pandas dayofyear for a period of 2 years - pandas

I am familiar with dayofyear. However, this time I have dates that span 2 years (2018, 2019). I'd like to get the day of year that would go from 1 to 730 (365+365). For example, Jan 3rd, 2019 should be 368 and Jan 3rd, 2018 should be 3. Is there a built-in way to do this? or do I need to write some function manually?
Thanks

Use year as well being even or odd and you then have 1 to 730
import datetime as dt
df = pd.DataFrame({"Date":pd.date_range(dt.datetime(2018,1,1), dt.datetime(2019,12,31))})
df["Date"].dt.dayofyear * ((df["Date"].dt.year % 2) + 1)

Related

After certain week of 2022 and continue with this new year

I would like to request some advice about how to set a Where Condition, but after a certain week
What I mean is:
I have dirty data before a specific week of 2022, so I made this:
DATEPART(WK, SA.FECHAE) >= 44
AND
YEAR(SA.FECHAE) >= 2022
But, We're on 2023, so, I need to add the new information of this new year year too into the query.
The query result shows me until 12-31-2022 and need it until today after the week 44 of 2022
...
WHERE (
DATEPART(WEEK, SA.FECHAE) >= 44
AND YEAR(SA.FECHAE) = 2022
)
OR (
YEAR(SA.FECHAE) >= 2023
)
In the OPs question they ask how to add an additional date range to their WHERE clause. The addition of this OR allows a second date range (in this case anything where the year is greater than or equal to 2023) to match the predicate and be returned, without impacting the original.
Plain English definition of the amended where clause:
Week 44 of 2022, or any week of any year from 2023 forward.

Pandas max dayofyear by year

I have a dataframe with a datetimeindex. There are multiple observations on the same day but different times.
I'm familiar with the dayofyear attribute. Is there a way to use this attribute to also determine the max dayofyear by year? The result would be something like:
2015 252
2016 250
2017 251
If I understand your question, you want to look at a list of dates and for each year get the maximum date for that year.
# Sample data
df = pd.DataFrame({'date':pd.DatetimeIndex(start=pd.datetime(2018,12,24),end=pd.datetime(2019,1,2),freq='h')})
df['dayofyear'] = df.date.dt.dayofyear
df['year'] = df.date.dt.year
df.groupby('year').dayofyear.max()
Out:
year
2018 365
2019 2

How to build a Dax to view data of all the months till data w.r.t all the years?

How can I build a DAX function which calculates all the data until a certain date and compare that with the previous year which have the same months as the "until" date?
For example, today's date is 5 April 2018, so if I select 2017 year inside the slicer, I should be able to see a graph which shows me the comparison between the start of year i.e 1 Jan 2018 to 5 April 2018, and 1 Jan 2017 to 5 April 2017 with the previous year.
Currently I am using YtD, but I think it's calculating all the 12 months of data of all the years except the year 2018, where it shows me data from Jan 1 to April 5. Can anyone shed some light here?
Currently I am using this YTDQty = TOTALYTD(sum(Bookscan[QtySold]),DATESYTD(Bookscan[Week Date]))
Which is showing me correct data of 2018, till date, I should be able to compare the 4 months of data to my previous years 2017, 2016, 2015, these years are showing me total data for all the years i.e 12months of data, However I only need to see data start from 2018 Jan till todays date or let say March 1, so all the years should show me this current data how to do this?
Very similar to this question.
Do you have a Date Dimension in your model?
TotalQuantity =
SUM(Bookscan[QtySold])
TotalQuantity YTD =
TOTALYTD([TotalQuantity],'Date'[Date])
TotalQuantity YTD LY =
CALCULATE(
[TotalQuantity YTD],
SAMEPERIODLASTYEAR('Date'[Date])
)

Select Date Between Just Day and Month Excluding Year

The following is the pseudo code for what I want to do:
When Date is Between 04-01 and 03-31 of the following year then output as Q1.
I know how to do this with the year but not excluding the year.
I have no idea what you mean by output "Q1". However, if you want your years to start on April 1st (which seems like a reasonable interpretation of what you are sking), the easiest way is to subtract a number of days. For most years you will deal with, you can do:
select year(dateadd(day, - (31 + 28 + 31), date) as theyear
Of course, this only works three years out of four, because of leap years. One way to fix this is with explicit logic -- but that gets messy. Another way is to add the remaining months and subtract one year:
select year(dateadd(day, (30 + 31 + 30 + 31 + 31 + 30 + 31 + 30 + 31), date) - 1 as theyear
It's unclear exactly what you're trying to do. Q1 usually indicates a quarter, a three-month period. A quarter running from 1 April to 31 March of the following year isn't much of a quarter :)
However, assuming you're trying to select stuff within a certain span of time starting from a particular date, you might try a little date/time arithmetic. First, a few notes:
datetime values have a nominal precision of 1 millisecond (and an actual precision of approximately 3ms). That means that something like '31 March 2014 23:59:59.999' is rounded up to '1 April 2014 00:00:00.000'. The largest time value for a given day is `23:59:59.997'. This can have...deleterious effects on your queries if you're not cognizant of it. Don't ask me how I know this.
datetime literals without a time component, such as '1 April 2013', are interpreted as start-of-day ('1 April 2014 00:00:00.000').
So, something like this:
declare
#dtFrom datetime ,
#dtThru datetime
set #dtFrom = '1 April 2013'
set #dtThru = dateAdd(year,1,dtFrom)
select *
from foo t
where t.someDateTimeValue >= #dtFrom
and t.someDateTimevalue < #dtThru
should probably do you.
You might want to adjust the setting of #dtThru to suit your requirements: if you're actually looking for the end of a quarter, you migh change it to something like
set #dtThru = dateAdd(month,3,dtFrom)
If you have a fiscal year that runs from 1 April through 31 March and want to figure out, say, what fiscal year and quarter your data represents, you might do something like this:
select FiscalYear = datepart(year,t.someDateTimeValue)
- case datepart(month,t.someDateTimeValue) / 4
when 0 then 1 -- jan/feb/mar is quarter 4 of the prev FY
else 0 -- everything else is this FY
end ,
FiscalQuarter = case datepart(month,t.someDateTimevalue) / 4
when 0 then 4 -- jan/feb/mar is Q4 of the prev FY
when 1 then 1 -- apr/may/jun is Q1 of the curr FY
when 2 then 2 -- jul/aug/sep is Q2 of the curr FY
when 3 then 3 -- oct/nov/dec is Q3 of the curr FY
end ,
*
from foo t
I think what you want is the following:
SELECT year(dateadd(q, -1, mydate)) AS yearEndingQ1
FROM mytable
This would give the year as 2014 for all dates between 04/01/2014 and 03/31/2015. Of course it's possible you want a result of 2015 instead in which case you want:
SELECT year(dateadd(q, 3, mydate)) AS yearEndingQ1
FROM mytable
Hope this helps.
UPDATE per OP's comment: "I am tracking data for a year ending Quarter x. Our fiscal year is a bit weird around here. So basically it would be fiscal year ending Q1, fiscal year ending Q2, etc. Perhaps I could have provided more clarity in my question."
This would give results in three separate columns for fiscal year ending Q1, fiscal year ending Q2, and fiscal year ending Q3. (I assume you don't need anything for fiscal year ending Q4!!)
SELECT year(dateadd(q, -1, mydate)) AS yearEndingQ1
, year(dateadd(q, -2, mydate)) AS yearEndingQ2
, year(dateadd(q, -3, mydate)) AS yearEndingQ3
FROM mytable

using groupby on pandas dataframe to group by financial year

I have a dataframe with a datetime64 column called DT. Is it possible to use groupby to group by financial year from April 1 to March 31?
For example,
Date | PE_LOW
2010-04-01 | 15.44
...
2011-03-31 | 16.8
2011-04-02 | 17.
...
2012-03-31 | 17.4
For the above data, I want to group by Fiscal Year 2010-2011 and Fiscal Year 2011-2012 without creating an extra column.*
The first thing you want to do is define a function that outputs the financial year as a value. You could use the following.
def getFiscalYear(dt):
year = dt.year
if dt.month<4: year -= 1
return year
You say you don't want to use an extra column to group the frame. Typically the groupby method is called by saying something like this df.groupby("colname") however that statement is semantically equivalent to df.groupby(df["colname"] - meaning you can do something like this...
grouped = DT.groupby(DT['Date'].apply(getFiscalYear))
and then apply a method to the groups or whatever you want to do. If you just want these groups separated call grouped.groups
With pandas.DatetimeIndex, that is very simple:
DT.groupby(pd.DatetimeIndex(DT.Date).shift(-3,freq='m').year)
Or if you use Date as an index of DT, it is even simpler:
DT.groupby(DT.index.shift(-3,freq='m').year)
But beware that shift(-3,freq='m') shifts date to ends of months; for example, 8 Apr to 31 Jan and so on. Anyway, it fits your problem well.
I had a similar problem and used the following to offset the business year end to March (month=3) using Grouper and specifying the frequency:
grouped_df = df.groupby([pd.Grouper(key='DateColumn', freq=pd.tseries.offsets.BYearEnd(month=3))])
Pandas Business Year End and
Grouper
The simplest method I've found for this (similar to Alex's answer, but slightly more concise):
df.groupby([pd.Grouper(key='DateColumn', freq="A-MAR")])
If you want year finishing on the last working day you can use freq="BA-MAR"
Similar to this answer, but I would (at this time of this initial post) need to report that the fiscal year is 2023. This is acheived by reversing the inequality and changing the decrement to an increment.
def fiscal_year(dt):
year = dt.year
if dt.month > 4:
year += 1
return year