Finding the WEEK number for 1st January - Big Query - google-bigquery

I am calculating the first week of every month for past 12 months from current date. The query logic that I am using is as follows:
SELECT
FORMAT_DATE('%Y%m%d', DATE_TRUNC(DATE_SUB(CURRENT_DATE(),interval 10 month), MONTH)) AS YYMMDD,
FORMAT_DATE('%Y%m', DATE_TRUNC(DATE_SUB(CURRENT_DATE(), interval 10 month), MONTH)) AS YYMM,
FORMAT_DATE('%Y%W', DATE_TRUNC(DATE_SUB(CURRENT_DATE(), interval 10 month), MONTH)) AS YYWW
OUTPUT:
Row
YYMMDD
YYMM
YYWW
1
20210101
202101
202100
The YYWW format returns the week as 00 and is causing my logic to fail. Is there anyway to handle this? My logic is going to be running 12 months calculation to find first week of every month.

At a very basic level, you can accomplish it with something like this:
with calendar as (
select date, extract(day from date) as day_of_month
from unnest(generate_date_array('2021-01-01',current_date(), interval 1 day)) date
)
select
date,
extract(month from date) as month_of_year,
case
when day_of_month < 8 then 1
when day_of_month < 15 then 2
when day_of_month < 22 then 3
when day_of_month < 29 then 4
else 5
end as week_of_month
from calendar
order by date
This approach is very simplistic, but you gave no criteria for your week-of-month definition in the query, so this is a reasonable answer. There is potential for a ton of variation in how you define week-of-month. The logic for week-of-year is built in to BQ, and provides options to handle items such as the starting day of the week, carryover at the end/beginning of consecutive years, etc. There is no corresponding week-of-month logic out of the box, so any "easy" built-in function like FORMAT_DATE() is unlikely to solve the problem.

Related

Average on 12 preceding months interval and ISO Weeks

I have the following query to get an average of units on the preceding 12 months in an interval but my problem is that the 12 preceding months is not taking into account ISO week 1 in the year, say this example:
SELECT
*,
avg(units) OVER (
ORDER BY to_date(year::text || '-' || week::text, 'IYYY-IW')
RANGE between interval '12 months' preceding and current row)
FROM
rolling_year_table
order by year,week;
Basically ISO week 1 2020 (which is actually '2019-12-30' is not taken into account in the calculations.
Is there a way to say 12 months preceding and current row but using ISO weeks?
Thanks,
This is too long for a comment.
I don't think there is an easy way to do this. The problem is the discrepancy between "week" and "year". There are, about 52.2 weeks in a 12 month period. So, what you are asking is that sometimes the "12 month" period have 52 weeks and sometimes it has 53 weeks.
I think you could do a cumulative calculation based on the past 52 weeks and then use condition logic to include the 53rd week previous. The problem is . . . I don't know what the exact rules are for going back 53 weeks.
If the only concern is that in the 53rd week of a year then the entire year should be included, then that would be pretty easy to include. The pseudo code for that would be:
(case when isoweek = 53
then avg() over (. . . range between '53 week' preceding and current row)
else avg() over (. . . range between '52 week' preceding and current row)
end)
EDIT:
I'm not 100% sure if this will work for your use-case. But I have an idea that might do what you want. That is to enumerate the weeks of the year as fractions of the year. So years with 52 weeks would have one enumeration and years with 53 weeks would have another.
This would look like:
select . . .,
avg(units) over (order by year + (isoweek - 1) / weeks_in_year
range between 1 preceding and current row
)
from (select t.*,
extract(isoyear from dte) as isoyear,
extract(week from dte) as isoweek,
greatest(extract(week from date_trunc('year', dte) + interval '1 year - 1 day'), 53) as weeks_in_year
f from t
) t;
You would need to test this to see if it does what you really want. As I say at the beginning of this answer "12 months ago" is not clearly defined for ISO weeks, but this may be a reasonable interpretation.
Doesn't this do what you want?
select ry.*,
avg(units) over (
order by year * 100 + week
range between 100 preceding and current row
from rolling_year_table
order by year,week;
Say current row is year 2020 and week 45, this will take rows from the same week in 2019 to the current row.

Dynamic scheduled query based on timestamp

I am trying to set up a scheduled query to run on the 1st of each month, and capture one month of data. However it should not be the previous month, but 2 months previous - due to delays in data being loaded in to the source table. The source table is partitioned by day on session_timestamp so refining this as much as possible will be of benefit to reducing query cost.
So far I have this:
WHERE
EXTRACT(YEAR
FROM
session_timestamp) = EXTRACT(YEAR
FROM
DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH))
AND EXTRACT(MONTH
FROM
session_timestamp) = EXTRACT(MONTH
FROM
DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH))
This seems a highly inelegant solution but was intended to address cases where a year boundary would be crossed. However I can see from the "This script will process * when run." area that this is going to query everything in 2020 and not just in May 2020.
As you have pointed out, your query doesn't engage partition filter down to the 2 months of data which you want to query.
You don't have to do the year trick because DATE_TRUNC(..., MONTH) has year in it. Please try filter below:
-- Last day of the month
DATE(session_timestamp) <= DATE_SUB(DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH), MONTH), INTERVAL 1 DAY)
AND
-- First day of the month
DATE(session_timestamp) >= DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 2 MONTH), MONTH)

Get last month data from first day until last day in Firebird

I am trying to query Firebird to get data from last month, from day 1 until last day (30 or 31 depending on the month). When I use the code below it gives me shifted dates from current, for example day 11/14/2017 until 12/13/2017.
The code:
WHERE DATE >= DATEADD(MONTH,-1, CURRENT_TIMESTAMP(2)) AND DATE<= 'TODAY'
The desired output is 11/01/2017 - 11/30/2017
What is the correct way to do it?
I don't use Firebird but I've used PostgreSQL fairly extensively and I think this should work:
WHERE
DATE BETWEEN dateadd(month, -1, CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE) + 1)
AND CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE)
Explanation
CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE) + 1 should go back to the first of this month and dateadd with -1 month should take it to the previous month. Then if you're between CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE) or in other words 12/13/2017 - 13 days that should be the last day of November. Crossing my fingers. Good luck.

Teradata SQL Same Day Prior Year in same Week

Need help figuring out how to determine if the date is the same 'day' as today in teradata. IE, today 12/1/15 Tuesday, same day last year was actually 12/2/2014 Tuesday.
I tried using current_date - INTERVAL'1'Year but it returns 12/1/2014.
You can do this with a bit of math if you can convert your current date's "Day of the week" to a number, and the previous year's "Day of the week" to a number.
In order to do this in Teradata your best bet is to utilize the sys_calendar.calendar table. Specifically the day_of_week column. Although there are other ways to do it.
Furthermore, instead of using CURRENT_DATE - INTERVAL '1' YEAR, it's a good idea to use ADD_MONTHS(CURRENT_DATE, -12) since INTERVAL arithmetic will fail on 2012-02-29 and other Feb 29th leap year dates.
So, putting it together you get what you need with:
SELECT
ADD_MONTHS(CURRENT_DATE, -12)
+
(
(SELECT day_of_week FROM sys_calendar.calendar WHERE calendar_date = CURRENT_DATE)
-
(SELECT day_of_week FROM sys_calendar.calendar WHERE calendar_date = ADD_MONTHS(CURRENT_DATE, -12))
)
This is basically saying: Take the current dates day of week number (3) and subtract from it last years day of week number (2) to get 1. Add that to last year's date and you'll have the same day of the week as current date.
I tested this for all dates between 01/01/2010 and CURRENT_DATE and it worked as expected.
Why don't you simply subtract 52 weeks?
current_date - 364
The SQL below will get you to the abbreviated name for the day of week, it's cumbersome but it works across versions of Teradata.
SELECT CAST(CAST(ADD_MONTHS(CURRENT_DATE, -12) AS DATE FORMAT 'E3') AS CHAR(3)) AS LY_DayOfWeek
, CAST(CAST(CURRENT_DATE) AS DATE FORMAT 'E3') AS CHAR(3)) AS CY_DayOfWeek
Dates are internally represented at integers in Teradata as (Year-1900) * 100000 + (MONTH * 100) + DAY. You may be able to do some creative arithmetic to figure out that 12/1/2015 Tuesday was 12/2/2014 Tuesday last year.

Function for week of the month in mysql

I was looking for a simple function to get the week of the month (rather than the easy week of the year) in a mysql query.
The best I could come up with was:
WEEK(dateField) - WEEK(DATE_SUB(dateField, INTERVAL DAYOFMONTH(dateField)-1 DAY)) + 1
I'd love to know if I'm reinventing the wheel here, and if there is an easier and cleaner solution?
AFAIK, there is no standard on the first week of month.
First week of year is the week containing Jan 4th.
How do you define first week of month?
UPDATE:
You'll need to rewrite your query like this:
SELECT WEEK(dateField, 5) -
WEEK(DATE_SUB(dateField, INTERVAL DAYOFMONTH(dateField) - 1 DAY), 5) + 1
so that the year transitions are handled correctly, and the weeks start on Monday.
Otherwise, your query is fine.
There is an alternative that's sometimes used in reporting databases. It's to create a table, let's call it ALMANAC, that has one row per date (the key), and has every needed attribute of the date that might be useful for reporting purposes.
In addition to the week of the month column, there could be a column for whether or not the date is a company holiday, and things like that. If your company had a fiscal year that starts in July or some other month, you could include the fiscal year, fiscal month, fiscal week, etc. that each date belongs to.
Then you write one program to populate this table out of thin air, given a range of dates to populate. You include all the crazy calendar calculations just once in this program.
Then, when you need to know the attribute for a date in some other table, you just do a join, and use the column. Yes, it's one more join. And no, this table isn't normalized. But it's still good design for certain very specific needs.
(Just to clarify for future readers: this answer to this old question was added because of a bounty that implied the current answer wasn't satisfying, so I added this solution/definition with additional "configuration options" for all kinds of situations.)
There is no standard definition for the Week of month, but a general solution would be the following formula, that you can configurate to your needs:
select (dayofmonth(dateField) + 6 + (7 - "min_days_for_partial_week")
- (weekday(datefield) - "weekday_startofweek" + 7) MOD 7) DIV 7 as week_of_month;
where
"weekday_startofweek" has to be replaced by the weekday that you want to be the first day of the week (0 = Monday, 6 = Sunday).
"min_days_for_partial_week" is the number of days the first week has to have to count as week 1 (values 1 to 7). Common values will be 1 (the first day of the month is always week 1), 4 (this would be similar to "iso week of the year", where the first week of the year is the week that contains the thursday, so has at least 4 days), and 7 (week 1 is the first complete week).
This formula will return values 0 to 6. 0 means that the current week is a partial week that doesn't have enough days to count as week 1, and 6 can only happen when you allow partial weeks with less than 3 days to be week 1.
Examples:
If your first day of the week is Monday ("weekday_startofweek" = 0) and week 1 should always be a whole week ("min_days_for_partial_week" = 7), this will simplify to
select (dayofmonth(dateField) + 6 - weekday(datefield)) DIV 7 as week_of_month;
E.g., for 2016-06-02 you will get 0, because June 2016 started with a Wednesday and for 2016-06-06 you will get 1, since this is the first Monday in June 2016.
To emulate your formula, where the first day of the week is always week 1 ("min_days_for_partial_week" = 1), and the week starts with Sunday ("weekday_startofweek" = 6), this will be
select (dayofmonth(dateField) + 12 - (weekday(datefield) + 1) MOD 7) DIV 7 as week_of_month;
Although you might want to comment that properly to know in 2 years where your constants came from.
My solution with a week starts on a Sunday.
SELECT ( 1 + ((DATE_FORMAT( DATE_ADD(LAST_DAY( DATE_ADD('2014-07-17',
INTERVAL -1 MONTH)), INTERVAL 1 DAY),'%w')+1) +
(DATE_FORMAT('2014-07-17', '%d')-2) ) DIV 7) "week_of_month";
Might be 12 years too late but in case anyone still looking, I am using this calculation in bigquery MySQL for calculating week in month.
I'm using Monday as first day on my calculation
case when
(case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7
else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) > 1 then -- check first day of month to decide if it's a complete week (starts on Monday)
case when EXTRACT(DAY FROM my_date) <= 7 then -- for incomplete week
case when
(case when EXTRACT(DAYOFWEEK FROM my_date) = 1 then 7 else EXTRACT(DAYOFWEEK FROM my_date)-1 end) - EXTRACT(DAY FROM my_date) =
(case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7
else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -1 then 1 -- incomplete week 1
else FLOOR(( EXTRACT(DAY FROM my_date) + (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7
else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -2 )/7)+1 end -- calculate week based on date
else FLOOR(( EXTRACT(DAY FROM my_date) + (case when EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) = 1 then 7
else EXTRACT(DAYOFWEEK FROM date_add(date(my_date), INTERVAL -(EXTRACT(DAY FROM my_date))+1 day)) -1 end) -2 )/7)+1 end -- calculate week based on date
else FLOOR((EXTRACT(DAY FROM my_date)-1)/7)+1 -- for complete week
end
The idea is to add the day difference (Monday to whatever day 1st of that month is) to date so it would divide correctly for week > 1
For week 1 (< date 7), I am calculation using day of week - date to get the end of 1st incomplete week (1st not on Monday).