Best way to break down by weeks in BigQuery - sql

So what I'm looking to do is create a report that shows how many sales a company had on a weekly basis.
So we have a time field called created that looks like this:
2016-04-06 20:58:06 UTC
This field represents when the sale takes place.
Now lets say I wanted to create a report that gives you how many sales you had on a weekly basis. So the above example will fall into something like Week of 2016-04-03 (it doesn't have to exactly say that, I'm just going for the simplest way to do this)
Anyone have any advice? I imagine it involves using the UTEC_TO_xxxxxx functions.

The documentation advises to use standard SQL functions, like DATE_TRUNC():
SELECT DATE_TRUNC(DATE '2019-12-25', WEEK) as week;

you can use WEEK() function - it gives you week number
SELECT WEEK('2016-04-06 20:58:06 UTC')
if you need first day of the week - you can try something like
STRFTIME_UTC_USEC((UTC_USEC_TO_WEEK(TIMESTAMP_TO_USEC(TIMESTAMP('2016-05-02 20:58:06 UTC')), 0)),'%Y-%m-%d')

I had to add parentheses:
SELECT DATE_TRUNC(DATE('2016-04-06 20:58:06 UTC'), WEEK) as week;

This is quite an old question and things have moved on since.
In my case, I found that the old WEEK function is no longer recognised, so I had to instead use the EXTRACT function. The doc for it can be found here.
For me it was enough to just extract the ISOWEEK from the timestamp, which results in the week of the year (the ISOYEAR) as a number.
ISOWEEK: Returns the ISO 8601 week number of the datetime_expression. ISOWEEKs begin on Monday. Return values are in the range [1, 53]. The first ISOWEEK of each ISO year begins on the Monday before the first Thursday of the Gregorian calendar year.
So I did this:
SELECT EXTRACT(ISOWEEK FROM created) as week
And if you want to see the week's last day, rather than the week's number in a year, then:
SELECT last_day(datetime(created), isoweek) as week

Related

INTERVAL handling in BigQuery's GENERATE_DATE_ARRAY

I'm using GENERATE_DATE_ARRAY to obtain all end-of-month (EOM) dates between two dates. According to the documentation, indeed, GENERATE_DATE_ARRAY supports the MONTH keyword in the INTERVAL part:
SELECT GENERATE_DATE_ARRAY('2020-12-31', '2022-03-31', INTERVAL 1 MONTH)
Unfortunately, as you can see, the result is quite buggy:
It seems that after February, the process get screwed and keeps 28 as the end of month until the final date.
Is there something I'm missing? Or maybe this is a bug?
Consider also below approach (note use of LAST_DAY function)
select last_day(day, month) from
unnest(generate_date_array('2021-01-01', '2022-04-01', interval 1 month)) day
As pointed out by #Jaytiger comment, this may be something expected, although not clearly documented. In the documentation of some date functions is reported (see for example DATE_ADD docs):
Special handling is required for MONTH, QUARTER, and YEAR parts when
the date is at (or near) the last day of the month. If the resulting
month has fewer days than the original date's day, then the resulting
date is the last date of that month.
As a workaround, if you want to obtain the EOM dates, this approach may be used:
SELECT DATE_SUB(day, INTERVAL 1 DAY) FROM
UNNEST(GENERATE_DATE_ARRAY('2021-01-01', '2022-04-01', INTERVAL 1 MONTH)) day
In other words, instead of generating the EOM dates, it is best to generate the start-of-month dates and the subtract a day.
Another approach is to use the method suggested by #Mikhail in his reply.

BigQuery partition on calendar week

Every week, I get a new dataset that I need to insert in BigQuery. The data can arrive on any day of the week. Once the data is ingested, I want to query data that arrived last week.
One option is to use date as partitioning when the data arrived but then the developers would need to know the exact date when data arrived to query the partition.
Instead of this, while ingestion, I want to create an INTEGER column which represents the calendar week of the year. The format will be 202005 or 202153 where former represents fifth week of 2020 and latter represents second last week of year 2021.
Since this is an integer, the only option for partition seems to be range partitioning. For it, BigQuery is asking for a start, end and interval. What values should I define?
I can define the following but as you can imagine that this sounds wrong
start 202001
end 203054
inerval 1
Update:
It seems that bigquery will only create partitions for which it has data. I checked that by executing
#legacySQL
SELECT
project_id, dataset_id, table_id, partition_id, TIMESTAMP(creation_time/1000) AS creation_time
FROM [PROJECT_ID:DATASET_ID.TABLE_ID$__PARTITIONS_SUMMARY__]
Another option would be to still Partition by date - but not ingestion date or whatever date you have in mind, rather start date of respective week with the help of DATE_TRUNC function
DATE_TRUNC(your_date, WEEK)
Note: You even can define start day of the week
WEEK(): Truncates date_expression to the preceding week boundary, where weeks begin on WEEKDAY. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.

Wrong US week number calculation for 1st jan using datepart

SQL server DATEPART function has two options to retrieve week number;
ISO_WEEK and WEEK. I Know the difference between the two, I want to have week numbers based on Sunday start standard as followed in the US; i.e. WEEK. But it doesn't handles partial weeks the way I expected. e.g.
SELECT DATEPART(WEEK,'2015-12-31') --53
SELECT DATEPART(WEEK,'2016-01-01') --1
SELECT DATEPART(WEEK,'2016-01-03') --2
gives two different week numbers for a single week, divided in two years. I wanted to implement something like in the following link for week days.
Week numbers according to US standard
Basically I would like something like this;
SELECT DATEPART(WEEK,'2015-12-31') --1
SELECT DATEPART(WEEK,'2016-01-01') --1
SELECT DATEPART(WEEK,'2016-01-03') --2
EDIT:
Basically I am not good with the division of a single week into two, I have to perform some calculations based on week numbers and the fact that a single week to be divided isn't acceptable. So if above isn't possible.
Is it possible that the week number one would start from 2016-01-03. i.e. what I would in that case would be something like this:
SELECT DATEPART(WEEK,'2015-12-31') --53
SELECT DATEPART(WEEK,'2016-01-01') --53
SELECT DATEPART(WEEK,'2016-01-03') --1
If you want the US numbering, you can do this by taking the WEEK number of the end of the week rather than the date itself.
First ensure that the setting for first day of the week is in fact Sunday on your system. You can verify this by running SELECT ##DATEFIRST; this should return 7 for Sunday. If it doesn't, run SET DATEFIRST 7; first.
SELECT
end_of_week=DATEADD(DAY, 7-(DATEPART(WEEKDAY, '20151231')), '20151231'),
week_day=DATEPART(WEEK, DATEADD(DAY, 7-(DATEPART(WEEKDAY, '20151231')), '20151231'));
Which will return 2016/01/02 - 1.
If you wish generate week number of a date, it will return the week number of the year(input date)
Thus, I think sql server treat '2015-12-31' as the last week of 2015.

SQL ORACLE Get week numbers from multiple datetime rows

I have 70.000 rows of data, including a date time column (YYYY-MM-DD HH24-MM-SS.).
I want to split this data into 3 separate columns; Hour, day and Week number.
The date time column name is 'REGISTRATIONDATE' from the table 'CONTRACTS'.
This is what I have so far for the day and hour columns:
SELECT substr(REGISTRATIONDATE, 0, 10) AS "Date",
substr(REGISTRATIONDATE, 11, 9) AS "Hour"
FROM CONTRACTS;
I have seen the options to get a week number for specific dates, this assignment concerns 70.000 dates so this is not an option.
You (the OP) still have to explain what week number to assign to the first few days in a year, until the first Monday of the year. Do you assign a week number for the prior calendar year? In a Comment I asked about January 1, 2017, as an example; that was a Sunday. The week from January 2 to January 8 of 2017 is "week 1" according to your definition; what week number do you assign to Sunday, January 1, 2017?
The straightforward calculation below assigns to it week number 0. Other than that, the computation is trivial.
Notes: To find the Monday of the week for any given date dt, we can use trunc(dt, 'iw'). iw stands for ISO Week, standard week which starts on Monday and ends on Sunday.
Then: To find the first Monday of the year, we can start with the date January 7 and ask for the Monday of the week in which January 7 falls. (I won't explain that one - it's easy logic and it has nothing to do with programming.)
To input a fixed date, the best way is with the date literal syntax: date '2017-01-07' for January 7. Please check the Oracle documentation for "date literals" if you are not familiar with it.
So: to find the week number for any date dt, compute
1 + ( trunc(dt, 'iw') - trunc(date '2017-01-07', 'iw') ) / 7
This formula finds the Monday of the ISO Week of dt and subtracts the first Monday of the year - using Oracle date arithmetic, where the difference between two dates is the number of days between them. So to find the number of weeks we divide by 7; and to have the first Monday be assigned the number 1, instead of 0, we need to add 1 to the result of dividing by 7.
The other issue you will have to address is to convert your strings into dates. The best solution would be to fix the data model itself (change the data type of the column so that it is DATE instead of VARCHAR2); then all the bits of data you need could be extracted more easily, you would make sure you don't have dates like '2017-02-29 12:30:00' in your data (currently, if you do, you will have a very hard time making any date calculations work), queries will be a lot faster, etc. Anyway, that's an entirely different issue so I'll leave it out of this discussion.
Assuming your REGISTRATIONDATE if formatted as 'MM/DD/YYYY'
the simples (and the faster ) query is based ond to to_char(to_date(REGISTRATIONDATE,'MM/DD/YYYY'),'WW')
(otherwise convert you column in a proper date and perform the conversio to week number)
SELECT substr(REGISTRATIONDATE, 0, 10) AS "Date",
substr(REGISTRATIONDATE, 11, 9) AS "Hour",
to_char(to_date(REGISTRATIONDATE,'MM/DD/YYYY'),'WW') as "Week"
FROM CONTRACTS;
This is messy, but it looks like it works:
to_char(
to_date(RegistrationDate,'YYYY-MM-DD HH24-MI-SS') +
to_number(to_char(trunc(to_date(RegistrationDate,'YYYY-MM-DD HH24-MI-SS'),'YEAR'),'D'))
- 2,
'WW')
On the outside you have the solution previous given by others but using the correct date format. In the middle there is an adjustment of a certain number of days to adjust for where the 1st Jan falls. The trunc part gets the first of Jan from the date, the 'D' gets the weekday of 1st Jan. Since 1 represents Sunday, we have to use -2 to get what we need.
EDIT: I may delete this answer later, but it looks to me that the one from #mathguy is the best. See also the comments on that answer for how to extend to a general solution.
But first you need to:
Decide what to do dates in Jan before the first Monday, and
Resolve the underlying problems in the date which prevent it being converted to dates.
On point 1, if assigning week 0 is not acceptable (you want week 52/53) it gets a bit more complicated, but we'll still be able to help.
As I see it, on point 2, either there is something systematically wrong (perhaps they are timestamps and include fractions of a second) or there are isolated cases of invalid data.
Either the length, or the format, or the specific values don't compute. The error message you got suggests that at least some of the data is "too long", and the code in my comment should help you locate that.

Teradata Change format of Week Number

I'm pretty new to SQL so I hope this isn't a dumb question, tried to google but couldn't find anything.
I'm summing sales of departments per week in SQL and am using TD_SYSFNLIB.WEEKNUMBER_OF_YEAR (trans_dt) to get the week number.
I think everything is working except I'd like to change the format of the weeks to the start date of the week, e.g. week 1 = 1/4/15
Also, i'm not sure how to handle the very first of the year week 0 since I think that should be grouped up with week 52 of last year.
The following date math trick should get you Beginning of Week as an actual date without having to join to the SYS_CALENDAR view or using a function:
SELECT CURRENT_DATE - ((CURRENT_DATE - DATE '0001-01-07) MOD 7) AS BOW;
Starting with TD14 there's NEXT_DAY which returns the following weekday, if you subtract 7 days you get the previous day:
next_day(trans_dt - 7, 'sunday')