BigQuery partition on calendar week - google-bigquery

Every week, I get a new dataset that I need to insert in BigQuery. The data can arrive on any day of the week. Once the data is ingested, I want to query data that arrived last week.
One option is to use date as partitioning when the data arrived but then the developers would need to know the exact date when data arrived to query the partition.
Instead of this, while ingestion, I want to create an INTEGER column which represents the calendar week of the year. The format will be 202005 or 202153 where former represents fifth week of 2020 and latter represents second last week of year 2021.
Since this is an integer, the only option for partition seems to be range partitioning. For it, BigQuery is asking for a start, end and interval. What values should I define?
I can define the following but as you can imagine that this sounds wrong
start 202001
end 203054
inerval 1
Update:
It seems that bigquery will only create partitions for which it has data. I checked that by executing
#legacySQL
SELECT
project_id, dataset_id, table_id, partition_id, TIMESTAMP(creation_time/1000) AS creation_time
FROM [PROJECT_ID:DATASET_ID.TABLE_ID$__PARTITIONS_SUMMARY__]

Another option would be to still Partition by date - but not ingestion date or whatever date you have in mind, rather start date of respective week with the help of DATE_TRUNC function
DATE_TRUNC(your_date, WEEK)
Note: You even can define start day of the week
WEEK(): Truncates date_expression to the preceding week boundary, where weeks begin on WEEKDAY. Valid values for WEEKDAY are SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY.

Related

Amazon Redshift- How to get start date of the Week from existing daily date field from the table?

I am trying to get start date of the week from existing daily date field from the same table. For example daily dates from 05/08/2022 to 05/14/2022 , the start of the week date output need to come as 05/08/2022 for all days in the week. week start on Sunday.
Also similar thing require to first date of the Month and quarter(3 month division)
The date_trunc() function performs this operation - https://docs.aws.amazon.com/redshift/latest/dg/r_DATE_TRUNC.html

Hive - Query to get Saturday as week start date for a given date

I have an requirement in hive to calculate Saturday as week start date for a given date in hive sql.
Eg)
Date week_start
03-27-2021 03-27-2021
03-28-2021 03-27-2021
03-31-2021 03-27-2021
04-07-2021 O4-03-2021
04-09-2021. 04-03-2021
I tried using pmod and other date functions but not getting desired output. Any insight is much appreciated.
Hive offers next_day(), which can be adapted for this purpose. I think the logic you want is:
select date_add(next_day(date, 'SAT'), -7)
This is a little arcane. next_day() gets the next date after the argument date with a given day of the week. So, go to the next Saturday and then subtract 7 days for the start of the week.

BigQuery, Sum by week

I am using standard SQL and am trying to add the weekly sum for product usage by week.
Using code below, I was able to add to each row the respective week and year it falls into. How would I go about summing the totals for an item by week and outputting it in columns, say up to the last 8 weeks.
extract(week from Metrics_Date) as week, EXTRACT(YEAR FROM Metrics_Date) AS year
Image is my raw data with the week and year next to an item:
This image is of above raw data being analyzed further(grouping them together). Here is where I would want to add columns, current_week & firstday of week date, and a sum of that weeks totals.
Any help would be appreciated.
You don't need the extract() by the way, you can do truncation DATE_TRUNC(your_date, WEEK) and it will truncate it to the week, usually easier.
Also, because the result of the truncation is a date, you will have the first day of the week already.
The rest I believe you have it figured out already, but just in case:
SELECT DATE_TRUNC(your_date_field, WEEK) AS week, SUM(message_count) AS total_messages FROM your_table GROUP BY 1

Best way to break down by weeks in BigQuery

So what I'm looking to do is create a report that shows how many sales a company had on a weekly basis.
So we have a time field called created that looks like this:
2016-04-06 20:58:06 UTC
This field represents when the sale takes place.
Now lets say I wanted to create a report that gives you how many sales you had on a weekly basis. So the above example will fall into something like Week of 2016-04-03 (it doesn't have to exactly say that, I'm just going for the simplest way to do this)
Anyone have any advice? I imagine it involves using the UTEC_TO_xxxxxx functions.
The documentation advises to use standard SQL functions, like DATE_TRUNC():
SELECT DATE_TRUNC(DATE '2019-12-25', WEEK) as week;
you can use WEEK() function - it gives you week number
SELECT WEEK('2016-04-06 20:58:06 UTC')
if you need first day of the week - you can try something like
STRFTIME_UTC_USEC((UTC_USEC_TO_WEEK(TIMESTAMP_TO_USEC(TIMESTAMP('2016-05-02 20:58:06 UTC')), 0)),'%Y-%m-%d')
I had to add parentheses:
SELECT DATE_TRUNC(DATE('2016-04-06 20:58:06 UTC'), WEEK) as week;
This is quite an old question and things have moved on since.
In my case, I found that the old WEEK function is no longer recognised, so I had to instead use the EXTRACT function. The doc for it can be found here.
For me it was enough to just extract the ISOWEEK from the timestamp, which results in the week of the year (the ISOYEAR) as a number.
ISOWEEK: Returns the ISO 8601 week number of the datetime_expression. ISOWEEKs begin on Monday. Return values are in the range [1, 53]. The first ISOWEEK of each ISO year begins on the Monday before the first Thursday of the Gregorian calendar year.
So I did this:
SELECT EXTRACT(ISOWEEK FROM created) as week
And if you want to see the week's last day, rather than the week's number in a year, then:
SELECT last_day(datetime(created), isoweek) as week

How to group by week specifying end of week day?

I need to run a report grouped by week. This could be done by using group by week(date) but the client wants to set the day of the week that marks the end of week. So it can be Tuesday, Wednesday etc. How can I work this into a group by query?
The datetime column type is unix timestamp.
The WEEK() function takes an optional second parameter to specify the start of the week:
This function returns the week number for date. The two-argument form of WEEK() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53.
However, it can only be set to Sunday or Monday.
UPDATE: Further to the comments below, you may want to consider adding a new column to your table to act as a grouping field, based on WEEK(DATE_ADD(date INTERVAL x DAY)), as suggested in the comments. You may want to create triggers to automatically generate this values whenever the date field is updated, and when new rows are inserted. You would then be able to create a usable index on this new field as required.