How to retrieve first and last day of the previous month in Google BigQuery? - sql

I am looking to create a get the first and last date of the previous month so I can do a WHERE clause with a between statement. It'll look something like this
WHERE
FirstSold_Date BETWEEN first_day_previous_month AND last_day_previous_month

Try this:
WHERE FirstSold_Date BETWEEN date_trunc(date_sub(current_date(), interval 1 month), month) AND last_day(date_sub(current_date(), interval 1 month), month)

I would not recommend between for this. Instead:
WHERE FirstSold_Date >= date_add(date_trunc(current_date, month), interval -1 month) and
FirstSold_Date < date_trunc(current_date, month)
The advantage of this approach is that the same logic works for timestamps and datetimes as well. Looking at the last date causes problems when times are involved.

Consider below
where date_trunc(FirstSold_Date, month) = date_trunc(date_sub(current_date, interval 1 month), month)

Related

How to get the first day of the previous month in SQL (BigQuery)

Would any of you know and would like to share the konwledge how to subtract the number of days from the current date (the data is type = DATE) so that I get the first day of the previous month. Here is an example:
Current Date = '2022-10-27'
The date I want = '2022-09-01'
I know how to get the first day of the current month using this:
(CURRENT_DATE() - EXTRACT(DAY FROM CURRENT_DATE()) +1)
BuT I have no idea how to check how many days there were in the previous month and hence get the correct answer.
I though that maybe DATE_TRUNC(CURRENT_DATE() - EXTRACT(DAY FROM CURRENT_DATE())) would work but I'm getting this error:
"No matching signature for function DATE_TRUNC for argument types: DATE"
SO that's clearly not the way. Any suggestions please? :)
Try using a combination of DATE_TRUNC and DATE_SUB as follows:
select current_date() as curr_date,
date_sub(date_trunc(current_date(), MONTH), INTERVAL 1 MONTH) as lm_day_1
It produces the following:

Dynamic month names as column aliases within Bigquery

I am trying to create a table in which a few columns are flags to show whether an specific order was placed in one of the previous few months from the current date (dynamic). I would like the table aliases / names to show the month and year, all relative to the current date. Below is an example of my current code:
SELECT
order_id
,CAST(o.placed_at_local AS DATE) > DATE_TRUNC(CURRENT_DATE(), MONTH) AS in_current_month
,CAST(o.placed_at_local AS DATE) BETWEEN DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH), MONTH)
AND DATE_TRUNC(CURRENT_DATE(), MONTH) AS in_previous_month
,CAST(o.placed_at_local AS DATE) BETWEEN DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 2 MONTH), MONTH)
AND DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH), MONTH) AS in_previous_2_month
FROM my_table
I would like to change the column names so they are dynamic and showing the relevant month and year, e.g. "in_current_month" becomes "April_2022", "in_previous_month" becomes "March_2022", "in_previous_2_month" becomes "February_2022" and so on.
Anyone have an idea how to do this in Bigquery?
Thanks!

Dynamic scheduled query based on timestamp

I am trying to set up a scheduled query to run on the 1st of each month, and capture one month of data. However it should not be the previous month, but 2 months previous - due to delays in data being loaded in to the source table. The source table is partitioned by day on session_timestamp so refining this as much as possible will be of benefit to reducing query cost.
So far I have this:
WHERE
EXTRACT(YEAR
FROM
session_timestamp) = EXTRACT(YEAR
FROM
DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH))
AND EXTRACT(MONTH
FROM
session_timestamp) = EXTRACT(MONTH
FROM
DATE_SUB(CURRENT_DATE, INTERVAL 2 MONTH))
This seems a highly inelegant solution but was intended to address cases where a year boundary would be crossed. However I can see from the "This script will process * when run." area that this is going to query everything in 2020 and not just in May 2020.
As you have pointed out, your query doesn't engage partition filter down to the 2 months of data which you want to query.
You don't have to do the year trick because DATE_TRUNC(..., MONTH) has year in it. Please try filter below:
-- Last day of the month
DATE(session_timestamp) <= DATE_SUB(DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH), MONTH), INTERVAL 1 DAY)
AND
-- First day of the month
DATE(session_timestamp) >= DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 2 MONTH), MONTH)

Getting records from the past x weeks

I've been fighting with an issue about querying the records where created_at is within the current, or past x weeks. Say, today is Wednesday, so that'd be from Monday at midnight, up to now, if x = 1. If x > 1, I'm looking for current week, up to today, or past week, but not using regular interval '1 week' as that'll get me Wednesday to Wednesday, and I'm only looking into "whole" weeks.
I've tried the interval-solution, and also things like WHERE created_at > (CURRENT_DATE - INTERVAL '5 week').
A solution that'll work for both day, month, year etc, would be preferred, as I'm actually building the query through some other backend logic.
I'm looking for a generic query for "Find everything that's been created 'x periods' back.
Edit:
Since last time, I've implemented this in my Ruby on Rails application. This has caused some problems when using HOUR. The built is working for everything but HOUR (MONTH, DAY, and YEAR)
SELECT "customer_uses".*
FROM "customer_uses"
WHERE (customer_uses.created_at > DATE_TRUNC('MONTH', TIMESTAMP '2017-09-17T16:45:01+02:00') - INTERVAL '1 MONTH')
Which works correctly on my test cases. (Checking count of this). The TIMESTAMP is generated by DateTime.now to ensure my test-cases working with a time-override for "time-travelling"-tests, therefore not using the built in function.
(I've stripped away some extra WHERE-calls which should be irrelevant).
Why the HOUR isn't working is a mystery for me, as I'm using it with a interpolated string for HOUR instead of MONTH as above like so:
SELECT "customer_uses".*
FROM "customer_uses"
WHERE (customer_uses.created_at > DATE_TRUNC('HOUR', TIMESTAMP '2017-09-17T16:45:21+02:00') - INTERVAL '1 HOUR')
Your current suggested query is almost right, except that it uses the current date instead of the start of the week:
SELECT * FROM your_table WHERE created_at > (CURRENT_DATE - INTERVAL '5 week')
Instead, we can check 5 week intervals backwards, but from the start of the current week:
SELECT *
FROM your_table
WHERE created_at > DATE_TRUNC('week', CURRENT_DATE) - INTERVAL '5 week';
I believe you should be able to use the above query as a template for other time periods, e.g. a certain number of months. Just replace the unit in DATE_TRUNC and in the interval of the WHERE clause.

Group SQL results by week and specify "week-ending" day

I'm trying to select data grouped by week, which I have working, but I need to be able to specify a different day as the last day of the week. I think something needs to go near INTERVAL (6-weekday('datetime')) but not sure. This kind of SQL is above my pay-grade ($0) :P
SELECT
sum(`value`) AS `sum`,
DATE(adddate(`datetime`, INTERVAL (6-weekday(`datetime`)) DAY)) AS `dt`
FROM `values`
WHERE id = '123' AND DATETIME BETWEEN '2010-04-22' AND '2010-10-22'
GROUP BY `dt`
ORDER BY `datetime`
Thanks!
select
sum(value) as sum,
CASE WHEN (weekday(datetime)<=3) THEN date(datetime + INTERVAL (3-weekday(datetime)) DAY)
ELSE date(datetime + INTERVAL (3+7-weekday(datetime)) DAY)
END as dt
FROM values
WHERE id = '123' and DATETIME between '2010-04-22' AND '2010-10-22'
GROUP BY dt
ORDER BY datetime
This does look pretty evil but, this query will provide you with a sum of value grouped by a week ending on a Thursday (weekday() return of 3).
If you wish to change what day the end of the week is you just need to replace the 3's in the case statement, ie if you wanted Tuesday you would have it say
CASE WHEN (weekday(datetime)<=1) THEN date(datetime + INTERVAL (1-weekday(datetime)) DAY)
ELSE date(datetime + INTERVAL (1+7-weekday(datetime)) DAY)
I hope this helps.
Simple solution that I like. This will return the date for the start of the week assuming the week ends Sunday and starts Monday.
DATE(`datetime`) - INTERVAL WEEKDAY(`datetime`) AS `dt`
This can easily be adjusted to have a week ending on Thursday because Thursday is 3 days earlier than Sunday
DATE(`datetime`) - INTERVAL WEEKDAY(`datetime` + INTERVAL 3 DAY) AS `dt`
this returns for the start of the week that starts on Friday and ends on Thursday.
You can group on this no problem. If you want to use get the end of the week based on the start you do this
DATE(`datetime`) - INTERVAL -6 + WEEKDAY(`datetime` + INTERVAL 3 DAY) AS `dt`
I think you must choose between Sunday and Monday? When you can use DATE_FORMAT for grouping by string format of date, and use %v for grouping by Mondays and %v for grouping by Sundays.
SELECT
sum(`value`) AS `sum`,
DATE_FORMAT(`datetime`,'%v.%m.%Y') AS `dt`
FROM `values`
WHERE id = '123' AND DATETIME BETWEEN '2010-04-22' AND '2010-10-22'
GROUP BY DATE_FORMAT(`datetime`,'%v.%m.%Y')
ORDER BY `datetime`
How to use DATE_FORMAT
I don't remember the exact math, but you can get WEEKDAY to wrap around on different days of the week by adding or subtracting days to its argument. You'll need to tinker with different values of x and y in the expression:
x-weekday(adddate(`datetime`, INTERVAL y DAY))