SQL: MTD Calculation on specific dates - sql

I am trying to Calculate MTD sales on daily sales number but my month starts from 26th of previous month to 25th of next month.
data contains only 3 columns (date, vendor_id, total_sales).
Below Code is working fine for calculating months starting from the 1st. I tried to do this by using the below approaches but it does not works
date(date - interval '25 day') : Not working
Mapping table creation for each day, but will not work for 30/31 days month
Need suggestion on above.
SUM(sales) OVER (
PARTITION BY
vendor_id,
EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date)
ORDER BY
date ROWS UNBOUNDED PRECEDING
) AS mtd_total_sales,

So, if the date is >= 26, then "it is the next month".
As a result, something like
CASE
WHEN EXTRACT(DAY FROM date) >= 26 THEN ADD_MONTHS(date, 1)
ELSE date
END
should suffice, since you extract the year and month anyway.

Related

How to find specific holiday date that varies by year in BigQuery?

I'm trying to query for dates that come the week before Labor Day. Labor Day is a US federal holiday on the first Monday of every September. I'm trying to select labor day and the full week before labor day. I'm somewhat close with this sort of query
select * from `bigquery-public-data.ghcn_d.ghcnd_1991`
where extract(month from date) = 9
and extract(dayofweek from date) = 2
and extract(week from date) = 36
But it's not always the 36th week, sometimes its the 35th (so the query above is wrong).
I'm guessing I'll have to do a date subtraction in order to get the full week before labor day...but for now I just need help finding how to query the first Monday of every September.
This is one approach:
select t.*
from `bigquery-public-data.ghcn_d.ghcnd_1991` t
join (
select format_date('%Y%m',date) as yr_mo,
date_sub(min(date), interval 1 week) as week_before,
min(date) as first_monday
from `bigquery-public-data.ghcn_d.ghcnd_1991` v
where extract(dayofweek from date) = 2
and extract(month from date) = 9
group by format_date('%Y%m',date)
) v
on t.date between v.week_before and v.first_monday;
This presumes you want all rows of the table where the date is Labor Day or within the week leading to Labor Day.
Because it is the first Monday of September you can perform a MIN on date within a CTE, or a subquery, to get Labor Day and the prior week:
WITH labor_day as (
select MIN(date) date
from `bigquery-public-data.ghcn_d.ghcnd_1991`
where extract(month from date) = 9
and extract(dayofweek from date) = 2
)
SELECT distinct ghcnd.date
FROM `bigquery-public-data.ghcn_d.ghcnd_1991` ghcnd
INNER JOIN labor_day
on ghcnd.date between labor_day.date-7 and labor_day.date

Exclude partial weeks from results BQ SQL

I am trying to find an easy way to exclude partial weeks from the results.
What I have so far:
WITH a AS (SELECT
FORMAT_DATE("%G-%V", created_date) as report_week
, created_date
, FORMAT_DATE('%A', created_date) AS day
, emp_id
, ROUND(SAFE_DIVIDE(SUM(working_time),3600),2) as hours
FROM `table1` a
WHERE created_date >= current_date()-10
GROUP BY 1,2,3,4,5)
SELECT
report_week
, emp_id
, hours
FROM a
WHERE day LIKE '%Monday%'
GROUP BY 1,2,3
ORDER BY report_week ASC
Input:
report_week: conversion of employee's shift date into week
created_date: date of employee's shift
day: conversion of date of employee's shift into day of week (Monday, Tuesday..)
emp_id: the employee's ID
hours: Number of worked hours by the employee
if current_date is 19 April 2022 then current_date()-10 is 9 April 2022.
Output:
The desired output is to return the number of hours worked for each employee during the full week 11 - 17 April only (it would exclude 9th, 10th, 18th and 19th of April from the results).
To obtain this, I tried to filter by having only week starting on a Monday with WHERE day LIKE '%Monday%' but in the example, it would also return the number of hours worked for each employee on 18th and 19th (since the 18th is a Monday). And if I combine this clause with AND (for example WHERE day LIKE '%Monday%' AND day LIKE '%Sunday%', it does not work at all.
Additionally, I see here another potential problem. If a Monday is a day off (like during Easter), then no employees will have hours on that Monday and the rest of the week will then not be returned.
My question: Is there an easy way to get only full weeks (Monday-Sunday) regardless the date range chosen?
Thank you in advance.
Best,
Fabien
You need to use UNNEST and create an array with a range of dates. Also, you need to use DATE_TRUNC to get the week and LAST_DAY to get the last day of the week. You can get the weeks that belong to each day in a range of dates.
You can see this example.
with sample_data as (
SELECT date FROM UNNEST(generate_timestamp_array('2022-04-09 00:00:00', '2022-04-19 00:00:00', INTERVAL 1 DAY)) as date
),
counting as(
select
DATE_ADD(DATE (date), INTERVAL 1 DAY) date
, DATE_TRUNC(DATE(date), WEEK)+1 week_start
, DATE_TRUNC(DATE(date), WEEK) +7 week_end
from sample_data
)
select b.date from (
select week_start,count(*) as ndays
from counting
group by week_start
having ndays=7
) a
join counting b on a.week_start=b.week_start
where timestamp(b.date) between timestamp(b.week_start) and timestamp(b.week_end)
I used the same range of dates like your example.

BigQuery SQL to change start date and end date into groups of months

I work with a hotel client where they have a BigQuery database which has hotel booking data. I've shared the relevant columns in the image below which list the names of each hotel, the arrival date of the guest, the departure date, and the revenue generated from the each booking:
My problem statement is that I have to showcase how many rooms have been booked, and how much revenue has been made for each hotel every month where my final grid would look similar to this:
The important points to remember are:
the depart_dt - arrival_dt are the number of nights that the guest is staying
the Rez_rate_total / (depart_dt - arrival_dt) is the revenue made per night
My problem here is trying to figure out how to change the start date and end date columns into groups of months. The challenge comes when a guest arrives in one month and leaves in the next month. For example, Row 5 in the original data has the guest coming in on 18th July and leaving on 1st Aug - so 13 days of his stay and 13 days of revenue has to be included in July and 1 day has to be included in August.
I haven't used SQL in a while so this is as far as I got:
WITH
temp_table AS (
SELECT
hotel_long_nm,
arrival_dt,
depart_dt,
DATE_DIFF(depart_dt, arrival_dt, day) AS room_nights,
rez_rate_total
FROM
`DATABASE.analytics.bookings` )
SELECT
*
FROM
temp_table
Any help would be greatly appreciated!
Consider the following approach:
with bookings as (
select hotel_long_nm, date(arrival_dt) as arrival_dt, date(depart_dt) as depart_dt, rez_rate_total from project.dataset.bookings
),
tmp as (
-- expose the dates in the reservation (excluding last day of reservation)
select *, generate_date_array(arrival_dt,date_sub(depart_dt, interval 1 day)) as stay_dates from bookings
),
calc as (
-- unnest and calculate the daily rate
select
hotel_long_nm,
stay_dt,
1 as stay_nights,
rez_rate_total/array_length(stay_dates) as rez_rate_daily
from tmp
left join unnest(stay_dates) as stay_dt
),
agg as (
-- aggregate to the year-month level
select
date_trunc(stay_dt, month) as year_month,
hotel_long_nm,
sum(stay_nights) as room_nights,
round(sum(rez_rate_daily),2) as rez_rate_total
from calc
group by 1,2
)
select * from agg
order by hotel_long_nm, year_month
You can consider this approach, following this logic.
Validate if both dates are in the same month
If are not in the same month, i get the final date of the month of
arrival date and subtract both dates
I get the first date of the month of the depart date and subtract
and subtract both dates
In this code you can see an example:
SELECT
/*arrival date*/
CURRENT_DATE() AS the_arival,
/*depart_dt*/
DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY) AS the_depart,
/*total of night between arrival date and depart date*/
DATE_DIFF(DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY) , CURRENT_DATE(), DAY) AS total_room_nights,
/* validate if the dates are in the same month or different month if equal 0 same month if >0 another month */
DATE_DIFF(DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY) , CURRENT_DATE(), MONTH) AS Same_Month,/*1 no and 0 yes/
/*in this case are in different month*/
/*I get the final date of the arrival month and subtract with the arrival date*/
DATE_DIFF(DATE_SUB(DATE_TRUNC(DATE_ADD(DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY), INTERVAL 1 MONTH), MONTH), INTERVAL 1 DAY),DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY), DAY) as total_room_nights_first_mont,
/*I get the initial date of the depart month and subtract with the depart date i add +1 because is the night between last day of the mont and first day of the next month*/
DATE_DIFF(DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY),DATE_TRUNC(DATE_ADD(CURRENT_DATE(), INTERVAL 30 DAY), MONTH), DAY)+1 as total_room_nights_second_month
You can see more information about the date function.Click Here.

What is the Syntax to get weekly data in addition to monthly data from a query?

I have written a query that gives me the numbers I need for a month over month comparison for custom orders. I'd like to get weekly data as well. What is the syntax that could give me both?
I've written an entire query that works very well for the monthly report but I've been asked to dive deeper into the data to get the weekly numbers as well.
SELECT t.customerref_name, t.customerref_value, t.txndate AS full_date,
EXTRACT(MONTH FROM CAST(t.txndate AS DATE)) AS month, EXTRACT (YEAR FROM
t.txndate) AS year,
r.description, r.amount, r.estimate_id, r.qty,
s.first_invoice_order_date,
s.last_invoice_order_date, DATE_DIFF(s.last_invoice_order_date,
s.first_invoice_order_date, DAY) AS days_been_csutomer,
NTILE(4) OVER (ORDER BY DATE_DIFF(s.last_invoice_order_date,
s.first_invoice_order_date, DAY)) AS percentile_lifetime, r.color,
Is it possible? This is just a snippet of my code but I can share the rest if need be
Instead of EXTRACT(MONTH FROM day) month you can use EXTRACT(WEEK FROM day) week
So, if your monthly query can be mimicked as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT day, CAST(100 * RAND() AS INT64) value
FROM UNNEST(GENERATE_DATE_ARRAY('2000-01-01', '2001-12-31')) day
)
SELECT EXTRACT(YEAR FROM day) year, EXTRACT(MONTH FROM day) month, SUM(value) value
FROM `project.dataset.table`
GROUP BY year, month
ORDER BY year, month
Then, your weekly will be
#standardSQL
WITH `project.dataset.table` AS (
SELECT day, CAST(100 * RAND() AS INT64) value
FROM UNNEST(GENERATE_DATE_ARRAY('2000-01-01', '2001-12-31')) day
)
SELECT EXTRACT(YEAR FROM day) year, EXTRACT(WEEK FROM day) week, SUM(value) value
FROM `project.dataset.table`
GROUP BY year, week
ORDER BY year, week
You can learn more about syntax of EXTRACT
In addition to Mikhail's answer, consider using FORMAT_DATE('%U', day) as week_num. This approach provides a few different format elements to give you more flexibility: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#supported-format-elements-for-datetime .
%U The week number of the year (Sunday as the first day of the week) as a decimal number (00-53).
%V The week number of the year (Monday as the first day of the week) as a decimal number (01-53). If the week containing January 1 has four or more days in the new year, then it is week 1; otherwise it is week 53 of the previous year, and the next week is week 1.
%W The week number of the year (Monday as the first day of the week) as a decimal number (00-53).

Week of the current month - interval

How can i select every 1./2./3./.4-5 week (interval) from the current months?
So not like:
select to_char(sysdate,'W') from dual
But i need an interval = week of the current month (for example 2.week of oktober, because it's october - sysdate). So, concretely:
select SUM(number)/((trunc(sysdate,'WW')/4) from my_table where date between ? and ?
Something like this should work for you.
SELECT TO_CHAR(datecol,'W') AS week,
SUM(countcol) AS sum_of_sales
FROM sales
WHERE TO_CHAR(datecol,'YYYYMM') = TO_CHAR(SYSDATE,'YYYYMM') --current year and month.
GROUP BY TO_CHAR(datecol,'W')
ORDER BY TO_CHAR(datecol,'W')