How to Count number of people on a rolling period with SQL? - sql

I'm trying to count the number of people who bought a product on a 1 year moving period using SQL.
for example :
In August 2022, the number of people from September 2021 to August 2022
In September 2022, the number of people from October 2021 to September 2022
In October 2022, the number of people from November 2021 to October 2022
....
Here is the code I'm using for now,
I'm doing a for loop to create a query on the 1st 12 months and incrementing by 1 to get the result on each single month. I'd like to know how I can do that by only using SQL and what is the best way to do this.
for i in range(1, 13):
query = f"""
SELECT
FORMAT_DATE("%B %Y", DATETIME_SUB(CURRENT_DATE(), INTERVAL {i} MONTH)),
APPROX_COUNT_DISTINCT(NB_buyer) AS NB_buyer
FROM
Table
WHERE
AND DATE_TRUNC(date, MONTH) BETWEEN DATETIME_SUB(DATE_TRUNC(CURRENT_DATE(), MONTH), INTERVAL {i+12} MONTH)
AND DATETIME_SUB(DATE_TRUNC(CURRENT_DATE(), MONTH), INTERVAL {i-1} MONTH)
{conditions}
UNION ALL
I hope my question is clear
Thanks in advance

Related

Sum of totals grouped by month

I've been out of the dev world for a few years so forgive me if this is a pretty basic question but I have an app that logs bookings for holiday accomodation. I want to produce a report detailing how much income per month a user gets.
My query thus far is as so:-
SELECT SUM(int_ToOwner) AS TotalIncome,
DateName(m,dtm_StartDate) AS BookingMonth
FROM tbl_Bookings
WHERE dtm_StartDate > '2021-12-31'
GROUP BY DatePart(m,dtm_StartDate), int_ToOwner, dtm_StartDate
But that produces the result below. I want it to give me a total for each month instead.
TotalIncome
BookingMonth
553.00
January
849.00
January
885.00
February
1236.00
February
1239.00
February
896.00
March
927.00
March
940.00
March
959.00
March
971.00
March
1167.00
April
1255.00
April
1500.00
April
2461.00
April
1131.00
May
1172.00
May
1275.00
May
2647.00
May
1466.00
June
1480.00
June
1496.00
June
1899.00
June
2167.00
June
1881.00
July
4990.00
July
4991.00
July
2134.00
August
4162.00
August
4883.00
August
5329.00
August
1430.00
September
1630.00
October
1130.00
November
You almost have it but you are also grouping by int_ToOwner and you have the dtm_StartDate twice.
Try:
SELECT SUM(int_ToOwner) AS TotalIncome, DateName(m,dtm_StartDate) AS BookingMonth
FROM tbl_Bookings
WHERE dtm_StartDate > '2021-12-31'
GROUP BY DatePart(m,dtm_StartDate)
A little re-format:
SELECT SUM(int_ToOwner) AS TotalIncome
, DateName(m,dtm_StartDate) AS BookingMonth
FROM tbl_Bookings
WHERE dtm_StartDate > '2021-12-31'
GROUP BY DatePart(m,dtm_StartDate)
, int_ToOwner
, dtm_StartDate
Your GROUP BY tells the database to create groups for data with equal values of
DatePart(m,dtm_StartDate)
int_ToOwner
dtm_StartDate
Then SELECT asks for each group the
calculated SUM of int_ToOwner
DateName(m,dtm_StartDate)
You should search your solution in grouping the correct attributes.

Weeks rolling up to months in SQL (date_trunc)

I am working on a MODE Case Study which can be accessed here https://mode.com/sql-tutorial/a-drop-in-user-engagement/#the-problem.
I am trying to access Table 2: Events which has a date column 'occurred_at'. I wanted to check the time frame of this case study, that is weeks and months.
I wrote a simple query
select distinct(date_trunc('week', occurred_at)) as week, date_trunc('month', occurred_at) as month
from tutorial.yammer_events
where event_type = 'engagement'
order by week;
and to my surprise, the first week of '2014-04-28' showed the month 'May' instead of 'April.
Can someone please tell me what is the reason for this?
Thank you
The PostgreSQL date_trunc rolls up the date to the first instance of the date depending upon the granularity (day, week, month, etc.)
For month this instance is the first day of month i.e. Day 1.
For week this instance is the first day of week i.e. Monday.
Suppose the date is 4th July 2021 which is Sunday, then the date_trunc will result in 1st July 2021 for month and 28th June 2021 for week which is Monday inside that week.
Suppose the date is 5th July 2021 which is Monday itself, then the date_trunc will still result in 1st July 2021 for month but result in 5th July 2021 for week since it is already Monday.

Get the data for the full penultimate calendar month + data for the same month, but in 2 previous years

I'm working in PostgreSQL environment and I have the following problem: let's say today is April 8th, 2019 and I want to pull the full list of user IDs who opened their account in the penultimate month (so from Feb 1st till Feb 28th, 2019) + list of user IDs who opened their account in the same month, but in years 2017 and 2018 (so in Feb'18 and Feb'17).
If we assumed that today is June 25th 2019, I would want to pull the list for the following periods:
- 1st till 30th April 2019;
- 1st till 30th April 2018;
- 1st till 30th April 2017.
I have this SQL code at the moment. As you can see, I have to alter the dates in the WHERE clause every month. Can someone advise me on how I could solve this problem?
SELECT
deposit_id,
to_char(activation_date, 'yyyy-mm-dd') as placement_start_date,
FROM fixed_term_plan
WHERE
activation_date BETWEEN Date '2019-01-01' AND Date '2019-01-31' or activation_date BETWEEN Date '2018-01-01' AND Date '2018-01-31' or activation_date BETWEEN date '2017-01-01' AND Date '2017-01-31'
Hmmm. I am thinking:
SELECT deposit_id,
to_char(activation_date, 'yyyy-mm-dd') as placement_start_date,
FROM fixed_term_plan ftp
WHERE extract(month from activiation_date) = extract(month from (now() - interval '2 month'));

How to build a Dax to view data of all the months till data w.r.t all the years?

How can I build a DAX function which calculates all the data until a certain date and compare that with the previous year which have the same months as the "until" date?
For example, today's date is 5 April 2018, so if I select 2017 year inside the slicer, I should be able to see a graph which shows me the comparison between the start of year i.e 1 Jan 2018 to 5 April 2018, and 1 Jan 2017 to 5 April 2017 with the previous year.
Currently I am using YtD, but I think it's calculating all the 12 months of data of all the years except the year 2018, where it shows me data from Jan 1 to April 5. Can anyone shed some light here?
Currently I am using this YTDQty = TOTALYTD(sum(Bookscan[QtySold]),DATESYTD(Bookscan[Week Date]))
Which is showing me correct data of 2018, till date, I should be able to compare the 4 months of data to my previous years 2017, 2016, 2015, these years are showing me total data for all the years i.e 12months of data, However I only need to see data start from 2018 Jan till todays date or let say March 1, so all the years should show me this current data how to do this?
Very similar to this question.
Do you have a Date Dimension in your model?
TotalQuantity =
SUM(Bookscan[QtySold])
TotalQuantity YTD =
TOTALYTD([TotalQuantity],'Date'[Date])
TotalQuantity YTD LY =
CALCULATE(
[TotalQuantity YTD],
SAMEPERIODLASTYEAR('Date'[Date])
)

SQL - check if an order date occurs after the second Saturday in July

I am querying against a table of 4 yrs of order transactions (pk = order number) and I'm looking to tag each record with particular date flags based on the order date - e.g., calendar year, calendar month, fiscal year, etc. There are date attributes that are specific to our business (e.g., not easily solved by a datepart function) that I'm having trouble with.
I was able to add "School Year" (for us that runs Aug 1 - July 31) using a case statement:
case
when datepart(month, oline.order_date_ready) between 8 and 12 then datepart(year, oline.order_date_ready)
else (datepart(year, oline.order_date_ready)-1)
end as school_yr
So for 1/19/2017, the above would return "2016", because to us the 2016 school year runs from Aug 1 2016 to July 31 2017.
But now I'm having trouble repeating the same kind of case statement for something called "Rollover Year". All of our order history tables are reset/"rolled over" on the 2nd Saturday in July every calendar year, so for example the most recent rollover date was Saturday July 9th 2016. Click to view - rollover year date ranges
My above case statement doesn't apply anymore because I can't just add "datepart(month, oline.order_date_ready) = 7" - I don't need the whole month of July, I just need all the orders occurring after the 2nd Saturday in that July. So in this example, I need everything occurring from Sat July 9 2016 to today to be flagged as rollover_date = 2016.
Is there a flexible way to do this without hard coding previous/future rollover dates into another table? That's the only way I can think to solve it currently, but I'm sure there must be a better way.
Thanks!
If you ask for the day-of-the-week of July 1st, then from there it's simple arithmetic, right? This query gives results matching your image:
SELECT y,
CONCAT(y, '-07-01')::timestamp +
CONCAT(6 - EXTRACT(DOW FROM CONCAT(y, '-07-01')::timestamp) + 7, ' days')::interval
FROM generate_series(2013, 2020) s(y)
ORDER BY y DESC
;
So given any date d from year y, if it comes before the 2nd Saturday of July, give it fiscal year y - 1. Otherwise give it fiscal year (school year?) y.