How to create monthly snapshots for the last 6 months? - sql

I'm trying to get detailed data (snapshot) for each month on Business Day=1 for the last 6 months and need to pass 6 different dates (BD1's only) through two date variables.
Two variables will be BOM which will be BD1 for the last 6 months and EOM which will be BD1+1.
For e.g
First snapshot will be
declare #BOM date ='2022-08-01'
declare #EOM date ='2022-09-01'
Second snapshot will be
declare #BOM date ='2022-09-01'
declare #EOM date ='2022-10-01'
and so on for the last 6 months from the current month
Here is what I'm trying to do:
declare #BOM date
set #BOM=
(
select top 6 cast(date_datetime as date) date_datetime
from date_dim
where
datediff(month, date_datetime, getdate()) <= 6
and bd=1
order by date_datetime asc);
declare #EOM date
set #EOM=
(
select top 6 date_datetime
from date_dim
where
datediff(month, date_datetime, getdate()) <= 5
and bd=1
order by date_datetime asc);
But my query does not process it as I'm passing more than 1 value through my BOM & EOM variables in my main query WHERE clause.
I need some help with defining and using these variables in my query so that they can take different snapshots and store it in a table.

As you discovered, you cannot store multiple values in a scalar variable. What you possibly need is to use a table variable (which behaves similarly to a temp table). The table variable can have multiple rows (one for each selected month) and multiple columns (BOM and EOM).
The following code defines such a table variable and populates it with BOM and EOM of the most recent 6 full months from the date_dim table. I used the LEAD() window function to select the corresponding EOM for each BOM.
Lacking any provided sample data to actually query, I added a simple query at the end to just list the selected date ranges and calculated number of business days in each.
-- Table variable to hold selected month information
DECLARE #selected_months TABLE (BOM DATE, EOM DATE)
-- Select last 6 full months
INSERT #selected_months
SELECT *
FROM (
SELECT
date_datetime AS BOM,
LEAD(date_datetime) OVER(ORDER BY date_datetime) AS EOM
FROM date_dim
) D
WHERE DATEDIFF(month, BOM, GETDATE()) BETWEEN 1 AND 6
ORDER BY BOM
-- Sample usage
SELECT M.*, DATEDIFF(day, M.BOM, M.EOM) business_days
FROM #selected_months M
-- JOIN your_data D
-- ON D.your_data_date >= SM.BOM
-- AND D.your_data_date < SM.EOM
GROUP BY M.BOM, M.EOM
ORDER BY M.BOM
Sample results:
BOM
EOM
business_days
2022-08-01
2022-09-05
35
2022-09-05
2022-10-03
28
2022-10-03
2022-11-07
35
2022-11-07
2022-12-05
28
2022-12-05
2023-01-02
28
2023-01-02
2023-02-06
35
See this db<>fiddle for a working demo.

Related

Select records by month and year between two dates

I have the table record_b. I want to select the records of an specific month and year between begin_date and end_date.
id
begin_date
end_date
2
2022-09-04
2022-10-03
3
2022-10-04
2022-10-31
4
2022-11-04
2022-12-03
5
2022-12-04
2023-01-03
6
2023-01-04
2023-02-03
7
2023-02-04
null
eg1:
Input: 2023-01
Output should be the record with id 5 and 6
eg2:
Input: 2022-12
Output should be the record with id 4 and 5
I have tried using between however there is a problem evaluating the months after the year.
and v_year BETWEEN EXTRACT(YEAR FROM PC.begin_date)
AND EXTRACT(YEAR FROM PC.end_date)
AND v_month BETWEEN EXTRACT(MONTH FROM PC.begin_date)
AND EXTRACT(MONTH FROM PC.end_date)
A very basic dictate is when you have a date store it as a date.
This can be further extended to when you need to process dates then process dates.
Most of the nothing else will be needed - no conversion, extract, date_part, epoch - just dates.
The task here is to find those rows where a specified Year-Month (yyyy-mm) falls within the period begin and end dates from a table.
Realize that if any portion of the specified year-month falls within the period then the first day of that month (yyyy-mm-01) falls within that period.
You can use the make_date() function to get the first of the specified month. Then JOIN that result with between dates.
with input_val(yr_mon) as (values (:yyyymm)) --select * from input_val
, tgt_date(dt) as
( select 0make_date(substring(yr_mon,1,4)::integer
,substring(yr_mon,6,2)::integer
,01
)
from input_val
) --select * from tgt_date;
select rb.*
from tgt_date t
join record_b rb
on t.dt between date_trunc('month',rb.begin_date)
and date_trunc('month',rb.end_date);
The above however does NOT handle well data point 7 with a null end date (nor would it handle a null start date). But should it?
If so a null value is often interpreted as there in no ending date, which basically says all dates on or after the start date are included.
You can handle the situation by converting the period to a daterange, which will handle it without getting into null processing logic, then use the range containment operator.
with input_val(yr_mon) as (values (:yyyymm)) --select * from input_val
, tgt_date(dt) as
( select make_date(substring(yr_mon,1,4)::integer
,substring(yr_mon,6,2)::integer
,01
)
from input_val
) --select * from tgt_date;
select rb.*
from tgt_date t
join record_b rb
on t.dt <# daterange(date_trunc('month',rb.begin_date)::date
,date_trunc('month',rb.end_date)::date
, '[]'
);
Finally, depending on how you you use the results, you can hide this whole thing within a SQL function, which can then be used in an SQL statement.
create or replace function periods_with_year_month(year_mm text)
returns setof record_b
language sql
as $$
with tgt_date(dt) as
(select make_date(substring(year_mm,1,4)::integer
,substring(year_mm,6,2)::integer
,01
)
)
select rb.*
from tgt_date t
join record_b rb
on t.dt <# daterange( date_trunc('month',rb.begin_date)::date
, date_trunc('month',rb.end_date)::date
, '[]'
);
$$;
See demo here. Unfortunately ,db<>fiddle is non-interactive, so parameters of yyyy-mm are hard coded.

Find first value with datediff larger than 1 week from table and join with initial table

I have two tables:
dbo.HistPrices (Which has all Historic Prices for every PId and some non interesting metadata... etc. )
PId (non-unique)
Price
Pricedate
...
1
5
2022-11-03
2
3
2022-11-03
2 (more than 1 date per week)
3.2
2022-11-02
1
6
2022-10-27
2
3.4
2022-10-27
and dbo.Stuff (which is like a shopping cart of some sort where the given Price is the price in the current week for a specific item which is encapsulated in the Sid )
SId (unique)
Price
Pricedate
desc
...
1
9
2022-11-10
2
2.9
2022-11-10
3
7
2022-11-10
The SId and PId have different names also the HistPrices table carries also information for items which are not related to the stuff Table.
What I want is a Table like this:
SId
Price
Pricedate
desc
...
last_week_Price
Last_week_PriceDAte
week_before Price
week before date
1
9
2022-11-10
5
2022-11-03
6
2022-10-27
2
2.9
2022-11-10
3
2022-11-03
3.4
2022-10-27
So, I want to create two columns in the dbo.Stuff table which get last week's price and the price from the week before. I can not be sure that there is only one price from last week (see the 2022-11-02 Price for PId 2).
So, if there are more prices from last week, I just want to grab the first one which is at least a week old. Similar for the price from the week before. Only the first price which is at least 2 weeks older needs to be fetched.
Another must is, that the length of dbo.Stuff is not changed. So, if no price is found, None should be inserted.
I got a solution for an individual part with a CTE, but I don't know how to form the correct join/insert statement.
My CTE for an individual SId, which I set manually, looks like this:
DECLARE #get_Date VARCHAR(100)
DECLARE #SId int
DECLARE #week_offset int
SET #get_Date = 'teststring in date format'
SET #SId = 12345
SET #week_offset = -1;
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (
PARTITION BY hp.PId
ORDER BY hp.PriceDate DESC
--- i also thoguth abut DATEDIFF(week, hp.Pricedate,CONVERT(DATETIME,#get_Date) )
) rn
FROM dbo.HistPrices hp
WHERE (hp.Pricedate >= DATEADD(Week, #week_offset,CONVERT(DATETIME,#get_Date))
AND hp.Pricedate < CONVERT(DATETIME,#get_Date) )
AND hp.PId = #SId
)
SELECT *
FROM cte
WHERE rn = 1
ORDER BY PId
I'm struggling to join the two tables for all ids like this. So, I think I get the correct result when I choose an ID manually, but I can somehow not join the two tables with this information.
Edit: I added some actual dates as requested in the comment
You could solve this a variety of ways depending on the requirements, but the first option that comes to mind would be to simply use an outer apply. Other options would include using ranking functions like row_number or analytic functions like first_value/last_value, or standard joins and CTEs if the requirements allow for it.
A simple example using a CROSS APPLY operation would be as follows:
select s.sid, s.price, s.pricedate,
l.price as last_week_price,
l.pricedate as last_week_pricedate
from dbo.Stuff s
outer apply
(
select top 1
h.price, h.pricedate
from dbo.HistPrices h
where h.pid = s.sid
-- Starting 1 week (i.e. 7 days) before the sid.pricedate...NOT on the week start of the week before
and h.pricedate >= dateadd(week, -1, s.pricedate)
-- If you'd rather start on the start of the week of the week before sid.pricedate, it'd be this
-- and h.pricedate >= dateadd(week, datediff(week, 0, s.pricedate) - 1, 0)
and h.pricedate < dateadd(week, datediff(week, 0, s.pricedate), 0)
order by
h.pricedate
) l;

T-sql count number of times a week on rows with date interval

If you have table like this:
Name
Data type
UserID
INT
StartDate
DATETIME
EndDate
DATETIME
With data like this:
UserID
StartDate
EndDate
21
2021-01-02 00:00:00
2021-01-02 23:59:59
21
2021-01-03 00:00:00
2021-01-04 15:42:00
24
2021-01-02 00:00:00
2021-01-06 23:59:59
And you want to calculate number of users that is represented on each day in a week with a result like this:
Year
Week
NumberOfTimes
2021
1
8
2021
2
10
2021
3
4
Basically I want to to a Select like this:
SELECT YEAR(dateColumn) AS yearname, WEEK(dateColumn)as week name, COUNT(somecolumen)
GROUP BY YEAR(dateColumn) WEEK(dateColumn)
The problem I have is the start and end date if the date goes over several days I want it to counted each day. Preferably I don't want the same user counted twice each day. There are millions of rows that are constantly being deleted and added so speed is key.
The database is MS-SQL 2019
I would suggest a recursive CTE:
with cte as (
select userid, startdate, enddate
from t
union all
select userid, startdate,
enddate
from cte
where startdate < enddate and
week(startdate) <> week(enddate)
)
select year(startdate), week(startdate), count(*)
from cte
group by year(startdate), week(startdate)
option (maxrecursion 0);
The CTE expands the data by adding 7 days to each row. This should be one day per week.
There is a little logic in the second part to handle the situation where the enddate ends in the same week as the last start date. The above solution assumes that the dates are all in the same year -- which seems quite reasonable given the sample data. There are other ways to prevent this problem.
You need to cross-join each row with the relevant dates.
Create a calendar table with columns of years and weeks, include a start and end date of the week. See here for an example of how to create one, and make sure you index those columns.
Then you can cross-join like this
SELECT
YEAR(dateColumn) AS yearname,
WEEK(dateColumn)as weekname,
COUNT(somecolumen)
FROM Table t
JOIN CalendarWeek c ON c.StartDate >= t.StartDate AND c.EndDate <= t.EndDate
GROUP BY YEAR(dateColumn), WEEK(dateColumn)

create a temporary sql table using recursion as a loop to populate custom time interval

Suppose you have a table like:
id subscription_start subscription_end segment
1 2016-12-01 2017-02-01 87
2 2016-12-01 2017-01-24 87
...
And wish to generate a temporary table with months.
One way would be to encode the month date as:
with months as (
select
'2016-12-01' as 'first',
'2016-12-31' as 'last'
union
select
'2017-01-01' as 'first',
'2017-01-31' as 'last'
...
) select * from months;
So that I have an output table like:
first_day last_day
2017-01-01 2017-01-31
2017-02-01 2017-02-31
2017-03-01 2017-03-31
I would like to generate a temporary table with a custom interval (above), without manually encoding all the dates.
Say the interval is of 12 months, for each year, for as many years there are in the db.
I'd like to have general approach to compute the months table with the same output as above.
Or, one may adjust the range to a custom interval (months split an year in 12 parts, but one may want to split a time in a custom interval of days).
To start, I was thinking to use recursive query like:
with months(id, first_day, last_day, month) as (
select
id,
first_day,
last_day,
0
where
subscriptions.first_day = min(subscriptions.first_day)
union all
select
id,
first_day,
last_day,
months.month + 1
from
subscriptions
left join months on cast(
strftime('%m', datetime(subscriptions.subscription_start)) as int
) = months.month
where
months.month < 13
)
select
*
from
months
where
month = 1;
but it does not do what I'd expect: here I was attempting to select the first row from the table with the minimum date, and populate a table at interval of months, ranging from 1 to 12. For each month, I was comparing the string date field of my table (e.g. 2017-03-01 = 3 is march).
The query above does work and also seems a bit complicated, but for the sake of learning, which alternative would you propose to create a temporary table months without manually coding the intervals ?

add running list of dates to sql query

This is the result of a query.
It is a calendar in essence.
I want to set the start date and for the date field to be populated with a running list of Dates as in the below example. starting with a date I declare (In the example this is set as 2017-04-29)
Order is the order in which the Item is to be made.
Days is the number of days that item has been worked on (first day returns 1, second day: 2, and so on).
It is currently ordered by 'order' column then 'days' column
Current Output
Date Item Order Days
Null WP-1 1 1
Null SP1 2 1
Null SP1 2 2
Null WP-2 3 1
Desired Output
Date Item Order Days
2017-04-29 WP-1 1 1
2017-04-30 SP1 2 1
2017-05-01 SP1 2 2
2017-05-02 WP-2 3 1
I do have 'numbers' and 'dates' tables if they help
This is for SQL Server 2008
Thanks
Use row_number and add it to the specified start date.
declare #startdate date;
set #startdate = '2017-04-29';
select dateadd(day, row_number() over(order by [order],days)-1, #startdate) as [date],
item,[order],days
from yourtable