How to join tally dates with list of periods to retrieve wanted results? - sql

I have a list of tally dates that I want to combine with prices, but I want for results to have all the dates from tally and dates and price values from prices (and null prices when no periods correspond to tally date)
Dates
Date
2017-12-22
2017-12-23
2017-12-24
2017-12-25
2017-12-26
2017-12-27
2017-12-28
2017-12-29
2017-12-30
2017-12-31
Prices
periodstart periodend price productID
2017-12-23 2017-12-25 50 1
2017-12-26 2017-12-29 10 1
Sql query result
date price productid
2017-12-22 null 1
2017-12-23 50 1
2017-12-24 50 1
2017-12-25 50 1
2017-12-26 10 1
2017-12-27 10 1
2017-12-28 10 1
2017-12-29 10 1
2017-12-30 null 1
2017-12-31 null 1
UPDATE
I added productID column in prices

rextester: http://rextester.com/ADJZSW20744
create table dbo.calendar (
[date] date primary key clustered
);
insert into dbo.calendar values
('2017-12-22'),('2017-12-23'),('2017-12-24')
,('2017-12-25'),('2017-12-26'),('2017-12-27')
,('2017-12-28'),('2017-12-29'),('2017-12-30')
,('2017-12-31');
create table prices (
periodstart date
, periodend date
, price int
, productid int
);
insert into prices values
('2017-12-23','2017-12-25',50,1)
,('2017-12-26','2017-12-29',10,1)
,('2017-12-22','2017-12-23',50,2)
,('2017-12-26','2017-12-27',10,2);
query: This will work with multiple products:
select
c.Date
, p.Price
, x.ProductId
from dbo.Calendar c
outer apply (
select distinct
ProductId
from prices
) x
left join dbo.Prices p on
c.Date >= p.PeriodStart
and c.Date <= p.PeriodEnd
and x.ProductId = p.ProductId
order by x.ProductId, c.Date;

A simple left join should do the trick
Select A.Date
,B.Price
From Dates A
Left Join Prices B on A.Date Between B.periodstart and B.periodend

Try this:
SELECT Date
, price
FROM Dates d
LEFT JOIN Prices p
ON d.Date BETWEEN p.periodstart AND ISNULL(p.periodend, d.Date)
To avoid conflicts in case your periods are intersecting or don't have an ending date, take the latest start period using an apply:
SELECT Date
, price
FROM Dates d
OUTER APPLY
(
SELECT TOP 1 price
FROM Prices p
WHERE d.Date BETWEEN p.periodstart AND ISNULL(p.periodend, d.Date)
ORDER BY p.periodstart DESC
) oa

Related

In SQL, is there a way to show all dates even if the date doesn't have data points?

I have a transaction table t as follows in MS SQL Management Studio:
If I run the following SQL to summarise the transaction:
Select
Format(Transaction_Date, 'MMM-yyyy') as 'Year/Month'
,Customer
,Count(Customer) as SalesCount
From t
Group by Format(Transaction_Date, 'MMM-yyyy'), Customer
Order by Customer, Format(Transaction_Date, 'MMM-yyyy')
I'll get:
However I was asked to add all the months for the year 2019 and if there's no transaction in a certain month then return 0 for the SalesCount column:
I tried to create a month table with all the months in 2019 and left join it with the transaction table, but it still returns the same result with no showing of the months without transactions.
Time table I created:
declare #StartDate date = '2019-01-01';
declare #EndDate date = '2020-01-01';
With cte as (
Select #StartDate AS myDate
Union All
Select Dateadd(Month,1,myDate)
From cte
Where Dateadd(Month,1,myDate) < #EndDate
)
,TimeTable as(
SELECT
year(myDate)
,Datename(Month,myDate)
,Format(myDate,'MMMM-yy') as 'Month-Year'
FROM cte
)
Select
tb.'Month-Year'
t.Format(Transaction_Date, 'MMM-yyyy') as Year/Month
,t.Customer
,t.Count(Customer) as SalesCount
From TimeTable tb
Left Join Transaction t on t.'Month/Year' = tb.'Month-Year'
Group by tb.'Month-Year', Format(Transaction_Date, 'MMM-yyyy'), Customer
Order by Customer, Format(Transaction_Date, 'MMM-yyyy')
Your help will be much appreciated!
You need to generate all the rows for the months. One method uses a recursive CTE. Then rest is then left join and aggregation.
Let me assume you are using SQL Server:
with months as (
select convert(date, '2019-01-01') as mon
union all
select dateadd(month, 1, mon)
from months
where mon < '2019-12-01'
)
select Format(m.mon, 'MMM-yyyy') as year_month,
c.Customer,
count(t.customer) as SalesCount
from months m cross join
(select distinct customer from t) c left join
t
on t.transaction_date >= m.mon and
t.transaction_date < dateadd(month, 1, mon) and
t.customer = c.customer
group by m.mon, c.customer
order by c.ustomer, c.mon ;
Note the other changes to the query:
Year/Month is not a valid column alias.
This orders the rows chronologically. That is usually (always?) preferred over alphabetic sorting of months.
You can use monthYear combination table and with distinct customer list, you can achieve this.
declare #table table(trandate date,customer char(1))
inSert into #table values
('2019-01-03','A'),
('2019-01-17','A'),
('2019-06-03','A'),
('2019-07-03','A'),
('2019-06-03','B'),
('2019-07-03','B');
;with monthYear AS
(
select * from
(
values
('Jan-19')
,('Feb-19')
,('Mar-19')
,('Apr-19')
,('May-19')
,('Jun-19')
,('Jul-19')
,('Aug-19')
,('Sep-19')
,('Oct-19')
,('Nov-19')
,('Dec-19')
) as t(mon)
)
SELECT my.mon,c.customer, isnull(t.salescount,0) as salescount FROM monthYear as my
CROSS JOIN (SELECT distinct customer from #table) as c
OUTER APPLY
(
select format(trandate,'MMM-yy') as MonthYear, count(customer) as salescount
from #table
where customer = c.customer
group by format(trandate,'MMM-yy')
having format(trandate,'MMM-yy') = my.mon
) as t
mon
customer
salescount
Jan-19
A
2
Feb-19
A
0
Mar-19
A
0
Apr-19
A
0
May-19
A
0
Jun-19
A
1
Jul-19
A
1
Aug-19
A
0
Sep-19
A
0
Oct-19
A
0
Nov-19
A
0
Dec-19
A
0
Jan-19
B
0
Feb-19
B
0
Mar-19
B
0
Apr-19
B
0
May-19
B
0
Jun-19
B
1
Jul-19
B
1
Aug-19
B
0
Sep-19
B
0
Oct-19
B
0
Nov-19
B
0
Dec-19
B
0

SQL left join same column and table

I have a customer order data and would like to do analysis on customer retention after price changes.
The order table is as follows:
customer_id order_number order_delivered_date
14156 R980193622 2/6/2020 14:51
1926396 R130222714 22/5/2020 11:02
1085123 R313065343 22/5/2020 14:50
699858 R693959049 8/6/2020 17:03
1609769 R195969327 3/6/2020 16:14
14156 R997103187 27/6/2020 14:01
1926396 R403942827 11/6/2020 14:42
1926396 R895013611 8/7/2020 17:04
So, I would like to pull order in the period before new price. Assume the new price implementation is on 10/6/2020. I would like to do left join to order after the new price on the customer_id.
Before is a set of data dated 10/5/2020 00:00:00 to 9/6/2020 23:59:59 while After is a set of data dated 10/6/2020 00:00:00 to 9/7/2020 23:59:59.
The desired table:
Before After
14156 14156
1926396 1926396
1085123 Null
699858 Null
1609769 Null
If customer_id is found side by side it means they are retained. It should be simple...But I have been stucked.
EDIT:
This is few code that I have been trying
First try:
select ol2.customer_id as before, ol.customer_id as after
from master.order_level ol,
left join master.order_level ol2
on ol2.customer_id = ol.customer_id
where order_delivered_date between '2020-05-10 00:00:00' and '2020-07-09 23:59:59' and country_id = 2
Second try:
SELECT ol.customer_id as before, ol2.customer_id as after
FROM master.order_level ol,master.order_level ol2
left join master.order_level
ON ol.customer_id = ol2.customer_id
WHERE ol.order_delivered_date between '2020-05-10 00:00:00' and '2020-06-09 23:59:59' and ol.country_id =2 and ol2.order_delivered_date between '2020-06-10 00:00:00' and '2020-07-09 23:59:59' and ol2.country_id =2
No need to do a join, you can just use you can do a simple group by and use case and aggregate functions. I also made a fiddle showing it in action here
SELECT customer_id,
CASE
WHEN MIN(order_delivered_date) < '3-15-2019' THEN customer_id
ELSE NULL END customer_before,
CASE
WHEN MAX(order_delivered_date) >= '3-15-2019' THEN customer_id
ELSE NULL END customer_after
FROM my_table
GROUP BY customer_id
there qyery will giva you results like this
customer_id customer_before customer_after
4 4 (null)
1 1 1
3 3 (null)
2 2 2
with before (customer_id) as
( select distinct customer_id from orders where order_delivered_date <= '10/06/2020'
),
after (customer_id) as
(select distinct customer_id from orders where order_delivered_date between '10/06/2020' and '09/07/2020')
select
before.customer_id,
after.customer_id
from before left outer join after on before.customer_id = after.customer_id
you can use union
select customer_id as before, null as after
from #order
where order_delivered_date <'2020-06-10'
union
select null as before, customer_id as after
from #order
where order_delivered_date >='2020-06-10'
results

add missing month in sales

I have a sales table with below values.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-07-01,1234,9
2020-03-01,3241,8
2020-07-01,3241,4
As you can see first purchase was for CustomerID = 1234 in Jan 2020 and for CustomerID = 3241 in MAR 2020.
I want on output where in all the date should be filled up with 0 purchase value.
means if there is no sale between Jan and July Then output should be as below.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-02-01,1234,0
2020-03-01,1234,0
2020-04-01,1234,0
2020-05-01,1234,0
2020-06-01,1234,0
2020-07-01,1234,9
2020-03-01,3241,8
2020-04-01,3241,0
2020-05-01,3241,0
2020-06-01,3241,0
2020-07-01,3241,4
You can use a recursive query to create the missing dates per customer.
with recursive dates (customerid, transactiondate, max_transactiondate) as
(
select customerid, min(transactiondate), max(transactiondate)
from sales
group by customerid
union all
select customerid, dateadd(month, 1, transactiondate), max_transactiondate
from dates
where transactiondate < max_transactiondate
)
select
d.customerid,
d.transactiondate,
coalesce(s.quantity, 0) as quantity
from dates d
left join sales s on s.customerid = d.customerid and s.transactiondate = d.transactiondate
order by d.customerid, d.transactiondate;
This is a convenient place to use a recursive CTE. Assuming all your dates are on the first of the month:
with cr as (
select customerid, min(transactiondate) as mindate, max(transactiondate) as maxdate
from t
group by customerid
union all
select customerid, dateadd(month, 1, mindate), maxdate
from cr
where mindate < maxdate
)
select cr.customerid, cr.mindate as transactiondate, coalesce(t.quantity, 0) as quantity
from cr left join
t
on cr.customerid = t.customerid and
cr.mindate = t.transactiondate;
Here is a db<>fiddle.
Note that if you have more than 100 months to fill in, then you will need option (maxrecursion 0).
Also, this can easily be adapted if the dates are not all on the first of the month. But you would need to explain what the result set should look like in that case.
[EDIT] Based on what other posted I updated the code.
;with
min_date_cte(MinTransactionDate, MaxTransactionDate) as (
select min(TransactionDate), max(TransactionDate) from tsales),
unq_yrs_cte(year_int) as (
select distinct year(TransactionDate) from tsales),
unq_cust_cte(CustomerID) as (
select distinct CustomerID from tsales)
select datefromparts(uyc.year_int, v.month_int, 1) TransactionDate,
ucc.CustomerID,
isnull(t.Quantity, 0) Quantity
from min_date_cte mdc
cross join unq_yrs_cte uyc
cross join unq_cust_cte ucc
cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(month_int)
left join tsales t on datefromparts(uyc.year_int, v.month_int, 1)=t.TransactionDate
and ucc.CustomerID=t.CustomerId
where
datefromparts(uyc.year_int, v.month_int, 1)>=mdc.MinTransactionDate
and datefromparts(uyc.year_int, v.month_int, 1)<=mdc.MaxTransactionDate;
Results
TransactionDate CustomerID Quantity
2020-01-01 1234 5
2020-01-01 3241 0
2020-02-01 1234 0
2020-02-01 3241 0
2020-03-01 1234 0
2020-03-01 3241 8
2020-04-01 1234 0
2020-04-01 3241 0
2020-05-01 1234 0
2020-05-01 3241 0
2020-06-01 1234 0
2020-06-01 3241 0
2020-07-01 1234 9
2020-07-01 3241 4
You can make use of recursive query:
WITH cte1 as
(
select customerid, min([TransactionDate]) as Monthly_date, max([TransactionDate]) as end_date from calender_table
group by customerid
union all
select customerid, dateadd(month, 1, Monthly_date), end_date from cte1
where Monthly_date < end_date
)
select a.Monthly_date, a.customerid,coalesce(b.quantity, 0) from cte1 a left outer join calender_table b
on (a.Monthly_date = b.[TransactionDate] and a.customerid = b.customerid)
order by a.customerid, a.Monthly_date;

Aggregate a subtotal column based on two dates of that same row

Situation:
I have 5 columns
id
subtotal (price of item)
order_date (purchase date)
updated_at (if refunded or any other status change)
status
Objective:
I need the order date as column 1
I need to get the subtotal for each day regardless if of the status as column 2
I need the subtotal amount for refunds for the third column.
Example:
If a purchase is made on May 1st and refunded on May 3rd. The output should look like this
+-------+----------+--------+
| date | subtotal | refund |
+-------+----------+--------+
| 05-01 | 10.00 | 0.00 |
| 05-02 | 00.00 | 0.00 |
| 05-03 | 00.00 | 10.00 |
+-------+----------+--------+
while the row will look like that
+-----+----------+------------+------------+----------+
| id | subtotal | order_date | updated_at | status |
+-----+----------+------------+------------+----------+
| 123 | 10 | 2019-05-01 | 2019-05-03 | refunded |
+-----+----------+------------+------------+----------+
Query:
Currently what I have looks like this:
Note: Timezone discrepancy therefore bring back the dates by 8 hours.
;with cte as (
select id as orderid
, CAST(dateadd(hour,-8,order_date) as date) as order_date
, CAST(dateadd(hour,-8,updated_at) as date) as updated_at
, subtotal
, status
from orders
)
select
b.dates
, sum(a.subtotal_price) as subtotal
, -- not sure how to aggregate it to get the refunds
from Orders as o
inner join cte as a on orders.id=cte.orderid
inner join (select * from cte where status = ('refund')) as b on o.id=cte.orderid
where dates between '2019-05-01' and '2019-05-31'
group by dates
And do I need to join it twice? Hopefully not since my table is huge.
This looks like a job for a Calendar Table. Bit of a stab in the dark, but:
--Overly simplistic Calendar table
CREATE TABLE dbo.Calendar (CalendarDate date);
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3, N N4, N N5) --Many years of data
INSERT INTO dbo.Calendar
SELECT DATEADD(DAY, T.I, 0)
FROM Tally T;
GO
SELECT C.CalendarDate AS [date],
CASE C.CalendarDate WHEN V.order_date THEN subtotal ELSE 0 END AS subtotal,
CASE WHEN C.CalendarDate = V.updated_at AND V.[status] = 'refunded' THEN subtotal ELSE 0.00 END AS subtotal
FROM (VALUES(123,10.00,CONVERT(date,'20190501'),CONVERT(date,'20190503'),'refunded'))V(id,subtotal,order_date,updated_at,status)
JOIN dbo.Calendar C ON V.order_date <= C.CalendarDate AND V.updated_at >= C.CalendarDate;
GO
DROP TABLE dbo.Calendar;
Consider joining on a recursive CTE of sequential dates:
WITH dates AS (
SELECT CONVERT(datetime, '2019-01-01') AS rec_date
UNION ALL
SELECT DATEADD(d, 1, CONVERT(datetime, rec_date))
FROM dates
WHERE rec_date < '2019-12-31'
),
cte AS (
SELECT id AS orderid
, CAST(dateadd(hour,-8,order_date) AS date) as order_date
, CAST(dateadd(hour,-8,updated_at) AS date) as updated_at
, subtotal
, status
FROM orders
)
SELECT rec_date AS date,
CASE
WHEN c.order_date = d.rec_date THEN subtotal
ELSE 0
END AS subtotal,
CASE
WHEN c.updated_at = d.rec_date THEN subtotal
ELSE 0
END AS refund
FROM cte c
JOIN dates d ON d.rec_date BETWEEN c.order_date AND c.updated_at
WHERE c.status = 'refund'
option (maxrecursion 0)
GO
Rextester demo

SQL spread month value into weeks

I have a table where I have values by month and I want to spread these values by week, taking into account that weeks that spread into two month need to take part of the value of each of the month and weight on the number of days that correspond to each month.
For example I have the table with a different price of steel by month
Product Month Price
------------------------------------
Steel 1/Jan/2014 100
Steel 1/Feb/2014 200
Steel 1/Mar/2014 300
I need to convert it into weeks as follows
Product Week Price
-------------------------------------------
Steel 06-Jan-14 100
Steel 13-Jan-14 100
Steel 20-Jan-14 100
Steel 27-Jan-14 128.57
Steel 03-Feb-14 200
Steel 10-Feb-14 200
Steel 17-Feb-14 200
As you see above, the week that overlaps between Jan and Feb needs to be calculated as follows
(100*5/7)+(200*2/7)
This takes into account tha the week of the 27th has 5 days that fall into Jan and 2 into Feb.
Is there any possible way to create a query in SQL that would achieve this?
I tried the following
First attempt:
select
WD.week,
PM.PRICE,
DATEADD(m,1,PM.Month),
SUM(PM.PRICE/7) * COUNT(*)
from
( select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE
)PM
join
( select '2014-1-20' as week
union
select '2014-1-27' as week
union
select '2014-2-3' as week
)WD
ON WD.week>=PM.Month
AND WD.week < DATEADD(m,1,PM.Month)
group by
WD.week,PM.PRICE, DATEADD(m,1,PM.Month)
This gives me the following
week PRICE
2014-1-20 100 2014-02-01 00:00:00.000 14
2014-1-27 100 2014-02-01 00:00:00.000 14
2014-2-3 200 2014-03-01 00:00:00.000 28
I tried also the following
;with x as (
select price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100 as PRICE
union
select '2014-2-1' as Month, 200 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n <= ndm.nd
)
select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
This gimes me the following
week price
2014-01-07 00:00:00.000 100.00
2014-01-14 00:00:00.000 100.00
2014-01-21 00:00:00.000 100.00
2014-02-04 00:00:00.000 200.00
2014-02-11 00:00:00.000 200.00
2014-02-18 00:00:00.000 200.00
Thanks
If you have a calendar table it's a simple join:
SELECT
product,
calendar_date - (day_of_week-1) AS week,
SUM(price/7) * COUNT(*)
FROM prices AS p
JOIN calendar AS c
ON c.calendar_date >= month
AND c.calendar_date < DATEADD(m,1,month)
GROUP BY product,
calendar_date - (day_of_week-1)
This could be further simplified to join only to mondays and then do some more date arithmetic in a CASE to get 7 or less days.
Edit:
Your last query returned jan 31st two times, you need to remove the =from on n.n < ndm.nd. And as you seem to work with ISO weeks you better change the DATEPART to avoid problems with different DATEFIRST settings.
Based on your last query I created a fiddle.
;with x as (
select price,
datepart(isowk,dateadd(day, n.n, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE) t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from
(select '2014-1-1' as Month, 100.00 as PRICE
union
select '2014-2-1' as Month, 200.00 as PRICE)
t
where t1.month = t.month) ndm
inner join
(SELECT (a.Number * 256) + b.Number AS N FROM
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) a (Number),
(SELECT number FROM master..spt_values WHERE type = 'P' AND number <= 255) b (Number)) n --numbers
on n.n < ndm.nd
) select min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by wk
having count(*) = 7
order by wk
Of course the dates might be from multiple years, so you need to GROUP BY by the year, too.
Actually, you need to spred it over days, and then get the averages by week. To get the days we'll use the Numbers table.
;with x as (
select product, price,
datepart(week,dateadd(day, n.n-2, t1.month)) wk,
dateadd(day, n.n-1, t1.month) dt
from #t t1
cross apply (
select datediff(day, t.month, dateadd(month, 1, t.month)) nd
from #t t
where t1.month = t.month and t1.product = t.product) ndm
inner join numbers n on n.n <= ndm.nd
)
select product, min(dt) as week, cast(sum(price)/count(*) as decimal(9,2)) as price
from x
group by product, wk
having count(*) = 7
order by product, wk
The result of datepart(week,dateadd(day, n.n-2, t1.month)) expression depends on SET DATEFIRST so you might need to adjust accordingly.