Joining on the same key on the next row - sql

Suppose we have a table which contains customer_id, order_date, and ship_date. A reorder of the product occurs when the same customer's next order_date is within 30 days of the last ship_date.
select * from mytable
customer_id order_date ship_date
1 2017-08-04 2017-08-09
1 2017-09-01 2017-09-05
2 2017-02-02 2017-03-01
2 2017-04-05 2017-04-09
2 2017-04-15 2017-04-19
3 2018-02-02 2018-03-01
Requested: Reorders
customer_id order_date ship_date
1 2017-09-01 2017-09-05
2 2017-04-15 2017-04-19
How can I retrieve only the records for the same customers who had reorders, next order_date within 30
days of the last ship_date.

You can use exists as follows:
Select * from your_table t
Where exists (select 1 from your_table tt
Where tt.customer_id = t.customer_id
And t.ship_date > tt.ship_date
and t.ship_date <= dateadd(day, 30, tt.ship_date))

One method is lead():
select t.customer_id, t.order_date, t.next_ship_date
from (select t.*,
lead(order_date) over (partition by customer_id order by order_date) as next_order_date
lead(ship_date) over (partition by customer_id order by order_date) as next_ship_date
from t
) t
where next_order_date < dateadd(day, 30, ship_date);
EDIT:
If you want the "reorder" row, just use lag():
select t.*
from (select t.*,
lag(ship_date) over (partition by customer_id order by order_date) as prev_ship_date
from t
) t
where prev_ship_date > dateadd(day, 30, order_date);

Related

Select max date per year

I have the table as follows:
user_id date
1 2020-11-15
1 2020-10-15
1 2020-09-15
1 2019-12-15
1 2019-11-15
2 2020-11-15
2 2020-10-15
2 2019-12-15
3 2020-10-15
3 2020-09-15
And I'd like to select the max date for every year per user, so the result would be like:
user_id date
1 2020-11-15
1 2019-12-15
2 2020-11-15
2 2019-12-15
3 2020-10-15
Some help?
Thank you!
Just use aggregation:
select user_id, max(date)
from t
group by user_id, date_trunc('year', date);
If you have more columns that you want, then use distinct on:
select distinct on (user_id, date_trunc('year', date)) t.*
from t
order by user_id, date_trunc('year', date), date desc;
You can use not exists as follows:
Select t.*
From your_table t
Where not exists (select 1 from your_table tt
Where t.id = tt.id
And date_trunc('year', t.date) = date_trunc('year', tt.date)
And tt.date > t.date)
Or you can use row_number analytical function as follows:
Select * from
(Select t.*,
Row_number() over (partition by t.id, date_trunc('year', t.date)
order by t.date desc) as rn
From your_table t) t
Where rn = 1

how to group data after every change but not to merge group even if the next time value repeated in sql

i have to group data on basis of amount column but if the amount repeat after some interval then it should be treated as new group.e.g
CREATE TABLE [dbo].[TEST](
[ID] [INT] NULL,
[DLRCODE] [VARCHAR](20) NULL,
[AMN] [DECIMAL](21, 5) NULL,
[RATE] [DECIMAL](7, 5) NULL,
[DTE] [DATETIME] NULL
) ON [NFS_DATA]
-----this should be first group
1 123 10.00000 5.00000 2019-11-01 00:00:00.000
2 123 10.00000 5.00000 2019-11-02 00:00:00.000
3 123 10.00000 5.00000 2019-11-03 00:00:00.000
-----this should be second group
4 123 15.00000 5.00000 2019-11-04 00:00:00.000
-----this should be third group
5 123 10.00000 5.00000 2019-11-05 00:00:00.000
6 123 10.00000 5.00000 2019-11-06 00:00:00.000
-----this should be fourth group
7 123 20.00000 5.00000 2019-11-07 15:02:07.537
as you can check from above code and data, result should be group, every time amount change new group will be created.
result will like this
1 30 --- group of first three records
2 15 --- group of fourth records
3 20 --- group of fifth and sixth records
4 20 --- group of seven record
You can do this by using a combination of LAG and conditional aggregation:
WITH CTE AS
(
SELECT Id
, DLRCode
, Amn
, Rate
, DTE
, ISNULL(LAG(Amn) OVER(ORDER BY DTE), Amn) As PreviousAmount
FROM dbo.Test
)
SELECT Id
, DLRCode
, Amn
, Rate
, DTE
, SUM(IIF(Amn = PreviousAmount, 0, 1)) OVER(ORDER BY DTE) As Grp
FROM CTE
To get your result set, you only need lag(), taking both the date and the amount into account:
select t.*
from (select t.*,
lag(amn) over (partition by dlrcode, rate order by dte) as prev_amn,
lag(dte) over (partition by dlrcode, rate order by dte) as prev_dte
from test t
) t
where prev_amn is null or
prev_amn <> amn or
prev_dte < dateadd(day, -1, dte);
If you want to incorporate this into a group id and then summarize the groups -- with information from multiple rows -- then we'll add a group id as the cumulative sum of the group changes and aggregate:
select dlrcode, rate, amn, min(dte), max(dte),
count(*)
from (select t.*,
sum(case when prev_amn = amn and prev_dte >= dateadd(day, -1, dte)
then 0 else 1
end) over (partition by dlrcode, rate) as grp
from (select t.*,
lag(amn) over (partition by dlrcode, rate order by dte) as prev_amn,
lag(dte) over (partition by dlrcode, rate order by dte) as prev_dte
from test t
) t
) t
group by dlrcode, rate, amn, grp;

Find the start and end date of stock difference

Please Suggest good sql query to find the start and end date of stock difference
imagine i data in a table like below.
Sample_table
transaction_date stock
2018-12-01 10
2018-12-02 10
2018-12-03 20
2018-12-04 20
2018-12-05 20
2018-12-06 20
2018-12-07 20
2018-12-08 10
2018-12-09 10
2018-12-10 30
Expected result should be
Start_date end_date stock
2018-12-01 2018-12-02 10
2018-12-03 2018-12-07 20
2018-12-08 2018-12-09 10
2018-12-10 null 30
It is the gap and island problem. You may use row_numer and group by for this.
select t.stock, min(transaction_date), max(transaction_date)
from (
select row_number() over (order by transaction_date) -
row_number() over (partition by stock order by transaction_date) grp,
transaction_date,
stock
from data
) t
group by t.grp, t.stock
In the following DBFIDDLE DEMO I solve also the null value of the last group, but the main idea of finding consecutive rows is build on the above query.
You may check this for an explanation of this solution.
You can try below using row_number()
select stock,min(transaction_date) as start_date,
case when min(transaction_date)=max(transaction_date) then null else max(transaction_date) end as end_date
from
(
select *,row_number() over(order by transaction_date)-
row_number() over(partition by stock order by transaction_date) as rn
from t1
)A group by stock,rn
Try to use GROUP BY with MIN and MAX:
SELECT
stock,
MIN(transaction_date) Start_date,
CASE WHEN COUNT(*)>1 THEN MAX(transaction_date) END end_date
FROM Sample_table
GROUP BY stock
ORDER BY stock
You can try with LEAD, LAG functions as below:
select currentStockDate as startDate,
LEAD(currentStockDate,1) as EndDate,
currentStock
from
(select *
from
(select
LAG(transaction_date,1) over(order by transaction_date) as prevStockDate,
transaction_date as CurrentstockDate,
LAG(stock,1) over(order by transaction_date) as prevStock,
stock as currentStock
from sample_table) as t
where (prevStock <> currentStock) or (prevStock is null)
) as t2

Additional condition withing partition over

https://www.db-fiddle.com/f/rgLXTu3VysD3kRwBAQK3a4/3
My problem here is that I want function partition over to start counting the rows only from certain time range.
In this example, if I would add rn = 1 at the end, order_id = 5 would be excluded from the results (because partition is ordering by paid_date and there's order_id = 6 with earlier date) but it shouldn't be as I want that time range for partition starts from '2019-01-10'.
Adding condition rn = 1expected output should be order_id 3,5,11,15, now its only 3,11,15
it should include only orders with is_paid = 0 that are the first one within given time range (if there's preceeding order with is_paid = 1 it shouldn't be counted)
use correlated subquery with not exists
DEMO
SELECT order_id, customer_id, amount, is_paid, paid_date, rn FROM (
SELECT o.*,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY paid_date,order_id) rn
FROM orders o
WHERE paid_date between '2019-01-10'
and '2019-01-15'
) x where rn=1 and not exists (select 1 from orders o1 where x.order_id=o1.order_id
and is_paid=1)
OUTPUT:
order_id customer_id amount is_paid paid_date rn
3 101 30 0 10/01/2019 00:00:00 1
5 102 15 0 10/01/2019 00:00:00 1
11 104 31 0 10/01/2019 00:00:00 1
15 105 11 0 10/01/2019 00:00:00 1
If priority should be given to order_id then put that before paid date in the partition function order by clause, this will solve your issue.
SELECT order_id, customer_id, amount, is_paid, paid_date, rn FROM (
SELECT o.*,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY order_id,paid_date) rn
FROM orders o
) x WHERE is_paid = 0 and paid_date between
'2019-01-10' and '2019-01-15' and rn=1
Since you need the paid date to be ordered first you need to imply a where condition in the partitioning table in order to avoid unnecessary dates interrupting the partition function.
SELECT order_id, customer_id, amount, is_paid, paid_date, rn FROM (
SELECT o.*,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY paid_date, order_id) rn
FROM orders o
where paid_date between '2019-01-10' and '2019-01-15'
) x WHERE is_paid = 0 and rn=1

SQL Server Select the most recent past date if no future date available

I have a table structure as below,
CREATE TABLE #CustOrder ( CustId INT, OrderDate DATE )
INSERT #CustOrder ( CustId, OrderDate )
VALUES ( 1, '2016-11-01' ),
( 1, '2019-09-01' ),
( 2, '2019-07-01' ),
( 2, '2019-11-01' ),
( 3, '2017-01-01' ),
( 4, '2016-12-01' ),
( 4, '2017-01-01' )
I want to list the customer with their future order dates, if they do not have a future order I want to list their last or most recent order. I have the following query.
; WITH LastOrder AS
(
SELECT
CO.CustId,
CO.OrderDate,
ROW_NUMBER() OVER(PARTITION BY CO.CustId ORDER BY ABS(DATEDIFF(DAY, CO.OrderDate, GETUTCDATE()))) AS RowNum
FROM #CustOrder AS CO
)
SELECT LO.CustId, LO.OrderDate
FROM LastOrder AS LO
WHERE LO.RowNum = 1
This query gives me the result as,
CustId | OrderDate
--------+-------------
1 | 2016-11-01
2 | 2019-07-01
3 | 2017-01-01
4 | 2017-01-01
However, I need the result as,
CustId | OrderDate
--------+-------------
1 | 2019-09-01
2 | 2019-07-01
3 | 2017-01-01
4 | 2017-01-01
As
Customer 1 has a future order on 2019-09-01
Customer 2 has two future order but the first one is on 2019-07-01
Customer 3 has no more than 1 order, it should just return 2017-01-01
Customer 4 has two past orders but the most recent is 2017-01-01
rextester: http://rextester.com/PBKNA95127
CREATE TABLE #CustOrder ( CustId INT, OrderDate DATE )
INSERT #CustOrder ( CustId, OrderDate )
VALUES ( 1, '2016-11-01' ),
( 1, '2019-09-01' ),
( 2, '2019-07-01' ),
( 2, '2019-11-01' ),
( 3, '2017-01-01' ),
( 4, '2016-12-01' ),
( 4, '2017-01-01' )
; WITH LastOrder AS
(
SELECT
CO.CustId,
CO.OrderDate,
ROW_NUMBER() OVER(PARTITION BY CO.CustId
ORDER BY case when co.OrderDate > getdate() then 0 else 1 end
, abs(DATEDIFF(DAY, getdate(),CO.OrderDate)) asc
) AS RowNum
FROM #CustOrder AS CO
)
SELECT LO.CustId, LO.OrderDate
FROM LastOrder AS LO
WHERE LO.RowNum = 1
results:
+--------+------------+
| CustId | OrderDate |
+--------+------------+
| 1 | 2019-09-01 |
| 2 | 2019-07-01 |
| 3 | 2017-01-01 |
| 4 | 2017-01-01 |
+--------+------------+
You can use the MAX function to check if the latest date is in the future. If so, get the MIN date after today using MIN. Else get the latest date.
SELECT CUSTID,OrderDate
FROM (SELECT CustId,
OrderDate,
CASE WHEN MAX(orderdate) OVER(PARTITION BY CustId) > GETUTCDATE()
THEN MIN(case when orderdate >getutcdate() then orderdate end) OVER(PARTITION BY CustId)
ELSE MAX(orderdate) OVER(PARTITION BY CustId) end as latest_date
FROM #CustOrder) T
WHERE latest_date=orderDate
Min, Max, UNION approach
select custID, MIN(OrderDate)
from #CustOrder
where OrderDate > '2017-02-17'
group by custID
union all
select co1.custID, max(co1.OrderDate)
from #CustOrder co1
where not exists ( select 1
from #CustOrder co2
where co2.CustId = co1.CustId
and co2.OrderDate > '2017-02-17'
)
group by co1.custID
Start your ORDER BY with a CASE expression that prefers future over past, and then use the ABS DATEDIFF (like you have now) as the second condition in the ORDER BY.
Maybe create another column and use the LAG() window function to grab the last date function and then put a conditional/case statement within the select portion? https://msdn.microsoft.com/en-us/library/hh231256.aspx