How to get next to last MAX date with join - sql

I'm working on a query in MS SQL to show items that have been in our "service center" more than once within the last 30 days. The data needed is from the service order prior to the most recent service order, based on serial number. So if an item has been received in the last 30 days, check to see if it was received at a previous time within the last 30 days.
ServiceOrders table: CustID, ItemID, DateReceived
ItemMaster table: CustID, ItemID, SerialNumber
I can get DateReceived items by using
ServiceOrders.DateReceived >= DATEADD(month,-1,GETDATE())
I could load service orders from the past month into a temp table, and then query against that to get the prior service order, but that doesn't sound like the best plan. Any ideas on an efficient way to get the previous service orders?
Example data
ServiceOrders table:
CustID ItemID DateReceived
1 2 9/26/2016
1 2 9/05/2016
1 2 1/15/2015
5 6 9/20/2016
7 6 9/02/2016
ItemMaster table:
CustID ItemID SerialNumber
1 2 8675309
5 6 101
7 6 101
So in the above example, SerialNumber 8675309 and 101 have been received more than once in the last 30 days. I need the data from ServiceOrders and ItemMaster for the DateReceived 9/05/2016 and 09/02/2016 records (the second most recent within 30 days). There are other fields in both tables, but they're simplified here. CustID won't necessarily stay the same from date to date, as the item can be transferred. SerialNumber is the key.

Filter the last month orders into Common Table Expression into cte and number them descending. Then select those items with more than 1 occurrences into cte2, join both cte's selecting the second row.
;With cte as(
Select row_number() over(PARTITION by ItemID order by DateReceived desc) as RowNum, *
from ServiceOrders
where DateReceived >= DateAdd(Month, -1, Getdate())
), cte2 as(
Select
ItemID From cte
Group by ItemID
Having count(*)>1
)
select b.*, c.SerialNumber from cte2 as a
left join cte as b on a.ItemID= b.ItemID and b.RowNum=2
left join ItemMaster as c on b.ItemID=c.ItemID and b.CustID=c.CustID

SELECT *, COUNT(ServiceOrders.DateReceived) as DateCount FROM TABLE
Paste to excel
Filter last column to show results = or > than 2
Might be a quick solution for you.

Get the items which have been received more than once in the last 30 days. Then use row_number to number the rows based on the descending order of date_received. Finally, get the rows whose row_number is 2 (the date before the latest date in the last 30 days).
If serialnumber is needed in the output, just join the itemmaster table to the final resultset.
WITH morethanone
AS (SELECT
so.itemid
FROM serviceorders so
JOIN itemmaster i
ON i.itemid = so.itemid
GROUP BY so.itemid
HAVING COUNT(CASE WHEN DATEDIFF(dd, so.datereceived, GETDATE()) <= 30 THEN 1 END)>1
)
SELECT
custid,
itemid,
datereceived
FROM (SELECT
*,
ROW_NUMBER() OVER (PARTITION BY itemid ORDER BY datereceived DESC) rn
FROM serviceorders
WHERE itemid IN (SELECT itemid FROM morethanone)
AND DATEDIFF(dd, datereceived, GETDATE()) <= 30
) x
WHERE rn = 2

Related

Calculate average days between orders The last three records tsql

I trying to take an average per customer, but you're not grouping by customer.
I would like to calculate the average days between several order dates from a table called invoice. For each BusinessPartnerID, what is the average days between orders i want average days last three records orders .
I got the average of all order for each user but need days last three records orders
The sample table is as below
;WITH temp (avg,invoiceid,carname,carid,fullname,mobail)
AS
(
SELECT AvgLag = AVG(Lag) , Lagged.idinvoice,
Lagged.carname ,
Lagged.carid ,Lagged.fullname,Lagged.mobail
FROM
(
SELECT
(car2.Name) as carname ,
(car2.id) as carid ,( busin.Name) as fullname, ( busin.Mobile) as mobail , INV.Id as idinvoice , Lag = CONVERT(int, DATEDIFF(DAY, LAG(Date,1)
OVER (PARTITION BY car2.Id ORDER BY Date ), Date))
FROM [dbo].[Invoice] AS INV
JOIN [dbo].[InvoiceItem] AS INITEM on INV.Id=INITEM.Invoiceid
JOIN [dbo].[BusinessPartner] as busin on busin.Id=INV.BuyerId and Type=5
JOIN [dbo].[Product] as pt on pt.Id=INITEM.ProductId and INITEM.ProductId is not null and pt.ProductTypeId=3
JOIN [dbo].[Car] as car2 on car2.id=INv.BusinessPartnerCarId
) AS Lagged
GROUP BY
Lagged.carname,
Lagged.carid,Lagged.fullname,Lagged.mobail, Lagged.idinvoice
-- order by Lagged.fullname
)
SELECT * FROM temp where avg is not null order by avg
I don't really see how your query relate to your question. Starting from a table called invoice that has columns businesspartnerid, and date, here is how you would take the average of the day difference between the last 3 invoices of each business partner:
select businesspartnerid,
avg(1.0 * datediff(
day,
lag(date) over(partition by businesspartnerid order by date),
date
) avg_diff_day
from (
select i.*,
row_number() over(partiton by businesspartnerid order by date desc) rn
from invoice i
) i
where rn <= 3
group by businesspartnerid
Note that 3 rows gives you 2 intervals only, that will be averaged.

Selecting the difference between dates in a stored procedure using a subquery

I can't get my head around whether this is even possible, but I feel like I might have done it before and lost that bit of code. I am trying to craft a select statement that contains an inner join on a subquery to show the number of days between two dates from the same table.
A simple example of the data structure would look like:
Name ID Date Day Hours
Bill 1 3/3/20 Thursday 8
Fred 2 4/3/20 Monday 6
Bill 1 8/3/20 Tuesday 2
Based on this data, I want to select each row plus an extra column which is the number of days between the date from each row for each ID. Something like:
Select * from tblData
Inner join (datediff(Select Top(1) Date from tblData where Date < Date), Date) And ID = ID)
or for simplicity:
Select * from tblData
Inner join (datediff(Select Top(1) Date from tblData where Date < 8/3/20), 8/3/20) And ID = 1)
The resulting dataset would look like:
Name ID Date Day Hours DaysBtwn
Bill 1 3/3/20 Thursday 8 4 (Assuming there was an earlier row in the table)
Fred 2 4/3/20 Monday 6 5 (Assuming there was an earlier row in the table)
Bill 1 8/3/20 Tuesday 2 5 (Based on the previous row date being 3/3/20 for Bill)
Does this make sense and am I trying to do this the wrong way? I want to do this for about 600000 rows in table and therefore efficiency is the key, so if there is a better way to do this, i'm open to suggestions.
You can use lag():
select t.*, datediff(day, lag(date) over(partition by id order by date), date) diff
from mytable t
I think you just want lag():
select t.*,
datediff(day,
lag(date) over (partition by name order by date),
date
) as diff
from tblData t;
Note: If you want to filter the data so rows in the result set are used for the lag() but not in the result set, then use a subquery:
select t.*
from (select t.*,
datediff(day,
lag(date) over (partition by name order by date),
date
) as diff
from tblData t
) t
where date < '2020-08-03';
Also note the use of the date constant as a string in YYYY-MM-DD format.

find users with consecutive logins sql query

i have table contains 2 columns; customer_id, login_date.
For each day, if a customer has logged in, there will be 1 entry in the table for that customer.
Customer_id login_date
--------------------
1 31-Dec-2018
2 31-Dec-2018
3 31-Dec-2018
1 1-Jan-2019
2 1-Jan-2019
3 2-Jan-2019
2 2-Jan-2019
3 3-Jan-2019
3 4-Jan-2019
I need to get the ids of customers who have logged in for at least 3 consecutive days.
Expected output is like below.
Customer_id
------
2
3
So far i have achieved this using below query .
select customer_id from (
select *, case when (lag(logindate,1) over (partition by customer_id order by logindate)) = dateadd(day, -1,logindate) then 1 else 0 end second_day ,
case when (lag(logindate,2) over (partition by customer_id order by logindate)) = dateadd(day, -2,logindate) then 1 else 0 end third_day
from login_history
) a
where a.second_day =1 and a.third_day =1;
But if i have to get customers with 5 consecutive logins i have to keep on adding lag columns.
Is there any better way to get this done?
You can use lag() or lead():
select distinct customer_id
from (select t.*,
lag(login_date, 2) over (partition by customer_id order by login_date) as prev2_login_date
from t
) t
where prev2_login_date = dateadd(day, -2, login_date);
This looks at the login date two rows behind. If it is two days before the current day -- then voila! Three days in a row. This uses the fact that you do not have duplicate dates for a given customer.
Here is a db<>fiddle.

SQL Get records past a certain date, higher than a certain value, with a minimum amount

I'm having a hard time with an SQL query at the moment. I have a list of customer orders, and I want to remove a set of them based on certain criteria:
We need to keep at least 6 of each customers' past orders on hand.
We need to keep all of the customers orders that occurred within the past 90 days.
We need to keep AT LEAST 1 of each customers orders that is older than 90 days (if the customer had 4 orders in the past 90 days, we'll need to keep the 2 from an earlier time to hit the 6 orders requirement.
So, for example, if a customer had 6 orders in the past 90 days, we would keep 7 of their orders (because we include the 1 order from older than 90 days).
If a customer had 21 orders in the past 90 days, we would keep 22 of their orders.
If a customer had 5 orders in the past 90 days, we would keep 6 of their orders.
Here is the query I am using to build a table of their orders:
INSERT INTO #OrdersToDelete
SELECT TempOrders.Site, TempOrders.Number, TempOrders.RowNumber, TempOrders.CustomerNumber
FROM (SELECT
ROW_NUMBER() OVER ( PARTITION BY CustomerNumber ORDER BY OrderDate DESC) AS 'RowNumber',
Number,
OrderDate,
CustomerNumber
FROM Orders
) TempOrders
LEFT OUTER JOIN (SELECT
ROW_NUMBER() OVER ( PARTITION BY CustomerNumber ORDER BY OrderDate DESC) AS 'RowNumber',
Number,
CustomerNumber
FROM SmartOrders
) SmartOrderOrders
ON TempOrders.Site = SmartOrderOrders.Site
AND TempOrders.Number = SmartOrderOrders.Number
WHERE
(DATEDIFF(dd, OrderDate, GETDATE()) > 90
This query returns a list of orders that are up for deletion (older than 90 days). In the WHERE clause, I can also check the order number, but I'm having difficulty figuring out how to exclude the customers first order after the 90 days period.
Any help would be appreciated.
--Get the rownumbers using a case expression in order by
--so all the orders within the last 90 days come first
WITH ROWNUMS AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY CustomerNumber
ORDER BY
CASE WHEN DATEDIFF(dd, OrderDate, GETDATE()) < 90 THEN 1 ELSE 0 END DESC,
OrderDate DESC) AS 'RowNumber',
Number,
OrderDate,
CustomerNumber
FROM Orders)
--Get the maximum rownumber per customer in the last 90 days
,MAXROWNUM AS (select CustomerNumber, MAX(rn) maxrn from ROWNUMS
where diff<=90
group by id)
--Join the previous cte's and get all the orders for a customer in the last 90 days
-- + one more row which is the latest before 90 days
SELECT r.*
FROM ROWNUMS r
JOIN MAXROWNUM c ON c.CustomerNumber=r.CustomerNumber
WHERE r.rn <= c.maxrn+1
--use r.rn <= case when c.maxrn <=5 then 5 else c.maxrn end + 1 to get atleast 6 orders per customer
Give this a shot.
Start off by creating 3 Common Table Expressions (CTEs). You can do them as nested subqueries but I find CTEs easier to read and manage, plus they're easier to explain.
WITH ninety_day_cte
AS
(SELECT temporders.site, temporders.number, temporders.customernumber, temporders.orderdate
FROM orders
WHERE
temporders.orderdate >= DATEADD(DAY,-ninety,GETDATE())),
ninety_day_count_cte
AS
(SELECT temporders.customernumber, COUNT(*) AS Order_Count
FROM orders
WHERE
temporders.orderdate >= DATEADD(DAY,-ninety,GETDATE())
GROUP BY
temporders.customernumber),
greater_ninety_day_cte
AS
(SELECT temporders.site, temporders.number, temporders.customernumber, temporders.orderdate,
ROW_NUMBER() OVER(PARTITION BY temporders.customernumber ORDER BY temporders.orderdate DESC) AS Row_Number
FROM orders
WHERE
temporders.orderdate < DATEADD(DAY,-ninety,GETDATE()))
The first CTE, ninety_day_cte will grab all the orders within the past 90 days - we need this for all customers and we need all orders. Simple, we can set this one aside.
The second CTE, ninety_day_count_cte is used to determine the total count of orders per customer within the last 90 days. We need to know this number to determine how many orders older than 90 days we need to grab.
The third CTE, greater_ninety_day_cte will grab all orders older than 90 days. We add the ROW_NUMBER() to rank the orders per customer by order date - this will help us grab the orders we need for the past 90 days.
Now we need to add the query that will grab the orders for the past 90 days:
SELECT site, number, customernumber, orderdate
FROM greater_ninety_day_cte AS g
LEFT JOIN ninety_day_count AS c
ON g.customernumber = c.customernumber
WHERE
g.Row_Number <= CASE
WHEN CASE WHEN c.Order_Count IS NULL THEN 0 ELSE c.Order_Count END > 6 THEN 1
ELSE (6 - CASE WHEN c.Order_Count IS NULL THEN 0 ELSE c.Order_Count END)
END
This uses the 2nd and 3rd CTEs. We use a LEFT JOIN so we grab data for customers who only have orders older than 90 days. The WHERE clause takes the Row_Number from the 3rd CTE and compares it to the Order_Count from the 2nd CTE. The CASE clauses state that if the Order_Count (Count of orders in the past 90 days) is greater than 6 we only want to pull the Row_Number >= 1, but if the Order_Count is less than 6 then we want to pull the difference (6 - Order_Count). This should get all the orders older than 90 day that meet the requirements.
Now we only need to get the orders that are less than 90 days. This is easily done with a UNION ALL statement using the 1st CTE:
UNION ALL
SELECT site, number, customernumber, orderdate
FROM ninety_day_cte
That should get you all the results you need. At least 6 orders and at least 1 order older than 90 days.
Here's the full query altogether:
WITH ninety_day_cte
AS
(SELECT temporders.site, temporders.number, temporders.customernumber, temporders.orderdate
FROM orders
WHERE
temporders.orderdate >= DATEADD(DAY,-ninety,GETDATE())),
ninety_day_count_cte
AS
(SELECT temporders.customernumber, COUNT(*) AS Order_Count
FROM orders
WHERE
temporders.orderdate >= DATEADD(DAY,-ninety,GETDATE())
GROUP BY
temporders.customernumber),
greater_ninety_day_cte
AS
(SELECT temporders.site, temporders.number, temporders.customernumber, temporders.orderdate,
ROW_NUMBER() OVER(PARTITION BY temporders.customernumber ORDER BY temporders.orderdate DESC) AS Row_Number
FROM orders
WHERE
temporders.orderdate < DATEADD(DAY,-ninety,GETDATE()))
SELECT site, number, customernumber, orderdate
FROM greater_ninety_day_cte AS g
LEFT JOIN ninety_day_count AS c
ON g.customernumber = c.customernumber
WHERE
g.Row_Number <= CASE
WHEN CASE WHEN c.Order_Count IS NULL THEN 0 ELSE c.Order_Count END > 6 THEN 1
ELSE (6 - CASE WHEN c.Order_Count IS NULL THEN 0 ELSE c.Order_Count END)
END
UNION ALL
SELECT site, number, customernumber, orderdate
FROM ninety_day_cte

query to display additional column based on aggregate value

I've been mulling on this problem for a couple of hours now with no luck, so I though people on SO might be able to help :)
I have a table with data regarding processing volumes at stores. The first three columns shown below can be queried from that table. What I'm trying to do is to add a 4th column that's basically a flag regarding if a store has processed >=$150, and if so, will display the corresponding date. The way this works is the first instance where the store has surpassed $150 is the date that gets displayed. Subsequent processing volumes don't count after the the first instance the activated date is hit. For example, for store 4, there's just one instance of the activated date.
store_id sales_volume date activated_date
----------------------------------------------------
2 5 03/14/2012
2 125 05/21/2012
2 30 11/01/2012 11/01/2012
3 100 02/06/2012
3 140 12/22/2012 12/22/2012
4 300 10/15/2012 10/15/2012
4 450 11/25/2012
5 100 12/03/2012
Any insights as to how to build out this fourth column? Thanks in advance!
The solution start by calculating the cumulative sales. Then, you want the activation date only when the cumulative sales first pass through the $150 level. This happens when adding the current sales amount pushes the cumulative amount over the threshold. The following case expression handles this.
select t.store_id, t.sales_volume, t.date,
(case when 150 > cumesales - t.sales_volume and 150 <= cumesales
then date
end) as ActivationDate
from (select t.*,
sum(sales_volume) over (partition by store_id order by date) as cumesales
from t
) t
If you have an older version of Postgres that does not support cumulative sum, you can get the cumulative sales with a subquery like:
(select sum(sales_volume) from t t2 where t2.store_id = t.store_id and t2.date <= t.date) as cumesales
Variant 1
You can LEFT JOIN to a table that calculates the first date surpassing the 150 $ limit per store:
SELECT t.*, b.activated_date
FROM tbl t
LEFT JOIN (
SELECT store_id, min(thedate) AS activated_date
FROM (
SELECT store_id, thedate
,sum(sales_volume) OVER (PARTITION BY store_id
ORDER BY thedate) AS running_sum
FROM tbl
) a
WHERE running_sum >= 150
GROUP BY 1
) b ON t.store_id = b.store_id AND t.thedate = b.activated_date
ORDER BY t.store_id, t.thedate;
The calculation of the the first day has to be done in two steps, since the window function accumulating the running sum has to be applied in a separate SELECT.
Variant 2
Another window function instead of the LEFT JOIN. May of may not be faster. Test with EXPLAIN ANALYZE.
SELECT *
,CASE WHEN running_sum >= 150 AND thedate = first_value(thedate)
OVER (PARTITION BY store_id, running_sum >= 150 ORDER BY thedate)
THEN thedate END AS activated_date
FROM (
SELECT *
,sum(sales_volume)
OVER (PARTITION BY store_id ORDER BY thedate) AS running_sum
FROM tbl
) b
ORDER BY store_id, thedate;
->sqlfiddle demonstrating both.