Selecting a date in the future - sql

I am working on a shipment delivery report to determine if shipments are made within a shipment window.
Every release has a Ship_Date value that is the date that the release must ship. Some releases though have a late window integer value that says if the shipment is made within X number of days then it is still considered on time.
This is complicated by another table which holds valid ship days for the month (used to exclude holidays, weekends, and such).
Order_Releases_Table
Part_No,
Quantity,
Ship_Date,
Window
Shipping_Date
Shipping_Day
Sample Data
Order_Releases_Table
Part_No Quantity Ship_Date Window
ABC 100 9/1/2011 0
XYZ 200 9/1/2011 2
Shipping_Date
9/1/2011
9/2/2011
9/5/2011
So with this data part ABC has to ship on 9/1 to be considered on time. Part XYZ though can ship up to 2 days past 9/1 and still be considered on time, but since 9/3 isn't in our shipping days, then 9/5 is the last day it can ship and still be considered on time.
I think the answer lies in joining in a sub query of the shipping days table that assigns a row number to the shipping_day field.
SELECT
Row_Number() OVER(ORDER BY Shipping_Date) AS Day_No,
Shipping_day
FROM Shipping_Date
WHERE Shipping_Day > Ship_Date
RETURNS
Day_No Shipping_Day
1 9/2/2011
2 9/5/2011
Then if I simply pick up the date where the Day_No from this sub query is equal to window value from the release, I then have the last day a particular shipment can ship and still be considered on time.
I'm having a hard time wrapping it all up in to the final query though.
Is this the correct way to approach the problem?

Maybe this will get you started:
DECLARE #t TABLE (Part CHAR(3), ShipDate DATETIME, Window INT)
DECLARE #ship TABLE (ShipDate DATETIME)
INSERT INTO #t
( Part, ShipDate, Window )
SELECT 'abc', '20110901', 0
UNION
SELECT 'xyz', '20110901', 2
INSERT INTO #ship
( ShipDate )
SELECT '20110901'
UNION
SELECT '20110905'
UNION
SELECT '20110910'
SELECT Part, ShipDate, Window,
(SELECT MIN(ShipDate) AS NextShip
FROM #ship S
WHERE s.shipDate >= DATEADD(day, t.Window, t.shipDate))
FROM #t t

Related

Group over dynamic date ranges

I have a table with IDs, dates and values. I want to always merge the records based on the ID that are within a 90 day window. In the example below, these are the rows marked in the same color.
The end result should look like this:
The entry with the RowId 1 opens the 90 days window for the ID 133741. RowId 2 and 3 are in this window and should therefore be aggregated together with RowId 1.
RowId 4 would be in a 90 day window with 2 and 3, but since it is outside the window for 1, it should no longer be aggregated with them but should be considered as the start of a new 90 day window. Since there are no other entries in this window, it remains as a single line.
The date for line 5 is clearly outside the 90 day window of the other entries and is therefore also aggregated individually. Just like line 6, since this is a different ID.
Below some example code:
create table #Table(RowId int, ID nvarchar(255) , Date date, Amount numeric(19,1));
insert into #Table values
(1,'133742', '2021-07-30', 1.00 ),
(2,'133742', '2021-08-05', 3.00 ),
(3,'133742', '2021-08-25', 10.00 ),
(4,'133742', '2021-11-01', 7.00 ),
(5,'133742', '2022-08-25', 11.00 ),
(6,'133769', '2021-11-13', 9.00 );
I tried with windowfunctions and CTEs but I could'nt find a way to include all my requirements
With the window function first_value() over() we calculate the distance in days divided by 90 to get the derived Grp
Example
with cte as (
Select *
,Grp = datediff(day,first_value(Date) over (partition by id order by date) ,date) / 90
from #Table
)
Select ID
,Date = min(date)
,Amount = sum(Amount)
From cte
Group By ID,Grp
Order by ID,min(date)
Results
ID Date Amount
133742 2021-07-30 14.0
133742 2021-11-01 7.0
133742 2022-08-25 11.0
133769 2021-11-13 9.0

Identify double seat bookings via sql

I have to make a report to identify double seat bookings . One can book a seat for a date range or a single date. Like the columns date_from to date_to can be a single day or a range( like from 16th Jan till 16th Jan or from 10th Jan to 30th Jan)
The problem is that the system allows double booking in case when there is an overlapping date range like if someone wants to book seat no 7 from 10th Jan to 16th Jan and someone books the same seat from 12thJan to 13th Jan. But it should not, that is what I have to flag about
I have tried writing the below query but my query does not identify anything in date ranges.. it only works for single dates. I would need to first break these date ranges in single dates and then run my query to work -
;with duplicate_seat(desk_id,date_from,date_to,name) as
(
select da.desk_id, da.date_from,da.date_to, hr.name as name
FROM [human_resources].[dbo].[desks_temporary_allocations] da
JOIN[human_resources].[dbo].hrms_mirror hr ON hr.sage_id = da.sage_id
)
select ds.desk_id,ds.date_from,ds.date_to,count(ds.desk_id)as occurences,min(ds.name)as Name1,max(ds.name) as Name2
from duplicate_seat ds
where ds.name like ('priyanka%')
group by ds.desk_id,ds.date_from,ds.date_to
having count(ds.desk_id)>1
This will give result like-
enter image description here
as you can see it is not picking up any date ranges.. only for a single date..But there were double bookings in case date ranges which this query is not showing. Can anyone please help me with this?
As others have suggested, you should remove the email part of your question and post that separately once this is resolved.
For simplicity, I've used temp tables to demonstrate this but it should be easy to convert to a CTE is you wish.
The key to the is having a Date table. If you don't have one, there are plenty of examples of how to generate one quickly. In this case my date table is called [Config].[DatesTable]
CREATE TABLE #t (desk_id int, date_from date, date_to date, EmpName varchar(10));
insert into #t VALUES
(1, '2022-12-25', '2023-01-01', 'Dave'),
(2, '2023-01-15', '2023-01-15', 'Jane'),
(2, '2023-01-12', '2023-01-20', 'Bob'),
(2, '2023-01-15', '2023-01-17', 'Mary');
-- desks and the dates they are over booked on
SELECT desk_id, TheDate
INTO #OverBookedDeskByDate
FROM (SELECT t.* , dt.TheDate
FROM #t t
JOIN Config.DatesTable dt on dt.TheDate between t.date_from and t.date_to
) a
GROUP BY desk_id, TheDate
HAVING Count(*) >1
-- find the bookings that overlap these desks/dates
SELECT t.*, o.TheDate FROM #OverBookedDeskByDate o
JOIN #t t on o.TheDate between t.date_from and t.date_to
ORDER by EmpName, desk_id, TheDate
I've created 3 bookings with some overlapping dates for desk 2.
Here are the results

SQL Selecting records sorted by growth over time

I have a simple table that contains a daily summary of the sales volumes of a couple hundred thousand products. One row for each product and date, with whatever quantity was sold that day. Table format is:
CREATE TABLE DAILYSALES (ID numeric IDENTITY PRIMARY KEY, ProductID numeric NOT NULL, XDate Date NOT NULL, QTY_SOLD int NOT NULL)
A record will only be in the table if there were sales that day, so there are no records where QTY_SOLD is zero.
I need to figure out a way to query this data within a date range, say, the last 30 days, but sorted by a growth trend (products that showed the most growth over the period would be on top).
The difference in quantities sold is off the charts... some products sell 1,000+ units per day consistently, while others sell 1 or 2 or zero on an average day and just have a couple of spikes here and there.
In an ideal result set, a product that sold 10 units a day on the first of the month, and grew by one unit a day to 40 units per day at the end of the month would rank higher than a product that sold 1,000 units a day on average and grew to 2,000 by the end of the month (a 4X growth level vs 2X).
The trouble I keep running into is that products with little to no sales but a couple of big spikes near the end always end up on top. A product that goes from 1 sale at the start of the month, nothing all month, and then 20 sales on the last day would show up first with the above model -- that shouldn't outrank a product with steadier sales.
I'm not sure what the best way to write this query would be. I imagine some kind of subquery that factors in the number of records (ie; number of days with data) that exist in the result set should be a factor, but I'm not sure where to begin. Would appreciate any suggestions, in particular from those who work with large data sets and have had to do something similar.
I would suggest to try some linear regression for this task. First identify the slope per article, then sort by slope descending. This way you should be able to identify the artricle qith the best growth. In the following example I have one article without sales, one with a constant size and one which starts at 0 and then growth in the following month:
DECLARE #t TABLE(
ArticleId int,
SoldDate date,
SoldQty int
)
;WITH cteDat AS(
SELECT CAST('2020-01-01' AS DATE) AS Dat
UNION ALL
SELECT DATEADD(d, 1, Dat)
FROM cteDat
WHERE Dat < '2020-12-31'
)
INSERT INTO #t
SELECT 123 AS ArticleId, Dat AS SoldDate, 0 AS SoldQty
FROM cteDat
UNION ALL
SELECT 456 AS ArticleId, Dat AS SoldDate, 100 AS SoldQty
FROM cteDat
UNION ALL
SELECT 789 AS ArticleId, Dat AS SoldDate, 0 AS SoldQty
FROM cteDat
OPTION (MAXRECURSION 0)
UPDATE #t
SET SoldQty = 50
WHERE ArticleId = 789
AND MONTH(SoldDate) > 7
;WITH cteRaw AS(
SELECT CAST(ArticleId AS FLOAT) AS ArticleId, CAST(CONVERT(NVARCHAR(8), SoldDate, 112) AS FLOAT) DatSID, CAST(SoldQty AS FLOAT) AS SoldQty
FROM #t
),
cteLinRegBase AS(
SELECT ArticleId
,COUNT(*) AS SampleSize
,SUM(DatSID) AS SumX
,SUM(SoldQty) AS SumY
,SUM(DatSID*DatSID) AS SumXX
,SUM(SoldQty*SoldQty) AS SumYY
,SUM(DatSID*SoldQty) AS SumXY
FROM cteRaw
GROUP BY ArticleId
)
SELECT ArticleId, CASE
WHEN SampleSize = 1 THEN 0 -- avoid divide by zero error
ELSE ( SampleSize * sumXY - sumX * sumY ) / ( SampleSize * sumXX - Power(sumX, 2) )
END
FROM cteLinRegBase
However, instead of calculating with the date as number, you could also add a rownumber or whatever to represent the X axis.

How to implement loops in SQL?

I am trying to calculate a KPI for each patient, the KPI is called "Initial prescription start date(IPST)".
The definition of IPST is if the patient has a negative history of using that particular medication for 60 days before a start date that start date is a IPST.
For example- See screen shot below, for patient with ID=101, I will start with IPST as 4/15/2019 , the difference in days between 4/15/2019 and 4/1/2019 is 14 <60 thus I will change my IPST to 4/1/2019.
Continuing with this iteration IPST for 101 is 3/17/2019 and 102 is 3/18/2018 as shown on the right hand side table.
I tried to build a UDF as below, where I am passing id of a patient and UDF is returning the IPST.
CREATE FUNCTION [Initial_Prescription_Date]
(
#id Uniqueidentifier
)
RETURNS date
AS
BEGIN
{
I am failing to implement this code here
}
I can get a list of Start_dates for a patient from a medication table like this
Select id, start_date from patient_medication
I will have to iterate through this list to get to the IPST for a patient.
I'll answer in order to start a dialog that we can work on.
The issue that I have is the the difference in days for ID = 102 between the last record and the one you've picked as the IPST is 29 days, but the IPST you've picked for 102 is 393 days, is that correct?
You don't need to loop to solve this problem. If you're comparing all of your dates only to your most recent, you can simply use MAX:
DECLARE #PatientRecords TABLE
(
ID INTEGER,
StartDate DATE,
Medicine VARCHAR(100)
)
INSERT INTO #PatientRecords VALUES
(101,'20181201','XYZ'),
(101,'20190115','XYZ'),
(101,'20190317','XYZ'),
(101,'20190401','XYZ'),
(101,'20190415','XYZ'),
(102,'20190401','XYZ'),
(102,'20190415','XYZ'),
(102,'20190315','XYZ'),
(102,'20180318','XYZ');
With maxCTE AS
(
SELECT *, DATEDIFF(DAY, StartDate, MAX(StartDate) OVER (PARTITION BY ID, MEDICINE ORDER BY StartDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) [IPSTDateDifference]
FROM #PatientRecords
)
SELECT m.ID, m.Medicine, MIN(m.StartDate) [IPST]
FROM maxCTE m
WHERE [IPSTDateDifference] < 60
GROUP BY m.ID, m.Medicine
ORDER BY 1,3;

T-Sql Cartesian Join to Fill Dates and Join to CTE to Get Most Recent

Our current method of calculating out of stock no longer works for how we track inventory and how we need to view the data. The new method I want to use is to only look at the final OnHandAfter value for each day in the trailing year. We are not 24/7 so the last value entered at the end of each day will tell us if the item was in/out of stock that day. If an item has no inventory transactions for a date it should use the previous found date.
My current query does a cross join of all out items (I currently have
it set to a single item for testing) and a calendar table. This give
me 365 days for each item. This is working.
My cte query returns the final OnHandAfter for each date there was a
transaction. This is working if run by itself.
With the <= date condition commented out I get 365 rows returned but
dates from the cte are NULL. If the condition is not commented out 0
rows are returned.
Note, the next step is to include the OnHandAfter field but for now I
can't seem to get the cte to connect.
ABDailyCalendar abdc
This is a table prefilled with every date in the trailing year
Sample Inventory Data (what the cte returns for single item if run by itself, I left out some columns for brevity)
ItemCode TransactionDate OnHandAfter rn
Item-123 10/1/2018 960 1
Item-123 9/28/2018 985 1
Item-123 9/27/2018 1085 1
Item-123 9/26/2018 1485 1
Item-123 9/24/2018 1835 1
Item-123 9/20/2018 2035 1
Item-123 9/18/2018 2185 1
Item-123 9/14/2018 2305 1
Item-123 9/13/2018 2605 1
My Query
with cte as
(
Select TOP 1 * from
(
Select
ItemCode
,convert(Date,TransactionDate) TransactionDate
,TransactionType
,TransactionQuantity
,OnHandBefore
,OnHandAfter
,ROW_NUMBER() over (partition by ItemCode, CONVERT(Date, TransactionDate) order by TransactionDate DESC) as rn
from InventoryTransaction
where TransactionType in (1,2,4,8)
) as ss
where rn = 1
order by TransactionDate DESC
)
SELECT
ab.ExternalId
,abdc.[Date]
,cte.TransactionDate
From ABItems ab CROSS JOIN ABDailyCalendar abdc
FULL OUTER JOIN cte on cte.ItemCode = ab.ExternalId --and cte.TransactionDate <= abdc.[Date]
Where ab.ExternalID = 'Item-123'
order by abdc.[Date] DESC
Current Sample Results
ExternalId Date TransactionDate
Item-123 9/30/2018 NULL
Item-123 9/29/2018 NULL
Item-123 9/28/2018 NULL
Item-123 9/27/2018 NULL
Item-123 9/26/2018 NULL
Item-123 9/25/2018 NULL
Item-123 9/24/2018 NULL
Desired Results
ExternalId Date TransactionDate
Item-123 9/30/2018 9/28/2018
Item-123 9/29/2018 9/28/2018
Item-123 9/28/2018 9/28/2018
Item-123 9/27/2018 9/27/2018
Item-123 9/26/2018 9/26/2018
Item-123 9/25/2018 9/24/2018
Item-123 9/24/2018 9/24/2018
The TransactionDate should be the most recent TransactionDate that is <= to the Date.
If it matters - I am running SSMS 2012 connected to SQL Server 2008.
Any pointers or ideas will be greatly appreciated. I have stared at it so long that nothing new is coming to me. Thanks.
I used postgres but it's broadly the same as SQLS for this operation. Here's an impl of what i wrote in my comment:
https://www.db-fiddle.com/f/uKcgh9yZVvvqfRWTERv2a3/0
We make some sample data on the left side of the fiddle. This is PG specific but shouldn't matter too much - end result is it gets to the same place you are with your data in SQLS
Then the query:
SELECT
itemcode,
caldate,
case when caldate = transactiondate then onhandafter else prev_onhandafter end as onhandat,
case when caldate = transactiondate then 'tran occurred today, using current onhandafter' else 'no tran today, using previous onhandafter' end as reasoning,
transactiondate,
onhandafter,
prev_onhandafter
FROM
(
SELECT
itemcode,
transactiondate,
LAG(transactiondate) over(partition by itemcode order by transactiondate) as prev_transactiondate,
onhandafter,
LAG(onhandafter) over(partition by itemcode order by transactiondate) as prev_onhandafter
FROM
t
) t2
INNER JOIN
c
ON
c.caldate > t2.prev_transactiondate and c.caldate <= t2.transactiondate
ORDER BY itemcode, caldate
itemcode/externalid (you called it both)
Bunch of dates - whether your dates are DATE or DATETIME they're comparable. No harm in casting a DATETIME to a DATE if you want, and if any of your dates contain a time component it may well be vital to do so, because 2018-01-01 00:00 is not the same as 2018-01-01 01:00, and if your calendar table has midnight, and the transactiondate is 1 am, then the range join condition (caldate > prevtrandate and caldate <= trandate) won't work out properly. Feel free to cast as part of the join: caldate > CAST(prevtrandate as DATE) and caldate <= CAST(trandate as DATE). If your datetimes are 100% guaranteed to be exactly bang on midnight (to the microsecond) then the join will work out without casting - casting here is a quick trick to strip the time off and ensure apples are comparing to apples
OK, so how this works:
Rather than number the rows in the table in a cte and join it to itself I used a similar technique using LAG to get the previous row's values I'm interested in. Previous here is defined as "per item code, in ascending order of trandate". This gives us rows that have a current trandate, a previous trandate (note: null for the first row, some extra fiddling with the query, like COALESCE(lag(...), trandate) will be required if it is to be kept otherwise it will disappear when joined) a previous onhand. We''l use the date pair to join and we'll later choose whether to present the current or previous onhand.
This is done as a subquery so that the prev values become available to use. It is joined to the calendar table on the calendar date being greater than the prev trandate and less than or equal to the current trandate. This means that a cartesian product fills in all the gaps in the transaction dates so we get a contiguous set of dates out of the calendar table.
We use a case when to examine the data - if the cal date is equal to the tran date, we use the new value for onhand because a tran occurred today and decremented the stock. Otherwise we can assert that there was no transaction today, and we should use the prev onhand instead.
Hopefully this is all the right way round with regards to what you want (you seemed to indicate that it was onhandafter that you actually wanted, but your desired query output only mentioned the transaction date/caldate pair
Edit: ok, lag isn't available- here's a solution that uses rownumber:
https://www.db-fiddle.com/f/2ooVrNF18stUQAa4HyTj6r/0
WITH cte AS(
SELECT
itemcode,
transactiondate,
ROW_NUMBER() over(partition by itemcode order by transactiondate) as rown,
onhandafter
FROM
t
)
SELECT
curr.itemcode,
c.caldate,
case when c.caldate = curr.transactiondate then curr.onhandafter else prev.onhandafter end as onhandat,
case when c.caldate = curr.transactiondate then 'tran occurred today, using current onhandafter' else 'no tran today, using previous onhandafter' end as reasoning,
curr.transactiondate,
curr.onhandafter,
prev.onhandafter
FROM
cte curr
INNER JOIN
cte prev
ON curr.rown = prev.rown + 1 and curr.itemcode = prev.itemcode
INNER JOIN
c
ON
c.caldate > prev.transactiondate and c.caldate <= curr.transactiondate
ORDER BY curr.itemcode, c.caldate
How it works: pretty much as my first comment. We partition the table into itemcode and rownumber in order of tran date. We do this as a cte so it's cleaner to say from cte curr inner join cte prev on curr.rownumber = prev.rownumber+1
Lag is thus simulated- we have a row that has current values and previous values. The rest of the query logic remains the same from above