I am having trouble figuring out how to write a window function that solves my problem. I am quite the novice at window functions, but I think one could be written to meet my needs.
Problem Statement:
I want to calculate a transfer sequence showing when person has changed locations based on the corresponding location ID over time.
Sample Data (Table1)
+----------+------------+-----------+---------+
| PersonID | LocationID | Date | Time |
+----------+------------+-----------+---------+
| 12 | A | 6/17/2020 | 12:00PM |
+----------+------------+-----------+---------+
| 12 | A | 6/18/2020 | 1:00PM |
+----------+------------+-----------+---------+
| 12 | B | 6/18/2020 | 6:00AM |
+----------+------------+-----------+---------+
| 12 | C | 6/19/2020 | 3:00PM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 8:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 11:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 12:00AM |
+----------+------------+-----------+---------+
| 13 | B | 6/16/2020 | 4:00PM |
+----------+------------+-----------+---------+
Expected Results
+----------+------------+-----------+---------+-------------------+
| PersonID | LocationID | Date | Time | Transfer Sequence |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/17/2020 | 12:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/18/2020 | 1:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | B | 6/18/2020 | 6:00AM | 2 |
+----------+------------+-----------+---------+-------------------+
| 12 | C | 6/19/2020 | 3:00PM | 3 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 8:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 11:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 12:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | B | 6/16/2020 | 4:00PM | 2 |
+----------+------------+-----------+---------+-------------------+
What I Tried
SELECT
[t1].[PersonID]
,[t1].[LocationID]
,[t1].[Date]
,[t1].[Time]
,DENSE_RANK()
OVER(
partition BY [t1].[PersonID], [t1].[LocationID]
ORDER BY [t1].[Date] ASC, [t1].[Time] ASC) AS
[Transfer Sequence]
FROM Table1 [t1]
Unfortunately, I believe DENSE_RANK() is assigning a rank regardless of whether the value of LocationID has changed. I need a function that will only add one to the sequence when the LocationID has changed.
Any help would be greatly appreciated.
Thank you!
You want to put "adjacent" rows in the same group. Straigt window functions cannot do that for you - we would need to use a gaps-and-island technique:
select
t.*,
sum(case when locationID = lagLocationID then 0 else 1 end)
over(partition by personID order by date, time)
as transfert_sequence
from (
select
t.*,
lag(locationID)
over(partition by personID order by date, time)
as lagLocationID
from mytable t
) t
The idea is to compute a window sum that increments everytime the locationID changes.
Note that this would properly handle the case when a person comes back to a location they have already been before.
What wuold I do (and I'm sure it's not the best way) is create a second table orderd with PersonID, locationID, Date, time and and empty field for the transfer sequence (sequence), then a cursor:
DECLARE transaction CURSOR
FOR select PersonID, LocationID, Date, Time from table1;
Then a loop:
OPEN CURSOR transaction
set #count = 0
set #person_saved = ""
set #location_saed = ""
FETCH NEXT FROM transaction INTO #person, #location, #date, #time
WHILE ##FETCH_STATUS = 0
BEGIN
if #person_saved <> #person -- changing personID, reset count
begin
set count = 0
set persone_saved = #person
end
if #location_saved <> #location. -- changing location, add count
begin
set #count = #count + 1
set #location_saved = #location
end
update table1 set sequence = #count where PersonId = #person and locationId = #location and date = #date and time = #time
FETCH NEXT FROM transaction INTO #person, #location, #date, #time
END
CLOSE transaction
DEALLOCATE transaction
I have a very simple requirement but I'm struggling to find a way around this.
I have a very simple query:
SELECT
ServiceCode,
StartDate,
Available,
Nights,
BookingID
FROM #tmpAvailability
LEFT JOIN vwRSBooking B
ON B.Depart = A.StartDate
AND B.ServiceCode = A.SupplierCode
AND B.StatusID IN (2640, 2621)
ORDER BY StartDate;
Made up of 2 tables
#tmpAvailability which consists of the following fields:
SupplierCode
StartDate
Available
vwRSBooking which consists of the following fields
BookingID
DepartDate
Code
Nights
StatusID
Departure and startdate can be joined to link the first day, and the servicecode and suppliercode can be joined to make sure that the availability is linked to the same supplier.
Which produces an output like this:
Code | Dates | Available | Nights | BookingID
TEST | 2018-01-04 | 1 | NULL | NULL
TEST | 2018-01-05 | 1 | NULL | NULL
TEST | 2018-01-06 | 0 | 4 | 123456
TEST | 2018-01-07 | 0 | NULL | NULL
TEST | 2018-01-08 | 0 | NULL | NULL
TEST | 2018-01-09 | 0 | NULL | NULL
TEST | 2018-01-10 | 1 | NULL | NULL
TEST | 2018-01-11 | 1 | NULL | NULL
TEST | 2018-01-12 | 1 | NULL | NULL
TEST | 2018-01-13 | 0 | NULL | 234567
TEST | 2018-01-14 | 0 | NULL | NULL
TEST | 2018-01-15 | 0 | NULL | NULL
What I need is when the BookingID in for 4 days that the bookingID and the nights are spread across those days, for example:
Code | Dates | Available | Nights | BookingID
TEST | 2018-01-04 | 1 | NULL | NULL
TEST | 2018-01-05 | 1 | NULL | NULL
TEST | 2018-01-06 | 0 | 4 | 123456
TEST | 2018-01-07 | 0 | 4 | 123456
TEST | 2018-01-08 | 0 | 4 | 123456
TEST | 2018-01-09 | 0 | 4 | 123456
TEST | 2018-01-10 | 1 | NULL | NULL
TEST | 2018-01-11 | 1 | NULL | NULL
TEST | 2018-01-12 | 1 | NULL | NULL
TEST | 2018-01-13 | 0 | 3 | 234567
TEST | 2018-01-14 | 0 | 3 | 234567
TEST | 2018-01-15 | 0 | 3 | 234567
TEST | 2018-01-16 | 1 | NULL | NULL
If anyone has any ideas on how to solve it would be most appreciated.
Andrew
You could replace your vwRSBooking with another view which uses a CTE to obtain all the dates the booking covers. Then use the view's coverdate for joining to the #tmpAvailability table:
CREATE VIEW vwRSBookingFull
AS
WITH cte ( bookingid, nights, depart, code, coverdate)
AS (SELECT bookingid,
nights,
depart,
code,
depart
FROM vwRSBooking
UNION ALL
SELECT c.bookingid,
c.nights,
c.depart,
c.code,
DATEADD(d, 1, c.coverdate)
FROM cte c
WHERE DATEDIFF(d, c.depart, c.coverdate) < (c.nights - 1))
SELECT c.bookingid,
c.nights,
c.depart,
c.code,
c.coverdate
FROM cte c
GO
You will need a calendar table with all the dates in the date range your dates may fall into. For this example, I build one for January 2018. We can then join onto this table to create the additional rows.
Here is the sample code I used. You can see it at SQL Fiddle.
CREATE TABLE code (
code varchar(max),
dates date,
available int,
nights int,
bookingid int
)
INSERT INTO code VALUES
('TEST','2018-01-04','1',NULL,NULL),
('TEST','2018-01-05','1',NULL,NULL),
('TEST','2018-01-06','0',4,123456),
('TEST','2018-01-07','0',NULL,NULL),
('TEST','2018-01-08','0',NULL,NULL),
('TEST','2018-01-09','0',NULL,NULL),
('TEST','2018-01-10','1',NULL,NULL),
('TEST','2018-01-11','1',NULL,NULL),
('TEST','2018-01-12','1',NULL,NULL),
('TEST','2018-01-13','0',3,234567),
('TEST','2018-01-14','0',NULL,NULL),
('TEST','2018-01-15','0',NULL,NULL)
CREATE TABLE dates (
dates date
)
INSERT INTO dates VALUES
('2018-01-01'),('2018-01-02'),('2018-01-03'),('2018-01-04'),('2018-01-05'),('2018-01-06'),('2018-01-07'),('2018-01-08'),('2018-01-09'),('2018-01-10'),('2018-01-11'),('2018-01-12'),('2018-01-13'),('2018-01-14'),('2018-01-15'),('2018-01-16'),('2018-01-17'),('2018-01-18'),('2018-01-19'),('2018-01-20'),('2018-01-21'),('2018-01-22'),('2018-01-23'),('2018-01-24'),('2018-01-25'),('2018-01-26'),('2018-01-27'),('2018-01-28'),('2018-01-29'),('2018-01-30'),('2018-01-31')
Here is the query based on this dataset:
SELECT
code.code,
dates.dates,
code.available,
code.nights,
code.bookingid
FROM code
LEFT JOIN dates ON
dates.dates >= code.dates
AND dates.dates < DATEADD(DAY,nights,code.dates)
Edit: Here is an example using your initial query as a subquery to join your result set onto the dates table if you want a copy & paste. Still requires creating the dates table.
SELECT
ServiceCode,
StartDate,
Available,
Nights,
BookingID
FROM (
SELECT
ServiceCode,
StartDate,
Available,
Nights,
BookingID
FROM #tmpAvailability
LEFT JOIN vwRSBooking B
ON B.Depart = A.StartDate
AND B.ServiceCode = A.SupplierCode
AND B.StatusID IN (2640, 2621)
) code
LEFT JOIN dates ON
dates.dates >= code.dates
AND dates.dates < DATEADD(DAY,nights,code.dates)
ORDER BY StartDate;
I'm working on the following presto/sql query using inline filter to get side by side comparison of current date range vs weeks ago data.
In my case query current date range is 2017-09-13 to 2017-09-14.
So far I'm able to get the following results, but unfortunately this is not what I want.
Any kind of help would be greatly appreciated.
SELECT
DATE_TRUNC('day',DATE_PARSE(CAST(sample.datep AS VARCHAR),'%Y%m%d')) AS date,
CAST(SUM(sample.page_views) FILTER (WHERE sample.datep BETWEEN 20170913 AND 20170914) AS DOUBLE) AS page_views,
CAST(SUM(sample.page_views) FILTER (WHERE sample.datep BETWEEN 20170906 AND 20170907) AS DOUBLE) AS page_views_weeks_ago
FROM
sample
WHERE
(
datep BETWEEN 20170906 AND 20170914
)
GROUP BY
1
ORDER BY
1 ASC
LIMIT 50
Actual result:
+------------+------------+----------------------+
| date | page_views | page_views_weeks_ago |
+------------+------------+----------------------+
| 2017-09-06 | 0 | 990,929 |
| 2017-09-07 | 0 | 913,802 |
| 2017-09-08 | 0 | 0 |
| 2017-09-09 | 0 | 0 |
| 2017-09-10 | 0 | 0 |
| 2017-09-11 | 0 | 0 |
| 2017-09-12 | 0 | 0 |
| 2017-09-13 | 1,507,715 | 0 |
| 2017-09-14 | 48,625 | 0 |
+------------+------------+----------------------+
Expected result:
+------------+------------+----------------------+
| date | page_views | page_views_weeks_ago |
+------------+------------+----------------------+
| 2017-09-13 | 1,507,715 | 990,929 |
| 2017-09-14 | 48,625 | 913,802 |
+------------+------------+----------------------+
You can achieve with joining a table with itself as a previous day. For brevity, I assume that we have a date field so that date substructions can be done easily.
SELECT date,
SUM(curr.page_views) AS page_views,
SUM(prev.page_views) AS page_views_weeks_ago
FROM sample curr
JOIN sample prev ON curr.date - 7 = prev.date
GROUP BY 1
ORDER BY 1 ASC
I got a problem in my query :
My table store data like this
ContractID | Staff_ID | EffectDate | End Date | Salary | active
-------------------------------------------------------------------------
1 | 1 | 2013-01-01 | 2013-12-30 | 100 | 0
2 | 1 | 2014-01-01 | 2014-12-30 | 150 | 0
3 | 1 | 2015-01-01 | 2015-12-30 | 200 | 1
4 | 2 | 2014-05-01 | 2015-04-30 | 500 | 0
5 | 2 | 2015-05-01 | 2016-04-30 | 700 | 1
I would like to write a query like below:
ContractID | Staff_ID | EffectDate | End Date | Salary | Increase
-------------------------------------------------------------------------
1 | 1 | 2013-01-01 | 2013-12-30 | 100 | 0
2 | 1 | 2014-01-01 | 2014-12-30 | 150 | 50
3 | 1 | 2015-01-01 | 2015-12-30 | 200 | 50
4 | 2 | 2014-05-01 | 2015-04-30 | 500 | 0
5 | 2 | 2015-05-01 | 2016-04-30 | 700 | 200
-------------------------------------------------------------------------
Increase column is calculated by current contract minus previous contract
I use sql server 2008 R2
Unfortunately 2008R2 doesn't have access to LAG, but you can simulate the effect of obtaining the previous row (prev) in the scope of a current row (cur), with a RANKing and a self join to the previous ranked row, in the same partition by Staff_ID):
With CTE AS
(
SELECT [ContractID], [Staff_ID], [EffectDate], [End Date], [Salary],[active],
ROW_NUMBER() OVER (Partition BY Staff_ID ORDER BY ContractID) AS Rnk
FROM Table1
)
SELECT cur.[ContractID], cur.[Staff_ID], cur.[EffectDate], cur.[End Date],
cur.[Salary], cur.Rnk,
CASE WHEN (cur.Rnk = 1) THEN 0 -- i.e. baseline salary
ELSE cur.Salary - prev.Salary END AS Increase
FROM CTE cur
LEFT OUTER JOIN CTE prev
ON cur.[Staff_ID] = prev.Staff_ID and cur.Rnk - 1 = prev.Rnk;
(If ContractId is always perfectly incrementing, we wouldn't need the ROW_NUMBER and could join on incrementing ContractIds, I didn't want to make this assumption).
SqlFiddle here
Edit
If you have Sql 2012 and later, the LEAD and LAG Analytic Functions make this kind of query much simpler:
SELECT [ContractID], [Staff_ID], [EffectDate], [End Date], [Salary],
Salary - LAG(Salary, 1, Salary) OVER (Partition BY Staff_ID ORDER BY ContractID) AS Incr
FROM Table1
Updated SqlFiddle
One trick here is that we are calculating delta increments in salary, so for the first employee contract we need to return the current salary so that Salary - Salary = 0 for the first increase.
Given the following two table scenario, how would I go about outputting the commission percentage based on the date range:
Commission Percentages
| User ID | Start Date | End Date | Percentage
| -------- | ---------- | ----------- | ----------
| 1 | 11/11/2014 | 11/30/2014 | 10%
| 1 | 11/30/2014 | NULL | 20%
| 2 | 10/10/2014 | NULL | 15%
Sales
| User ID | Sale Date |
| -------- | ---------- |
| 1 | 11/24/2014 |
| 1 | 12/1/2014 |
| 2 | 12/30/2014 |
I would like to end up with a join between the two like so (a null value in the end date field represents present - and the dates will also include a time stamp):
| User ID | Sales Date | Start Date | End Date | Percentage
| -------- | ---------- | ---------- | ---------- | ----------
| 1 | 11/24/2014 | 11/11/2014 | 11/30/2014 | 10%
| 1 | 12/1/2014 | 11/30/2014 | NULL | 20%
| 2 | 12/30/2014 | 10/10/2014 | NULL | 15%
I am using SQL Server 2012
Thanks
Something like this might work for you, however you need to figure your date logic (i.e. whether it should be greater than, or greater than/equal to) depending on how your system works:
select S.UserID, S.SalesDate, C.StartDate, C.EndDate, C.Percentage
from Sales AS S
inner join Commission AS C
on C.UserID = S.UserID
AND S.SalesDate > C.StartDate
AND S.SalesDate <= coalesce(C.EndDate, S.SalesDate)
I'm assuming the end date is the first date the percentage does not apply based on the data. User ID 1 has a vector overlap.
SELECT s.User_ID,
s.Sales_Date,
cp.Start_Date,
cp.End_Date,
cp.Pecrcentage
FROM Commission_Percentages cp
INNER JOIN Sales s
ON s.User_ID = cp.User_ID
AND s.Sale_Date >= cp.Start_Date
AND (s.Sale_Date < cp.End_Date OR cp.End_Date IS NULL)