How can I transform SQL to show start and stop dates and show every day? - sql

Person
State
StartDate
joe
blue
2/4/2020
bob
red
12/1/2019
bob
black
12/3/2009
joe
blue
2/4/2018
joe
red
12/1/2015
mary
black
12/3/2009
I have a table set up as shown above. I want to transform this to the following
Person
State
StartDate
EndDate
joe
blue
2/4/2020
bob
red
12/1/2019
bob
black
12/3/2009
11/30/2019
joe
blue
2/4/2018
2/3/2020
joe
red
12/1/2015
2/3/2018
mary
black
12/3/2009
After this, I want to have one line for every calendar day that a Person is in a given state. If there is no end date, the days in a given state should stop at the current date.
How can I do this with SQL only?

Perhaps the window function lead() in concert with dateadd() would be a good option
Example
Select *
,EndDate = dateadd(day,-1,lead(StartDate,1) over (partition by Person Order by StartDate))
From YourTable A
Returns
Person State StartDate EndDate
bob black 2009-12-03 2019-11-30
bob red 2019-12-01 NULL
joe red 2015-12-01 2018-02-03
joe blue 2018-02-04 2020-02-03
joe blue 2020-02-04 NULL
mary black 2009-12-03 NULL
EDIT - To Expand Into Daily
We take the query above and add IsNull(...,convert(date,getdate())) to trap the end dates. Then we create an ad-hoc calendar table and perform a simple join.
Select A.*
,B.D
From (
Select *
,EndDate = IsNull(dateadd(day,-1,lead(StartDate,1) over (partition by Person Order by StartDate)),convert(date,getdate()))
From YourTable A
) A
Join (
Select Top (datediff(day,'1999-12-31',getdate()))
D=dateadd(day,Row_Number() Over (Order By (Select NULL)),convert(date,'1999-12-31'))
From master..spt_values n1, master..spt_values n2
) B on D between StartDate and EndDate
Order By Person,D
Returns 10,210 rows

I like to use recursive CTEs for expanding data. It is pretty simple in your case:
with cte as (
select person, state, startdate,
lead(dateadd(day, -1, startdate),
1,
convert(date, getdate())
) over (partition by person order by startdate) as enddate
from t
union all
select person, state, dateadd(day, 1, startdate), enddate
from cte
where startdate < enddate
)
select person, state, startdate
from cte
option (maxrecursion 0);
Here is a db<>fiddle.

Related

Find intersecting dates

Can somebody help me with next problem. I have MS Access table, lets say with my employees, and for each one of them I have start and end date of their vacation:
Name begin end
John 1.3.2021. 15.3.2021.
Robert 6.3.2021. 8.3.2021.
Lisa 13.3.2021. 16.3.2021.
John 1.4.2021. 3.4.2021.
Robert 2.4.2021. 2.4.2021.
Lisa 15.5.2021. 23.5.2021.
Lisa 5.6.2021. 15.6.2021.
How to get the result with number of employees which are absent from work per each date from the table (dates which are included into intervals begin-end). For example:
1.3.2021. 1 '>>>only John
2.3.2021. 1 '>>>only John
3.3.2021. 1 '>>>only John
4.3.2021. 1 '>>>only John
5.3.2021. 1 '>>>only John
6.3.2021. 2 '>>>John and Robert
7.3.2021. 2 '>>>John and Robert
...
Thank you in advanced!
You can use union to combine the tables and a correlated subquery:
select dte,
(select count(*)
from t
where d.dte between t.[begin] and t.[end]
) as cnt
from (select [begin] as dte
from t
union
select [end]
from t
) d;

Oracle Function LAG and order by

*Edit Question to reflect true output, see comments.
I have the below data, I need previous program.
TableA
StartDate EndDate Program Id
1/26/15 2/23/15 Red 1
2/24/15 3/31/17 Yellow 1
5/3/16 6/1/17 Silver 1
4/1/17 1/31/18 Orange 1
2/1/18 Blue 1
MyOutput(incorrect)
StartDate EndDate Program Prev_program
1/26/15 2/23/15 Red
2/24/15 3/31/17 Yellow Red
5/3/16 6/1/17 Silver Yellow
4/1/17 1/31/18 Orange Silver
2/1/18 Blue Orange
ExpectedOutput:
StartDate EndDate Program Prev_program
1/26/15 2/23/15 Red
2/24/15 3/31/17 Yellow Red
5/3/16 6/1/17 Silver Red
4/1/17 1/31/18 Orange Yellow
2/1/18 Blue Orange
I would like to take the previous program when previous program end date is not greater than current startdate.
I used Lag which is producing results that I do not want. Lag is not taking into account "program end date is not greater than current startdate."
SELECT *
,LAG (PROGRAM, 1) OVER (PARTITION BY ID ORDER BY STARTDATE) AS PREV_PROGRAM
FROM TABLEA
Here is one way to do this. Not the most elegant or efficient, but it does the job. Two indexes, one on (id, startdate) and one on (id, enddate) may help with performance (worth testing, anyway). You are missing the id column in the output, but I assume it plays a role (and you want the processing to be done separately for each id). I wrote the query to work separately for each id, even though the test data has only one id.
The with clause is not part of the query - I included it at the top instead of creating an actual table. You don't need it - start from SELECT a1.startdate...
with
table_a ( startdate, enddate, program, id ) as (
select date '2015-01-26', date '2015-02-23', 'Red' , 1 from dual union all
select date '2015-02-24', date '2017-03-31', 'Yellow', 1 from dual union all
select date '2016-03-05', date '2017-06-01', 'Silver', 1 from dual union all
select date '2017-04-01', date '2018-01-31', 'Orange', 1 from dual union all
select date '2018-02-01', null , 'Blue' , 1 from dual
)
select a1.startdate, a1.enddate, a1.program, a1.id,
min(a2.program) keep (dense_rank last order by a2.startdate) as prev_program
from table_a a1 left outer join table_a a2
on a1.id = a2.id and a1.startdate > a2.enddate
group by a1.startdate, a1.enddate, a1.program, a1.id
;
STARTDATE ENDDATE PROGRAM ID PREV_PROGRAM
---------- ---------- -------- -- ------------
1/26/2015 2/23/2015 Red 1
2/24/2015 3/31/2017 Yellow 1 Red
3/5/2016 6/1/2017 Silver 1 Red
4/1/2017 1/31/2018 Orange 1 Yellow
2/1/2018 Blue 1 Orange

Advanced Sql query solution required

player team start_date end_date points
John Jacob SportsBallers 2015-01-01 2015-03-31 100
John Jacob SportsKings 2015-04-01 2015-12-01 115
Joe Smith PointScorers 2014-01-01 2016-12-31 125
Bill Johnson SportsKings 2015-01-01 2015-06-31 175
Bill Johnson AllStarTeam 2015-07-01 2016-12-31 200
The above table has many more rows. I was asked the below questions in an interview.
1.)For each player, which team were they play for on 2015-01-01?
I could not answer this one.
2.)For each player, how can we get the team for whom they scored the most points?
select team from Players
where points in (select max(points) from players group by player).
Please, solutions for both.
1
select *
from PlayerTeams
where startdate <='2015-01-01' and enddate >= '2015-01-01'
2
Select player, team, points
from(
Select *, row_number() over (partition by player order by points desc) as rank
From PlayerTeams) as player
where rank = 1
For #1:
Select Player
,Team
From table
Where '2015-01-01' between start_date and end_date
For #2:
select t.Player
,t.Team
from table t
inner join (select Player
,Max(points)
from table
group by Player) m
on t.Player = m.Player
and t.points = m.points

How can I add group numbers to sequential records in SQL Server 2012 when a column value changes?

I've been playing with window functions in SQL Server 2012 and can't get this to work, as I'm hoping to avoid a cursor and going row by row. My problem is that I need to add a group number to each record. The tricky part is that the group number has to increment each time a column value changes, even if it changes back to a value that existed before earlier in the sequence of records.
Here's an example of the data and my desired outcome:
if object_id('tempdb..#data') is not null
drop table #data
create table #data
(
id int identity(1,1)
,mytime datetime
,distance int
,direction varchar(20)
)
insert into #data (mytime, distance, direction)
values
('2016-01-01 08:00',10,'North')
,('2016-01-01 08:30',18,'North')
,('2016-01-01 09:00',15,'North')
,('2016-01-01 09:30',12,'South')
,('2016-01-01 10:00',16,'South')
,('2016-01-01 10:30',45,'North')
,('2016-01-01 11:00',23,'North')
,('2016-01-01 11:30',14,'South')
,('2016-01-01 12:00',40,'South')
Desired outcome:
mytime Distance Direction GroupNumber
--------------------------------------------------------
2016-01-01 8:00 10 North 1
2016-01-01 8:30 18 North 1
2016-01-01 9:00 15 North 1
2016-01-01 9:30 12 South 2
2016-01-01 10:00 16 South 2
2016-01-01 10:30 45 North 3
2016-01-01 11:00 23 North 3
2016-01-01 11:30 14 South 4
2016-01-01 12:00 40 South 4
Is this possible using window functions?
One way would be
WITH T
AS (SELECT *,
CASE
WHEN LAG(direction)
OVER (ORDER BY ID) = direction THEN 0
ELSE 1
END AS Flag
FROM #data)
SELECT mytime,
Distance,
Direction,
SUM(Flag) OVER (ORDER BY id) AS GroupNumber
FROM T
The above assumes Direction doesn't contain any NULLs. It would need a minor adjustment if this is possible. But you would also need to define whether or not two consecutive NULL should be treated as equal (assuming this was the case then the below variant would work)
WITH T
AS (SELECT *,
prev = LAG(direction) OVER (ORDER BY ID),
rn = ROW_NUMBER() OVER (ORDER BY ID)
FROM #data)
SELECT mytime,
Distance,
Direction,
SUM(CASE WHEN rn > 1 AND EXISTS(SELECT prev
INTERSECT
SELECT Direction) THEN 0 ELSE 1 END) OVER (ORDER BY id) AS GroupNumber
FROM T
ORDER BY ID

SQL Server 2014 - Return Team name based on most recent date (somewhat dynamically)

My title is misleading because I don't know how to sum it up better than that :)
I have a table that keeps a history of changes made to users and what teams they belong to. It starts with their initial team and date, then adds an entry via a trigger when we change their teams in the UserList table.
Our business, like many, loves month to month data. I don't want to have entries for every single month if they don't change teams. Ill get to why that's a problem.
Here is an example of the data in the TeamHistory Table
UserID|CurrentTeam|ChangeDate
User1-|Team1------|01-01-2016
User1-|Team2------|03-01-2016
When I run a view or query that rolls the data up by person and media type (I can have 4 entries for a single person in a single month - voice, fax, email and voicemail) I then need to add the team that they were working on for that month.
Using that above example, if I ran the data for all of last year, I would expect Jan-May to display Team1. Then from June to Dec, Team 2. The problem is if I join the date field in my view/query with this table and use an = sign, then I only get data for 1-1 and 6-1, clearly because I only have those values in the table to match against. If I tell it to do < or <=, I start encountering duplicates as its just not specific enough.
If we need an example query, I can try to work something up that's not one of these massive views.
So lets assume this is my data:
Userid| Month |Media|Calls
User1-|-01/01/2016|Voice|200
User1-|-01/01/2016|Email|100
User1-|-02/01/2016|Voice|250
User1-|-02/01/2016|Email|120
User1-|-03/01/2016|Voice|250
User1-|-03/01/2016|Email|120
And the TeamHistory table has 2 entries, the team they started on for 1/1/2016 and then they switched for 3/1/2016. How do I join the two data sets, using the date and userid as my variables, to pull in the corresponding Team? Especially when I wont have an actual entry for 2/1/2016?
Id want my final dataset to look like this:
Userid|Team | Month |Media|Calls
User1-|Team1|-01/01/2016|Voice|200
User1-|Team1|-01/01/2016|Email|100
User1-|Team1|-02/01/2016|Voice|250
User1-|Team1|-02/01/2016|Email|120
User1-|Team2|-03/01/2016|Voice|250
User1-|Team2|-03/01/2016|Email|120
Since you're using SQL Server (2012 and newer) you can use the LEAD() function to identify an end date for a given range:
;with cte aS (SELECT 'User1' as UserID, 'Team1' AS CurrentTeam, CAST('2016-01-01' AS DATE) as ChangeDate
UNION SELECT 'User1' as UserID, 'Team2' AS CurrentTeam, CAST('2016-06-01' AS DATE) as ChangeDate
UNION SELECT 'User1' as UserID, 'Team1' AS CurrentTeam, CAST('2016-08-15' AS DATE) as ChangeDate
UNION SELECT 'User2' as UserID, 'Team1' AS CurrentTeam, CAST('2016-02-01' AS DATE) as ChangeDate
UNION SELECT 'User2' as UserID, 'Team2' AS CurrentTeam, CAST('2016-07-01' AS DATE) as ChangeDate
)
SELECT *,COALESCE(LEAD(ChangeDate,1) OVER(PARTITION BY UserID ORDER BY ChangeDate),CAST(GETDATE() AS DATE)) as End_Dt
FROM cte
Returns:
UserID CurrentTeam ChangeDate End_Dt
User1 Team1 2016-01-01 2016-06-01
User1 Team2 2016-06-01 2016-08-15
User1 Team1 2016-08-15 2017-01-05
User2 Team1 2016-02-01 2016-07-01
User2 Team2 2016-07-01 2017-01-05
You could then join those ranges to a calendar table to get the individual months as well as calculate which team they spent more days in for a given month.
The LEAD() function returns the next row's value for a given field, PARTITION BY is used to reset the next row based on some grouping, in this case you want the value per UserID, and ORDER BY is used to specify what the next row should be, in this case from one ChangeDate to the next.
You might try this:
--A simple person table
DECLARE #pers TABLE(Person VARCHAR(100));
INSERT INTO #pers VALUES('Bob'),('Tim');
--a table reflecting your work-data
--attention Tim is changing in July to Team Read and still in July back to Blue
DECLARE #Team TABLE(Person VARCHAR(100),Team VARCHAR(100),ChangeDate DATE);
INSERT INTO #Team VALUES
('Bob','Red' ,{d'2016-04-01'})
,('Tim','Blue',{d'2016-04-13'})
,('Tim','Red' ,{d'2016-07-22'})
,('Bob','Blue',{d'2016-06-15'})
,('Tim','Blue',{d'2016-07-28'})
,('Bob','Red' ,{d'2016-10-15'})
,('Tim','Red' ,{d'2016-12-28'})
;
--A CTE to mock-up a numbers/tally/date-table
WITH FirstOfMonthDays(d) AS
(
SELECT {d'2016-01-01'}
UNION ALL SELECT {d'2016-02-01'}
UNION ALL SELECT {d'2016-03-01'}
UNION ALL SELECT {d'2016-04-01'}
UNION ALL SELECT {d'2016-05-01'}
UNION ALL SELECT {d'2016-06-01'}
UNION ALL SELECT {d'2016-07-01'}
UNION ALL SELECT {d'2016-08-01'}
UNION ALL SELECT {d'2016-09-01'}
UNION ALL SELECT {d'2016-10-01'}
UNION ALL SELECT {d'2016-11-01'}
UNION ALL SELECT {d'2016-12-01'}
)
--I use CONVERT(VARCHAR(6),ChangeDate,112) to get a string of YYYYMM
,Numbered AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Person, CONVERT(VARCHAR(6),ChangeDate,112) ORDER BY ChangeDate DESC) AS Nr
,t.*
FROM #Team AS t
)
--Pick out the one with Nr=1, these are the last changes per month
,LastChangeInMonth AS
(
SELECT *
FROM Numbered
WHERE Nr=1
)
--The actual query
SELECT fom.d
,p.Person
,(
SELECT TOP 1 t.Team
FROM LastChangeInMonth AS t
WHERE t.Person=p.Person
AND CONVERT(VARCHAR(6),t.ChangeDate,112)<=CONVERT(VARCHAR(6),fom.d,112)
ORDER BY t.ChangeDate DESC
) AS fittingTeam
FROM FirstOfMonthDays AS fom
CROSS JOIN #pers AS p
ORDER BY p.Person,fom.d
Since you are using SQL Server 2014 (please tag your questions correctly!) this would be a bit easier with LEAD()/LAG/(), but the idea was the same...
The result
2016-01-01 Bob NULL
2016-02-01 Bob NULL
2016-03-01 Bob NULL
2016-04-01 Bob Red
2016-05-01 Bob Red
2016-06-01 Bob Blue
2016-07-01 Bob Blue
2016-08-01 Bob Blue
2016-09-01 Bob Blue
2016-10-01 Bob Red
2016-11-01 Bob Red
2016-12-01 Bob Red
2016-01-01 Tim NULL
2016-02-01 Tim NULL
2016-03-01 Tim NULL
2016-04-01 Tim Blue
2016-05-01 Tim Blue
2016-06-01 Tim Blue
2016-07-01 Tim Blue
2016-08-01 Tim Blue
2016-09-01 Tim Blue
2016-10-01 Tim Blue
2016-11-01 Tim Blue
2016-12-01 Tim Red