Oracle Function LAG and order by - sql

*Edit Question to reflect true output, see comments.
I have the below data, I need previous program.
TableA
StartDate EndDate Program Id
1/26/15 2/23/15 Red 1
2/24/15 3/31/17 Yellow 1
5/3/16 6/1/17 Silver 1
4/1/17 1/31/18 Orange 1
2/1/18 Blue 1
MyOutput(incorrect)
StartDate EndDate Program Prev_program
1/26/15 2/23/15 Red
2/24/15 3/31/17 Yellow Red
5/3/16 6/1/17 Silver Yellow
4/1/17 1/31/18 Orange Silver
2/1/18 Blue Orange
ExpectedOutput:
StartDate EndDate Program Prev_program
1/26/15 2/23/15 Red
2/24/15 3/31/17 Yellow Red
5/3/16 6/1/17 Silver Red
4/1/17 1/31/18 Orange Yellow
2/1/18 Blue Orange
I would like to take the previous program when previous program end date is not greater than current startdate.
I used Lag which is producing results that I do not want. Lag is not taking into account "program end date is not greater than current startdate."
SELECT *
,LAG (PROGRAM, 1) OVER (PARTITION BY ID ORDER BY STARTDATE) AS PREV_PROGRAM
FROM TABLEA

Here is one way to do this. Not the most elegant or efficient, but it does the job. Two indexes, one on (id, startdate) and one on (id, enddate) may help with performance (worth testing, anyway). You are missing the id column in the output, but I assume it plays a role (and you want the processing to be done separately for each id). I wrote the query to work separately for each id, even though the test data has only one id.
The with clause is not part of the query - I included it at the top instead of creating an actual table. You don't need it - start from SELECT a1.startdate...
with
table_a ( startdate, enddate, program, id ) as (
select date '2015-01-26', date '2015-02-23', 'Red' , 1 from dual union all
select date '2015-02-24', date '2017-03-31', 'Yellow', 1 from dual union all
select date '2016-03-05', date '2017-06-01', 'Silver', 1 from dual union all
select date '2017-04-01', date '2018-01-31', 'Orange', 1 from dual union all
select date '2018-02-01', null , 'Blue' , 1 from dual
)
select a1.startdate, a1.enddate, a1.program, a1.id,
min(a2.program) keep (dense_rank last order by a2.startdate) as prev_program
from table_a a1 left outer join table_a a2
on a1.id = a2.id and a1.startdate > a2.enddate
group by a1.startdate, a1.enddate, a1.program, a1.id
;
STARTDATE ENDDATE PROGRAM ID PREV_PROGRAM
---------- ---------- -------- -- ------------
1/26/2015 2/23/2015 Red 1
2/24/2015 3/31/2017 Yellow 1 Red
3/5/2016 6/1/2017 Silver 1 Red
4/1/2017 1/31/2018 Orange 1 Yellow
2/1/2018 Blue 1 Orange

Related

How can I transform SQL to show start and stop dates and show every day?

Person
State
StartDate
joe
blue
2/4/2020
bob
red
12/1/2019
bob
black
12/3/2009
joe
blue
2/4/2018
joe
red
12/1/2015
mary
black
12/3/2009
I have a table set up as shown above. I want to transform this to the following
Person
State
StartDate
EndDate
joe
blue
2/4/2020
bob
red
12/1/2019
bob
black
12/3/2009
11/30/2019
joe
blue
2/4/2018
2/3/2020
joe
red
12/1/2015
2/3/2018
mary
black
12/3/2009
After this, I want to have one line for every calendar day that a Person is in a given state. If there is no end date, the days in a given state should stop at the current date.
How can I do this with SQL only?
Perhaps the window function lead() in concert with dateadd() would be a good option
Example
Select *
,EndDate = dateadd(day,-1,lead(StartDate,1) over (partition by Person Order by StartDate))
From YourTable A
Returns
Person State StartDate EndDate
bob black 2009-12-03 2019-11-30
bob red 2019-12-01 NULL
joe red 2015-12-01 2018-02-03
joe blue 2018-02-04 2020-02-03
joe blue 2020-02-04 NULL
mary black 2009-12-03 NULL
EDIT - To Expand Into Daily
We take the query above and add IsNull(...,convert(date,getdate())) to trap the end dates. Then we create an ad-hoc calendar table and perform a simple join.
Select A.*
,B.D
From (
Select *
,EndDate = IsNull(dateadd(day,-1,lead(StartDate,1) over (partition by Person Order by StartDate)),convert(date,getdate()))
From YourTable A
) A
Join (
Select Top (datediff(day,'1999-12-31',getdate()))
D=dateadd(day,Row_Number() Over (Order By (Select NULL)),convert(date,'1999-12-31'))
From master..spt_values n1, master..spt_values n2
) B on D between StartDate and EndDate
Order By Person,D
Returns 10,210 rows
I like to use recursive CTEs for expanding data. It is pretty simple in your case:
with cte as (
select person, state, startdate,
lead(dateadd(day, -1, startdate),
1,
convert(date, getdate())
) over (partition by person order by startdate) as enddate
from t
union all
select person, state, dateadd(day, 1, startdate), enddate
from cte
where startdate < enddate
)
select person, state, startdate
from cte
option (maxrecursion 0);
Here is a db<>fiddle.

SQL get single row based on multiple condition after group by

Need help with creating query for below case :
Suppose I have a Table with following records
Name Date Time Category CategoryKey
John 10/20/2012 10:00 Low 2
Sam 10/20/2012 10:00 High 4
Harry 10/20/2012 10:00 Medium 1
Michael 10/20/2012 10:00 Grey 3
Rob 10/22/2013 11:00 Low 2
Marry 10/23/2014 12:00 Low 2
Richard 10/23/2014 12:00 Grey 3
Jack 10/24/2015 1:30 High 4
Then If there are multiple Names for same date and time then force select only one record based on following logic and stop when any 1 of the following condition is met.
If Category is Medium then take name
Else If Category is High then take name
Else If Category is Low then take name
Else If Category is Grey then take name
So that the Final result will be
Name Date Time Category CategoryKey
Harry 10/20/2012 10:00 Medium 1
Rob 10/22/2013 11:00 Low 2
Marry 10/23/2014 12:00 Low 2
Jack 10/24/2015 1:30 High 4
The simplest method is row_number():
select t.*
from (select t.*,
row_number() over (partition by date, time
order by (case category when 'Medium' then 1 when 'High' then 2 when 'Low' then 3 when 'Grey' then 4 else 5 end)
) as seqnum
from t
) t
where seqnum = 1;
It can be convenient to use string functions here:
row_number() over (partition by date, time
order by charindex(category, 'Medium,High,Low,Grey')
) as seqnum
This works for your case, but you need to be a little careful that all values are included and that none "contain" another value.

Expanding/changing my query to find more entries using (potentially) IFELSE

My question will use this dataset as an example. I have a query setup (I have changed variables to more generic variables for the sake of posting this on the internet so the query may not make perfect sense) that picks the most recent date for a given account. So the query returns values with a reason_type of 1 with the most recent date. This query has effective_date set to is not null.
account date effective_date value reason_type
123456 4/20/2017 5/1/2017 5 1
123456 1/20/2017 2/1/2017 10 1
987654 2/5/2018 3/1/2018 15 1
987654 12/31/2017 2/1/2018 20 1
456789 4/27/2018 5/1/2018 50 1
456789 1/24/2018 2/1/2018 60 1
456123 4/25/2017 null 15 2
789123 5/1/2017 null 16 2
666888 2/1/2018 null 31 2
333222 1/1/2018 null 20 2
What I am looking to do now is to basically use that logic to only apply to reason_type
if there is an entry for it, otherwise have it default to reason_type
I think I should be using an IFELSE, but I'm admittedly not knowledgeable about how I would go about that.
Here is the code that I currently have to return the reason_type 1s most recent entry.
I hope my question is clear.
SELECT account, date, effective_date, value, reason_type
from
(
SELECT account, date, effective_date, value, reason_type
ROW_NUMBER() over (partition by account order by date desc) rn
from mytable
WHERE value is not null
AND effective_date is not null
)
WHERE rn =1
I think you might want something like this (do you really have a column named date by the way? That seems like a bad idea):
SELECT account, date, effective_date, value, reason_type
FROM (
SELECT account, date, effective_date, value, reason_type
, ROW_NUMBER() OVER ( PARTITION BY account ORDER BY date DESC ) AS rn
FROM mytable
WHERE value IS NOT NULL
) WHERE rn = 1
-- effective_date IS NULL or is on or before today's date
AND ( effective_date IS NULL OR effective_date < TRUNC(SYSDATE+1) );
Hope this helps.

Conditional Analytic Function

Work:
,MAX(CASE WHEN MAX(HIST)
AND workid IS NOT NULL
AND ROLE = 'red'
THEN 'ASSIGNED'
ELSE 'UNASSIGNED'
END)
OVER (PARTITION BY id) AS ASSIGNED
Criteria:
Partition By ID
Look at last entry from each ID, utilizing the PKHistid column
If Role = Red and Workid IS NOT NULL from the last entry for each ID
Then Assigned
Else Unassigned
Table:
PKHistid ID Role Entry_Date Workid
1 101 Red 1/1/17 201
2 101 Yellow 1/2/17 201
3 102 Yellow 5/1/17 (Null)
4 102 Red 6/1/17 202
5 103 Red 7/1/17 202
6 103 Red 7/5/17 202
Expected Results: (New Column Assigned_Status)
PKHistid ID Role Entry_Date Workid *Assigned_Status
1 101 Red 1/1/17 201 Unassigned
2 101 Yellow 1/2/17 201 Unassigned
3 102 Yellow 5/1/17 (Null) Assigned
4 102 Red 6/1/17 202 Assigned
5 103 Red 7/1/17 202 Assigned
6 103 Red 7/5/17 202 Assigned
Is this "instead of" your earlier question (also posted today), or is it "in addition to" it? If it is "in addition to", note that you can do both things in the same query.
Here you need a case expression to create the additional column. In the case expression, the condition uses an analytic function. I prefer the analytic version of the LAST function (which, unfortunately, many developers don't seem to know and use). Please read the Oracle documentation for it if it is not familiar to you.
Note that analytic functions can't be nested; but there is absolutely no prohibition against using analytic functions in case expressions. I often see solutions where the analytic function is called in a subquery, and then further processing (such as case expressions using the result from the analytic functions) is done in an outer query. Unnecessary layering!
with
inputs ( pkhistid, id, role, entry_date, workid) as (
select 1, 101, 'Red' , to_date('1/1/17', 'mm/dd/rr'), 201 from dual union all
select 2, 101, 'Yellow', to_date('1/2/17', 'mm/dd/rr'), 201 from dual union all
select 3, 102, 'Yellow', to_date('5/1/17', 'mm/dd/rr'), null from dual union all
select 4, 102, 'Red' , to_date('6/1/17', 'mm/dd/rr'), 202 from dual union all
select 5, 103, 'Red' , to_date('7/1/17', 'mm/dd/rr'), 202 from dual union all
select 6, 103, 'Red' , to_date('7/5/17', 'mm/dd/rr'), 202 from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins BELOW THIS LINE. Use your actual table and column names.
select pkhistid, id, role, entry_date, workid,
case when max(role) keep (dense_rank last order by pkhistid)
over (partition by id) = 'Red'
and
max(workid) keep (dense_rank last order by pkhistid)
over (partition by id) is not null
then 'Assigned'
else 'Unassigned' end as assigned_status
from inputs
order by id, pkhistid -- If needed
;
PKHISTID ID ROLE ENTRY_DATE WORKID ASSIGNED_STATUS
---------- ---------- ------ ---------- ---------- ---------------
1 101 Red 01/01/17 201 Unassigned
2 101 Yellow 01/02/17 201 Unassigned
3 102 Yellow 05/01/17 Assigned
4 102 Red 06/01/17 202 Assigned
5 103 Red 07/01/17 202 Assigned
6 103 Red 07/05/17 202 Assigned

SQL Server 2014 - Return Team name based on most recent date (somewhat dynamically)

My title is misleading because I don't know how to sum it up better than that :)
I have a table that keeps a history of changes made to users and what teams they belong to. It starts with their initial team and date, then adds an entry via a trigger when we change their teams in the UserList table.
Our business, like many, loves month to month data. I don't want to have entries for every single month if they don't change teams. Ill get to why that's a problem.
Here is an example of the data in the TeamHistory Table
UserID|CurrentTeam|ChangeDate
User1-|Team1------|01-01-2016
User1-|Team2------|03-01-2016
When I run a view or query that rolls the data up by person and media type (I can have 4 entries for a single person in a single month - voice, fax, email and voicemail) I then need to add the team that they were working on for that month.
Using that above example, if I ran the data for all of last year, I would expect Jan-May to display Team1. Then from June to Dec, Team 2. The problem is if I join the date field in my view/query with this table and use an = sign, then I only get data for 1-1 and 6-1, clearly because I only have those values in the table to match against. If I tell it to do < or <=, I start encountering duplicates as its just not specific enough.
If we need an example query, I can try to work something up that's not one of these massive views.
So lets assume this is my data:
Userid| Month |Media|Calls
User1-|-01/01/2016|Voice|200
User1-|-01/01/2016|Email|100
User1-|-02/01/2016|Voice|250
User1-|-02/01/2016|Email|120
User1-|-03/01/2016|Voice|250
User1-|-03/01/2016|Email|120
And the TeamHistory table has 2 entries, the team they started on for 1/1/2016 and then they switched for 3/1/2016. How do I join the two data sets, using the date and userid as my variables, to pull in the corresponding Team? Especially when I wont have an actual entry for 2/1/2016?
Id want my final dataset to look like this:
Userid|Team | Month |Media|Calls
User1-|Team1|-01/01/2016|Voice|200
User1-|Team1|-01/01/2016|Email|100
User1-|Team1|-02/01/2016|Voice|250
User1-|Team1|-02/01/2016|Email|120
User1-|Team2|-03/01/2016|Voice|250
User1-|Team2|-03/01/2016|Email|120
Since you're using SQL Server (2012 and newer) you can use the LEAD() function to identify an end date for a given range:
;with cte aS (SELECT 'User1' as UserID, 'Team1' AS CurrentTeam, CAST('2016-01-01' AS DATE) as ChangeDate
UNION SELECT 'User1' as UserID, 'Team2' AS CurrentTeam, CAST('2016-06-01' AS DATE) as ChangeDate
UNION SELECT 'User1' as UserID, 'Team1' AS CurrentTeam, CAST('2016-08-15' AS DATE) as ChangeDate
UNION SELECT 'User2' as UserID, 'Team1' AS CurrentTeam, CAST('2016-02-01' AS DATE) as ChangeDate
UNION SELECT 'User2' as UserID, 'Team2' AS CurrentTeam, CAST('2016-07-01' AS DATE) as ChangeDate
)
SELECT *,COALESCE(LEAD(ChangeDate,1) OVER(PARTITION BY UserID ORDER BY ChangeDate),CAST(GETDATE() AS DATE)) as End_Dt
FROM cte
Returns:
UserID CurrentTeam ChangeDate End_Dt
User1 Team1 2016-01-01 2016-06-01
User1 Team2 2016-06-01 2016-08-15
User1 Team1 2016-08-15 2017-01-05
User2 Team1 2016-02-01 2016-07-01
User2 Team2 2016-07-01 2017-01-05
You could then join those ranges to a calendar table to get the individual months as well as calculate which team they spent more days in for a given month.
The LEAD() function returns the next row's value for a given field, PARTITION BY is used to reset the next row based on some grouping, in this case you want the value per UserID, and ORDER BY is used to specify what the next row should be, in this case from one ChangeDate to the next.
You might try this:
--A simple person table
DECLARE #pers TABLE(Person VARCHAR(100));
INSERT INTO #pers VALUES('Bob'),('Tim');
--a table reflecting your work-data
--attention Tim is changing in July to Team Read and still in July back to Blue
DECLARE #Team TABLE(Person VARCHAR(100),Team VARCHAR(100),ChangeDate DATE);
INSERT INTO #Team VALUES
('Bob','Red' ,{d'2016-04-01'})
,('Tim','Blue',{d'2016-04-13'})
,('Tim','Red' ,{d'2016-07-22'})
,('Bob','Blue',{d'2016-06-15'})
,('Tim','Blue',{d'2016-07-28'})
,('Bob','Red' ,{d'2016-10-15'})
,('Tim','Red' ,{d'2016-12-28'})
;
--A CTE to mock-up a numbers/tally/date-table
WITH FirstOfMonthDays(d) AS
(
SELECT {d'2016-01-01'}
UNION ALL SELECT {d'2016-02-01'}
UNION ALL SELECT {d'2016-03-01'}
UNION ALL SELECT {d'2016-04-01'}
UNION ALL SELECT {d'2016-05-01'}
UNION ALL SELECT {d'2016-06-01'}
UNION ALL SELECT {d'2016-07-01'}
UNION ALL SELECT {d'2016-08-01'}
UNION ALL SELECT {d'2016-09-01'}
UNION ALL SELECT {d'2016-10-01'}
UNION ALL SELECT {d'2016-11-01'}
UNION ALL SELECT {d'2016-12-01'}
)
--I use CONVERT(VARCHAR(6),ChangeDate,112) to get a string of YYYYMM
,Numbered AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Person, CONVERT(VARCHAR(6),ChangeDate,112) ORDER BY ChangeDate DESC) AS Nr
,t.*
FROM #Team AS t
)
--Pick out the one with Nr=1, these are the last changes per month
,LastChangeInMonth AS
(
SELECT *
FROM Numbered
WHERE Nr=1
)
--The actual query
SELECT fom.d
,p.Person
,(
SELECT TOP 1 t.Team
FROM LastChangeInMonth AS t
WHERE t.Person=p.Person
AND CONVERT(VARCHAR(6),t.ChangeDate,112)<=CONVERT(VARCHAR(6),fom.d,112)
ORDER BY t.ChangeDate DESC
) AS fittingTeam
FROM FirstOfMonthDays AS fom
CROSS JOIN #pers AS p
ORDER BY p.Person,fom.d
Since you are using SQL Server 2014 (please tag your questions correctly!) this would be a bit easier with LEAD()/LAG/(), but the idea was the same...
The result
2016-01-01 Bob NULL
2016-02-01 Bob NULL
2016-03-01 Bob NULL
2016-04-01 Bob Red
2016-05-01 Bob Red
2016-06-01 Bob Blue
2016-07-01 Bob Blue
2016-08-01 Bob Blue
2016-09-01 Bob Blue
2016-10-01 Bob Red
2016-11-01 Bob Red
2016-12-01 Bob Red
2016-01-01 Tim NULL
2016-02-01 Tim NULL
2016-03-01 Tim NULL
2016-04-01 Tim Blue
2016-05-01 Tim Blue
2016-06-01 Tim Blue
2016-07-01 Tim Blue
2016-08-01 Tim Blue
2016-09-01 Tim Blue
2016-10-01 Tim Blue
2016-11-01 Tim Blue
2016-12-01 Tim Red