SQL Conditional update column value based on previous row value - sql
I have a table with attendance dates in SQL Workbench/J. I need to define the attendance periods of the people, where any attendance period with gaps less or equal than 90 days are merged into a single attendance period and any gaps larger than that are considered a different attendance period. For example for a single person this is the table I have
id
year
start_date
end_date
prev_att_month
diff
1
2012
2012-08-01
2012-08-31
2012-07-01
31
1
2012
2012-07-01
2012-07-31
2012-04-01
91
1
2012
2012-04-01
2012-04-30
2012-03-01
31
1
2012
2012-03-01
2012-03-31
2012-02-01
29
1
2012
2012-02-01
2012-02-29
2012-01-01
31
1
2012
2012-01-01
2012-01-31
2011-12-01
31
1
2011
2011-12-01
2011-12-31
2011-11-01
30
1
2011
2011-11-01
2011-11-30
2011-10-01
31
1
2011
2011-10-01
2011-10-31
2011-09-01
30
1
2011
2011-09-01
2011-09-30
2011-08-01
31
1
2011
2011-08-01
2011-08-31
2011-07-01
31
1
2011
2011-07-01
2011-07-31
2011-05-01
61
1
2011
2011-05-01
2011-05-31
2011-04-01
30
1
2011
2011-04-01
2011-04-30
2011-03-01
31
1
2011
2011-03-01
2011-03-31
2011-02-01
28
1
2011
2011-02-01
2011-02-28
2010-08-01
184
1
2010
2010-08-01
2010-08-31
2010-07-01
31
1
2010
2010-07-01
2010-07-31
2010-06-01
30
1
2010
2010-06-01
2010-06-30
2010-05-01
31
1
2010
2010-05-01
2010-05-31
2010-04-01
30
1
2010
2010-04-01
2010-04-30
where I defined the previous attendance month column with a lag function and then found the difference between that column and the the start_date in the diff column. This way I can check the gaps between the attendance months
I want this output of the attendance periods with the 90 day rule explained above:
id
start_date
end_date
1
1/04/2010
31/08/2010
1
1/02/2011
30/04/2012
1
1/07/2012
31/08/2012
Does any one have an idea of how to do this?
So far I was just able to define the difference between the attendance months but since this is a large data set I have not been able to find a solution to define the attendance periods without making a row to row analysis.
with [table] as (
select id, year, start_date, end_date,
lag(start_date) over (partition by id order by id, year, start_date) as prev_att_month,
start_date-prev_att_month as diff
from source
)
select *
from [table]
where id = 1
One method would be to use a windowed COUNT to count how many times a value greater than 90 has appeared in the diff column, which provides a unique group number. Then you can just group your data into those groups and get the MIN and MAX values:
WITH Grps AS(
SELECT V.id,
V.year,
V.start_date,
V.end_date,
V.prev_att_month,
V.diff,
COUNT(CASE WHEN diff > 90 THEN 1 END) OVER (PARTITION BY ID ORDER BY V.start_date ASC) AS Grp
FROM (VALUES(1,2012,CONVERT(date,'20120801'),CONVERT(date,'20120831'),CONVERT(date,'2012-07-01'),31),
(1,2012,CONVERT(date,'20120701'),CONVERT(date,'20120731'),CONVERT(date,'2012-04-01'),91),
(1,2012,CONVERT(date,'20120401'),CONVERT(date,'20120430'),CONVERT(date,'2012-03-01'),31),
(1,2012,CONVERT(date,'20120301'),CONVERT(date,'20120331'),CONVERT(date,'2012-02-01'),29),
(1,2012,CONVERT(date,'20120201'),CONVERT(date,'20120229'),CONVERT(date,'2012-01-01'),31),
(1,2012,CONVERT(date,'20120101'),CONVERT(date,'20120131'),CONVERT(date,'2011-12-01'),31),
(1,2011,CONVERT(date,'20111201'),CONVERT(date,'20111231'),CONVERT(date,'2011-11-01'),30),
(1,2011,CONVERT(date,'20111101'),CONVERT(date,'20111130'),CONVERT(date,'2011-10-01'),31),
(1,2011,CONVERT(date,'20111001'),CONVERT(date,'20111031'),CONVERT(date,'2011-09-01'),30),
(1,2011,CONVERT(date,'20110901'),CONVERT(date,'20110930'),CONVERT(date,'2011-08-01'),31),
(1,2011,CONVERT(date,'20110801'),CONVERT(date,'20110831'),CONVERT(date,'2011-07-01'),31),
(1,2011,CONVERT(date,'20110701'),CONVERT(date,'20110731'),CONVERT(date,'2011-05-01'),61),
(1,2011,CONVERT(date,'20110501'),CONVERT(date,'20110531'),CONVERT(date,'2011-04-01'),30),
(1,2011,CONVERT(date,'20110401'),CONVERT(date,'20110430'),CONVERT(date,'2011-03-01'),31),
(1,2011,CONVERT(date,'20110301'),CONVERT(date,'20110331'),CONVERT(date,'2011-02-01'),28),
(1,2011,CONVERT(date,'20110201'),CONVERT(date,'20110228'),CONVERT(date,'2010-08-01'),184),
(1,2010,CONVERT(date,'20100801'),CONVERT(date,'20100831'),CONVERT(date,'2010-07-01'),31),
(1,2010,CONVERT(date,'20100701'),CONVERT(date,'20100731'),CONVERT(date,'2010-06-01'),30),
(1,2010,CONVERT(date,'20100601'),CONVERT(date,'20100630'),CONVERT(date,'2010-05-01'),31),
(1,2010,CONVERT(date,'20100501'),CONVERT(date,'20100531'),CONVERT(date,'2010-04-01'),30),
(1,2010,CONVERT(date,'20100401'),CONVERT(date,'20100430'),NULL,NULL))V(id,year,start_date,end_date,prev_att_month,diff))
SELECT id,
MIN(Start_date) AS Start_date,
MAX(End_Date) AS End_Date
FROM Grps
GROUP BY Id,
Grp
ORDER BY id,
Start_date;
Related
Compare values for consecutive dates of same month
I have a table ID Value Date 1 10 2017-10-02 02:50:04.480 2 20 2017-10-01 07:28:53.593 3 30 2017-09-30 23:59:59.000 4 40 2017-09-30 23:59:59.000 5 50 2017-09-30 02:36:07.520 I compare Value with previous date. But, I don't need compare result between first day in current month and last day in previous month. For this table, I don't need to compare result between 2017-10-01 07:28:53.593 and 2017-09-30 23:59:59.000 How it can be done? Result table for this example: ID Value Date Diff 1 10 2017-10-02 02:50:04.480 10 2 20 2017-10-01 07:28:53.593 NULL 3 30 2017-09-30 23:59:59.000 10 4 40 2017-09-29 23:59:59.000 10 5 50 2017-09-28 02:36:07.520 NULL
You can use this. SELECT * , LEAD(Value) OVER( PARTITION BY DATEPART(YEAR,[Date]), DATEPART(MONTH,[Date]) ORDER BY ID ) - Value AS Diff FROM MyTable ORDER BY ID
you can use a query like below select *, diff=LEAD(Value) OVER( PARTITION BY Month(Date),Year(Date) ORDER BY Date desc)-Value from t order by id asc see working demo
SQL Date Range Query - Table Comparison
I have two SQL Server tables containing the following information: Table t_venues: venue_id is unique venue_id | start_date | end_date 1 | 01/01/2014 | 02/01/2014 2 | 05/01/2014 | 05/01/2014 3 | 09/01/2014 | 15/01/2014 4 | 20/01/2014 | 30/01/2014 Table t_venueuser: venue_id is not unique venue_id | start_date | end_date 1 | 02/01/2014 | 02/01/2014 2 | 05/01/2014 | 05/01/2014 3 | 09/01/2014 | 10/01/2014 4 | 23/01/2014 | 25/01/2014 From these two tables I need to find the dates that haven't been selected for each range, so the output would look like this: venue_id | start_date | end_date 1 | 01/01/2014 | 01/01/2014 3 | 11/01/2014 | 15/01/2014 4 | 20/01/2014 | 22/01/2014 4 | 26/01/2014 | 30/01/2014 I can compare the two tables and get the date ranges from t_venues to appear in my query using 'except' but I can't get the query to produce the non-selected dates. Any help would be appreciated.
Calendar Table! Another perfect candidate for a calendar table. If you can't be bothered to search for one, here's one I made earlier. Setup Data DECLARE #t_venues table ( venue_id int , start_date date , end_date date ); INSERT INTO #t_venues (venue_id, start_date, end_date) VALUES (1, '2014-01-01', '2014-01-02') , (2, '2014-01-05', '2014-01-05') , (3, '2014-01-09', '2014-01-15') , (4, '2014-01-20', '2014-01-30') ; DECLARE #t_venueuser table ( venue_id int , start_date date , end_date date ); INSERT INTO #t_venueuser (venue_id, start_date, end_date) VALUES (1, '2014-01-02', '2014-01-02') , (2, '2014-01-05', '2014-01-05') , (3, '2014-01-09', '2014-01-10') , (4, '2014-01-23', '2014-01-25') ; The Query SELECT t_venues.venue_id , calendar.the_date , CASE WHEN t_venueuser.venue_id IS NULL THEN 1 ELSE 0 END As is_available FROM dbo.calendar /* see: http://gvee.co.uk/files/sql/dbo.numbers%20&%20dbo.calendar.sql for an example */ INNER JOIN #t_venues As t_venues ON t_venues.start_date <= calendar.the_date AND t_venues.end_date >= calendar.the_date LEFT JOIN #t_venueuser As t_venueuser ON t_venueuser.venue_id = t_venues.venue_id AND t_venueuser.start_date <= calendar.the_date AND t_venueuser.end_date >= calendar.the_date ORDER BY t_venues.venue_id , calendar.the_date ; The Result venue_id the_date is_available ----------- ----------------------- ------------ 1 2014-01-01 00:00:00.000 1 1 2014-01-02 00:00:00.000 0 2 2014-01-05 00:00:00.000 0 3 2014-01-09 00:00:00.000 0 3 2014-01-10 00:00:00.000 0 3 2014-01-11 00:00:00.000 1 3 2014-01-12 00:00:00.000 1 3 2014-01-13 00:00:00.000 1 3 2014-01-14 00:00:00.000 1 3 2014-01-15 00:00:00.000 1 4 2014-01-20 00:00:00.000 1 4 2014-01-21 00:00:00.000 1 4 2014-01-22 00:00:00.000 1 4 2014-01-23 00:00:00.000 0 4 2014-01-24 00:00:00.000 0 4 2014-01-25 00:00:00.000 0 4 2014-01-26 00:00:00.000 1 4 2014-01-27 00:00:00.000 1 4 2014-01-28 00:00:00.000 1 4 2014-01-29 00:00:00.000 1 4 2014-01-30 00:00:00.000 1 (21 row(s) affected) The Explanation Our calendar tables contains an entry for every date. We join our t_venues (as an aside, if you have the choice, lose the t_ prefix!) to return every day between our start_date and end_date. Example output for venue_id=4 for just this join: venue_id the_date ----------- ----------------------- 4 2014-01-20 00:00:00.000 4 2014-01-21 00:00:00.000 4 2014-01-22 00:00:00.000 4 2014-01-23 00:00:00.000 4 2014-01-24 00:00:00.000 4 2014-01-25 00:00:00.000 4 2014-01-26 00:00:00.000 4 2014-01-27 00:00:00.000 4 2014-01-28 00:00:00.000 4 2014-01-29 00:00:00.000 4 2014-01-30 00:00:00.000 (11 row(s) affected) Now we have one row per day, we [outer] join our t_venueuser table. We join this in much the same manner as before, but with one added twist: we need to join based on the venue_id too! Running this for venue_id=4 gives this result: venue_id the_date t_venueuser_venue_id ----------- ----------------------- -------------------- 4 2014-01-20 00:00:00.000 NULL 4 2014-01-21 00:00:00.000 NULL 4 2014-01-22 00:00:00.000 NULL 4 2014-01-23 00:00:00.000 4 4 2014-01-24 00:00:00.000 4 4 2014-01-25 00:00:00.000 4 4 2014-01-26 00:00:00.000 NULL 4 2014-01-27 00:00:00.000 NULL 4 2014-01-28 00:00:00.000 NULL 4 2014-01-29 00:00:00.000 NULL 4 2014-01-30 00:00:00.000 NULL (11 row(s) affected) See how we have a NULL value for rows where there is no t_venueuser record. Genius, no? ;-) So in my first query I gave you a quick CASE statement that shows availability (1=available, 0=not available). This is for illustration only, but could be useful to you. You can then either wrap the query up and then apply an extra filter on this calculated column or simply add a where clause in: WHERE t_venueuser.venue_id IS NULL and that will do the same trick.
This is a complete hack, but it gives the results you require, I've only tested it on the data you provided so there may well be gotchas with larger sets. In general what you are looking at solving here is a variation of gaps and islands problem ,this is (briefly) a sequence where some items are missing. The missing items are referred as gaps and the existing items are referred as islands. If you would like to understand this issue in general check a few of the articles: Simple talk article blogs.MSDN article SO answers tagged gaps-and-islands Code: ;with dates as ( SELECT vdates.venue_id, vdates.vdate FROM ( SELECT DATEADD(d,sv.number,v.start_date) vdate , v.venue_id FROM t_venues v INNER JOIN master..spt_values sv ON sv.type='P' AND sv.number BETWEEN 0 AND datediff(d, v.start_date, v.end_date)) vdates LEFT JOIN t_venueuser vu ON vdates.vdate >= vu.start_date AND vdates.vdate <= vu.end_date AND vdates.venue_id = vu.venue_id WHERE ISNULL(vu.venue_id,-1) = -1 ) SELECT venue_id, ISNULL([1],[2]) StartDate, [2] EndDate FROM (SELECT venue_id, rDate, ROW_NUMBER() OVER (PARTITION BY venue_id, DateType ORDER BY rDate) AS rType, DateType as dType FROM( SELECT d1.venue_id ,d1.vdate AS rDate ,'1' AS DateType FROM dates AS d1 LEFT JOIN dates AS d0 ON DATEADD(d,-1,d1.vdate) = d0.vdate LEFT JOIN dates AS d2 ON DATEADD(d,1,d1.vdate) = d2.vdate WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 1 AND ISNULL(d0.vdate, '01 Jan 1753') = '01 Jan 1753' UNION SELECT d1.venue_id ,ISNULL(d2.vdate,d1.vdate) ,'2' FROM dates AS d1 LEFT JOIN dates AS d2 ON DATEADD(d,1,d1.vdate) = d2.vdate WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 2 ) res ) src PIVOT (MIN (rDate) FOR dType IN ( [1], [2] ) ) AS pvt Results: venue_id StartDate EndDate 1 2014-01-01 2014-01-01 3 2014-01-11 2014-01-15 4 2014-01-20 2014-01-22 4 2014-01-26 2014-01-30
Computation of period Start date
I have a table that hold the start date and the end date of a financial period. CHARGE_PERIOD_ID START_DATE END_DATE 13 2013-03-31 00:00:00.000 2013-04-27 00:00:00.000 14 2013-04-28 00:00:00.000 2013-05-25 00:00:00.000 15 2013-05-26 00:00:00.000 2013-06-29 00:00:00.000 16 2013-06-30 00:00:00.000 2013-07-27 00:00:00.000 17 2013-07-28 00:00:00.000 2013-08-24 00:00:00.000 18 2013-08-25 00:00:00.000 2013-09-28 00:00:00.000 19 2013-09-29 00:00:00.000 2013-10-26 00:00:00.000 20 2013-10-27 00:00:00.000 2013-11-23 00:00:00.000 21 2013-11-24 00:00:00.000 2013-12-28 00:00:00.000 22 2013-12-29 00:00:00.000 2014-01-25 00:00:00.000 23 2014-01-26 00:00:00.000 2014-02-22 00:00:00.000 24 2014-02-23 00:00:00.000 2014-03-29 00:00:00.000 The user of a report wants the current financial year split into 12 periods and want to give to feed in 2 parameters into the report , a year and a period number which will go into my sql . So something like #year=2014 #period=1 will be recieved . I have to write some sql to go to this table and set a period start date of 31/03/2014 and a period end date of 27/04/2014. So in pseudo code: Look up period 1 for 2014 and return period start date of 31/03/2014 and period end date of 27/04/2014. #PERIOD_START_DATE = select the the first period that starts in March for the given year . all financial period starts in March. #PERIOD_END_DATE = select the corresponding END_DATE from the table . The question is how to begin to code this or my design approach? Should I create a function that calcualtes this or should I do a CTE and add a column which will hold the period number in the way they want etc . Thinking about it more I think I need a mapping table . So the real question is can I do this without a mapping table ?
DECLARE #Year INT DECLARE #Period INT SET #Year= 2013 SET #Period = 1 ;WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) -1 ELSE YEAR([START_DATE]) END ORDER BY CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) - 1 ELSE YEAR([START_DATE]) END ,CASE WHEN MONTH([START_DATE])<3 THEN MONTH([START_DATE]) + 12 ELSE MONTH([START_DATE]) END ) AS RN FROM Periods ) SELECT * FROM CTE WHERE RN = #Period AND CASE WHEN MONTH([START_DATE])<3 THEN YEAR([START_DATE]) -1 ELSE YEAR([START_DATE]) END = #Year SQLFiddle DEMO
Pulling Quarters from date range
Please help me how can I break a date range into quarters of a year.Ex date range 1st Jan 2012 to 31st October 2013 should give me a result set of all 8 quarters.The results should be in following format, I am using SQL server 2008 : Quarter Month start Month end 1 Jan-12 Mar-12 2 Apr-12 Jun-12 3 Jul-12 Sep-12 4 Oct-12 Dec-12 1 Jan-13 Mar-13 2 Apr-13 Jun-13 3 Jul-13 Sep-13 4 Oct-13 Oct-13
You'd need to look at the DATEPART(QUARTER,date) and break them up that way. Something akin to this: select datepart(year, dateTarget) as theYear, num as theQuarter, min(dateTarget) as startDate, max(dateTarget) as endDate from numbers join dates on datepart(quarter, dateper) = num where num between 1 and 4 group by datepart(year, dateTarget),num Where the dates table is the table you're looking at, and numbers is, well, a numbers table (something I find pretty useful to just have around).
This gives you quarter start dates for 12 quarrters: with calendar as ( select --DATEFROMPARTS(year(getdate()),1,1) as [start], convert(datetime, convert(char(4), year(getdate()))+'0101') as [start], qtrsBack = 1 union all select dateadd(mm,-3,[start]), qtrsBack+1 from calendar where qtrsback < 12 ) select * from calendar producing: start qtrsBack ---------- ----------- 2013-01-01 1 2012-10-01 2 2012-07-01 3 2012-04-01 4 2012-01-01 5 2011-10-01 6 2011-07-01 7 2011-04-01 8 2011-01-01 9 2010-10-01 10 2010-07-01 11 2010-04-01 12
Update a Field/Column based on Current and Previous Record Value
I need assistance with updating a field/column "IsLatest" based on the comparison between the current and previous record. I'm using CTE's syntax and I'm able to get the current and previous record but I'm unable updated field/column "IsLatest" which I need based on the field/column "Value" of the current and previous record. Example Current Output Dates Customer Value IsLatest 2010-01-01 00:00:00.000 1 12 1 Dates Customer Value IsLatest 2010-01-01 00:00:00.000 1 12 0 2010-01-02 00:00:00.000 1 30 1 Dates Customer Value IsLatest 2010-01-01 00:00:00.000 1 12 0 2010-01-02 00:00:00.000 1 30 0 2010-01-03 00:00:00.000 1 13 1 Expected Final Output Dates Customer Value ValueSetId IsLatest 2010-01-01 00:00:00.000 1 12 12 0 2010-01-01 00:00:00.000 1 12 13 0 2010-01-01 00:00:00.000 1 12 14 0 2010-01-02 00:00:00.000 1 30 12 0 2010-01-02 00:00:00.000 1 30 13 0 2010-01-02 00:00:00.000 1 30 14 0 2010-01-03 00:00:00.000 1 13 12 0 2010-01-03 00:00:00.000 1 13 13 0 2010-01-03 00:00:00.000 1 13 14 0 2010-01-04 00:00:00.000 1 14 12 0 2010-01-04 00:00:00.000 1 14 13 0 2010-01-04 00:00:00.000 1 14 14 1
;WITH a AS ( SELECT Dates Customer Value, row_number() over (partition by customer order by Dates desc, ValueSetId desc) rn FROM #Customers) SELECT Dates, Customer, Value, case when RN = 1 then 1 else 0 end IsLatest FROM a