T-SQL to find gaps in Date field of table - sql
I'm inexperienced in asking these questions in a forum. I'm sure there's a more elegant way to display this code, etc. I hope I can explain what I'm after here. I have a simple table with a record ID and a date column in it. You can run this simple code to create the table if you'd like.
IF OBJECT_ID('tempdb..#tmptbl') IS NOT NULL
BEGIN
DROP TABLE #tmptbl
END
create table #tmptbl (recid int, docdate date)
insert into #tmptbl
values
(1, '11/16/19'),(1, '11/15/19'),(1, '11/14/19'),(1, '11/13/19'),(1, '10/29/19'),(1, '10/27/19'),(1, '10/26/19'),(2, '10/31/19'),(2, '10/30/19'),(2, '10/29/19'),(2, '10/1/19'),(3, '11/16/19'),(3, '11/15/19'),(3, '11/13/19'),(3, '8/9/19'),(3, '8/8/19'),(3, '8/7/19')
--select * from #tmptbl order by 1, 2 desc
Here is a picture of a sample in Excel. The highlighted rows are the rows I want to return in a query.
Logic for the select statement to return the rows needed:
For each recid, determine if there is a record on 11/16/19 (this can be a passed parameter but it will always be just one particular date). If the recid does not have a record with 11/16/19 on it, return no rows for that recid. If it does, I need to return the consecutive dated rows up to that date. When there is a gap in the date for the recid, I can omit the rest of the rows for that recid. I've tried to explain the logic in comments in the picture.
Can you help give me some examples of how to accomplish this using T-SQL? ...Return only the consecutive dated rows for each recid up to the provided date (i.e. 11/16/19 in my example).
Thank you.
You can do this easily with a recursive CTE
declare #docDate datetime = '2019-11-16';
with cte as (select recid
, docdate
from #tmptbl
where docdate = #docDate
union all
select t.recid
, t.docdate
from #tmptbl as t
join cte on t.recid = cte.recid
and t.docdate = dateadd(day, -1, cte.docdate))
select *
from cte
Related
Find the most recently updated rows according to a multi-column grouping
I'm using SQL Server and T-SQL. Sample Data: I have data similar to the following readily consumable test data. --===== Set the proper date format for the test data. SET DATEFORMAT dmy ; --===== Create and populate the Test Table DROP TABLE IF EXISTS #TestTable ; CREATE TABLE #TestTable ( Item VARCHAR(10) NOT NULL ,GroupA TINYINT NOT NULL ,GroupB SMALLINT NOT NULL ,Updated DATE NOT NULL ,Idx INT NOT NULL ) ; INSERT INTO #TestTable WITH (TABLOCK) (Item,GroupA,GroupB,Updated,Idx) VALUES ('ABC',7,2020,'14/11/2019',8) --Return this row ,('ABC',7,2020,'10/11/2019',7) ,('ABC',6,2019,'14/11/2019',6) --Return this row ,('ABC',5,2018,'13/11/2019',5) --Return this row ,('ABC',5,2018,'12/11/2019',4) ,('ABC',7,2018,'14/11/2019',3) --Return this row ,('ABC',7,2019,'25/11/2019',2) --Return this row ,('ABC',7,2019,'18/11/2019',1) ; --===== Display the test data SELECT * FROM #TestTable ; Problem Description: I need help in writing a query that will return the rows marked as "--Return this row". I know how to write a basic SELECT but have no idea how to pull this off. The basis of the problem is to return the latest updated row for each "group" of rows. A "group" of rows is determined by the combination of the Item, GroupA, and GroupB columns and I need to return the full rows found.
Use row_number() : select t.* from (select t.*, row_number() over (partition by item, groupa, groupb order by updated desc) as seq from table t ) t where seq = 1;
select table.Item,table.GroupA,table.GroupB,table.Updated,Idx FROM (select Item,GroupA,GroupB,max(Updated) Updated from table group by Item,GroupA,GroupB) a inner join table on(a.Item = table.Item and a.GroupA = table.GroupA and a.GroupB = table.GroupB and a.Updated = table.Updated)
Generating Lines based on a value from a column in another table
I have the following table: EventID=00002,DocumentID=0005,EventDesc=ItemsReceived I have the quantity in another table DocumentID=0005,Qty=20 I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of: ITEM_TAG_001, ITEM_TAG_002, ITEM_TAG_003, ITEM_TAG_004, .. ITEM_TAG_020
Here's your sql query. with cte as ( select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1 inner join tableB t2 on t2.DocumentId = t1.DocumentId union all select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte where ctr <= Qty )select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte option (maxrecursion 0); Output:
Best is to introduce a numbers table, very handsome in many places... Something along: Create some test data: DECLARE #MockNumbers TABLE(Number BIGINT); DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100)); DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT); INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values; INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7'); INSERT INTO #YourTable2 VALUES(1,5), (2,7); --The query SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0')) FROM #YourTable1 t1 INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A; The result FirstItem_001 FirstItem_002 [...] FirstItem_005 SecondItem_001 SecondItem_002 [...] SecondItem_007 The idea in short: We use an INNER JOIN to get the quantity joined to the item. Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it. The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly... The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table... If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a
Copy Most Recent Date's row values where gaps in dates exist
I am creating a report in Tableau for a new product that captures metrics such as previous applications pending, new apps end of day pending etc. In order to do this, I need a a snapshot of the end of day status for each application each day. A decision was made above my pay grade to only capture a rolling seven day delta of the data. So, what happens is an application that has not had a status change in the previous seven days stops appearing in the DB until something new happens which allows for gaps in dates and throws my numbers off in my report. What I need is a snapshot for each day for each application, so when there is a date gap, I want to grab the most recent previous day's record and insert to fill in the gaps between the two dates. Also, I join to a credit score table and we sometimes pull all three bureaus, sometimes two, sometimes one so there could be up to three rows per application per day. I have looked on this site for similar issues which I seem some similar issues however none are an exact match to what I am trying to accomplish and I honestly do not know where to start. Will a correlated subquery accomplish what I need? I provided some code below to show what the data looks like currently. drop table if exists #date drop table if exists #test create table #date ( calendar_date date ) insert into #date values ('2019-08-07'), ('2019-08-08'), ('2019-08-09'), ('2019-08-10'), ('2019-08-11'), ('2019-08-12') create table #test ( id int, period_date date, decision_status varchar(20), credit_score int, expired_flag bit ) insert into #test (id,period_date,decision_status,credit_score,expired_flag) values (1,'2019-08-08','declined',635,null), (1,'2019-08-08','declined',642,null), (1,'2019-08-09','declined',635,null), (1,'2019-08-09','declined',642,null), (1,'2019-08-10','declined',635,null), (1,'2019-08-10','declined',642,null), (1,'2019-08-11','declined',635,null), (1,'2019-08-11','declined',642,null), (1,'2019-08-12','declined',635,null), (1,'2019-08-12','declined',642,null), (2,'2019-08-08','review',656,null), (2,'2019-08-08','review',648,null), (2,'2019-08-09','review',656,null), (2,'2019-08-09','review',648,null), (2,'2019-08-12','review',656,null), (2,'2019-08-12','review',648,null), (3,'2019-08-08','preapproved',678,null), (3,'2019-08-08','preapproved',689,null), (3,'2019-08-08','preapproved',693,null), (3,'2019-08-09','preapproved',678,null), (3,'2019-08-09','preapproved',689,null), (3,'2019-08-09','preapproved',693,null), (3,'2019-08-11','preapproved',678,1), (3,'2019-08-11','preapproved',689,1), (3,'2019-08-11','preapproved',693,1), (3,'2019-08-12','preapproved',678,1), (3,'2019-08-12','preapproved',689,1), (3,'2019-08-12','preapproved',693,1), (4,'2019-08-08','onboarded',725,null), (4,'2019-08-09','onboarded',725,null), (4,'2019-08-10','onboarded',725,null), (5,'2019-08-08','approved',685,null), (5,'2019-08-08','approved',675,null), (5,'2019-08-09','approved',685,null), (5,'2019-08-09','approved',675,null), (5,'2019-08-12','approved',685,1), (5,'2019-08-12','approved',675,1) And the query: select id, calendar_date, period_date, decision_status, credit_score, expired_flag from #date join #test on calendar_date=dateadd(day,-1,period_date) order by id, calendar_date I just need each application to show for each day.
You may just need a left join: just need a left join: select t.id, d.calendar_date, t.period_date, t.decision_status, t.credit_score, t.expired_flag from #date d left join #test t on d.calendar_date = dateadd(day, -1, t.period_date) order by id, d.calendar_date; If by "application" you mean the id in #test, then use cross join to generate the rows and a outer apply to fill in the values: select t.id, d.calendar_date, t.period_date, t.decision_status, t.credit_score, t.expired_flag from #date d cross join (select distinct id from #test) i outer apply (select top (1) t.* from #test t where t.id = i.id and t.date <= d.date order by t.date desc ) t
Update: After receiving the reply from Gordon, which gave me some inspiration and set me in the right direction, and conducting some additional research, I appear to have found a solution that is working. I wanted to share the solution here in case anyone else runs across this problem. I am posting the code below: drop table if exists #date drop table if exists #test drop table if exists #test1 drop table if exists #row_num create table #date ( calendar_date date ) insert into #date values ('2019-08-07'), ('2019-08-08'), ('2019-08-09'), ('2019-08-10'), ('2019-08-11') create table #test ( id int, period_date date, decision_status varchar(20), credit_score int, expired_flag bit ) insert into #test (id,period_date,decision_status,credit_score,expired_flag) values (1,'2019-08-08','declined',635,null), (1,'2019-08-08','declined',642,null), (1,'2019-08-09','declined',635,null), (1,'2019-08-09','declined',642,null), (1,'2019-08-10','declined',635,null), (1,'2019-08-10','declined',642,null), (1,'2019-08-11','declined',635,null), (1,'2019-08-11','declined',642,null), (1,'2019-08-12','declined',635,null), (1,'2019-08-12','declined',642,null), (2,'2019-08-08','review',656,null), (2,'2019-08-08','review',648,null), (2,'2019-08-09','review',656,null), (2,'2019-08-09','review',648,null), (2,'2019-08-12','review',656,null), (2,'2019-08-12','review',648,null), (3,'2019-08-08','preapproved',678,null), (3,'2019-08-08','preapproved',689,null), (3,'2019-08-08','preapproved',693,null), (3,'2019-08-09','preapproved',678,null), (3,'2019-08-09','preapproved',689,null), (3,'2019-08-09','preapproved',693,null), (3,'2019-08-11','preapproved',678,1), (3,'2019-08-11','preapproved',689,1), (3,'2019-08-11','preapproved',693,1), (3,'2019-08-12','preapproved',678,1), (3,'2019-08-12','preapproved',689,1), (3,'2019-08-12','preapproved',693,1), (4,'2019-08-08','onboarded',725,null), (4,'2019-08-09','onboarded',725,null), (4,'2019-08-10','onboarded',725,null), (5,'2019-08-08','approved',685,null), (5,'2019-08-08','approved',675,null), (5,'2019-08-09','approved',685,null), (5,'2019-08-09','approved',675,null), (5,'2019-08-12','approved',685,1), (5,'2019-08-12','approved',675,1) select id,calendar_date,decision_status,credit_score,expired_flag ,ROW_NUMBER() over(partition by id,calendar_date order by calendar_date) as row_id ,cast(ROW_NUMBER() over(partition by id,calendar_date order by calendar_date) as char(1)) as row_num into #test1 from #date join #test on calendar_date=dateadd(day,-1,period_date) order by id,calendar_date create table #row_num ( row_id int, row_num char(1) ) insert into #row_num values (1,'1'), (2,'2'), (3,'3') select i.id ,d.calendar_date ,coalesce(t.decision_status,t1.decision_status) as decision_status ,coalesce(t.credit_score,t1.credit_score) as credit_score ,coalesce(t.expired_flag,t1.expired_flag) as expired_flag from #date d cross join (select distinct id from #test1 ) i cross join #row_num r left join #test1 t on t.id=i.id and t.row_id=r.row_id and t.calendar_date=d.calendar_date join (select id,row_id,decision_status,credit_score,expired_flag ,calendar_date as start_date ,lead(calendar_date,1,dateadd(day,1,(select max(calendar_date) from #date))) over (partition by id,row_id order by calendar_date) as end_date from #test1 ) t1 on t1.id=i.id and t1.row_id=r.row_id and d.calendar_date>=t1.start_date and d.calendar_date<t1.end_date order by i.id,d.calendar_date,r.row_id This gives me what I am looking for, all the daily records for each application for each day.
How to select values by date field (not as simple as it sounds)
I have a table called tblMK The table contains a date time field. What I wish to do is create a query which will each time, select the 2 latest entries (by the datetime column) and then get the date difference between them and show only that. How would I go around creating this expression. This doesn't necessarily need to be a query, it could be a view/function/procedure or what ever works. I have created a function called getdatediff which receives to dates, and returns a string the says (x days y hours z minutes) basically that will be the calculated field. So how would I go around doing this? Edit: I need to each time select 2 and 2 and so on until the oldest one. There will always be an even amount of rows.
Use only sql like this: create table t1(c1 integer, dt datetime); insert into t1 values (1, getdate()), (2, dateadd(day,1,getdate())), (3, dateadd(day,2,getdate())); with temp as (select top 2 dt from t1 order by dt desc) select datediff(day,min(dt),max(dt)) as diff_of_dates from temp; sql fiddle
On MySQL use limit clause select max(a.updated_at)-min(a.updated_at) From ( select * from mytable order by updated_at desc limit 2 ) a
Thanks guys I found the solution please ignore the additional columns they are for my db: ; with numbered as ( Select part,taarich,hulia,mesirakabala, rowno = row_number() OVER (Partition by parit order.by taarich) From tblMK) Select a.rowno-1,a.part, a.Julia,b.taarich,as.taarich_kabala,a.taarich, a.mesirakabala,getdatediff(b.taarich,a.taarich) as due From numbered a Left join numbered b ON b.parit=a.parit And b.rowno = a.rowno - 1 Where b.taarich is not null Order by part,taarich Sorry about mistakes I might of made, I'm on my smartphone.
SQL JOIN table with a date range
Say, I have a table with C columns and N rows. I would like to produce a select statement that represents the "join" of that table with a data range comprising, M days. The resultant result set should have C+1 columns (the last one being the date) and NXM rows. Trivial example to clarify things: Given the table A below: select * from A; avalue | --------+ "a" | And a date range from 10 to 12 of October 2012, I want the following result set: avalue | date --------+------- "a" | 2012-10-10 "a" | 2012-10-11 "a" | 2012-10-12 (this is a stepping stone I need towards ultimately calculating inventory levels on any given day, given starting values and deltas)
The Postgres way for this is simple: CROSS JOIN to the function generate_series(): SELECT t.*, g.day::date FROM tbl t CROSS JOIN generate_series(timestamp '2012-10-10' , timestamp '2012-10-12' , interval '1 day') AS g(day); Produces exactly the output requested. generate_series() is a set-returning function (a.k.a. "table function") producing a derived table. There are a couple of overloaded variants, here's why I chose timestamp input: Generating time series between two dates in PostgreSQL For arbitrary dates, replace generate_series() with a VALUES expression. No need to persist a table: SELECT * FROM tbl t CROSS JOIN ( VALUES (date '2012-08-13') -- explicit type in 1st row , ('2012-09-05') , ('2012-10-10') ) g(day);
If the date table has more dates in it than you're interested in, then do select a.avalue, b.date from a, b where b.date between '2012-10-10' and '2012-10-12' Other wise if the date table contained only the dates you were interested in, a cartesian join would accomplish this: select * from a,b;
declare #Date1 datetime = '20121010', #Date2 datetime = '20121012'; with Dates as ( select #Date1 as [Date] union all select dateadd(dd, 1, D.[Date]) as [Date] from Dates as D where D.[Date] <= DATEADD(dd, -1, #Date2) ) select A.value, D.[Date] from Dates as D cross join A
For MySQL schema/data: CREATE TABLE someTable ( someCol varchar(8) not null ); INSERT INTO someTable VALUES ('a'); CREATE TABLE calendar ( calDate datetime not null, isBus bit ); ALTER TABLE calendar ADD CONSTRAINT PK_calendar PRIMARY KEY (calDate); INSERT INTO calendar VALUES ('2012-10-10', 1); INSERT INTO calendar VALUES ('2012-10-11', 1); INSERT INTO calendar VALUES ('2012-10-12', 1); query: select s.someCol, c.calDate from someTable s, calendar c;
You really have two options for what you are trying to do. If your RDBMS supports it (I know SQL Server does, but I don't know any others), you can create a table-valued function which takes in a date range and returns a result set of all the discrete dates within that range. You would do a cartesian join between your table and the function. You can create a static table of date values and then do a cartesian join between the two tables. The second option will perform better, especially if you are dealing with large date ranges, however, that solution will not be able to handle arbitrary date ranges. But then, you should know your minimum date, and you can alway add more dates to your table as time goes on.
I am not very clear about your M table. Providing that you have such a table(M) with dates, following cross join will bring the results. SELECT C.*, M.date FROM C CROSS JOIN M