T-SQL to find gaps in Date field of table - sql

I'm inexperienced in asking these questions in a forum. I'm sure there's a more elegant way to display this code, etc. I hope I can explain what I'm after here. I have a simple table with a record ID and a date column in it. You can run this simple code to create the table if you'd like.
IF OBJECT_ID('tempdb..#tmptbl') IS NOT NULL
BEGIN
DROP TABLE #tmptbl
END
create table #tmptbl (recid int, docdate date)
insert into #tmptbl
values
(1, '11/16/19'),(1, '11/15/19'),(1, '11/14/19'),(1, '11/13/19'),(1, '10/29/19'),(1, '10/27/19'),(1, '10/26/19'),(2, '10/31/19'),(2, '10/30/19'),(2, '10/29/19'),(2, '10/1/19'),(3, '11/16/19'),(3, '11/15/19'),(3, '11/13/19'),(3, '8/9/19'),(3, '8/8/19'),(3, '8/7/19')
--select * from #tmptbl order by 1, 2 desc
Here is a picture of a sample in Excel. The highlighted rows are the rows I want to return in a query.
Logic for the select statement to return the rows needed:
For each recid, determine if there is a record on 11/16/19 (this can be a passed parameter but it will always be just one particular date). If the recid does not have a record with 11/16/19 on it, return no rows for that recid. If it does, I need to return the consecutive dated rows up to that date. When there is a gap in the date for the recid, I can omit the rest of the rows for that recid. I've tried to explain the logic in comments in the picture.
Can you help give me some examples of how to accomplish this using T-SQL? ...Return only the consecutive dated rows for each recid up to the provided date (i.e. 11/16/19 in my example).
Thank you.

You can do this easily with a recursive CTE
declare #docDate datetime = '2019-11-16';
with cte as (select recid
, docdate
from #tmptbl
where docdate = #docDate
union all
select t.recid
, t.docdate
from #tmptbl as t
join cte on t.recid = cte.recid
and t.docdate = dateadd(day, -1, cte.docdate))
select *
from cte

Related

Find the most recently updated rows according to a multi-column grouping

I'm using SQL Server and T-SQL.
Sample Data:
I have data similar to the following readily consumable test data.
--===== Set the proper date format for the test data.
SET DATEFORMAT dmy
;
--===== Create and populate the Test Table
DROP TABLE IF EXISTS #TestTable
;
CREATE TABLE #TestTable
(
Item VARCHAR(10) NOT NULL
,GroupA TINYINT NOT NULL
,GroupB SMALLINT NOT NULL
,Updated DATE NOT NULL
,Idx INT NOT NULL
)
;
INSERT INTO #TestTable WITH (TABLOCK)
(Item,GroupA,GroupB,Updated,Idx)
VALUES ('ABC',7,2020,'14/11/2019',8) --Return this row
,('ABC',7,2020,'10/11/2019',7)
,('ABC',6,2019,'14/11/2019',6) --Return this row
,('ABC',5,2018,'13/11/2019',5) --Return this row
,('ABC',5,2018,'12/11/2019',4)
,('ABC',7,2018,'14/11/2019',3) --Return this row
,('ABC',7,2019,'25/11/2019',2) --Return this row
,('ABC',7,2019,'18/11/2019',1)
;
--===== Display the test data
SELECT * FROM #TestTable
;
Problem Description:
I need help in writing a query that will return the rows marked as "--Return this row". I know how to write a basic SELECT but have no idea how to pull this off.
The basis of the problem is to return the latest updated row for each "group" of rows. A "group" of rows is determined by the combination of the Item, GroupA, and GroupB columns and I need to return the full rows found.
Use row_number() :
select t.*
from (select t.*, row_number() over (partition by item, groupa, groupb order by updated desc) as seq
from table t
) t
where seq = 1;
select table.Item,table.GroupA,table.GroupB,table.Updated,Idx
FROM (select Item,GroupA,GroupB,max(Updated) Updated
from table
group by Item,GroupA,GroupB) a
inner join table
on(a.Item = table.Item and a.GroupA = table.GroupA and a.GroupB = table.GroupB and
a.Updated = table.Updated)

Generating Lines based on a value from a column in another table

I have the following table:
EventID=00002,DocumentID=0005,EventDesc=ItemsReceived
I have the quantity in another table
DocumentID=0005,Qty=20
I want to generate a result of 20 lines (depending on the quantity) with an auto generated column which will have a sequence of:
ITEM_TAG_001,
ITEM_TAG_002,
ITEM_TAG_003,
ITEM_TAG_004,
..
ITEM_TAG_020
Here's your sql query.
with cte as (
select 1 as ctr, t2.Qty, t1.EventID, t1.DocumentId, t1.EventDesc from tableA t1
inner join tableB t2 on t2.DocumentId = t1.DocumentId
union all
select ctr + 1, Qty, EventID, DocumentId, EventDesc from cte
where ctr <= Qty
)select *, concat('ITEM_TAG_', right('000'+ cast(ctr AS varchar(3)),3)) from cte
option (maxrecursion 0);
Output:
Best is to introduce a numbers table, very handsome in many places...
Something along:
Create some test data:
DECLARE #MockNumbers TABLE(Number BIGINT);
DECLARE #YourTable1 TABLE(DocumentID INT,ItemTag VARCHAR(100),SomeText VARCHAR(100));
DECLARE #YourTable2 TABLE(DocumentID INT, Qty INT);
INSERT INTO #MockNumbers SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values;
INSERT INTO #YourTable1 VALUES(1,'FirstItem','qty 5'),(2,'SecondItem','qty 7');
INSERT INTO #YourTable2 VALUES(1,5), (2,7);
--The query
SELECT CONCAT(t1.ItemTag,'_',REPLACE(STR(A.Number,3),' ','0'))
FROM #YourTable1 t1
INNER JOIN #YourTable2 t2 ON t1.DocumentID=t2.DocumentID
CROSS APPLY(SELECT Number FROM #MockNumbers WHERE Number BETWEEN 1 AND t2.Qty) A;
The result
FirstItem_001
FirstItem_002
[...]
FirstItem_005
SecondItem_001
SecondItem_002
[...]
SecondItem_007
The idea in short:
We use an INNER JOIN to get the quantity joined to the item.
Now we use APPLY, which is a row-wise action, to bind as many rows to the set, as we need it.
The first item will return with 5 lines, the second with 7. And the trick with STR() and REPLACE() is one way to create a padded number. You might use FORMAT() (v2012+), but this is working rather slowly...
The table #MockNumbers is a declared table variable containing a list of numbers from 1 to 100. This answer provides an example how to create a pyhsical numbers and date table. Any database should have such a table...
If you don't want to create a numbers table, you can search for a tally table or tally on the fly. There are many answers showing approaches how to create a list of running numbers...a

Copy Most Recent Date's row values where gaps in dates exist

I am creating a report in Tableau for a new product that captures metrics such as previous applications pending, new apps end of day pending etc. In order to do this, I need a a snapshot of the end of day status for each application each day. A decision was made above my pay grade to only capture a rolling seven day delta of the data. So, what happens is an application that has not had a status change in the previous seven days stops appearing in the DB until something new happens which allows for gaps in dates and throws my numbers off in my report. What I need is a snapshot for each day for each application, so when there is a date gap, I want to grab the most recent previous day's record and insert to fill in the gaps between the two dates. Also, I join to a credit score table and we sometimes pull all three bureaus, sometimes two, sometimes one so there could be up to three rows per application per day.
I have looked on this site for similar issues which I seem some similar issues however none are an exact match to what I am trying to accomplish and I honestly do not know where to start. Will a correlated subquery accomplish what I need? I provided some code below to show what the data looks like currently.
drop table if exists #date
drop table if exists #test
create table #date
(
calendar_date date
)
insert into #date
values
('2019-08-07'),
('2019-08-08'),
('2019-08-09'),
('2019-08-10'),
('2019-08-11'),
('2019-08-12')
create table #test
(
id int,
period_date date,
decision_status varchar(20),
credit_score int,
expired_flag bit
)
insert into #test (id,period_date,decision_status,credit_score,expired_flag)
values
(1,'2019-08-08','declined',635,null),
(1,'2019-08-08','declined',642,null),
(1,'2019-08-09','declined',635,null),
(1,'2019-08-09','declined',642,null),
(1,'2019-08-10','declined',635,null),
(1,'2019-08-10','declined',642,null),
(1,'2019-08-11','declined',635,null),
(1,'2019-08-11','declined',642,null),
(1,'2019-08-12','declined',635,null),
(1,'2019-08-12','declined',642,null),
(2,'2019-08-08','review',656,null),
(2,'2019-08-08','review',648,null),
(2,'2019-08-09','review',656,null),
(2,'2019-08-09','review',648,null),
(2,'2019-08-12','review',656,null),
(2,'2019-08-12','review',648,null),
(3,'2019-08-08','preapproved',678,null),
(3,'2019-08-08','preapproved',689,null),
(3,'2019-08-08','preapproved',693,null),
(3,'2019-08-09','preapproved',678,null),
(3,'2019-08-09','preapproved',689,null),
(3,'2019-08-09','preapproved',693,null),
(3,'2019-08-11','preapproved',678,1),
(3,'2019-08-11','preapproved',689,1),
(3,'2019-08-11','preapproved',693,1),
(3,'2019-08-12','preapproved',678,1),
(3,'2019-08-12','preapproved',689,1),
(3,'2019-08-12','preapproved',693,1),
(4,'2019-08-08','onboarded',725,null),
(4,'2019-08-09','onboarded',725,null),
(4,'2019-08-10','onboarded',725,null),
(5,'2019-08-08','approved',685,null),
(5,'2019-08-08','approved',675,null),
(5,'2019-08-09','approved',685,null),
(5,'2019-08-09','approved',675,null),
(5,'2019-08-12','approved',685,1),
(5,'2019-08-12','approved',675,1)
And the query:
select id, calendar_date, period_date, decision_status, credit_score, expired_flag
from #date join
#test
on calendar_date=dateadd(day,-1,period_date)
order by id, calendar_date
I just need each application to show for each day.
You may just need a left join: just need a left join:
select t.id, d.calendar_date, t.period_date, t.decision_status, t.credit_score, t.expired_flag
from #date d left join
#test t
on d.calendar_date = dateadd(day, -1, t.period_date)
order by id, d.calendar_date;
If by "application" you mean the id in #test, then use cross join to generate the rows and a outer apply to fill in the values:
select t.id, d.calendar_date, t.period_date, t.decision_status, t.credit_score, t.expired_flag
from #date d cross join
(select distinct id from #test) i outer apply
(select top (1) t.*
from #test t
where t.id = i.id and t.date <= d.date
order by t.date desc
) t
Update:
After receiving the reply from Gordon, which gave me some inspiration and set me in the right direction, and conducting some additional research, I appear to have found a solution that is working. I wanted to share the solution here in case anyone else runs across this problem. I am posting the code below:
drop table if exists #date
drop table if exists #test
drop table if exists #test1
drop table if exists #row_num
create table #date
(
calendar_date date
)
insert into #date
values
('2019-08-07'),
('2019-08-08'),
('2019-08-09'),
('2019-08-10'),
('2019-08-11')
create table #test
(
id int,
period_date date,
decision_status varchar(20),
credit_score int,
expired_flag bit
)
insert into #test (id,period_date,decision_status,credit_score,expired_flag)
values
(1,'2019-08-08','declined',635,null),
(1,'2019-08-08','declined',642,null),
(1,'2019-08-09','declined',635,null),
(1,'2019-08-09','declined',642,null),
(1,'2019-08-10','declined',635,null),
(1,'2019-08-10','declined',642,null),
(1,'2019-08-11','declined',635,null),
(1,'2019-08-11','declined',642,null),
(1,'2019-08-12','declined',635,null),
(1,'2019-08-12','declined',642,null),
(2,'2019-08-08','review',656,null),
(2,'2019-08-08','review',648,null),
(2,'2019-08-09','review',656,null),
(2,'2019-08-09','review',648,null),
(2,'2019-08-12','review',656,null),
(2,'2019-08-12','review',648,null),
(3,'2019-08-08','preapproved',678,null),
(3,'2019-08-08','preapproved',689,null),
(3,'2019-08-08','preapproved',693,null),
(3,'2019-08-09','preapproved',678,null),
(3,'2019-08-09','preapproved',689,null),
(3,'2019-08-09','preapproved',693,null),
(3,'2019-08-11','preapproved',678,1),
(3,'2019-08-11','preapproved',689,1),
(3,'2019-08-11','preapproved',693,1),
(3,'2019-08-12','preapproved',678,1),
(3,'2019-08-12','preapproved',689,1),
(3,'2019-08-12','preapproved',693,1),
(4,'2019-08-08','onboarded',725,null),
(4,'2019-08-09','onboarded',725,null),
(4,'2019-08-10','onboarded',725,null),
(5,'2019-08-08','approved',685,null),
(5,'2019-08-08','approved',675,null),
(5,'2019-08-09','approved',685,null),
(5,'2019-08-09','approved',675,null),
(5,'2019-08-12','approved',685,1),
(5,'2019-08-12','approved',675,1)
select id,calendar_date,decision_status,credit_score,expired_flag
,ROW_NUMBER() over(partition by id,calendar_date order by calendar_date) as row_id
,cast(ROW_NUMBER() over(partition by id,calendar_date order by calendar_date) as char(1)) as row_num
into #test1
from #date
join #test
on calendar_date=dateadd(day,-1,period_date)
order by id,calendar_date
create table #row_num
(
row_id int,
row_num char(1)
)
insert into #row_num
values
(1,'1'),
(2,'2'),
(3,'3')
select i.id
,d.calendar_date
,coalesce(t.decision_status,t1.decision_status) as decision_status
,coalesce(t.credit_score,t1.credit_score) as credit_score
,coalesce(t.expired_flag,t1.expired_flag) as expired_flag
from #date d
cross join
(select distinct id
from #test1 ) i
cross join #row_num r
left join #test1 t
on t.id=i.id
and t.row_id=r.row_id
and t.calendar_date=d.calendar_date
join
(select id,row_id,decision_status,credit_score,expired_flag
,calendar_date as start_date
,lead(calendar_date,1,dateadd(day,1,(select max(calendar_date) from #date)))
over (partition by id,row_id order by calendar_date) as end_date
from #test1
) t1
on t1.id=i.id
and t1.row_id=r.row_id
and d.calendar_date>=t1.start_date
and d.calendar_date<t1.end_date
order by i.id,d.calendar_date,r.row_id
This gives me what I am looking for, all the daily records for each application for each day.

How to select values by date field (not as simple as it sounds)

I have a table called tblMK The table contains a date time field.
What I wish to do is create a query which will each time, select the 2 latest entries (by the datetime column) and then get the date difference between them and show only that.
How would I go around creating this expression. This doesn't necessarily need to be a query, it could be a view/function/procedure or what ever works. I have created a function called getdatediff which receives to dates, and returns a string the says (x days y hours z minutes) basically that will be the calculated field. So how would I go around doing this?
Edit: I need to each time select 2 and 2 and so on until the oldest one. There will always be an even amount of rows.
Use only sql like this:
create table t1(c1 integer, dt datetime);
insert into t1 values
(1, getdate()),
(2, dateadd(day,1,getdate())),
(3, dateadd(day,2,getdate()));
with temp as (select top 2 dt
from t1
order by dt desc)
select datediff(day,min(dt),max(dt)) as diff_of_dates
from temp;
sql fiddle
On MySQL use limit clause
select max(a.updated_at)-min(a.updated_at)
From
( select * from mytable order by updated_at desc limit 2 ) a
Thanks guys I found the solution please ignore the additional columns they are for my db:
; with numbered as (
Select part,taarich,hulia,mesirakabala,
rowno = row_number() OVER (Partition by parit order.by taarich)
From tblMK)
Select a.rowno-1,a.part, a.Julia,b.taarich,as.taarich_kabala,a.taarich, a.mesirakabala,getdatediff(b.taarich,a.taarich) as due
From numbered a
Left join numbered b ON b.parit=a.parit
And b.rowno = a.rowno - 1
Where b.taarich is not null
Order by part,taarich
Sorry about mistakes I might of made, I'm on my smartphone.

SQL JOIN table with a date range

Say, I have a table with C columns and N rows. I would like to produce a select statement that represents the "join" of that table with a data range comprising, M days. The resultant result set should have C+1 columns (the last one being the date) and NXM rows.
Trivial example to clarify things:
Given the table A below:
select * from A;
avalue |
--------+
"a" |
And a date range from 10 to 12 of October 2012, I want the following result set:
avalue | date
--------+-------
"a" | 2012-10-10
"a" | 2012-10-11
"a" | 2012-10-12
(this is a stepping stone I need towards ultimately calculating inventory levels on any given day, given starting values and deltas)
The Postgres way for this is simple: CROSS JOIN to the function generate_series():
SELECT t.*, g.day::date
FROM tbl t
CROSS JOIN generate_series(timestamp '2012-10-10'
, timestamp '2012-10-12'
, interval '1 day') AS g(day);
Produces exactly the output requested.
generate_series() is a set-returning function (a.k.a. "table function") producing a derived table. There are a couple of overloaded variants, here's why I chose timestamp input:
Generating time series between two dates in PostgreSQL
For arbitrary dates, replace generate_series() with a VALUES expression. No need to persist a table:
SELECT *
FROM tbl t
CROSS JOIN (
VALUES
(date '2012-08-13') -- explicit type in 1st row
, ('2012-09-05')
, ('2012-10-10')
) g(day);
If the date table has more dates in it than you're interested in, then do
select a.avalue, b.date from a, b where b.date between '2012-10-10' and '2012-10-12'
Other wise if the date table contained only the dates you were interested in, a cartesian join would accomplish this:
select * from a,b;
declare
#Date1 datetime = '20121010',
#Date2 datetime = '20121012';
with Dates
as
(
select #Date1 as [Date]
union all
select dateadd(dd, 1, D.[Date]) as [Date]
from Dates as D
where D.[Date] <= DATEADD(dd, -1, #Date2)
)
select
A.value, D.[Date]
from Dates as D
cross join A
For MySQL
schema/data:
CREATE TABLE someTable
(
someCol varchar(8) not null
);
INSERT INTO someTable VALUES ('a');
CREATE TABLE calendar
(
calDate datetime not null,
isBus bit
);
ALTER TABLE calendar
ADD CONSTRAINT PK_calendar
PRIMARY KEY (calDate);
INSERT INTO calendar VALUES ('2012-10-10', 1);
INSERT INTO calendar VALUES ('2012-10-11', 1);
INSERT INTO calendar VALUES ('2012-10-12', 1);
query:
select s.someCol, c.calDate from someTable s, calendar c;
You really have two options for what you are trying to do.
If your RDBMS supports it (I know SQL Server does, but I don't know any others), you can create a table-valued function which takes in a date range and returns a result set of all the discrete dates within that range. You would do a cartesian join between your table and the function.
You can create a static table of date values and then do a cartesian join between the two tables.
The second option will perform better, especially if you are dealing with large date ranges, however, that solution will not be able to handle arbitrary date ranges. But then, you should know your minimum date, and you can alway add more dates to your table as time goes on.
I am not very clear about your M table. Providing that you have such a table(M) with dates, following cross join will bring the results.
SELECT C.*, M.date FROM C CROSS JOIN M