Query populating dates - sql

query that generates records to hold a future calculated value.
Hi I trying to write a query with the tables below to populate a collection. I want the t2 values when the dates match but when there is not a match I want the dates to populate with a null values (will be populate later with a calculated value) The number of records for the same date should match the last time the dates matched. So in the example for each day after 7/1 there should be 3 records for each day and after 7/5 just 2. I am trying to do this in one query but I am not sure it is possible. Any help on creating this and getting into a collection would be much appreciated.
create table t1 as
WITH DATA AS
(SELECT to_date('07/01/2019', 'MM/DD/YYYY') date1,
to_date('07/10/2019', 'MM/DD/YYYY') date2
FROM dual
)
SELECT date1+LEVEL-1 the_date,
TO_CHAR(date1+LEVEL-1, 'DY','NLS_DATE_LANGUAGE=AMERICAN') day
FROM DATA
WHERE TO_CHAR(date1+LEVEL-1, 'DY','NLS_DATE_LANGUAGE=AMERICAN')
NOT IN ('SAT', 'SUN')
CONNECT BY LEVEL <= date2-date1+1
create table t2
(cdate date,
camount number);
insert into t2 values
('01-JUL-2019', 10);
insert into t2 values
('01-JUL-2019', 20);
insert into t2 values
('01-JUL-2019', 30);
insert into t2 values
('05-JUL-19', 50);
insert into t2 values
('05-JUL-19', 20);
expected results:
01-JUL-19 10
01-JUL-19 20
01-JUL-19 30
02-JUL-19 null
02-JUL-19 null
02-JUL-19 null
03-JUL-19 null
03-JUL-19 null
03-JUL-19 null
04-JUL-19 null
04-JUL-19 null
04-JUL-19 null
05-JUL-19 50
05-JUL-19 20
08-JUL-19 null
08-JUL-19 null
09-JUL-19 null
09-JUL-19 null
10-JUL-19 null
10-JUL-19 null

One approach to this kind of problem is to build the result set incrementally in a few steps:
Count matches that each THE_DATE in T1 has in T2.
Apply the rule you outlined in the question to those THE_DATE which have zero matches (carry forward (across dates in ascending order) the number of matches of the last THE_DATE that did have matches.
Generate the extra rows in T1 for the THE_DATE that have zero matches. (e.g. If it is supposed to have three null records, duplicate up to this number)
Outer join to T2 to get the CAMOUNT where it is available.
Here's an example (The three named subfactors corresponding to steps 1,2,3 above):
WITH DATE_MATCH_COUNT AS (
SELECT T1.THE_DATE,
COUNT(T2.CDATE) AS MATCH_COUNT,
ROW_NUMBER() OVER (PARTITION BY NULL ORDER BY T1.THE_DATE ASC) AS ROWKEY
FROM T1
LEFT OUTER JOIN T2
ON T1.THE_DATE = T2.CDATE
GROUP BY T1.THE_DATE),
ADJUSTED_MATCH_COUNT AS (
SELECT THE_DATE,
MATCH_COUNT AS ACTUAL_MATCH_COUNT,
GREATEST(MATCH_COUNT,
(SELECT MAX(MATCH_COUNT) KEEP ( DENSE_RANK LAST ORDER BY ROWKEY ASC )
FROM DATE_MATCH_COUNT SCALAR_MATCH_COUNT
WHERE SCALAR_MATCH_COUNT.ROWKEY <= DATE_MATCH_COUNT.ROWKEY AND
SCALAR_MATCH_COUNT.MATCH_COUNT > 0)) AS FORCED_MATCH_COUNT
FROM DATE_MATCH_COUNT),
GENERATED_MATCH_ROW AS (
SELECT THE_DATE, FORCED_MATCH_COUNT, MATCH_KEY
FROM ADJUSTED_MATCH_COUNT CROSS APPLY (SELECT LEVEL AS MATCH_KEY
FROM DUAL CONNECT BY LEVEL <= DECODE(ACTUAL_MATCH_COUNT,0,FORCED_MATCH_COUNT,1)))
SELECT THE_DATE, CAMOUNT
FROM GENERATED_MATCH_ROW
LEFT OUTER JOIN T2
ON GENERATED_MATCH_ROW.THE_DATE = T2.CDATE
ORDER BY THE_DATE, CAMOUNT ASC;
Result:
THE_DATE CAMOUNT
____________ __________
01-JUL-19 10
01-JUL-19 20
01-JUL-19 30
02-JUL-19
02-JUL-19
02-JUL-19
03-JUL-19
03-JUL-19
03-JUL-19
04-JUL-19
04-JUL-19
04-JUL-19
05-JUL-19 20
05-JUL-19 50
08-JUL-19
08-JUL-19
09-JUL-19
09-JUL-19
10-JUL-19
10-JUL-19

Related

SQL - Select date ranges without overlapping

I have the following table (Oracle database):
ID
valid_from
valid_to
1
01.01.22
28.02.22
1
01.03.22
30.06.22
1
01.07.22
31.12.22
1
01.01.23
null
2
01.01.22
31.03.22
2
01.04.22
null
How do I best extract now all date ranges without overlaps over both IDs? The final result set should look like:
valid_from
valid_to
01.01.22
28.02.22
01.03.22
31.03.22
01.04.22
30.06.22
01.07.23
31.12.22
01.01.23
null
Null stands for max_date (PL / SQL Oracle Max Date).
Moreover, I should only select the values valid for the current year (let's assume we are already in 2022).
Thanks for your help in advance!
You can apply next select statement:
with
-- main table
t1 AS (SELECT w, q1, q2, to_date(q1,'dd.mm.yy') q1d, to_date(q2,'dd.mm.yy') q2d FROM www)
-- custom year in YYYY format
, t0 AS (SELECT '2022' y FROM dual)
-- join and order dates FROM - TO
, t2 AS (SELECT t1.q1, t1.q1d, s2.q2, s2.q2d
FROM t1
LEFT JOIN t1 s2 on t1.q1d <= s2.q2d
ORDER BY t1.q1d, s2.q2d)
-- mark the first each new row-pair by row_number()
, t3 AS (SELECT t2.*,
row_number() OVER (PARTITION BY t2.q1d ORDER BY t2.q1d ) r
FROM t2 )
-- join custom year value and select desired rows based on that value
SELECT q1, q2 FROM t3
JOIN t0 on 1=1
WHERE r = 1
-- for the custom year
AND t0.y <= to_char(q1d, 'yyyy')
ORDER BY q1d;
Demo
In my table-example dates are presented in varchar2 datatype and in dd.mm.yy date format. In case if your table fields have datatype date, then you don't need to implement function to_date() for those 2 fields.
Used table sample:
create table www (w integer, q1 varchar2(30), q2 varchar2(30));
insert into www values (1, '01.01.22', '28.02.22');
insert into www values (1, '01.03.22', '30.06.22');
insert into www values (1, '01.07.22', '31.12.22');
insert into www values (1, '01.01.23', '');
insert into www values (2, '01.01.22', '31.03.22');
insert into www values (2, '01.04.22', '');
If your table sample has more rows which are have null value in the field valid_to and the dates in valid_from are not in any range, let's say:
insert into www values (1, '01.01.24', '');
then previous solution will produce more rows in the end with null value.
In this case you can use that more complex solution:
...
-- join custom year value and select desired rows based on that value
, t4 as (SELECT q1, q2, q1d FROM t3
JOIN t0 on 1=1
WHERE r = 1 AND
-- for the custom year
t0.y <= to_char(q1d, 'yyyy')
ORDER BY q1d)
-- filter non-nullable rows
, t5 as ( SELECT q1, q2 FROM t4 WHERE Q2 IS NOT NULL )
-- max date from rows where Q2 field has null value
, t6 as ( SELECT to_char(MAX(Q1D),'dd.mm.yy') q1, q2
FROM t4
WHERE Q2 IS NULL
GROUP BY q2)
-- append rows with max date
SELECT * FROM t5
UNION ALL
SELECT * FROM t6;
Demo

Highlight multiple records in a date range

Working with SQL Server 2008.
fromdate todate ID name
--------------------------------
1-Aug-16 7-Aug-16 x jack
3-Aug-16 4-Aug-16 x jack
5-Aug-16 6-Aug-16 x tom
1-Aug-16 2-Aug-16 x john
3-Aug-16 4-Aug-16 x harry
5-Aug-16 6-Aug-16 x mac
Is there a way to script this so that I know if there are multiple names tagged to an ID in the same date range?
For example above, I want to flag that ID x has Name Jack and Tom tagged in the same date range.
ID multiple_flag
------------------------------------------------
x yes
y no
If there is a unique index in your table (in my example it is column i but you could also generate one by means of using ROW_NUMBER()) then you can do the following query based on an INNER JOIN to find overlapping date ranges:
CREATE TABLE #tmp (i int identity primary key,fromdate date,todate date,ID int,name varchar(32));
insert into #tmp (fromdate,todate,ID ,name) values
('1-Aug-16','7-Aug-16',3,'jack'),
('3-Aug-16','4-Aug-16',3,'tom'),
('5-Aug-16','6-Aug-16',3,'jack');
select a.*,b.name bname,b.i i2 from #tmp a
INNER join #tmp b on b.id=a.id AND b.i<>a.i
AND ( b.fromdate between a.fromdate and a.todate
OR b.todate between a.fromdate and a.todate)
(My id column is int). This will give you:
i fromdate todate ID name bname i2
- ---------- ---------- - ---- ----- --
1 2016-08-01 2016-08-07 3 jack tom 2
1 2016-08-01 2016-08-07 3 jack jack 3
Implement further filtering or grouping as required. I left a little demo here.
Please check the below sql, but it might not be the optimal one..
SELECT formdate,todate,id,tab1.name,
case when tab2.#Of >1 then 'yes' else 'no' end as multiple_flag
FROM tab1
inner join (SELECT Name, COUNT(*) as #Of
FROM tab1
GROUP BY Name) as tab2 on tab1.name=tab2.name
order by tab1.id ;
add your where condition, before the order by, if you need to add some date range on your sql.
change formdate to fromdate before run this sql, as I have used formdate in my machine.
The result looks like
One way to do it is using EXISTS CASE:
Please note this part of the query:
-- make sure the records date ranges overlap
AND t1.fromdate <= t2.todate
AND t2.fromdate <= t1.todate
for an explanation on testing for overlapping ranges, read the overlap wiki.
Create and populate sample data (Please save us this step in your future questions)
DECLARE #T as table
(
fromdate date,
todate date,
ID char(1),
name varchar(10)
)
INSERT INTO #T VALUES
('2016-08-01', '2016-08-07', 'x', 'jack'),
('2016-08-03', '2016-08-04', 'x', 'tom'),
('2016-08-05', '2016-08-06', 'x', 'jack'),
('2016-08-01', '2016-08-02', 'y', 'john'),
('2016-08-03', '2016-08-04', 'y', 'harry'),
('2016-08-05', '2016-08-06', 'y', 'mac')
The query:
SELECT DISTINCT id,
CASE WHEN EXISTS
(
SELECT 1
FROM #T t2
WHERE t1.Id = t2.Id
-- make sure it's not the same record
AND t1.fromdate <> t2.fromdate
AND t1.todate <> t2.todate
-- make sure the records date ranges overlap
AND t1.fromdate <= t2.todate
AND t2.fromdate <= t1.todate
)
THEN 'Yes'
ELSE 'No'
END As multiple_flag
FROM #T t1
Results:
id multiple_flag
---- -------------
x Yes
y No

Insert data into empty cells in ascending order

I have a Postgres table with following structure:
CREATE TABLE tb1 (
id integer,
name text,
date date,
time time without tz
);
CREATE TABLE tb2 (
id integer,
name text,
date date
);
I need to generate a 3rd table tb3 that will have column time_now with increment steps by 10 minutes. If tb1.time is not the same as tb3.time_now then b2.name is filled in. If tb1.time equals (or close by time_now) it is inserted into tb3.
Example:
tb1
1, xxxx, 2014-10-01, 08:20:00
2, yyyy, 2014-10-01, 08:40:00
tb2
1, zzzz, 2014-10-01
2, vvvv, 2014-10-01
3, eeee, 2014-10-01
3rd table should look like:
1, 08:00:00,zzzz -----> from tb2
2, 08:10:00,vvvv -----> from tb2
3, 08:20:00,xxxx -----> from tb1
4, 08:30:00,eeee -----> from tb2
5, 08:40:00,yyyy -----> from tb1
How to achieve this?
SELECT t.id, t.time::text, COALESCE(t.name, t2.name) AS name
FROM (
SELECT g.id, g.time, t1.name
, CASE WHEN t1.name IS NULL THEN
row_number() OVER (PARTITION BY t1.name ORDER BY g.id)
END AS rn
FROM (
SELECT g AS id, '08:00'::time + '10 min'::interval * (g-1) AS time
FROM generate_series (1,6) g
) g
LEFT JOIN tb1 t1 USING (time)
) t
LEFT JOIN tb2 t2 ON t2.id = t.rn
ORDER BY t.id;
First build a table of desired times with generate_series(). From 8:00 to 8:50 in my example.
Join to tb1 on time. Attach ascending numbers to empty slots with row_number() (rn).
Join tb2 to the remaining empty slots in ascending order.
Use COALESCE to pick names from tb1 and tb2.
Be wary about off-by-1 errors.
SQL Fiddle.
Aside: I would use none of those column names. id and name are not descriptive - never use those, be more specific. date and time are basic type names.

Grouping rows with a date range

I am using SQL Server 2008 and need to create a query that shows rows that fall within a date range.
My table is as follows:
ADM_ID WH_PID WH_IN_DATETIME WH_OUT_DATETIME
My rules are:
If the WH_OUT_DATETIME is on or within 24 hours of the WH_IN_DATETIME of another ADM_ID with the same WH_P_ID
I would like another column added to the results which identify the grouped value if possible as EP_ID.
e.g.
ADM_ID WH_PID WH_IN_DATETIME WH_OUT_DATETIME
------ ------ -------------- ---------------
1 9 2014-10-12 00:00:00 2014-10-13 15:00:00
2 9 2014-10-14 14:00:00 2014-10-15 15:00:00
3 9 2014-10-16 14:00:00 2014-10-17 15:00:00
4 9 2014-11-20 00:00:00 2014-11-21 00:00:00
5 5 2014-10-17 00:00:00 2014-10-18 00:00:00
Would return rows with:
ADM_ID WH_PID EP_ID EP_IN_DATETIME EP_OUT_DATETIME WH_IN_DATETIME WH_OUT_DATETIME
------ ------ ----- ------------------- ------------------- ------------------- -------------------
1 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-12 00:00:00 2014-10-13 15:00:00
2 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-14 14:00:00 2014-10-15 15:00:00
3 9 1 2014-10-12 00:00:00 2014-10-17 15:00:00 2014-10-16 14:00:00 2014-10-17 15:00:00
4 9 2 2014-11-20 00:00:00 2014-11-20 00:00:00 2014-10-16 14:00:00 2014-11-21 00:00:00
5 5 1 2014-10-17 00:00:00 2014-10-18 00:00:00 2014-10-17 00:00:00 2014-10-18 00:00:00
The EP_OUT_DATETIME will always be the latest date in the group. Hope this clarifies a bit.
This way, I can group by the EP_ID and find the EP_OUT_DATETIME and start time for any ADM_ID/PID that fall within.
Each should roll into the next, meaning that if another row has an WH_IN_DATETIME which follows on the WH_OUT_DATETIME of another for the same WH_PID, than that row's WH_OUT_DATETIME becomes the EP_OUT_DATETIME for all of the WH_PID's within that EP_ID.
I hope this makes some sense.
Thanks,
MR
Since the question does not specify that the solution be a "single" query ;-), here is another approach: using the "quirky update" feature dealy, which is updating a variable at the same time you update a column. Breaking down the complexity of this operation, I create a scratch table to hold the piece that is the hardest to calculate: the EP_ID. Once that is done, it gets joined into a simple query and provides the window with which to calculate the EP_IN_DATETIME and EP_OUT_DATETIME fields.
The steps are:
Create the scratch table
Seed the scratch table with all of the ADM_ID values -- this lets us do an UPDATE as all of the rows already exist.
Update the scratch table
Do the final, simple select joining the scratch table to the main table
The Test Setup
SET ANSI_NULLS ON;
SET NOCOUNT ON;
CREATE TABLE #Table
(
ADM_ID INT NOT NULL PRIMARY KEY,
WH_PID INT NOT NULL,
WH_IN_DATETIME DATETIME NOT NULL,
WH_OUT_DATETIME DATETIME NOT NULL
);
INSERT INTO #Table VALUES (1, 9, '2014-10-12 00:00:00', '2014-10-13 15:00:00');
INSERT INTO #Table VALUES (2, 9, '2014-10-14 14:00:00', '2014-10-15 15:00:00');
INSERT INTO #Table VALUES (3, 9, '2014-10-16 14:00:00', '2014-10-17 15:00:00');
INSERT INTO #Table VALUES (4, 9, '2014-11-20 00:00:00', '2014-11-21 00:00:00');
INSERT INTO #Table VALUES (5, 5, '2014-10-17 00:00:00', '2014-10-18 00:00:00');
Step 1: Create and Populate the Scratch Table
CREATE TABLE #Scratch
(
ADM_ID INT NOT NULL PRIMARY KEY,
EP_ID INT NOT NULL
-- Might need WH_PID and WH_IN_DATETIME fields to guarantee proper UPDATE ordering
);
INSERT INTO #Scratch (ADM_ID, EP_ID)
SELECT ADM_ID, 0
FROM #Table;
Alternate scratch table structure to ensure proper update order (since "quirky update" uses the order of the Clustered Index, as noted at the bottom of this answer):
CREATE TABLE #Scratch
(
WH_PID INT NOT NULL,
WH_IN_DATETIME DATETIME NOT NULL,
ADM_ID INT NOT NULL,
EP_ID INT NOT NULL
);
INSERT INTO #Scratch (WH_PID, WH_IN_DATETIME, ADM_ID, EP_ID)
SELECT WH_PID, WH_IN_DATETIME, ADM_ID, 0
FROM #Table;
CREATE UNIQUE CLUSTERED INDEX [CIX_Scratch]
ON #Scratch (WH_PID, WH_IN_DATETIME, ADM_ID);
Step 2: Update the Scratch Table using a local variable to keep track of the prior value
DECLARE #EP_ID INT; -- this is used in the UPDATE
;WITH cte AS
(
SELECT TOP (100) PERCENT
t1.*,
t2.WH_OUT_DATETIME AS [PriorOut],
t2.ADM_ID AS [PriorID],
ROW_NUMBER() OVER (PARTITION BY t1.WH_PID ORDER BY t1.WH_IN_DATETIME)
AS [RowNum]
FROM #Table t1
LEFT JOIN #Table t2
ON t2.WH_PID = t1.WH_PID
AND t2.ADM_ID <> t1.ADM_ID
AND t2.WH_OUT_DATETIME >= (t1.WH_IN_DATETIME - 1)
AND t2.WH_OUT_DATETIME < t1.WH_IN_DATETIME
ORDER BY t1.WH_PID, t1.WH_IN_DATETIME
)
UPDATE sc
SET #EP_ID = sc.EP_ID = CASE
WHEN cte.RowNum = 1 THEN 1
WHEN cte.[PriorOut] IS NULL THEN (#EP_ID + 1)
ELSE #EP_ID
END
FROM #Scratch sc
INNER JOIN cte
ON cte.ADM_ID = sc.ADM_ID
Step 3: Select Joining the Scratch Table
SELECT tab.ADM_ID,
tab.WH_PID,
sc.EP_ID,
MIN(tab.WH_IN_DATETIME) OVER (PARTITION BY tab.WH_PID, sc.EP_ID)
AS [EP_IN_DATETIME],
MAX(tab.WH_OUT_DATETIME) OVER (PARTITION BY tab.WH_PID, sc.EP_ID)
AS [EP_OUT_DATETIME],
tab.WH_IN_DATETIME,
tab.WH_OUT_DATETIME
FROM #Table tab
INNER JOIN #Scratch sc
ON sc.ADM_ID = tab.ADM_ID
ORDER BY tab.ADM_ID;
Resources
MSDN page for UPDATE
look for "#variable = column = expression"
Performance Analysis of doing Running Totals (not exactly the same thing as here, but not too far off)
This blog post does mention:
PRO: this method is generally pretty fast
CON: "The order of the UPDATE is controlled by the order of the clustered index". This behavior might rule out using this method depending on circumstances. But in this particular case, if the WH_PID values are not at least grouped together naturally via the ordering of the clustered index and ordered by WH_IN_DATETIME, then those two fields just get added to the scratch table and the PK (with implied clustered index) on the scratch table becomes (WH_PID, WH_IN_DATETIME, ADM_ID).
I would do this using exists in a correlated subquery:
select t.*,
(case when exists (select 1
from table t2
where t2.WH_P_ID = t.WH_P_ID and
t2.ADM_ID = t.ADM_ID and
t.WH_OUT_DATETIME between t2.WH_IN_DATETIME and dateadd(day, 1, t2.WH_OUT_DATETIME)
)
then 1 else 0
end) as TimeFrameFlag
from table t;
Try this query :
;WITH cte
AS (SELECT t1.ADM_ID AS EP_ID,*
FROM #yourtable t1
WHERE NOT EXISTS (SELECT 1
FROM #yourtable t2
WHERE t1.WH_PID = t2.WH_PID
AND t1.ADM_ID <> t2.ADM_ID
AND Abs(Datediff(HH, t1.WH_OUT_DATETIME, t2.WH_IN_DATETIME)) <= 24)
UNION ALL
SELECT t2.EP_ID,t1.ADM_ID,t1.WH_PID,t1.WH_IN_DATETIME,t1.WH_OUT_DATETIME
FROM #yourtable t1
JOIN cte t2
ON t1.WH_PID = t2.WH_PID
AND t1.ADM_ID <> t2.ADM_ID
AND Abs(( Datediff(HH, t2.WH_IN_DATETIME, t1.WH_OUT_DATETIME) )) <= 24),
cte_result
AS (SELECT t1.*,Dense_rank() OVER ( partition BY wh_pid ORDER BY t1.WH_PID, ISNULL(t2.EP_ID, t1.ADM_ID)) AS EP_ID
FROM #yourtable t1
LEFT OUTER JOIN (SELECT DISTINCT ADM_ID,
EP_ID
FROM cte) t2
ON t1.ADM_ID = t2.ADM_ID)
SELECT ADM_ID,WH_PID,EP_ID,Min(WH_IN_DATETIME)OVER(partition BY wh_pid, ep_id) AS [EP_IN_DATETIME],Max(WH_OUT_DATETIME)OVER(partition BY wh_pid, ep_id) AS [EP_OUT_DATETIME],
WH_IN_DATETIME,
WH_OUT_DATETIME
FROM cte_result
ORDER BY ADM_ID
I assumed these things :
Those rows which follow your rule, are a group.
min(WH_IN_DATETIME) of the group will be shown in EP_IN_DATETIME column for all rows belong to that group. Similarly, max(WH_OUT_DATETIME) of the group will be shown in EP_IN_DATETIME column for all rows belong to that group.
EP_ID will be assigned to groups of each WH_PID separately.
One thing which is not justified by your question that how EP_OUT_DATETIME and WH_IN_DATETIME of 4th row become 2014-11-20 00:00:00 and 2014-10-16 14:00:00 respectively. Assuming that it is a typo and it should be 2014-11-21 00:00:00.000 and 2014-11-20 00:00:00.000.
Explaination :
First CTE cte will return the possible groups based on your rule. Second CTE cte_result will assign EP_ID to groups. In the last, you can select min(WH_IN_DATETIME) and Max(WH_OUT_DATETIME) in partitions of wh_pid, ep_id.
sqlfiddle
Here's yet another alternative... which may miss your results still.
I agree with #NoDisplayName that there appears to be an error in your ADM_ID 5 output, the 2 OUT dates should match - at least that seems logical to me. I can't understand why you would want an out date to ever be showing an in date value, but of course there could be a good reason. :)
Also, the wording of your question makes it sound like this is just a part of the problem and that you may take this output to then further. I'm not sure what you are really aiming for, but I've broken the query below up into 2 CTEs and you may find your final information in the 2nd CTE (as it sounds like you want to group the data back together).
Here's the complete structure & query on SQL Fiddle
-- The Cross Join ensures we always have a pair of first and last time pairs
-- The left join matches all overlapping combinations,
-- allowing the where clause to restrict to just the first and last
-- These first/last pairs are then grouped in the first CTE
-- Data is restricted in the second CTE
-- The final select is then quite simple
With GroupedData AS (
SELECT
(Row_Number() OVER (ORDER BY t1.WH_PID, t1.WH_IN_DATETIME) - 1) / 2 Grp,
t1.WH_IN_DATETIME, t1.WH_OUT_DATETIME, t1.WH_PID
FROM yourtable t1
CROSS JOIN (SELECT 0 AS [First] UNION SELECT 1) SetOrder
LEFT OUTER JOIN yourtable t2
ON t1.WH_PID = t2.WH_PID
AND ((DATEADD(d,1,t1.WH_OUT_DATETIME) BETWEEN t2.WH_IN_DATETIME AND t2.WH_OUT_DATETIME AND [First] = 0)
OR (DATEADD(d,1,t2.WH_OUT_DATETIME) BETWEEN t1.WH_IN_DATETIME AND t1.WH_OUT_DATETIME AND [First] = 1))
WHERE t2.WH_PID IS NULL
), RestrictedData AS (
SELECT WH_PID, MIN(WH_IN_DATETIME) AS WH_IN_DATETIME, MAX(WH_OUT_DATETIME) AS WH_OUT_DATETIME
FROM GroupedData
GROUP BY Grp, WH_PID
)
SELECT yourtable.ADM_ID, yourtable.WH_PID, RestrictedData.WH_IN_DATETIME AS EP_IN_DATETIME, RestrictedData.WH_OUT_DATETIME AS EP_OUT_DATETIME, yourtable.WH_IN_DATETIME, yourtable.WH_OUT_DATETIME
FROM RestrictedData
INNER JOIN yourtable
ON RestrictedData.WH_PID = yourtable.WH_PID
AND yourtable.WH_IN_DATETIME BETWEEN RestrictedData.WH_IN_DATETIME AND RestrictedData.WH_OUT_DATETIME
ORDER BY yourtable.ADM_ID
A Left Outer Join and DateDiff Function should help you to filter the records. Finally Use Window Function to create GroupID's
create table #test
(ADM_ID int,WH_PID int,WH_IN_DATETIME DATETIME,WH_OUT_DATETIME DATETIME)
INSERT #test
VALUES ( 1,9,'2014-10-12 00:00:00','2014-10-13 15:00:00'),
(2,9,'2014-10-14 14:00:00','2014-10-15 15:00:00'),
(3,9,'2014-10-16 14:00:00','2014-10-17 15:00:00'),
(1,10,'2014-10-16 14:00:00','2014-10-17 15:00:00'),
(2,10,'2014-10-18 14:00:00','2014-10-19 15:00:00')
SELECT Row_number()OVER(partition by a.WH_PID ORDER BY a.WH_IN_DATETIME) Group_Id,
a.WH_PID,
a.WH_IN_DATETIME,
b.WH_OUT_DATETIME
FROM #test a
LEFT JOIN #test b
ON a.WH_PID = b.WH_PID
AND a.ADM_ID <> b.ADM_ID
where Datediff(hh, a.WH_OUT_DATETIME, b.WH_IN_DATETIME)BETWEEN 0 AND 24
OUTPUT :
Group_Id WH_PID WH_IN_DATETIME WH_OUT_DATETIME
-------- ------ ----------------------- -----------------------
1 9 2014-10-12 00:00:00.000 2014-10-15 15:00:00.000
2 9 2014-10-14 14:00:00.000 2014-10-17 15:00:00.000
1 10 2014-10-16 14:00:00.000 2014-10-19 15:00:00.000

SQL to Return missing Row

I have one Scenario where I need to find missing records in Table using SQL - without using Cursor, Views, SP.
For a particular CustID initial Start_Date will be 19000101 and End_date will be any random date.
Then for next Record for the same CustID will have its Start_Date as End_Date (of previous Record) + 1.
Its End_Date again will be any random date.
And so on….
For Last record of same CustID its end Date will be 99991231.
Following population of data will explain it better.
CustID Start_Date End_Date
1 19000101 20121231
1 20130101 20130831
1 20130901 20140321
1 20140321 99991231
Basically I am trying to populate data like in SCD2 scenario.
Now I want to find missing record (or CustID).
Like below we don’t have record with CustID = 4 with Start_Date = 20120606 and End_Date = 20140101
CustID Start_Date End_Date
4 19000101 20120605
4 20140102 99991231
Code for Creating Table
CREATE TABLE TestTable
(
CustID int,
Start_Date int,
End_Date int
)
INSERT INTO TestTable values (1,19000101,20121231)
INSERT INTO TestTable values (1,20130101,20130831)
INSERT INTO TestTable values (1,20130901,20140321)
INSERT INTO TestTable values (1,20140321,99991231)
INSERT INTO TestTable values (2,19000101,99991213)
INSERT INTO TestTable values (3,19000101,20140202)
INSERT INTO TestTable values (3,20140203,99991231)
INSERT INTO TestTable values (4,19000101,20120605)
--INSERT INTO TestTable values (4,20120606,20140101) --Missing Value
INSERT INTO TestTable values (4,20140102,99991231)
Now SQL should return CustID = 4 as its has missing Value.
My idea is based on this logic. Lets assume 19000101 as 1 and 99991231 as 10. Now for all IDs, if you subtract the End_date - start_date and add them up, the total sum must be equal to 9 (10 - 1). You can do the same here
SELECT ID, SUM(END_DATE - START_DATE) as total from TABLE group by ID where total < (MAX_END_DATE - MIN_START_DATE)
You might want to find the command in your SQL that gives the number of days between 2 days and use that in the SUM part.
Lets take the following example
1 1900 2003
1 2003 9999
2 1900 2222
2 2222 9977
3 1900 9999
The query will be executed as follows
1 (2003 - 1900) + (9999 - 2003) = 1 8098
2 (2222 - 1900) + (9977 - 2222) = 2 9077
3 (9999 - 1900) = 3 8098
The where clause will eliminate 1 and 3 giving you only 2, which is what you want.
If you just need the CustID then this will do
SELECT t1.CustID
FROM TestTable t1
LEFT JOIN TestTable t2
ON DATEADD(D, 1, t1.Start_Date) = t2.Start_Date
WHERE t2.CustID IS NULL
GROUP BY t1.CustID
You need rows if the one of the following conditions is met:
Not a final row (99991231) and no matching next row
Not a start row (19000101) and no matching previous row
You can left join to the same table to find previous and next rows and filter the results where you don't find a row by checking the column values for null:
SELECT t1.CustID, t1.StartDate, t1.EndDate
FROM TestTable t1
LEFT JOIN TestTable tPrevious on tPrevious.CustID = t1.CustID
and tPrevious.EndDate = t1.StartDate - 1
LEFT JOIN TestTable tNext on tNext.CustID = t1.CustID
and tNext.StartDate = t1.EndDate + 1
WHERE (t1.EndDate <> 99991231 and tNext.CustID is null) -- no following
or (t1.StartDate <> 19000101 and tPrevious.CustID is null) -- no previous