I have a job list that indicates the work performed on any particular job. When work is done during the day then just one record is added and a work_type is included.
Work is not performed on a weekend. Jobs can have work done over a long period of time with the odd day here and there but at some point in its lifecycle it should have a period of work where it is being worked on consistently.
Our management would like to be able to highlight on a report any jobs where this longer period of work hasn't happened.
There are some other conditions around type of work and the team name but the main sticking point is the time issue.
So ... how do I find jobs that have not had a period of at least two consecutive weeks (10 working days) consistent work performed ?
In the following, job 164353 will not be included as it has the necessary 10 consecutive days (ignoring weekends), while job 214325 will be flagged as there is a gap on the 9th that broke the sequence of consecutive days.
JOB_ID W ACTION_DATE
---------- - -----------
164354 H 10-FEB-17
164354 H 13-FEB-17
164354 H 14-FEB-17
164354 H 15-FEB-17
164354 H 16-FEB-17
164354 H 17-FEB-17
164354 H 20-FEB-17
164354 H 21-FEB-17
164354 H 22-FEB-17
164354 H 23-FEB-17
164354 H 24-FEB-17
214325 H 01-MAR-17
214325 H 02-MAR-17
214325 H 03-MAR-17
214325 H 06-MAR-17
214325 H 07-MAR-17
214325 H 08-MAR-17
214325 H 10-MAR-17
214325 H 13-MAR-17
214325 H 14-MAR-17
214325 H 15-MAR-17
I have this query where I can produce consecutive groups with a number of days against each group but I am struggling to adapt it to span over the weekends. In other words the results below would ideally show a number of consecutive days of 10.
WITH
groups AS (
SELECT
ROW_NUMBER() OVER (ORDER BY action_date) AS rn,
action_date -ROW_NUMBER() OVER (ORDER BY action_date) AS grp,
action_date
FROM test_job_list
WHERE job_id = 164354
)
SELECT count(*) AS num_consec_dates,
min(action_date) AS earliest,
max(action_date) AS latest
FROM groups
group by grp
ORDER BY num_consec_dates desc, earliest desc
NUM_CONSEC
DATES EARLIEST LATEST
---------- --------- ---------
5 20-FEB-17 24-FEB-17
5 13-FEB-17 17-FEB-17
1 10-FEB-17 10-FEB-17
You can determine which day of the week it is using (monday = 0, sunday = 6):
TRUNC( action_date ) - TRUNC( action_date, 'IW' )
And, using the LAG analytic function you can then compare whether the previous entry is the previous working day and use this to determine the group:
Oracle Setup:
CREATE TABLE test_job_list ( JOB_ID, W, ACTION_DATE ) AS
SELECT 164354, 'H', DATE '2017-02-10' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-13' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-14' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-15' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-16' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-17' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-20' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-21' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-22' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-23' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-24' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-01' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-02' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-03' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-06' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-07' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-08' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-10' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-13' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-14' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-15' FROM DUAL;
Query:
SELECT job_id,
MIN( action_date ) AS start_date,
MAX( action_date ) AS end_date,
COUNT( 1 ) AS num_days
FROM (
SELECT job_id,
action_date,
SUM( has_changed_group ) OVER ( PARTITION BY job_id ORDER BY action_date )
AS group_id
FROM (
SELECT job_id,
action_date,
CASE WHEN
LAG( action_date ) OVER ( PARTITION BY job_id ORDER BY action_date )
= action_date - CASE TRUNC( action_date ) - TRUNC( action_date, 'IW' )
WHEN 0 THEN 3 ELSE 1 END
THEN 0
ELSE 1
END AS has_changed_group
FROM test_job_list
)
)
GROUP BY job_id, group_id
-- HAVING COUNT(1) >= 10;
Output:
JOB_ID START_DATE END_DATE NUM_DAYS
---------- ------------------- ------------------- ----------
164354 2017-02-10 00:00:00 2017-02-24 00:00:00 11
214325 2017-03-10 00:00:00 2017-03-15 00:00:00 4
214325 2017-03-01 00:00:00 2017-03-08 00:00:00 6
Alternative:
If you just want the jobs where there has never been a period of 10 consecutive working days then you can use the COUNT() analytic function and specify a RANGE window:
SELECT job_id
FROM (
SELECT job_id,
COUNT( 1 ) OVER ( PARTITION BY job_id
ORDER BY action_date
RANGE BETWEEN INTERVAL '13' DAY PRECEDING
AND INTERVAL '0' DAY FOLLOWING )
AS num_days
FROM test_job_list
)
GROUP BY job_id
HAVING MAX( num_days ) < 10;
Output:
JOB_ID
----------
214325
Edit 2
First version had many issues, this one should work.
An option is to join the table with itself on the job_id, and filtering on the right side only the rows of the two weeks preceding the date on the left side. Then you can count the reimaining dates.
select JOB_ID
from (
select g1.JOB_ID, count(g2.ACTION_DATE) CNT
from GROUPS g1
join GROUPS g2
on g1.JOB_ID = g2.JOB_ID
where g2.ACTION_DATE between g1.ACTION_DATE - 13 and g1.ACTION_DATE
group by g1.JOB_ID, g1.ACTION_DATE
) t1
group by JOB_ID
having max(CNT) < 10
I know this solution too long , but you can see all details on query by executing step by step
create table calendar1 as
select day_id,WEEK_DAY_SHORT,day_num_of_week from VITDWH.DW_MIS_TAKVIM as calendar order by day_id;
CREATE TABLE JOB_LIST (JOB_ID NUMBER,ACTION_DATE DATE);
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('10-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('13-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('14-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('15-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('16-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('17-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('20-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('21-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('22-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('23-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('24-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('01-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('02-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('03-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('06-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('07-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('08-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('10-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('13-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('14-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('15-MAR-17','DD-MON-YY'));
COMMIT;
with a1 as
(
select A.JOB_ID,A.ACTION_DATE,B.DAY_ID,
(case when action_date is not null and lag(action_date) over(partition by job_id order by day_id) is null then action_date else null end) start_date,
(case when action_date is not null and lead(action_date) over(partition by job_id order by day_id) is null then action_date else null end) max_date
from
(
select * from calendar1
WHERE DAY_ID >=(select MIN(ACTION_DATE) from JOB_LIST)
AND DAY_ID <= (select MAX(ACTION_DATE) from JOB_LIST)
ORDER BY DAY_ID
)
B LEFT OUTER JOIN
JOB_LIST A
PARTITION BY (A.JOB_ID) ON (A.ACTION_DATE= B.DAY_ID)
ORDER BY A.JOB_ID,DAY_ID
)
,a2 as
(
select * from a1 where start_date is not null or max_date is not null
)
,a3 as
(
select a2.*,lead(max_date) over(partition by job_id order by day_id) end_date
from a2
)
select a.job_id,a.start_date,nvl(a.maX_date,a.end_date) end_date, (nvl(a.maX_date,a.end_date) -a.start_date) +1 date_count
from a3 a where start_date is not null;
10 days = 2 full weeks. For 11 days, you can look at the date 14 days ago and see if it is exactly two weeks ago:
select tjl.*,
lag(action_date, 10) over (partition by job id order by action_date) as minad_2weeks
from test_job_list;
A simple trick works for 10 days:
Then you can get jobs with no such period by using aggregation:
select job_id
from (select tjl.*,
lag(action_date, 9) over (partition by job_id order by action_date) as lag9_ad
from test_job_list tjl
) tjl
group by job_id
having max(action_date - lag9_ad) > action_date - 14;
That is, if the 9th date back is within the past two weeks, then there are two full weeks of dates.
Related
I need SELECT for finding data with overlapping date in Oracle SQL just from today to exactly one year ago. ID_FORMULAR is not UNIQUE value and I need to include just data with overlapping date where ID_FORMULAR is UNIQUE.
My code:
SELECT T1.*
FROM VISITORS T1, VISITORS T2
WHERE ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.FROM_DATE >= t2.FROM_DATE
AND t1.FROM_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.FROM_DATE
AND t1.TO_DATE <= t2.TO_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
OR ( T1.ID_FORMULAR != T2.ID_FORMULAR
AND t1.TO_DATE >= t2.TO_DATE
AND t1.FROM_DATE <= t2.FROM_DATE
AND T1.CREATED_DATE >= ADD_MONTHS (TRUNC (CURRENT_DATE), -12)
AND T1.CREATED_DATE < TRUNC (CURRENT_DATE) + 1)
It is not working correctly. Any help?
From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:
SELECT *
FROM (
SELECT *
FROM visitors
WHERE created_date >= ADD_MONTHS(TRUNC(CURRENT_DATE), -12)
AND created_date < TRUNC(CURRENT_DATE) + 1
)
MATCH_RECOGNIZE(
ORDER BY from_date
ALL ROWS PER MATCH
PATTERN (any_row overlap+)
DEFINE
overlap AS PREV(id_formular) != id_formular
AND PREV(to_date) >= from_date
)
Which, for the sample data:
CREATE TABLE visitors (id_formular, created_date, from_date, to_date) AS
SELECT 1, DATE '2022-08-01', DATE '2022-08-01', DATE '2022-08-03' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-02', DATE '2022-08-04' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-01', DATE '2022-08-03', DATE '2022-08-05' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-06', DATE '2022-08-06' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-07', DATE '2022-08-09' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-01', DATE '2022-08-08', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 1, DATE '2022-08-01', DATE '2022-08-09', DATE '2022-08-11' FROM DUAL;
Outputs:
FROM_DATE
ID_FORMULAR
CREATED_DATE
TO_DATE
01-AUG-22
1
01-AUG-22
03-AUG-22
02-AUG-22
2
01-AUG-22
04-AUG-22
03-AUG-22
3
01-AUG-22
05-AUG-22
08-AUG-22
2
01-AUG-22
10-AUG-22
09-AUG-22
1
01-AUG-22
11-AUG-22
db<>fiddle here
I don't quite understand the question. The thing that is confusing me is that you need just rows where ID is unique. If ID is unique than there is no other row to overlap with. Anyway, lets suppose that the sample data is like below:
WITH
tbl AS
(
SELECT 0 "ID", DATE '2021-07-01' "CREATED", DATE '2021-07-01' "DATE_FROM", DATE '2021-07-13' "DATE_TO" FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-01', DATE '2021-12-01', DATE '2021-12-03' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-04', DATE '2021-12-04', DATE '2021-12-14' FROM DUAL UNION ALL
SELECT 1, DATE '2021-12-12', DATE '2021-12-12', DATE '2021-12-29' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-04', DATE '2022-08-04', DATE '2022-08-10' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-21' FROM DUAL UNION ALL
SELECT 2, DATE '2022-08-21', DATE '2022-08-21', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 3, DATE '2022-08-11', DATE '2022-08-11', DATE '2022-08-29' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-14', DATE '2022-08-14', DATE '2022-08-14' FROM DUAL UNION ALL
SELECT 4, DATE '2022-08-29', DATE '2022-08-14', DATE '2022-08-29' FROM DUAL
)
We can add some columns that will tell us if the ID is unique or not, what is the order of appearance of the same ID, what is the end date of the previous row for the same ID and if the rows of a particular ID overlaps or not. Here is the code: (used analytic functions with windowing clause)
SELECT
ID "ID",
CASE WHEN Count(*) OVER (PARTITION BY ID ORDER BY ID) = 1 THEN 'Y' ELSE 'N' END "IS_UNIQUE",
Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) "ID_ORDER_NO",
CREATED "CREATED",
DATE_FROM "DATE_FROM",
DATE_TO "DATE_TO",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN Null
ELSE
First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
END "PREVIOUS_END_DATE",
CASE
WHEN Count(ID) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) = 1
THEN 'N'
ELSE
CASE
WHEN DATE_FROM <= First_Value(DATE_TO) OVER (PARTITION BY ID ORDER BY ID, DATE_FROM, DATE_TO ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
THEN 'Y'
ELSE 'N'
END
END "OVERLAPS"
FROM
TBL
WHERE
CREATED BETWEEN ADD_MONTHS(TRUNC(SYSDATE, 'dd'), -12) And TRUNC(SYSDATE, 'dd')
Here is the resulting dataset...
/* R e s u l t
ID IS_UNIQUE ID_ORDER_NO CREATED DATE_FROM DATE_TO PREVIOUS_END_DATE OVERLAPS
---------- --------- ----------- --------- --------- --------- ----------------- --------
1 N 1 01-DEC-21 01-DEC-21 03-DEC-21 N
1 N 2 04-DEC-21 04-DEC-21 14-DEC-21 03-DEC-21 N
1 N 3 12-DEC-21 12-DEC-21 29-DEC-21 14-DEC-21 Y
2 N 1 04-AUG-22 04-AUG-22 10-AUG-22 N
2 N 2 11-AUG-22 11-AUG-22 21-AUG-22 10-AUG-22 N
2 N 3 21-AUG-22 21-AUG-22 29-AUG-22 21-AUG-22 Y
3 Y 1 11-AUG-22 11-AUG-22 29-AUG-22 N
4 N 1 14-AUG-22 14-AUG-22 14-AUG-22 N
4 N 2 29-AUG-22 14-AUG-22 29-AUG-22 14-AUG-22 Y
*/
This dataset could be further used to get you the rows and columns that you are trying to get. You can filter it, do some other calculations (like number of overlaping days), get number of rows per ID and so on....
Regards...
The problem I am facing is how to find distinct time periods from multiple time periods with overlap in Teradata ANSI SQL.
For example, the attached tables contain multiple overlapping time periods, how can I combine those time periods into 3 unique time periods in Teradata SQL???
I think I can do it in python with the loop function, but not sure how to do it in SQL
ID
Start Date
End Date
001
2005-01-01
2006-01-01
001
2005-01-01
2007-01-01
001
2008-01-01
2008-06-01
001
2008-04-01
2008-12-01
001
2010-01-01
2010-05-01
001
2010-04-01
2010-12-01
001
2010-11-01
2012-01-01
My expected result is:
ID
start_Date
end_date
001
2005-01-01
2007-01-01
001
2008-01-01
2008-12-01
001
2010-01-01
2012-01-01
From Oracle 12, you can use MATCH_RECOGNIZE to perform a row-by-row comparison:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY id
ORDER BY start_date
MEASURES
FIRST(start_date) AS start_date,
MAX(end_date) AS end_date
ONE ROW PER MATCH
PATTERN (overlapping_ranges* last_range)
DEFINE overlapping_ranges AS NEXT(start_date) <= MAX(end_date)
)
Which, for the sample data:
CREATE TABLE table_name (ID, Start_Date, End_Date) AS
SELECT '001', DATE '2005-01-01', DATE '2006-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2005-01-01', DATE '2007-01-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-01-01', DATE '2008-06-01' FROM DUAL UNION ALL
SELECT '001', DATE '2008-04-01', DATE '2008-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-01-01', DATE '2010-05-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-04-01', DATE '2010-12-01' FROM DUAL UNION ALL
SELECT '001', DATE '2010-11-01', DATE '2012-01-01' FROM DUAL;
Outputs:
ID
START_DATE
END_DATE
001
2005-01-01 00:00:00
2007-01-01 00:00:00
001
2008-01-01 00:00:00
2008-12-01 00:00:00
001
2010-01-01 00:00:00
2012-01-01 00:00:00
db<>fiddle here
Update: Alternative query
SELECT id,
start_date,
end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, ROWNUM) * type AS cnt
FROM table_name
UNPIVOT (dt FOR type IN (start_date AS 1, end_date AS -1))
)
WHERE cnt IN (1,0)
)
PIVOT (MAX(dt) FOR cnt IN (1 AS start_date, 0 AS end_date))
Or, an equivalent that does not use UNPIVOT, PIVOT or ROWNUM and works in both Oracle and PostgreSQL:
SELECT id,
MAX(CASE cnt WHEN 1 THEN dt END) AS start_date,
MAX(CASE cnt WHEN 0 THEN dt END) AS end_date
FROM (
SELECT id,
dt,
SUM(cnt) OVER (PARTITION BY id ORDER BY dt) AS grp,
cnt
FROM (
SELECT ID,
dt,
SUM(type) OVER (PARTITION BY id ORDER BY dt, rn) * type AS cnt
FROM (
SELECT r.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY dt ASC, type DESC) AS rn
FROM (
SELECT id, 1 AS type, start_date AS dt FROM table_name
UNION ALL
SELECT id, -1 AS type, end_date AS dt FROM table_name
) r
) p
) s
WHERE cnt IN (1,0)
) t
GROUP BY id, grp
Update 2: Another Alternative
SELECT id,
MIN(start_date) AS start_date,
MAX(end_Date) AS end_date
FROM (
SELECT t.*,
SUM(CASE WHEN start_date <= prev_max THEN 0 ELSE 1 END)
OVER (PARTITION BY id ORDER BY start_date) AS grp
FROM (
SELECT t.*,
MAX(end_date) OVER (
PARTITION BY id ORDER BY start_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS prev_max
FROM table_name t
) t
) t
GROUP BY id, grp
db<>fiddle Oracle PostgreSQL
This is a gaps and islands problem. Try this:
with u as
(select ID, start_date, end_date,
case
when start_date <= lag(end_date) over(partition by ID order by start_date, end_date) then 0
else 1 end as grp
from table_name),
v as
(select ID, start_date, end_date,
sum(grp) over(partition by ID order by start_date, end_date) as island
from u)
select ID, min(start_date) as start_Date, max(end_date) as end_date
from v
group by ID, island;
Fiddle
Basically you can identify "islands" by comparing start_date of current row to end_date of previous row (ordered by start_date, end_date), if it precedes it then it's the same island. Then you can do a rolling sum() to get the island numbers. Finally select min(start_date) and max(end_date) from each island to get the desired output.
This may work ,with little bit of change in function , I tried it in Dbeaver :
select ID,Start_Date,End_Date
from
(
select t.*,
dense_rank () over(partition by extract (year from Start_Date) order BY End_Date desc) drnk
from testing_123 t
) temp
where temp.drnk = 1
ORDER BY Start_Date;
Try this
WITH a as (
SELECT
ID,
LEFT(Start_Date, 4) as Year,
MIN(Start_Date) as New_Start_Date
FROM
TAB1
GROUP BY
ID,
LEFT(Start_Date, 4)
), b as (
SELECT
a.ID,
Year,
New_Start_Date,
End_Date
FROM
a
LEFT JOIN
TAB1
ON LEFT(a.New_Start_Date, 4) = LEFT(TAB1.Start_Date, 4)
)
select
ID,
New_Start_Date as Start_Date,
MAX(End_Date)
from
b
GROUP BY
ID,
New_Start_Date;
Example: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=97f91b68c635aebfb752538cdd752ace
I have a table with emplid and end_date columns. I want from all emplids the max end_dates. If at least one end_date is null, I want to have the null value as max. So in this example:
emplid end_date
1 05/04/2019
1 05/10/2019
1 null
2 05/04/2019
2 05/10/2019
I want as result:
emplid end_date
1 null
2 05/10/2019
I tried something like
select emplid,
CASE
WHEN MAX(NVL(end_Date,'01/01/3000'))='01/01/3000' THEN null
ELSE end_date
END as end_dt
from people
group by emplid
then I get a group-by error.
Maybe it is very easy, but I don't figure out how to get properly what I want.
with s(id, dt) as (
select 1, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 1, to_date('05/10/2019', 'dd/mm/yyyy') from dual union all
select 1, null from dual union all
select 2, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 2, to_date('05/10/2019', 'dd/mm/yyyy') from dual)
select id, decode(count(dt), count(*), max(dt)) max_dt
from s
group by id;
ID MAX_DT
---------- -----------------------------
1
2 2019-10-05 00:00:00
I would simply do:
select emplid,
(case when count(*) = count(end_date)
then max(end_date)
end) as max_end_date
from t
group by emplid;
There is no reason to introduce a "magic" maximum value (even if it is correct).
The first expression in the case is simply asking "do the number of non-NULL end-date values match the number of rows".
Try this
SELECT
EMPLID,
CASE WHEN END_DATE='01/01/3000' THEN NULL ELSE END_DATE END AS END_DT
FROM
(
SELECT EMPLID, MAX(END_DATE) AS END_DATE FROM
(
SELECT EMPLID, NVL(END_DATE,'01/01/3000') AS END_DATE FROM PEOPLE
)
GROUP BY EMPLID
);
Case does not go with group by , you have to get the max value using group by first then evaluate the null values. Try below.
select empid, CASE WHEN NVL(eDate,'01-DEC-3000')='01-DEC-3000' THEN null ELSE edate end end_dt from (
select empid, MAX(NVL(eDate,'01-DEC-3000')) eDate
from
(select 1 empid, sysdate-100 edate from dual union all
select 1 empid, sysdate-10 edate from dual union all
select 1 empid, null edate from dual union all
select 2 empid, sysdate-105 edate from dual union all
select 2 empid, sysdate-1 edate from dual ) datad
group by empid);
I have below data
empid date amount
1 12-FEB-2017 10
1 12-FEB-2017 10
1 13-FEB-2017 10
1 14-FEB-2017 10
I need a query to return the total amount for a given id and date i.e, below result set
empid date amount
1 12-FEB-2017 20
1 13-FEB-2017 10
1 14-FEB-2017 10
but the think is, from the UI i will be getting the date as input.. if they pass the date return the result for that date .. if they dont pass the date return the result for most recent date.
below is the query that I wrote .. but it is working partially..
SELECT sum(amount),empid,date
FROM employee emp,
where
((date= :ddd) OR aum_valutn_dt = (select max(date) from emp))
AND emp.id = '1'
group by (empid,date)
Please help..
I think you could do something like this
but it is pretty bad you should try to do it some other way
it is doing extra work to get the most recent date
select amt, empid, date
from
(
select amt, empid, date, rank() over (order by date desc) date_rank
from
(SELECT sum(amount) amt,empid,date
FROM employee emp
where emp.id = '1'
and (date = :ddd or :ddd is null)
group by empid, date)
)
where date = :ddd or (:ddd is null and date_rank=1)
Here's another option; scans TEST table twice so ... mind the performance.
SQL> with test (empid, datum, amount) as
2 (select 1, date '2017-02-12', 10 from dual union all
3 select 1, date '2017-02-12', 10 from dual union all
4 select 1, date '2017-02-13', 10 from dual union all
5 select 1, date '2017-02-14', 10 from dual
6 )
7 select t.empid, t.datum, sum(t.amount) sum_amount
8 from test t
9 where t.datum = (select max(t1.datum)
10 from test t1
11 where t1.empid = t.empid
12 and (t1.datum = to_date('&&par_datum', 'dd.mm.yyyy')
13 or '&&par_datum' is null)
14 )
15 group by t.empid, t.datum;
Enter value for par_datum: 13.02.2017
EMPID DATUM SUM_AMOUNT
---------- ---------- ----------
1 13.02.2017 10
SQL> undefine par_datum
SQL> /
Enter value for par_datum:
EMPID DATUM SUM_AMOUNT
---------- ---------- ----------
1 14.02.2017 10
SQL>
SELECT sum(amount),empid,date
FROM employee emp,
where date =nvl((:ddd ,(select max(date) from emp))
AND emp.id = '1'
group by (empid,date)
My solution is following:
with t (empid, datum, amount) as
(select 1, date '2017-02-12', 10 from dual union all
select 1, date '2017-02-12', 10 from dual union all
select 1, date '2017-02-13', 10 from dual union all
select 1, date '2017-02-14', 10 from dual
)
select empid, datum, s
from (select empid, datum, sum(amount) s, max(datum) over (partition by empid) md
from t
group by empid, datum)
where datum = nvl(to_date(:p, 'yyyy-mm-dd'), md);
Calculate maximal date in the subquery and then, in outer subquery, compare the date with nvl(to_date(:p, 'yyyy-mm-dd'), md). If the paremeter is null, then the date field is compared with maximal date.
I have a table named x . The data is as follows.
Acccount_num start_dt end_dt
A111326 02/01/2016 02/11/2016
A111326 02/12/2016 03/05/2016
A111326 03/02/2016 03/16/2016
A111331 02/28/2016 02/29/2016
A111331 02/29/2016 03/29/2016
A999999 08/25/2015 08/25/2015
A999999 12/19/2015 12/22/2015
A222222 11/06/2015 11/10/2015
A222222 05/16/2016 05/17/2016
Both A111326 and A111331 should be identified as contiguous data and A999999 and
A222222 should be identified as discontinuous data.In my code I currently use the following query to identify discontinuous data. The A111326 is also erroneously identified as discontinuous data. Please help to modify the below code so that A111326 is not identified as discontinuous data.Thanks in advance for your help.
(SELECT account_num
FROM (SELECT account_num,
(MAX (
END_DT)
OVER (PARTITION BY account_num
ORDER BY START_DT))
START_DT,
(LEAD (
START_DT)
OVER (PARTITION BY account_num
ORDER BY START_DT))
END_DT
FROM x
WHERE (START_DT + 1) <=
(END_DT - 1))
WHERE START_DT < END_DT);
Oracle Setup:
CREATE TABLE accounts ( Account_num, start_dt, end_dt ) AS
SELECT 'A', DATE '2016-02-01', DATE '2016-02-11' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-02-12', DATE '2016-03-05' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-03-02', DATE '2016-03-16' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-28', DATE '2016-02-29' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-29', DATE '2016-03-29' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-08-25', DATE '2015-08-25' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-12-19', DATE '2015-12-22' FROM DUAL UNION ALL
SELECT 'D', DATE '2015-11-06', DATE '2015-11-10' FROM DUAL UNION ALL
SELECT 'D', DATE '2016-05-16', DATE '2016-05-17' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-01', DATE '2016-01-02' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-05', DATE '2016-01-06' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-03', DATE '2016-01-07' FROM DUAL;
Query:
WITH times ( account_num, dt, lvl ) AS (
SELECT Account_num, start_dt - 1, 1 FROM accounts
UNION ALL
SELECT Account_num, end_dt, -1 FROM accounts
)
, totals ( account_num, dt, total ) AS (
SELECT account_num,
dt,
SUM( lvl ) OVER ( PARTITION BY Account_num ORDER BY dt, lvl DESC )
FROM times
)
SELECT Account_num,
CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
THEN 'N'
ELSE 'Y'
END AS is_contiguous
FROM totals
GROUP BY Account_Num
ORDER BY Account_Num;
Output:
ACCOUNT_NUM IS_CONTIGUOUS
----------- -------------
A Y
B Y
C N
D N
E Y
Alternative Query:
(It's exactly the same method just using UNPIVOT rather than UNION ALL.)
SELECT Account_num,
CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
THEN 'N'
ELSE 'Y'
END AS is_contiguous
FROM (
SELECT Account_num,
SUM( lvl ) OVER ( PARTITION BY Account_Num
ORDER BY CASE lvl WHEN 1 THEN dt - 1 ELSE dt END,
lvl DESC
) AS total
FROM accounts
UNPIVOT ( dt FOR lvl IN ( start_dt AS 1, end_dt AS -1 ) )
)
GROUP BY Account_Num
ORDER BY Account_Num;
WITH cte AS (
SELECT
AccountNumber
,CASE
WHEN
LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) IS NULL THEN 0
WHEN
LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) >= Start_Dt - 1 THEN 0
ELSE 1
END as discontiguous
FROM
#Table
)
SELECT
AccountNumber
,CASE WHEN SUM(discontiguous) > 0 THEN 'discontiguous' ELSE 'contiguous' END
FROM
cte
GROUP BY
AccountNumber;
One of your problems is that your contiguous desired result also includes overlapping date ranges in your example data set. Example A111326 Starts on 3/2/2016 but ends the row before on 3/5/2015 meaning it overlaps by 3 days.