How to generalize or parameterize the SQL query - sql

I have a problem with my employee_det table, where I am categorizing year wise active employee status.
for example1: an employee joined in 01-01-2017 and released from company in 02-02-2018 then he/she fall under 2017 bucket.
example2: If an employee joined in 01-02-2018 and released in 01-15-2019 then he will be under 2018 bucket.
if an employee joined in 01-01-2017 and he is still continuing in company then he must fall under 2019.
I have written the following query and which is giving me accurate results, but next year I need to add one more entry in WHERE condition, instead of that is there is any generalized way to solve this.
select emp_id, ename, year(effective_start_date) as year_bucket
from employee_det
where worker_status = 'Active'
and manager_name like '%srinivas%'
and (
( date(effective_start_date) <= '2017-12-31'
and date(effective_end_date)>='2017-12-31' )
or
( date(effective_start_date) <= '2018-12-31'
and date(effective_end_date)>='2018-12-31' )
or
( date(effective_start_date) <= current_date()
and date(effective_end_date)>=current_date()
)

You seem to want the start year for employees who have ended and the current year for active employees. So:
select emp_id, ename,
(case when effective_end_date > current_date
then year(current_date)
else year(effective_start_date)
end) as year_bucket
from employee_det
where worker_status = 'Active' and
manager_name like '%srinivas%';

Below is for BigQuery Standard SQL
#standardSQL
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
You can test, play with above using dummy data as in example below
#standardSQL
WITH `project.dataset.employee_det` AS (
SELECT 1 emp_id, 'employee1' ename, DATE '2017-01-01' effective_start_date, DATE '2018-02-02' effective_end_date, 'Active' worker_status, 'srinivas' manager_name UNION ALL
SELECT 2, 'employee2', '2018-01-02', '2019-01-15', 'Active', 'srinivas' UNION ALL
SELECT 3, 'employee3', '2017-01-01', '2019-04-15', 'Active', 'srinivas'
)
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
with result
Row emp_id ename year_bucket
1 1 employee1 2017
2 2 employee2 2018
3 3 employee3 2019
Update - excluding employees whose start and end YEAR is the same
You can just use one "generic" clause as below
WHERE EXTRACT(YEAR FROM effective_start_date) != EXTRACT(YEAR FROM effective_end_date)
so, the whole query now will be as in below example
#standardSQL
WITH `project.dataset.employee_det` AS (
SELECT 1 emp_id, 'employee1' ename, DATE '2017-01-01' effective_start_date, DATE '2018-02-02' effective_end_date, 'Active' worker_status, 'srinivas' manager_name UNION ALL
SELECT 2, 'employee2', '2018-01-02', '2019-01-15', 'Active', 'srinivas' UNION ALL
SELECT 3, 'employee3', '2017-01-01', '2019-04-15', 'Active', 'srinivas' UNION ALL
SELECT 4, 'employee4', '2017-01-01', '2017-04-15', 'Active', 'srinivas'
)
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
AND EXTRACT(YEAR FROM effective_start_date) != EXTRACT(YEAR FROM effective_end_date)
with result
Row emp_id ename year_bucket
1 1 employee1 2017
2 2 employee2 2018
3 3 employee3 2019
as you can see - employee4 is not included in any bucket

Related

Find day of week with most hires

I am trying to find the employee hire_date (day of week) where the most employees were hired. In my test CASE below the answer should be Tuesday.
As you can see I can list all the days but I'm having a problem narrowing down the result to 1 row.
Any help would be greatly appreciated. I listed my failed attempt. If there is a more efficient way to rewrite the query I would prefer any input.
CREATE TABLE employees (employee_id, first_name, last_name, hire_date) AS
SELECT 1, 'Lisa', 'Saladino', DATE '2001-04-03' FROM DUAL UNION ALL
SELECT 2, 'Abby', 'Abbott', DATE '2001-04-04' FROM DUAL UNION ALL
SELECT 3, 'Beth', 'Cooper', DATE '2001-04-05' FROM DUAL UNION ALL
SELECT 4, 'Carol', 'Orr', DATE '2001-04-06' FROM DUAL UNION ALL
SELECT 5, 'Nancy', 'Turner', DATE '2001-04-07' FROM DUAL UNION ALL
SELECT 6, 'Cheryl', 'Ford', DATE '2001-04-08' FROM DUAL UNION ALL
SELECT 7, 'Leslee', 'Gold', DATE '2001-04-10' FROM DUAL UNION ALL
SELECT 8, 'Jill', 'Coralnick', DATE '2001-04-11' FROM DUAL UNION ALL
SELECT 9, 'Faith', 'Aaron', DATE '2001-04-17' FROM DUAL;
SELECT TO_CHAR(HIRE_DATE,'DAY') DAY, count(*) cnt FROM EMPLOYEES GROUP BY TO_CHAR(HIRE_DATE,'DAY')
DAY CNT
TUESDAY 3
FRIDAY 1
SUNDAY 1
SATURDAY 1
WEDNESDAY 2
THURSDAY 1
/* not working */
SELECT e.*
FROM EMPLOYEES e
INNER JOIN
(SELECT employee_id, TO_CHAR(HIRE_DATE,'DAY') DAY
FROM EMPLOYEES
GROUP BY TO_CHAR(HIRE_DATE,'DAY')
HAVING COUNT(1)=(SELECT MAX(COUNT(1))FROM EMPLOYEES GROUP BY TO_CHAR(HIRE_DATE,'DAY'))) AS empdays
ON TO_CHAR(e.HIRE_DATE, 'DAY') = empdays.DAY;
You neither need a subquery or nor a join to use if the DB's version is 12c+, but just use the FETCH clause following ORDER BY in order to sort the counts descendingly such as
SELECT TO_CHAR(hire_date, 'DAY') AS day, COUNT(*) AS cnt
FROM employees
GROUP BY TO_CHAR(hire_date, 'DAY')
ORDER BY cnt DESC
FETCH FIRST 1 ROW WITH TIES
This works for me:
select DAY, cnt
from (SELECT TO_CHAR(HIRE_DATE,'DAY') DAY
,count(*) cnt
FROM EMPLOYEES
GROUP BY TO_CHAR(HIRE_DATE,'DAY'))
where cnt = (select max(cnt)
from (SELECT TO_CHAR(HIRE_DATE,'DAY') DAY
,count(*) cnt
FROM EMPLOYEES
GROUP BY TO_CHAR(HIRE_DATE,'DAY')))
Produces following results
DAY
CNT
Tuesday
3
Refer to this db<>fiddle
In a comment, you asked for dense_rank example; here it is:
SQL> with temp as
2 (select to_char(hire_date, 'Day') day,
3 count(*) cnt,
4 dense_rank() over (order by count(*) desc) rnk
5 from employees
6 group by to_char(hire_date, 'Day')
7 )
8 select day, cnt
9 from temp
10 where rnk = 1;
DAY CNT
------------------------------------ ----------
Tuesday 3
SQL>

SQL: How to use DECODE and COUNT functions together

I'm working on a homework problem, but we haven't learned about the DECODE function, only CASE. This week's unit is about using aggregate functions. Below is my homework question and what my professor wants as result-
"The following two questions are very challenging (you need to use DECODE function to complete them).
Create a query that will display the total number of employees and of that total
the number who were hired in the year 1980, 1981, 1982, and 1987. Give
appropriate column headings. (5 Points)
Total 1980 1981 1982 1987
----- ----- ----- ----- -----
14 1 10 1 2
Here is the function I typed in the server along with result. I only tried the year 1980 so I don't waste time, but I also need 1981, 1982, and 1987-
SELECT COUNT(ename) AS "Total",
COUNT(DECODE(hiredate, '80', '1980'))
FROM emp;
Total COUNT(DECODE(HIREDATE,'80','1980'))
----- -----------------------------------
14 0
Here is the datatype for the 'hiredate' column-
HIREDATE
---------
17-NOV-81
01-MAY-81
09-JUN-81
02-APR-81
28-SEP-81
20-FEB-81
08-SEP-81
03-DEC-81
22-FEB-81
03-DEC-81
17-DEC-80
HIREDATE
---------
09-DEC-82
12-JAN-83
23-JAN-82
14 rows selected.
Thank you for anyone's help!
Assuming that the hiredate column has the data type DATE.
Lets do it without DECODE and use PIVOT instead:
Oracle Setup:
CREATE TABLE emp ( ename, hiredate ) AS
SELECT 'A', DATE '1980-01-01' FROM DUAL UNION ALL
SELECT 'B', DATE '1981-01-01' FROM DUAL UNION ALL
SELECT 'C', DATE '1981-02-01' FROM DUAL UNION ALL
SELECT 'D', DATE '1981-03-01' FROM DUAL UNION ALL
SELECT 'E', DATE '1981-04-01' FROM DUAL UNION ALL
SELECT 'F', DATE '1981-05-01' FROM DUAL UNION ALL
SELECT 'G', DATE '1981-06-01' FROM DUAL UNION ALL
SELECT 'H', DATE '1981-07-01' FROM DUAL UNION ALL
SELECT 'I', DATE '1981-08-01' FROM DUAL UNION ALL
SELECT 'J', DATE '1981-09-01' FROM DUAL UNION ALL
SELECT 'K', DATE '1981-10-01' FROM DUAL UNION ALL
SELECT 'L', DATE '1982-01-01' FROM DUAL UNION ALL
SELECT 'M', DATE '1987-01-01' FROM DUAL UNION ALL
SELECT 'N', DATE '1987-02-01' FROM DUAL;
Query:
SELECT *
FROM (
SELECT EXTRACT( YEAR FROM hiredate ) AS hireyear,
COUNT(*) OVER () AS "Total"
FROM emp
)
PIVOT ( COUNT(*) FOR hireyear IN ( 1980, 1981, 1982, 1987 ) )
Output:
Total | 1980 | 1981 | 1982 | 1987
----: | ---: | ---: | ---: | ---:
14 | 1 | 10 | 1 | 2
Query 2:
You can also do it with CASE:
SELECT COUNT(*) AS "Total",
COUNT(
CASE
WHEN hiredate >= DATE '1980-01-01' AND hiredate < DATE '1981-01-01'
THEN hiredate
END
) AS "1980",
COUNT(
CASE
WHEN hiredate >= DATE '1981-01-01' AND hiredate < DATE '1982-01-01'
THEN hiredate
END
) AS "1981",
COUNT(
CASE
WHEN hiredate >= DATE '1982-01-01' AND hiredate < DATE '1983-01-01'
THEN hiredate
END
) AS "1982",
COUNT(
CASE
WHEN hiredate >= DATE '1987-01-01' AND hiredate < DATE '1988-01-01'
THEN hiredate
END
) AS "1987"
FROM emp
Query 3:
Or with DECODE.
SELECT COUNT(*) AS "Total",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 'anything here' ) ) AS "1980",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1981, 'anything here' ) ) AS "1981",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1982, 'anything here' ) ) AS "1982",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1987, 'anything here' ) ) AS "1987"
FROM emp
Note: When you are counting, you count that the value passed to COUNT is not-NULL so the DECODE function can return any non-NULL value when it matches and it will get counted; conversely, so long as the value returned to COUNT is NULL, which is the default for DECODE when you don't supply an extra even-numbered argument, then it won't count the row.
So you could use any literal value. Like a string:
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 'anything here' ) ) AS "1980"
or a number
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 1 ) ) AS "1980"
or even the hiredate column (which if it has a year of 1980 then you know it isn't NULL):
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, hiredate ) ) AS "1980"
and if you want to be explicit about the return value from DECODE when it doesn't match then put in an extra NULL argument:
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 42, NULL ) ) AS "1980"
db<>fiddle here
Your teacher wants you to do conditional aggregation:
SELECT
COUNT(ename) AS "Total",
SUM(DECODE(hire_date, 1980, 1, 0)) AS "1980",
SUM(DECODE(hire_date, 1981, 1, 0)) AS "1981",
SUM(DECODE(hire_date, 1982, 1, 0)) AS "1982",
SUM(DECODE(hire_date, 1987, 1, 0)) AS "1987"
FROM emp;
Within each SUM, DECODE() checks the hire_date against the target value, and counts in only records that have the relevant date for each column.
This assumes that hire_date is actually a year, which seems counterintuitive. If it is a DATE, then you would need to extrat the year part, like so:
SELECT
COUNT(ename) AS "Total",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1980, 1, 0)) AS "1980",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1981, 1, 0)) AS "1981",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1982, 1, 0)) AS "1982",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1987, 1, 0)) AS "1987"
FROM emp;
Please note that DECODE() is an Oracle-specific function that is not supported by other RDBMS. A more standard way to write this is to use CASE blocks, with expressions like:
SUM(CASE WHEN EXTRACT(YEAR from hire_date) = 1980 THEN 1 ELSE 0 END) as "1980"
select "1981"+"1982"+"1983" Total, "1981","1982","1983" from (select
count(case to_char(hiredate,'YYYY') when '1981' then 1 end) "1981",
count(case to_char(hiredate,'YYYY') when '1982' then 1 end) "1982",
count(case to_char(hiredate,'YYYY') when '1983' then 1 end) "1983"
from emp);

sql oracle goup by on dates with possibilities of null values

I have a table with emplid and end_date columns. I want from all emplids the max end_dates. If at least one end_date is null, I want to have the null value as max. So in this example:
emplid end_date
1 05/04/2019
1 05/10/2019
1 null
2 05/04/2019
2 05/10/2019
I want as result:
emplid end_date
1 null
2 05/10/2019
I tried something like
select emplid,
CASE
WHEN MAX(NVL(end_Date,'01/01/3000'))='01/01/3000' THEN null
ELSE end_date
END as end_dt
from people
group by emplid
then I get a group-by error.
Maybe it is very easy, but I don't figure out how to get properly what I want.
with s(id, dt) as (
select 1, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 1, to_date('05/10/2019', 'dd/mm/yyyy') from dual union all
select 1, null from dual union all
select 2, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 2, to_date('05/10/2019', 'dd/mm/yyyy') from dual)
select id, decode(count(dt), count(*), max(dt)) max_dt
from s
group by id;
ID MAX_DT
---------- -----------------------------
1
2 2019-10-05 00:00:00
I would simply do:
select emplid,
(case when count(*) = count(end_date)
then max(end_date)
end) as max_end_date
from t
group by emplid;
There is no reason to introduce a "magic" maximum value (even if it is correct).
The first expression in the case is simply asking "do the number of non-NULL end-date values match the number of rows".
Try this
SELECT
EMPLID,
CASE WHEN END_DATE='01/01/3000' THEN NULL ELSE END_DATE END AS END_DT
FROM
(
SELECT EMPLID, MAX(END_DATE) AS END_DATE FROM
(
SELECT EMPLID, NVL(END_DATE,'01/01/3000') AS END_DATE FROM PEOPLE
)
GROUP BY EMPLID
);
Case does not go with group by , you have to get the max value using group by first then evaluate the null values. Try below.
select empid, CASE WHEN NVL(eDate,'01-DEC-3000')='01-DEC-3000' THEN null ELSE edate end end_dt from (
select empid, MAX(NVL(eDate,'01-DEC-3000')) eDate
from
(select 1 empid, sysdate-100 edate from dual union all
select 1 empid, sysdate-10 edate from dual union all
select 1 empid, null edate from dual union all
select 2 empid, sysdate-105 edate from dual union all
select 2 empid, sysdate-1 edate from dual ) datad
group by empid);

How to achieve multiple records output with based on single record

I have a table "Managers" that contains data as follows
i am expecting the output like as below
or another output format i am expecting is
The conditions are
manager 1001 is joined in 2018 and end date is 9999, so he is active in 2018, 2019 and 2020
manager 1004 is joined in 2018 and he left the company in the same year, so he is active only in 2018
please help me on how to achieve this
Build a list of years and JOIN with it:
SELECT manager_id, yearnum, 'Active' AS status
FROM UNNEST(GENERATE_ARRAY(2018, 2020)) AS yearnum
JOIN managers ON yearnum BETWEEN EXTRACT(year FROM eff_start_date)
AND EXTRACT(year FROM eff_end_date)
Below is for BigQuery Standard SQL
#standardSQL
WITH years AS (
SELECT EXTRACT(YEAR FROM year) year
FROM ( SELECT
(SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date,
(SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date
), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
FROM `project.dataset.managers`
)
SELECT manager_id, year, status
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.managers` AS (
SELECT 1001 manager_id, 'Active' status, DATE '2018-02-10' eff_start_date, DATE '9999-12-31' eff_end_date UNION ALL
SELECT 1002, 'Active', '2018-02-14', '2020-12-31' UNION ALL
SELECT 1003, 'Active', '2018-02-16', '2019-02-15' UNION ALL
SELECT 1004, 'Active', '2018-02-16', '2018-12-31'
), years AS (
SELECT EXTRACT(YEAR FROM year) year
FROM ( SELECT
(SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date,
(SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date
), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
FROM `project.dataset.managers`
)
SELECT manager_id, year, status
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year
-- ORDER BY manager_id, year
with result
Row manager_id year status
1 1001 2018 Active
2 1001 2019 Active
3 1001 2020 Active
4 1002 2018 Active
5 1002 2019 Active
6 1002 2020 Active
7 1003 2018 Active
8 1003 2019 Active
9 1004 2018 Active

Find consecutive dates spanning a weekend

I have a job list that indicates the work performed on any particular job. When work is done during the day then just one record is added and a work_type is included.
Work is not performed on a weekend. Jobs can have work done over a long period of time with the odd day here and there but at some point in its lifecycle it should have a period of work where it is being worked on consistently.
Our management would like to be able to highlight on a report any jobs where this longer period of work hasn't happened.
There are some other conditions around type of work and the team name but the main sticking point is the time issue.
So ... how do I find jobs that have not had a period of at least two consecutive weeks (10 working days) consistent work performed ?
In the following, job 164353 will not be included as it has the necessary 10 consecutive days (ignoring weekends), while job 214325 will be flagged as there is a gap on the 9th that broke the sequence of consecutive days.
JOB_ID W ACTION_DATE
---------- - -----------
164354 H 10-FEB-17
164354 H 13-FEB-17
164354 H 14-FEB-17
164354 H 15-FEB-17
164354 H 16-FEB-17
164354 H 17-FEB-17
164354 H 20-FEB-17
164354 H 21-FEB-17
164354 H 22-FEB-17
164354 H 23-FEB-17
164354 H 24-FEB-17
214325 H 01-MAR-17
214325 H 02-MAR-17
214325 H 03-MAR-17
214325 H 06-MAR-17
214325 H 07-MAR-17
214325 H 08-MAR-17
214325 H 10-MAR-17
214325 H 13-MAR-17
214325 H 14-MAR-17
214325 H 15-MAR-17
I have this query where I can produce consecutive groups with a number of days against each group but I am struggling to adapt it to span over the weekends. In other words the results below would ideally show a number of consecutive days of 10.
WITH
groups AS (
SELECT
ROW_NUMBER() OVER (ORDER BY action_date) AS rn,
action_date -ROW_NUMBER() OVER (ORDER BY action_date) AS grp,
action_date
FROM test_job_list
WHERE job_id = 164354
)
SELECT count(*) AS num_consec_dates,
min(action_date) AS earliest,
max(action_date) AS latest
FROM groups
group by grp
ORDER BY num_consec_dates desc, earliest desc
NUM_CONSEC
DATES EARLIEST LATEST
---------- --------- ---------
5 20-FEB-17 24-FEB-17
5 13-FEB-17 17-FEB-17
1 10-FEB-17 10-FEB-17
You can determine which day of the week it is using (monday = 0, sunday = 6):
TRUNC( action_date ) - TRUNC( action_date, 'IW' )
And, using the LAG analytic function you can then compare whether the previous entry is the previous working day and use this to determine the group:
Oracle Setup:
CREATE TABLE test_job_list ( JOB_ID, W, ACTION_DATE ) AS
SELECT 164354, 'H', DATE '2017-02-10' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-13' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-14' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-15' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-16' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-17' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-20' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-21' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-22' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-23' FROM DUAL UNION ALL
SELECT 164354, 'H', DATE '2017-02-24' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-01' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-02' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-03' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-06' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-07' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-08' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-10' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-13' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-14' FROM DUAL UNION ALL
SELECT 214325, 'H', DATE '2017-03-15' FROM DUAL;
Query:
SELECT job_id,
MIN( action_date ) AS start_date,
MAX( action_date ) AS end_date,
COUNT( 1 ) AS num_days
FROM (
SELECT job_id,
action_date,
SUM( has_changed_group ) OVER ( PARTITION BY job_id ORDER BY action_date )
AS group_id
FROM (
SELECT job_id,
action_date,
CASE WHEN
LAG( action_date ) OVER ( PARTITION BY job_id ORDER BY action_date )
= action_date - CASE TRUNC( action_date ) - TRUNC( action_date, 'IW' )
WHEN 0 THEN 3 ELSE 1 END
THEN 0
ELSE 1
END AS has_changed_group
FROM test_job_list
)
)
GROUP BY job_id, group_id
-- HAVING COUNT(1) >= 10;
Output:
JOB_ID START_DATE END_DATE NUM_DAYS
---------- ------------------- ------------------- ----------
164354 2017-02-10 00:00:00 2017-02-24 00:00:00 11
214325 2017-03-10 00:00:00 2017-03-15 00:00:00 4
214325 2017-03-01 00:00:00 2017-03-08 00:00:00 6
Alternative:
If you just want the jobs where there has never been a period of 10 consecutive working days then you can use the COUNT() analytic function and specify a RANGE window:
SELECT job_id
FROM (
SELECT job_id,
COUNT( 1 ) OVER ( PARTITION BY job_id
ORDER BY action_date
RANGE BETWEEN INTERVAL '13' DAY PRECEDING
AND INTERVAL '0' DAY FOLLOWING )
AS num_days
FROM test_job_list
)
GROUP BY job_id
HAVING MAX( num_days ) < 10;
Output:
JOB_ID
----------
214325
Edit 2
First version had many issues, this one should work.
An option is to join the table with itself on the job_id, and filtering on the right side only the rows of the two weeks preceding the date on the left side. Then you can count the reimaining dates.
select JOB_ID
from (
select g1.JOB_ID, count(g2.ACTION_DATE) CNT
from GROUPS g1
join GROUPS g2
on g1.JOB_ID = g2.JOB_ID
where g2.ACTION_DATE between g1.ACTION_DATE - 13 and g1.ACTION_DATE
group by g1.JOB_ID, g1.ACTION_DATE
) t1
group by JOB_ID
having max(CNT) < 10
I know this solution too long , but you can see all details on query by executing step by step
create table calendar1 as
select day_id,WEEK_DAY_SHORT,day_num_of_week from VITDWH.DW_MIS_TAKVIM as calendar order by day_id;
CREATE TABLE JOB_LIST (JOB_ID NUMBER,ACTION_DATE DATE);
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('10-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('13-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('14-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('15-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('16-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('17-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('20-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('21-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('22-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('23-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(164354,TO_DATE('24-FEB-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('01-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('02-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('03-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('06-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('07-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('08-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('10-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('13-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('14-MAR-17','DD-MON-YY'));
INSERT INTO JOB_LIST VALUES(214325,TO_DATE('15-MAR-17','DD-MON-YY'));
COMMIT;
with a1 as
(
select A.JOB_ID,A.ACTION_DATE,B.DAY_ID,
(case when action_date is not null and lag(action_date) over(partition by job_id order by day_id) is null then action_date else null end) start_date,
(case when action_date is not null and lead(action_date) over(partition by job_id order by day_id) is null then action_date else null end) max_date
from
(
select * from calendar1
WHERE DAY_ID >=(select MIN(ACTION_DATE) from JOB_LIST)
AND DAY_ID <= (select MAX(ACTION_DATE) from JOB_LIST)
ORDER BY DAY_ID
)
B LEFT OUTER JOIN
JOB_LIST A
PARTITION BY (A.JOB_ID) ON (A.ACTION_DATE= B.DAY_ID)
ORDER BY A.JOB_ID,DAY_ID
)
,a2 as
(
select * from a1 where start_date is not null or max_date is not null
)
,a3 as
(
select a2.*,lead(max_date) over(partition by job_id order by day_id) end_date
from a2
)
select a.job_id,a.start_date,nvl(a.maX_date,a.end_date) end_date, (nvl(a.maX_date,a.end_date) -a.start_date) +1 date_count
from a3 a where start_date is not null;
10 days = 2 full weeks. For 11 days, you can look at the date 14 days ago and see if it is exactly two weeks ago:
select tjl.*,
lag(action_date, 10) over (partition by job id order by action_date) as minad_2weeks
from test_job_list;
A simple trick works for 10 days:
Then you can get jobs with no such period by using aggregation:
select job_id
from (select tjl.*,
lag(action_date, 9) over (partition by job_id order by action_date) as lag9_ad
from test_job_list tjl
) tjl
group by job_id
having max(action_date - lag9_ad) > action_date - 14;
That is, if the 9th date back is within the past two weeks, then there are two full weeks of dates.