To fetch the rehire period using lag n lead function - sql

Wanted to check whether the employee got rehired to a contract or not. If he is rehired then return the rehire period .
If the multiple employees got rehired then return all their rehire period .
Sample data:(Table 'Contract')
Employee_id Period Contract
111 202204 1NA
111 202205 1NA
111 202206 1NA
112 202207 1NA
112 202208 1NA
111 202209 1NA
In the above case the output should be ,
Employee_id Period Contract
111 202209 1NA
The query should first check whether the employee got rehired or not, if so then return the rehire period.
If that contract has got no rehire's then return NULL.
Any other logic other than lag n lead will also be Appreciated!
Thanks in advance:)
Image of the sample data

Use LAG to identify if the previous period interval, then only select the ones where the interval > 1
create table contracts (employee_id,period,contract) as
(
SELECT 111, 202204,'1NA' FROM DUAL UNION ALL
SELECT 111, 202205,'1NA' FROM DUAL UNION ALL
SELECT 111, 202206,'1NA' FROM DUAL UNION ALL
SELECT 111, 202209,'1NA' FROM DUAL UNION ALL
SELECT 112, 202207,'1NA' FROM DUAL UNION ALL
SELECT 112, 202208,'1NA' FROM DUAL
);
Table CONTRACTS created.
with contracts_w_lags (
employee_id
,period
,last_period
,contract
) as ( select employee_id
,period
,lag(period)
over(partition by employee_id
order by period)
,contract
from contracts
)
select employee_id
,period
,contract
from contracts_w_lags
where period - nvl( last_period ,period ) > 1;
EMPLOYEE_ID PERIOD CON
----------- ---------- ---
111 202209 1NA
Note that your sample data only has periods within the same year. This example will fail if periods cross years.
To overcome that, create a pseudo "periods" table with a rownumber to identify consecutive rows:
create table contracts (employee_id,period,contract) as
(
SELECT 111, 202111,'1NA' FROM DUAL UNION ALL
SELECT 111, 202112,'1NA' FROM DUAL UNION ALL
SELECT 111, 202201,'1NA' FROM DUAL UNION ALL
SELECT 111, 202203,'1NA' FROM DUAL UNION ALL
SELECT 112, 202207,'1NA' FROM DUAL UNION ALL
SELECT 112, 202208,'1NA' FROM DUAL
);
Table CONTRACTS created.
with month_count ( cnt ) as
( select months_between(
to_date( max(period) ,'YYYYMM' )
,to_date( min(period) ,'YYYYMM' ))
from contracts
),contract_start ( dt ) as
( select to_date( min(period) ,'YYYYMM' )
from contracts
),contract_periods ( period ,rn ) as
( select to_char( add_months( c.dt ,level - 1 ) ,'YYYYMM' )
,row_number() over( order by add_months( c.dt ,level - 1 ) )
from contract_start c
,month_count m
connect by level <= m.cnt + 1
),contracts_w_lags ( employee_id ,period ,contract ,period_rn ,last_period_rn ) as
( select c.employee_id
,c.period
,c.contract
,p.rn
,lag(p.rn) over(partition by c.employee_id order by p.rn )
from contracts c
join contract_periods p on c.period = p.period
)
select employee_id
,period
,contract
from contracts_w_lags
where period_rn - nvl( last_period_rn ,period_rn ) > 1;
EMPLOYEE_ID PERIOD CON
----------- ---------- ---
111 202209 1NA

The answer (after comments) is at the end....
It is unclear what is your expected result with this sample data:
WITH
contracts (EMP_ID, PERIOD, CONTRACT) as
(
SELECT 111, 202204, '1NA' FROM DUAL UNION ALL
SELECT 111, 202205, '1NA' FROM DUAL UNION ALL
SELECT 111, 202206, '1NA' FROM DUAL UNION ALL
SELECT 112, 202207, '1NA' FROM DUAL UNION ALL
SELECT 112, 202208, '1NA' FROM DUAL UNION ALL
SELECT 111, 202209, '1NA' FROM DUAL
)
There are some multiple consecutive periods for both sample eployees. One of the options is to show first and last periods for emps with multiple periods:
SELECT EMP_ID, Min(PREV_PERIOD) "FIRST_PERIOD", Max(PERIOD) "LAST_PERIOD", CONTRACT
FROM (Select EMP_ID, PERIOD, CONTRACT,
LAG(PERIOD, 1, 0) OVER(Partition By EMP_ID Order By PERIOD) "PREV_PERIOD"
From contracts)
WHERE PREV_PERIOD != 0
GROUP BY EMP_ID, CONTRACT
--
-- R e s u l t :
-- EMP_ID FIRST_PERIOD LAST_PERIOD CONTRACT
-- ---------- ------------ ----------- --------
-- 111 202204 202209 1NA
-- 112 202207 202208 1NA
... another could be to show them all :
SELECT EMP_ID, PERIOD "PERIOD", PREV_PERIOD "PREV_PERIOD", CONTRACT
FROM (Select EMP_ID, PERIOD, CONTRACT,
LAG(PERIOD, 1, 0) OVER(Partition By EMP_ID Order By PERIOD) "PREV_PERIOD"
From contracts)
WHERE PREV_PERIOD != 0
--
-- R e s u l t :
-- EMP_ID PERIOD PREV_PERIOD CONTRACT
-- ---------- ---------- ----------- --------
-- 111 202205 202204 1NA
-- 111 202206 202205 1NA
-- 111 202209 202206 1NA
-- 112 202208 202207 1NA
... and if you want the same with LEAD() function
SELECT EMP_ID, PERIOD "PERIOD", NEXT_PERIOD "NEXT_PERIOD", CONTRACT
FROM (Select EMP_ID, PERIOD, CONTRACT,
LEAD(PERIOD, 1, 0) OVER(Partition By EMP_ID Order By PERIOD) "NEXT_PERIOD"
From contracts)
WHERE NEXT_PERIOD != 0
--
-- R e s u l t :
-- EMP_ID PERIOD NEXT_PERIOD CONTRACT
-- ---------- ---------- ----------- --------
-- 111 202204 202205 1NA
-- 111 202205 202206 1NA
-- 111 202206 202309 1NA
-- 112 202207 202208 1NA
-- 112 202208 202207 1NA
It is pretty much the same - just showing next period instead of previous.
NOTE: If rehire to a contract means the same contract then -
OVER(Partition By EMP_ID, CONTRACT ....)
To do the opposite (non-consecutive periods):
SELECT EMP_ID, PERIOD "PERIOD", NEXT_PERIOD "NEXT_PERIOD", CONTRACT
FROM (Select EMP_ID, PERIOD, CONTRACT,
LEAD(PERIOD, 1, 0) OVER(Partition By EMP_ID Order By PERIOD) "NEXT_PERIOD"
From contracts)
WHERE NEXT_PERIOD != 0 And CASE WHEN SubStr(NEXT_PERIOD, 1, 4) = SubStr(PERIOD, 1, 4)
THEN NEXT_PERIOD - PERIOD
ELSE NEXT_PERIOD - (PERIOD + 88) -- handling the year change
END > 1
--
-- R e s u l t :
-- EMP_ID PERIOD NEXT_PERIOD CONTRACT
-- ---------- ---------- ----------- --------
-- 111 202206 202209 1NA

From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row pattern matching:
SELECT emp_id, period, contract
FROM contracts
MATCH_RECOGNIZE(
PARTITION BY contract
ORDER BY period
MEASURES
FIRST(emp_id) AS emp_id,
LAST(period) AS period
AFTER MATCH SKIP TO FIRST different_emp
PATTERN (emp+ different_emp+ emp)
DEFINE
emp AS FIRST(emp_id) = emp_id,
different_emp AS FIRST(emp_id) != emp_id
);
Which, for the sample data:
CREATE TABLE contracts (EMP_ID, PERIOD, CONTRACT) as
SELECT 111, 202204, '1NA' FROM DUAL UNION ALL
SELECT 111, 202205, '1NA' FROM DUAL UNION ALL
SELECT 111, 202206, '1NA' FROM DUAL UNION ALL
SELECT 112, 202207, '1NA' FROM DUAL UNION ALL
SELECT 112, 202208, '1NA' FROM DUAL UNION ALL
SELECT 111, 202209, '1NA' FROM DUAL;
Outputs:
EMP_ID
PERIOD
CONTRACT
111
202209
1NA
fiddle

Related

I want to get the count of roll number for each month

ID
STUDENT_ID
STATUS_DATE
1002
434120010026
25-FEB-22
1000
434120010026
03-MAY-03
1001
434120010026
25-FEB-22
1020
434120020023
18-MAR-22
1021
434120020025
18-MAR-22
1022
434120020025
16-MAR-22
Tried this
select count(*),
trunc(status_date, 'mm')
from test_studentattendance
group by trunc(status_date, 'mm');
got count of roll number in each month not the roll numbers.
COUNT(*)
TRUNC(STATUS_DATE,'MM')
1
01-MAY-03
2
01-FEB-22
3
01-MAR-22
COUNT(*) is counting the number of rows in each group and is counting students that appear in the same month multiple times.
To get the number on roll each month, you need to COUNT the DISTINCT identifier for each student (which, I assume would be student_id):
SELECT COUNT(DISTINCT student_id) AS number_of_students,
TRUNC(status_date, 'mm') AS month
FROM test_studentattendance
GROUP BY TRUNC(status_date, 'mm');
Which, for your sample data:
CREATE TABLE test_studentattendance (ID, STUDENT_ID, STATUS_DATE) AS
SELECT 1002, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1000, 434120010026, DATE '2003-05-03' FROM DUAL UNION ALL
SELECT 1001, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1020, 434120020023, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1021, 434120020025, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1022, 434120020025, DATE '2022-03-16' FROM DUAL;
Outputs:
NUMBER_OF_STUDENTS
MONTH
1
2022-02-01 00:00:00
1
2003-05-01 00:00:00
2
2022-03-01 00:00:00
db<>fiddle here

Oracle SQL to find population by percentage range

I have a table with customers, purchase date and zip code. Key is (customer_id, purchase_dt and zip_cd)
I am trying to find zip codes where customers are doing business, ranges like 80% and above, 60 - 80%, 40-60%. Can someone help me out with a query to achieve this.
with tmp as
(
select 123 as cust_id, date '2017-01-01' purchase_dt, '10035' zip_cd from dual
union
select 1234 as cust_id, date '2019-06-01' purchase_dt, '11377' zip_cd from dual
union
select 12345 as cust_id, date '2019-07-01' purchase_dt, '11377' zip_cd from dual
union
select 234 as cust_id, date '2019-08-01' purchase_dt, '11377' zip_cd from dual
union
select 2345 as cust_id, date '2019-09-01' purchase_dt, '11417' zip_cd from dual
)
select * from tmp;
Expected output:
80% and above zip code: 11377 and so on..
You can use the combination of the average and analytical function count as follows:
with tmp as
(
select 123 as cust_id, date '2017-01-01' purchase_dt, '10035' zip_cd from dual
union
select 1234 as cust_id, date '2019-06-01' purchase_dt, '11377' zip_cd from dual
union
select 12345 as cust_id, date '2019-07-01' purchase_dt, '11377' zip_cd from dual
union
select 234 as cust_id, date '2019-08-01' purchase_dt, '11377' zip_cd from dual
union
select 2345 as cust_id, date '2019-09-01' purchase_dt, '11417' zip_cd from dual
)
select zip_cd, 100*(count(1)/cnt) percntg from
(select zip_cd, count(1) over () cnt from tmp)
group by zip_cd, cnt
order by percntg desc;
This answer will take into account if a customer has made purchases on multiple days and won't double count them. Additionally, this response adds in the grouping discussed in the question:
with tmp as
(
select 123 as cust_id, date '2017-01-01' purchase_dt, '10035' zip_cd from dual
union
select 1234 as cust_id, date '2019-06-01' purchase_dt, '11377' zip_cd from dual
union
select 12345 as cust_id, date '2019-07-01' purchase_dt, '11377' zip_cd from dual
union
select 234 as cust_id, date '2019-08-01' purchase_dt, '11377' zip_cd from dual
union
select 2345 as cust_id, date '2019-09-01' purchase_dt, '11417' zip_cd from dual
)
SELECT sub2.pct_range, listagg(sub2.zip_cd||' ('||sub2.zip_pct||')', ', ') WITHIN GROUP (ORDER BY zip_pct DESC) AS ZIP_CODES
FROM (SELECT CASE
WHEN sub.zip_pct BETWEEN 80 AND 100 THEN '80% and above'
WHEN sub.zip_pct BETWEEN 60 AND 79 THEN '60% to 79%'
WHEN sub.zip_pct BETWEEN 40 AND 59 THEN '40% to 59%'
WHEN sub.zip_pct BETWEEN 20 AND 39 THEN '20% to 39%'
ELSE 'Below 20%'
END AS PCT_RANGE,
sub.zip_cd,
sub.zip_pct
FROM (SELECT DISTINCT
zip_cd,
100*COUNT(DISTINCT cust_id) OVER (PARTITION BY zip_cd)/COUNT(DISTINCT cust_id) OVER () AS ZIP_PCT
FROM tmp) sub) sub2
GROUP BY pct_range
ORDER BY pct_range DESC;

Retrieve single row from a query

I am creating a query to find salary details of an employee with date_to as '31-dec-4712' (Latest).
But, If date_to is 31-dec-4712 for two rows for an employee then the one with status 'Approved' should be picked in other cases when only
single rows comes then that should be returned as is.
I have created the below query for the salary details. need help with teh above scenario
select distinct PAPF.EMPLOYEE_NUMBER ,
TO_CHAR (EMP_DOJ (PAPF.PERSON_ID),'DD-MON-YYYY' ) DOJ ,
TO_CHAR(HR_EMPLOYEE_ORIGINAL_DOJ(PAPF.EMPLOYEE_NUMBER,42) ,'DD- MON-YYYY' ) ORIGINAL_DOJ,
PPP.CHANGE_DATE,
PPP.DATE_TO,
PPP.PROPOSED_SALARY_N TOTAL_REMUN,
HR_GENERAL.DECODE_LOOKUP('PER_SAL_PROPOSAL_STATUS',APPROVED) status
from PER_ALL_ASSIGNMENTS_F PAAF,
PER_ALL_PEOPLE_F PAPF,
PER_PAY_PROPOSALS PPP
where 1 = 1
and PAPF.PERSON_ID = PAAF.PERSON_ID
and PAPF.BUSINESS_GROUP_ID = 21
and PAPF.CURRENT_EMPLOYEE_FLAG = 'Y'
and papf.employee_number = '109575'
and :P_DATE1 between PAAF.EFFECTIVE_START_DATE
and PAAF.EFFECTIVE_END_DATE
and :P_DATE1 between PAPF.EFFECTIVE_START_DATE
and PAPF.EFFECTIVE_END_DATE
and :P_DATE1 between PPP.CHANGE_DATE(+)
and NVL(PPP.DATE_TO, HR_GENERAL.END_OF_TIME)
and PPP.ASSIGNMENT_ID(+) = PAAF.ASSIGNMENT_ID
order by TO_NUMBER(PAPF.EMPLOYEE_NUMBER);
Emp_num DOJ ORIGINAL_DOJ CHANGE_DATE DATE_TO TOTAL_REMUN STATUS
109575 01-DEC-2016 24-JUL-2014 01-MAY-19 31-DEC-12 250000 Proposed
109575 01-DEC-2016 24-JUL-2014 01-APR-19 31-DEC-12 100000 Approved
You can use conditional ordering for each employee separately, like here:
-- sample rows
with salaries (emp_id, name, salary, date_to, status) as (
select 1001, 'Orange', 1400, date '4712-12-31', 'Rejected' from dual union all
select 1001, 'Orange', 1200, date '4712-12-31', 'Approved' from dual union all
select 1002, 'Red', 2500, date '4712-12-31', 'Approved' from dual union all
select 1003, 'Blue', 2700, date '4712-12-31', 'Proposed' from dual union all
select 1004, 'Green', 2200, date '2012-07-31', 'Approved' from dual union all
select 1005, 'White', 1200, date '4712-12-31', 'Approved' from dual union all
select 1005, 'White', 1300, date '4712-12-31', 'Rejected' from dual )
-- end of sample data
select emp_id, name, salary, date_to, status
from (
select s.*,
row_number() over (partition by emp_id
order by case status when 'Approved' then 1 end) rn
from salaries s
where date_to = date '4712-12-31')
where rn = 1
Result:
EMP_ID NAME SALARY DATE_TO STATUS
---------- ------ ---------- ----------- --------
1001 Orange 1200 4712-12-31 Approved
1002 Red 2500 4712-12-31 Approved
1003 Blue 2700 4712-12-31 Proposed
1005 White 1200 4712-12-31 Approved
If the STATUS takes only two values, "Approved" and "Proposed", you can order by STATUS and fetch the first row. If you have (or in the future you'll have) more statuses and you want to define a priority add a column in the select with a "CASE" that assigns to each status the corresponding priority. Then you order by this column and you fetch the first row....

Month counts between dates

I have the below table. I need to count how many ids were active in a given month. So thinking I'll need to create a row for each id that was active during that month so that id can be counted each month. A row should be generated for a term_dt during that month.
active_dt term_dt id
1/1/2018 101
1/1/2018 5/15/2018 102
3/1/2018 6/1/2018 103
1/1/2018 4/25/18 104
Apparently this is a "count number of overlapping intervals" problem. The algorithm goes like this:
Create a sorted list of all start and end points
Calculate a running sum over this list, add one when you encounter a start and subtract one when you encounter an end
If two points are same then perform subtractions first
You will end up with list of all points where the sum changed
Here is a rough outline of the query. It is for SQL Server but could be ported to any RDBMS that supports window functions:
WITH cte1(date, val) AS (
SELECT active_dt, 1 FROM #t AS t
UNION ALL
SELECT COALESCE(term_dt, '2099-01-01'), -1 FROM #t AS t
-- if end date is null then assume the row is valid indefinitely
), cte2 AS (
SELECT date, SUM(val) OVER(ORDER BY date, val) AS rs
FROM cte1
)
SELECT YEAR(date) AS YY, MONTH(date) AS MM, MAX(rs) AS MaxActiveThisYearMonth
FROM cte2
GROUP BY YEAR(date), MONTH(date)
DB Fiddle
I was toying with a simpler query, that seemed to do the trick, for Oracle:
with candidates (month_start) as (
select to_date ('2018-' || column_value || '-01','YYYY-MM-DD')
from
table
(sys.odcivarchar2list('01','02','03','04','05',
'06','07','08','09','10','11','12'))
), sample_data (active_dt, term_dt, id) as (
select to_date('01/01/2018', 'MM/DD/YYYY'), null, 101 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('05/15/2018', 'MM/DD/YYYY'), 102 from dual
union select to_date('03/01/2018', 'MM/DD/YYYY'),
to_date('06/01/2018', 'MM/DD/YYYY'), 103 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('04/25/2018', 'MM/DD/YYYY'), 104 from dual
)
select c.month_start, count(1)
from candidates c
join sample_data d
on c.month_start between d.active_dt and nvl(d.term_dt,current_date)
group by c.month_start
order by c.month_start
An alternative solution would be to use a hierarchical query, e.g.:
WITH your_table AS (SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, NULL term_dt, 101 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('15/05/2018', 'dd/mm/yyyy') term_dt, 102 ID FROM dual UNION ALL
SELECT to_date('01/03/2018', 'dd/mm/yyyy') active_dt, to_date('01/06/2018', 'dd/mm/yyyy') term_dt, 103 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('25/04/2018', 'dd/mm/yyyy') term_dt, 104 ID FROM dual)
SELECT active_month,
COUNT(*) num_active_ids
FROM (SELECT add_months(TRUNC(active_dt, 'mm'), -1 + LEVEL) active_month,
ID
FROM your_table
CONNECT BY PRIOR ID = ID
AND PRIOR sys_guid() IS NOT NULL
AND LEVEL <= FLOOR(months_between(coalesce(term_dt, SYSDATE), active_dt)) + 1)
GROUP BY active_month
ORDER BY active_month;
ACTIVE_MONTH NUM_ACTIVE_IDS
------------ --------------
01/01/2018 3
01/02/2018 3
01/03/2018 4
01/04/2018 4
01/05/2018 3
01/06/2018 2
01/07/2018 1
01/08/2018 1
01/09/2018 1
01/10/2018 1
Whether this is more or less performant than the other answers is up to you to test.

Finding missing dates in a sequence

I have following table with ID and DATE
ID DATE
123 7/1/2015
123 6/1/2015
123 5/1/2015
123 4/1/2015
123 9/1/2014
123 8/1/2014
123 7/1/2014
123 6/1/2014
456 11/1/2014
456 10/1/2014
456 9/1/2014
456 8/1/2014
456 5/1/2014
456 4/1/2014
456 3/1/2014
789 9/1/2014
789 8/1/2014
789 7/1/2014
789 6/1/2014
789 5/1/2014
789 4/1/2014
789 3/1/2014
In this table, I have three customer ids, 123, 456, 789 and date column which shows which month they worked.
I want to find out which of the customers have gap in their work.
Our customers work record is kept per month...so, dates are monthly..
and each customer have different start and end dates.
Expected results:
ID First_Absent_date
123 10/01/2014
456 06/01/2014
To get a simple list of the IDs with gaps, with no further details, you need to look at each ID separately, and as #mikey suggested you can count the number of months and look at the first and last date to see if how many months that spans.
If your table has a column called month (since date isn't allowed unless it's a quoted identifier) you could start with:
select id, count(month), min(month), max(month),
months_between(max(month), min(month)) + 1 as diff
from your_table
group by id
order by id;
ID COUNT(MONTH) MIN(MONTH) MAX(MONTH) DIFF
---------- ------------ ---------- ---------- ----------
123 8 01-JUN-14 01-JUL-15 14
456 7 01-MAR-14 01-NOV-14 9
789 7 01-MAR-14 01-SEP-14 7
Then compare the count with the month span, in a having clause:
select id
from your_table
group by id
having count(month) != months_between(max(month), min(month)) + 1
order by id;
ID
----------
123
456
If you can actually have multiple records in a month for an ID, and/or the date recorded might not be the start of the month, you can do a bit more work to normalise the dates:
select id,
count(distinct trunc(month, 'MM')),
min(trunc(month, 'MM')),
max(trunc(month, 'MM')),
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1 as diff
from your_table
group by id
order by id;
select id
from your_table
group by id
having count(distinct trunc(month, 'MM')) !=
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1
order by id;
Oracle Setup:
CREATE TABLE your_table ( ID, "DATE" ) AS
SELECT 123, DATE '2015-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-06-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-05-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-04-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-11-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-10-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-03-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-03-01' FROM DUAL;
Query:
SELECT ID,
MIN( missing_date )
FROM (
SELECT ID,
CASE WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
= ADD_MONTHS( "DATE", 1 ) THEN NULL
WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
IS NULL THEN NULL
ELSE ADD_MONTHS( "DATE", 1 )
END AS missing_date
FROM your_table
)
GROUP BY ID
HAVING COUNT( missing_date ) > 0;
Output:
ID MIN(MISSING_DATE)
---------- -------------------
123 2014-10-01 00:00:00
456 2014-06-01 00:00:00
You could use a Lag() function to see if records have been skipped for a particular date or not.Lag() basically helps in comparing the data in current row with previous row. So if we order by DATE, we could easily compare and find any gaps.
select * from
(
select ID,DATE_, case when DATE_DIFF>1 then 1 else 0 end comparison from
(
select ID, DATE_ ,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;
This groups all the entries by id, and then arranges the records by date. If a customer is always present, there would not be a gap in his date. So anyone who has a date difference greater than 1 had a gap. You could tweak this as per your requirement.
EDIT : Just observed that you are storing data in mm/dd/yyyy format, when I closely observed above answers.You are storing only first date of every month. So, the above query can be tweaked as :
select * from
(
select ID,DATE_,PREV_DATE,last_day(PREV_DATE)+1 ABSENT_DATE, case when DATE_DIFF>31 then 1 else 0 end comparison from
(
select ID, DATE_ ,LAG(DATE_,1) OVER (PARTITION BY ID ORDER BY DATE_) PREV_DATE,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;