Month counts between dates - sql

I have the below table. I need to count how many ids were active in a given month. So thinking I'll need to create a row for each id that was active during that month so that id can be counted each month. A row should be generated for a term_dt during that month.
active_dt term_dt id
1/1/2018 101
1/1/2018 5/15/2018 102
3/1/2018 6/1/2018 103
1/1/2018 4/25/18 104

Apparently this is a "count number of overlapping intervals" problem. The algorithm goes like this:
Create a sorted list of all start and end points
Calculate a running sum over this list, add one when you encounter a start and subtract one when you encounter an end
If two points are same then perform subtractions first
You will end up with list of all points where the sum changed
Here is a rough outline of the query. It is for SQL Server but could be ported to any RDBMS that supports window functions:
WITH cte1(date, val) AS (
SELECT active_dt, 1 FROM #t AS t
UNION ALL
SELECT COALESCE(term_dt, '2099-01-01'), -1 FROM #t AS t
-- if end date is null then assume the row is valid indefinitely
), cte2 AS (
SELECT date, SUM(val) OVER(ORDER BY date, val) AS rs
FROM cte1
)
SELECT YEAR(date) AS YY, MONTH(date) AS MM, MAX(rs) AS MaxActiveThisYearMonth
FROM cte2
GROUP BY YEAR(date), MONTH(date)
DB Fiddle

I was toying with a simpler query, that seemed to do the trick, for Oracle:
with candidates (month_start) as (
select to_date ('2018-' || column_value || '-01','YYYY-MM-DD')
from
table
(sys.odcivarchar2list('01','02','03','04','05',
'06','07','08','09','10','11','12'))
), sample_data (active_dt, term_dt, id) as (
select to_date('01/01/2018', 'MM/DD/YYYY'), null, 101 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('05/15/2018', 'MM/DD/YYYY'), 102 from dual
union select to_date('03/01/2018', 'MM/DD/YYYY'),
to_date('06/01/2018', 'MM/DD/YYYY'), 103 from dual
union select to_date('01/01/2018', 'MM/DD/YYYY'),
to_date('04/25/2018', 'MM/DD/YYYY'), 104 from dual
)
select c.month_start, count(1)
from candidates c
join sample_data d
on c.month_start between d.active_dt and nvl(d.term_dt,current_date)
group by c.month_start
order by c.month_start

An alternative solution would be to use a hierarchical query, e.g.:
WITH your_table AS (SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, NULL term_dt, 101 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('15/05/2018', 'dd/mm/yyyy') term_dt, 102 ID FROM dual UNION ALL
SELECT to_date('01/03/2018', 'dd/mm/yyyy') active_dt, to_date('01/06/2018', 'dd/mm/yyyy') term_dt, 103 ID FROM dual UNION ALL
SELECT to_date('01/01/2018', 'dd/mm/yyyy') active_dt, to_date('25/04/2018', 'dd/mm/yyyy') term_dt, 104 ID FROM dual)
SELECT active_month,
COUNT(*) num_active_ids
FROM (SELECT add_months(TRUNC(active_dt, 'mm'), -1 + LEVEL) active_month,
ID
FROM your_table
CONNECT BY PRIOR ID = ID
AND PRIOR sys_guid() IS NOT NULL
AND LEVEL <= FLOOR(months_between(coalesce(term_dt, SYSDATE), active_dt)) + 1)
GROUP BY active_month
ORDER BY active_month;
ACTIVE_MONTH NUM_ACTIVE_IDS
------------ --------------
01/01/2018 3
01/02/2018 3
01/03/2018 4
01/04/2018 4
01/05/2018 3
01/06/2018 2
01/07/2018 1
01/08/2018 1
01/09/2018 1
01/10/2018 1
Whether this is more or less performant than the other answers is up to you to test.

Related

How do you combine query results from different rows into one?

My original query:
SELECT desc, start_date
FROM foo.bar
WHERE desc LIKE 'Fall%' AND desc NOT LIKE '%Med%'
UNION
SELECT desc, end_date
FROM foo.bar
WHERE desc LIKE 'Spring%' AND desc NOT LIKE '%Med%'
ORDER BY start_date;
With this query, I get (roughly) the data set I am looking for. I now need to take that data and combine the results taking two at a time in order and then produce a result like:
DESC
START_DATE
END_DATE
Fall 1971 - Spring 1972
15-AUG-71
15-MAY-72
Fall 1971 - Spring 1972
15-AUG-72
15-MAY-73
Where DESC is a concatenation of the DESC form row 1 and 2, START_DATE is the date from row 1 and END_DATE is the date from row 2. Following this same pattern for the entire data set.
Any help with a query that will produce the result I need is greatly appreciated. Not sure if I'm heading down the right path or if that originally query is just wrong.
As stated above, I tried the supplied query, which gives me the data I need. However, I've been unsuccessful in finding a way to format it into my desired output. It should also be noted that I am running this on an Oracle database.
Instead of union, use each of those queries as CTEs (with a slight modification - include row number you'll later use in JOIN):
Sample data:
SQL> with test (description, datum) as
2 (select 'Fall 1971' , date '1971-08-15' from dual union all
3 select 'Spring 1972', date '1972-05-15' from dual union all
4 select 'Fall 1972' , date '1972-08-15' from dual union all
5 select 'Spring 1973', date '1973-05-15' from dual union all
6 select 'Fall 1973' , date '1973-08-15' from dual union all
7 select 'Spring 1974', date '1974-05-15' from dual union all
8 select 'Fall 1974' , date '1974-08-15' from dual union all
9 select 'Spring 1975', date '1975-05-15' from dual
10 ),
Query begins here: t_start and t_end represent your current queries
11 t_start as
12 (select description, datum,
13 row_number() Over (order by datum) rn
14 from test
15 where description like 'Fall%' and description not like '%Med%'
16 ),
17 t_end as
18 (select description, datum,
19 row_number() Over (order by datum) rn
20 from test
21 where description like 'Spring%' and description not like '%Med%'
22 )
Finally:
23 select s.description ||' - '|| e.description as description,
24 s.datum start_date,
25 e.datum end_date
26 from t_start s join t_end e on s.rn = e.rn
27 order by s.rn;
DESCRIPTION START_DAT END_DATE
------------------------- --------- ---------
Fall 1971 - Spring 1972 15-AUG-71 15-MAY-72
Fall 1972 - Spring 1973 15-AUG-72 15-MAY-73
Fall 1973 - Spring 1974 15-AUG-73 15-MAY-74
Fall 1974 - Spring 1975 15-AUG-74 15-MAY-75
SQL>
You can also use the MODEL clause to avoid to scan the table twice:
with data(description,datum) as (
select 'Fall 1971' , date '1971-08-15' from dual union all
select 'Spring 1972', date '1972-05-15' from dual union all
select 'Fall 1972' , date '1972-08-15' from dual union all
select 'Spring 1973', date '1973-05-15' from dual union all
select 'Fall 1973' , date '1973-08-15' from dual union all
select 'Spring 1974', date '1974-05-15' from dual union all
select 'Fall 1974' , date '1974-08-15' from dual union all
select 'Spring 1975', date '1975-05-15' from dual
)
select description, start_date, end_date
from (
select rn, desc1 as description, start_date, end_date
from (
select row_number() over(order by datum) as rn, description, datum
from data
where description not like '%Med%'
)
model
dimension by (rn)
measures (
cast(' ' as varchar2(256)) as desc1, description, cast(NULL as DATE) start_date, cast(NULL as DATE) end_date , datum
)
rules (
desc1[mod(rn,2)=1] = description[cv()] || ' - ' || description[cv()+1],
start_date[mod(rn,2)=1] = datum[cv()],
end_date[mod(rn,2)=1] = datum[cv()+1]
)
)
where mod(rn,2)=1
;

sql oracle goup by on dates with possibilities of null values

I have a table with emplid and end_date columns. I want from all emplids the max end_dates. If at least one end_date is null, I want to have the null value as max. So in this example:
emplid end_date
1 05/04/2019
1 05/10/2019
1 null
2 05/04/2019
2 05/10/2019
I want as result:
emplid end_date
1 null
2 05/10/2019
I tried something like
select emplid,
CASE
WHEN MAX(NVL(end_Date,'01/01/3000'))='01/01/3000' THEN null
ELSE end_date
END as end_dt
from people
group by emplid
then I get a group-by error.
Maybe it is very easy, but I don't figure out how to get properly what I want.
with s(id, dt) as (
select 1, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 1, to_date('05/10/2019', 'dd/mm/yyyy') from dual union all
select 1, null from dual union all
select 2, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 2, to_date('05/10/2019', 'dd/mm/yyyy') from dual)
select id, decode(count(dt), count(*), max(dt)) max_dt
from s
group by id;
ID MAX_DT
---------- -----------------------------
1
2 2019-10-05 00:00:00
I would simply do:
select emplid,
(case when count(*) = count(end_date)
then max(end_date)
end) as max_end_date
from t
group by emplid;
There is no reason to introduce a "magic" maximum value (even if it is correct).
The first expression in the case is simply asking "do the number of non-NULL end-date values match the number of rows".
Try this
SELECT
EMPLID,
CASE WHEN END_DATE='01/01/3000' THEN NULL ELSE END_DATE END AS END_DT
FROM
(
SELECT EMPLID, MAX(END_DATE) AS END_DATE FROM
(
SELECT EMPLID, NVL(END_DATE,'01/01/3000') AS END_DATE FROM PEOPLE
)
GROUP BY EMPLID
);
Case does not go with group by , you have to get the max value using group by first then evaluate the null values. Try below.
select empid, CASE WHEN NVL(eDate,'01-DEC-3000')='01-DEC-3000' THEN null ELSE edate end end_dt from (
select empid, MAX(NVL(eDate,'01-DEC-3000')) eDate
from
(select 1 empid, sysdate-100 edate from dual union all
select 1 empid, sysdate-10 edate from dual union all
select 1 empid, null edate from dual union all
select 2 empid, sysdate-105 edate from dual union all
select 2 empid, sysdate-1 edate from dual ) datad
group by empid);

Finding missing dates in a sequence

I have following table with ID and DATE
ID DATE
123 7/1/2015
123 6/1/2015
123 5/1/2015
123 4/1/2015
123 9/1/2014
123 8/1/2014
123 7/1/2014
123 6/1/2014
456 11/1/2014
456 10/1/2014
456 9/1/2014
456 8/1/2014
456 5/1/2014
456 4/1/2014
456 3/1/2014
789 9/1/2014
789 8/1/2014
789 7/1/2014
789 6/1/2014
789 5/1/2014
789 4/1/2014
789 3/1/2014
In this table, I have three customer ids, 123, 456, 789 and date column which shows which month they worked.
I want to find out which of the customers have gap in their work.
Our customers work record is kept per month...so, dates are monthly..
and each customer have different start and end dates.
Expected results:
ID First_Absent_date
123 10/01/2014
456 06/01/2014
To get a simple list of the IDs with gaps, with no further details, you need to look at each ID separately, and as #mikey suggested you can count the number of months and look at the first and last date to see if how many months that spans.
If your table has a column called month (since date isn't allowed unless it's a quoted identifier) you could start with:
select id, count(month), min(month), max(month),
months_between(max(month), min(month)) + 1 as diff
from your_table
group by id
order by id;
ID COUNT(MONTH) MIN(MONTH) MAX(MONTH) DIFF
---------- ------------ ---------- ---------- ----------
123 8 01-JUN-14 01-JUL-15 14
456 7 01-MAR-14 01-NOV-14 9
789 7 01-MAR-14 01-SEP-14 7
Then compare the count with the month span, in a having clause:
select id
from your_table
group by id
having count(month) != months_between(max(month), min(month)) + 1
order by id;
ID
----------
123
456
If you can actually have multiple records in a month for an ID, and/or the date recorded might not be the start of the month, you can do a bit more work to normalise the dates:
select id,
count(distinct trunc(month, 'MM')),
min(trunc(month, 'MM')),
max(trunc(month, 'MM')),
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1 as diff
from your_table
group by id
order by id;
select id
from your_table
group by id
having count(distinct trunc(month, 'MM')) !=
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1
order by id;
Oracle Setup:
CREATE TABLE your_table ( ID, "DATE" ) AS
SELECT 123, DATE '2015-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-06-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-05-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-04-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-11-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-10-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-03-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-03-01' FROM DUAL;
Query:
SELECT ID,
MIN( missing_date )
FROM (
SELECT ID,
CASE WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
= ADD_MONTHS( "DATE", 1 ) THEN NULL
WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
IS NULL THEN NULL
ELSE ADD_MONTHS( "DATE", 1 )
END AS missing_date
FROM your_table
)
GROUP BY ID
HAVING COUNT( missing_date ) > 0;
Output:
ID MIN(MISSING_DATE)
---------- -------------------
123 2014-10-01 00:00:00
456 2014-06-01 00:00:00
You could use a Lag() function to see if records have been skipped for a particular date or not.Lag() basically helps in comparing the data in current row with previous row. So if we order by DATE, we could easily compare and find any gaps.
select * from
(
select ID,DATE_, case when DATE_DIFF>1 then 1 else 0 end comparison from
(
select ID, DATE_ ,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;
This groups all the entries by id, and then arranges the records by date. If a customer is always present, there would not be a gap in his date. So anyone who has a date difference greater than 1 had a gap. You could tweak this as per your requirement.
EDIT : Just observed that you are storing data in mm/dd/yyyy format, when I closely observed above answers.You are storing only first date of every month. So, the above query can be tweaked as :
select * from
(
select ID,DATE_,PREV_DATE,last_day(PREV_DATE)+1 ABSENT_DATE, case when DATE_DIFF>31 then 1 else 0 end comparison from
(
select ID, DATE_ ,LAG(DATE_,1) OVER (PARTITION BY ID ORDER BY DATE_) PREV_DATE,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;

SQL (Oracle) to query for record with max date, only if the end_dt has a value

I am trying to select a record from a row by looking at both the start date and the end date. What I need to do is pick the max start date, then only return a result from that max date if the end date has a value.
I hope the images below help clarify this a bit more. This is in Oracle based SQL.
Example #2
I can, so far, either return all the records or incorrectly return a record in scenario #2 but I've yet to figure out the best way to make this work. I would greatly appreciate any assistance.
Thank you!
I would use an analytic function:
with sample_data as (select 1 id, 1 grp_id, to_date('01/01/2015', 'dd/mm/yyyy') st_dt, to_date('23/01/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 2 id, 1 grp_id, to_date('24/02/2015', 'dd/mm/yyyy') st_dt, to_date('15/02/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 3 id, 1 grp_id, to_date('17/03/2015', 'dd/mm/yyyy') st_dt, to_date('30/03/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 4 id, 2 grp_id, to_date('01/01/2015', 'dd/mm/yyyy') st_dt, to_date('17/01/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 5 id, 2 grp_id, to_date('21/01/2015', 'dd/mm/yyyy') st_dt, to_date('23/03/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 6 id, 2 grp_id, to_date('14/04/2015', 'dd/mm/yyyy') st_dt, to_date('16/05/2015', 'dd/mm/yyyy') ed_dt from dual union all
select 7 id, 2 grp_id, to_date('28/05/2015', 'dd/mm/yyyy') st_dt, null ed_dt from dual),
res as (select id,
grp_id,
st_dt,
ed_dt,
max(st_dt) over (partition by grp_id) max_st_dt
from sample_data)
select id,
grp_id,
st_dt,
ed_dt
from res
where st_dt = max_st_dt
and ed_dt is not null;
ID GRP_ID ST_DT ED_DT
---------- ---------- ---------- ----------
3 1 17/03/2015 30/03/2015
This would be one of the simplest way.
select * from
(
select apay_id,
max(start_dt) OVER () max_start_dt,
start_dt,
end_dt
from sample
)
where
start_dt=max_start_dt
and end_dt is not null
Idea is to get maximum start_dt and corresponding end_dt.
And then filter result if end_dt is null.
SQL Fiddle
Database Schema
create table sample
(apay_id number(7),
account_number number(7),
start_dt date,
end_dt date);
Sample1
insert into sample values(554433, 123456, '15-Aug-15', null);
insert into sample values(112266, 123456, '21-Jul-15', '31-Aug-15');
insert into sample values(733221, 123456, '29-Jun-15', '31-Jul-15');
Output for Sample1
No rows
Sample2
insert into sample values(554433, 123456, '15-Aug-15', '11-Nov-15');
insert into sample values(112266, 123456, '21-Jul-15', '31-Aug-15');
insert into sample values(733221, 123456, '29-Jun-15', '31-Jul-15');
Output for Sample2
| APAY_ID | MAX_START_DT | END_DT |
|---------|--------------------------|----------------------------|
| 554433 | August, 15 2015 00:00:00 | November, 11 2015 00:00:00 |
select * from ( select apay_id from sample where end_dt is not null order by start_dt desc) where rownum=1
I think this can also work.

SQL Oracle Query self query

I am trying to figure out how to populate the below NULL values with 1.245 for dates from 07-OCT-14 to 29-SEP-14 then from 26-SEP-14 to 28-JUL-14 it will be 1.447.
This means if the date is less than or equal to the given date then use the value of max effective date which is less than the given date
We could select the last available index_ratio value for given security_alias and effective date <=p.effective_date , so in other words we will need to modify the sql to return from the subquery the index ratio value identified for the maximum available effective date assuming that this effective date is less or equal position effective date
How to populate the value ?
select ab.security_alias,
ab.index_ratio,
ab.effective_date
from securitydbo.security_analytics_fi ab
where ab.security_alias = 123627
order by ab.effective_date desc
Below should be the output
Assuming I understand your requirements correctly, I think the analytic function LAST_VALUE() is what you're after. E.g.:
with sample_data as (select 1 id, 10 val, to_date('01/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('02/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('03/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('04/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, 20 val, to_date('05/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, 21 val, to_date('06/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('07/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('08/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, 31 val, to_date('09/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, null val, to_date('10/08/2015', 'dd/mm/yyyy') dt from dual union all
select 1 id, 42 val, to_date('11/08/2015', 'dd/mm/yyyy') dt from dual)
select id,
last_value(val ignore nulls) over (partition by id order by dt) val,
dt
from sample_data
order by id, dt desc;
ID VAL DT
---------- ---------- ----------
1 42 11/08/2015
1 31 10/08/2015
1 31 09/08/2015
1 21 08/08/2015
1 21 07/08/2015
1 21 06/08/2015
1 20 05/08/2015
1 10 04/08/2015
1 10 03/08/2015
1 10 02/08/2015
1 10 01/08/2015