Grouping by Date inclusivity - sql

Here is the data I'm working with here
Accountid
Month
123
08/01/2021
123
09/01/2021
123
03/01/2022
123
04/01/2022
123
05/01/2022
123
06/01/2022
I'm trying to insert into a new table where the data is like this
Accountid
Start Month
End Month
123
08/01/2021
09/01/2021
123
03/01/2022
06/01/2022
I'm not sure how to separate them with the gap, and group by the account id in this case.
Thanks in advance

In 12c+ you may also use match_recognize for gaps-and-islands problems to define grouping rules (islands) in a more readable and natural way.
select *
from input_
match_recognize(
partition by accountid
order by month asc
measures
first(month) as start_month,
last(month) as end_month
/*Any month followed by any number of subsequent month */
pattern(any_ next*)
define
/*Next is the month right after the previous one*/
next as months_between(month, prev(month)) = 1
)
ACCOUNTID
START_MONTH
END_MONTH
123
2021-08-01
2021-09-01
123
2022-03-01
2022-06-01
db<>fiddle here

That's a gaps and islands problem; one option to do it is:
Sample data:
SQL> with test (accountid, month) as
2 (select 123, date '2021-01-08' from dual union all
3 select 123, date '2021-01-09' from dual union all
4 select 123, date '2021-01-03' from dual union all
5 select 123, date '2021-01-04' from dual union all
6 select 123, date '2021-01-05' from dual union all
7 select 123, date '2021-01-06' from dual
8 ),
Query begins here:
9 temp as
10 (select accountid, month,
11 to_char(month, 'J') - row_number() Over
12 (partition by accountid order by month) diff
13 from test
14 )
15 select accountid,
16 min(month) as start_month,
17 max(month) as end_Month
18 from temp
19 group by accountid, diff
20 order by accountid, start_month;
ACCOUNTID START_MONT END_MONTH
---------- ---------- ----------
123 03/01/2021 06/01/2021
123 08/01/2021 09/01/2021
SQL>
Although related to MS SQL Server, have a look at Introduction to Gaps and Islands Analysis; should be interesting reading for you, I presume.

Related

I want to get the count of roll number for each month

ID
STUDENT_ID
STATUS_DATE
1002
434120010026
25-FEB-22
1000
434120010026
03-MAY-03
1001
434120010026
25-FEB-22
1020
434120020023
18-MAR-22
1021
434120020025
18-MAR-22
1022
434120020025
16-MAR-22
Tried this
select count(*),
trunc(status_date, 'mm')
from test_studentattendance
group by trunc(status_date, 'mm');
got count of roll number in each month not the roll numbers.
COUNT(*)
TRUNC(STATUS_DATE,'MM')
1
01-MAY-03
2
01-FEB-22
3
01-MAR-22
COUNT(*) is counting the number of rows in each group and is counting students that appear in the same month multiple times.
To get the number on roll each month, you need to COUNT the DISTINCT identifier for each student (which, I assume would be student_id):
SELECT COUNT(DISTINCT student_id) AS number_of_students,
TRUNC(status_date, 'mm') AS month
FROM test_studentattendance
GROUP BY TRUNC(status_date, 'mm');
Which, for your sample data:
CREATE TABLE test_studentattendance (ID, STUDENT_ID, STATUS_DATE) AS
SELECT 1002, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1000, 434120010026, DATE '2003-05-03' FROM DUAL UNION ALL
SELECT 1001, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1020, 434120020023, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1021, 434120020025, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1022, 434120020025, DATE '2022-03-16' FROM DUAL;
Outputs:
NUMBER_OF_STUDENTS
MONTH
1
2022-02-01 00:00:00
1
2003-05-01 00:00:00
2
2022-03-01 00:00:00
db<>fiddle here

sql query to get the salary from past 3 years

I have a table salary that has salary details for an employee for various years like:
person_number date_from date_to salary RN
---------------------------------------------------------------------------
272 03-Mar-2022 31-dec-4712 109000 1
272 05-Mar-2021 02-Mar-2022 100000 1
272 10-Mar-2020 04-Mar-2020 100000 1
10 10-Mar-2019 31-dec-4712 4678 1
I want to get the latest salary for past 2 years along with current year. I created the below query for the same -
SELECT *
FROM
(SELECT
person_number
sal.salary_amount,
ROW_NUMBER() OVER (PARTITION BY person_number
ORDER BY date_to DESC) rn,
sal.date_from
FROM
cmp_salary sal
WHERE
1 = 1
AND TO_CHAR(sal.date_from, 'YYYY') = (SELECT TO_CHAR(Add_months(SYSDATE, -12), 'yyyy')
FROM dual)
)
WHERE
rn = 1
Like this one above I have created a clause for all 3 years (just replaced the 12 with 24,36).
These separate query is returning the correct data for 2 years back i.e till 2020 but the only case it is not working in when the same salary exists in 2 years.
Eg- For person #10, the salary is the same from 2019, 2021 and 2022.
Because I am using the year comparison in my above query, it will not return an output for 2020, 2021 and 2022, because the date_From is in 2019. Ideally it should give me the same salary for 2020, 2021 and 2022.
How to tweak this? I have to create 3 separate queries for this.
I don't think you need 3 separate queries; one would suffice. Also, according to data you posted, there's only one row per year per person so it is a "fixed" 3 years back. Therefore, I'd think of something like this instead:
Sample data:
SQL> WITH
2 cmp_salary (person_number,
3 date_from,
4 date_to,
5 salary_amount)
6 AS
7 (SELECT 272, DATE '2022-03-03', DATE '4712-12-31', 109000 FROM DUAL
8 UNION ALL
9 SELECT 272, DATE '2021-03-05', DATE '2022-03-02', 100000 FROM DUAL
10 UNION ALL
11 SELECT 272, DATE '2020-03-10', DATE '2020-03-04', 100000 FROM DUAL
12 UNION ALL
13 SELECT 10, DATE '2019-03-10', DATE '4712-12-31', 4678 FROM DUAL)
Query you might be interested in begins here:
14 SELECT person_number, sal.salary_amount, sal.date_from
15 FROM cmp_salary sal
16 CROSS JOIN
17 TABLE (CAST (MULTISET ( SELECT LEVEL
18 FROM DUAL
19 CONNECT BY LEVEL <= 3) AS SYS.odcinumberlist))
20 WHERE EXTRACT (YEAR FROM sal.date_from) =
21 EXTRACT (YEAR FROM SYSDATE) - COLUMN_VALUE + 1
22 ORDER BY person_number, date_from DESC;
PERSON_NUMBER SALARY_AMOUNT DATE_FROM
------------- ------------- ----------
272 109000 03.03.2022
272 100000 05.03.2021
272 100000 10.03.2020
SQL>

How to loop in Oracle SQL

Assume we have exampleDB and select contractdate like
SELECT DB.contractdate
FROM exampleDB DB
contractdate
2014/12/1
2015/12/1
2016/12/1
2017/12/1
2018/12/1
2019/12/1
I would like to count the policy number at each time like
each time policy count
2014/1/1 0
2015/1/1 1
2016/1/1 2
2017/1/1 3
2018/1/1 4
2019/1/1 5
I tried
WHERE DB.contractdate <='2014/1/1';
But I must loop such code manually.
How can I loop?
If the binning is every month,it is very stressful process.
can they be combined into one?
Best regards
select contractdate as "each time",
count(*) as "policy count"
from exampleDB
where contractdate in (mention all dates you want)
group by contractdate
Hope this will help you.
you can use row_num() and trunc() to get 1st day of the month
SELECT TRUNC(DB.contractdate, 'MONTH'), row_number() over (order by DB.contractdate) -1 as policy_count
FROM exampleDB DB
You can use COUNT analytical function with RANGE operator as follows:
SQL> with dataa(contractdate) as
2 (
3 select date '2014-12-01' from dual union all
4 select date '2015-12-01' from dual union all
5 select date '2016-12-01' from dual union all
6 select date '2017-12-01' from dual union all
7 select date '2018-12-01' from dual union all
8 select date '2019-12-01' from dual
9 )
10 SELECT
11 TRUNC(CONTRACTDATE, 'year') as "each time",
12 COUNT(1) OVER(
13 ORDER BY
14 CONTRACTDATE DESC
15 RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING
16 ) as "policy count"
17 FROM
18 DATAA order by 1;
each time policy count
--------- ------------
01-JAN-14 0
01-JAN-15 1
01-JAN-16 2
01-JAN-17 3
01-JAN-18 4
01-JAN-19 5
6 rows selected.
SQL>
Cheers!!

Finding missing dates in a sequence

I have following table with ID and DATE
ID DATE
123 7/1/2015
123 6/1/2015
123 5/1/2015
123 4/1/2015
123 9/1/2014
123 8/1/2014
123 7/1/2014
123 6/1/2014
456 11/1/2014
456 10/1/2014
456 9/1/2014
456 8/1/2014
456 5/1/2014
456 4/1/2014
456 3/1/2014
789 9/1/2014
789 8/1/2014
789 7/1/2014
789 6/1/2014
789 5/1/2014
789 4/1/2014
789 3/1/2014
In this table, I have three customer ids, 123, 456, 789 and date column which shows which month they worked.
I want to find out which of the customers have gap in their work.
Our customers work record is kept per month...so, dates are monthly..
and each customer have different start and end dates.
Expected results:
ID First_Absent_date
123 10/01/2014
456 06/01/2014
To get a simple list of the IDs with gaps, with no further details, you need to look at each ID separately, and as #mikey suggested you can count the number of months and look at the first and last date to see if how many months that spans.
If your table has a column called month (since date isn't allowed unless it's a quoted identifier) you could start with:
select id, count(month), min(month), max(month),
months_between(max(month), min(month)) + 1 as diff
from your_table
group by id
order by id;
ID COUNT(MONTH) MIN(MONTH) MAX(MONTH) DIFF
---------- ------------ ---------- ---------- ----------
123 8 01-JUN-14 01-JUL-15 14
456 7 01-MAR-14 01-NOV-14 9
789 7 01-MAR-14 01-SEP-14 7
Then compare the count with the month span, in a having clause:
select id
from your_table
group by id
having count(month) != months_between(max(month), min(month)) + 1
order by id;
ID
----------
123
456
If you can actually have multiple records in a month for an ID, and/or the date recorded might not be the start of the month, you can do a bit more work to normalise the dates:
select id,
count(distinct trunc(month, 'MM')),
min(trunc(month, 'MM')),
max(trunc(month, 'MM')),
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1 as diff
from your_table
group by id
order by id;
select id
from your_table
group by id
having count(distinct trunc(month, 'MM')) !=
months_between(max(trunc(month, 'MM')), min(trunc(month, 'MM'))) + 1
order by id;
Oracle Setup:
CREATE TABLE your_table ( ID, "DATE" ) AS
SELECT 123, DATE '2015-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-06-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-05-01' FROM DUAL UNION ALL
SELECT 123, DATE '2015-04-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 123, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-11-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-10-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 456, DATE '2014-03-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-09-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-08-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-07-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-06-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-05-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-04-01' FROM DUAL UNION ALL
SELECT 789, DATE '2014-03-01' FROM DUAL;
Query:
SELECT ID,
MIN( missing_date )
FROM (
SELECT ID,
CASE WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
= ADD_MONTHS( "DATE", 1 ) THEN NULL
WHEN LEAD( "DATE" ) OVER ( PARTITION BY ID ORDER BY "DATE" )
IS NULL THEN NULL
ELSE ADD_MONTHS( "DATE", 1 )
END AS missing_date
FROM your_table
)
GROUP BY ID
HAVING COUNT( missing_date ) > 0;
Output:
ID MIN(MISSING_DATE)
---------- -------------------
123 2014-10-01 00:00:00
456 2014-06-01 00:00:00
You could use a Lag() function to see if records have been skipped for a particular date or not.Lag() basically helps in comparing the data in current row with previous row. So if we order by DATE, we could easily compare and find any gaps.
select * from
(
select ID,DATE_, case when DATE_DIFF>1 then 1 else 0 end comparison from
(
select ID, DATE_ ,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;
This groups all the entries by id, and then arranges the records by date. If a customer is always present, there would not be a gap in his date. So anyone who has a date difference greater than 1 had a gap. You could tweak this as per your requirement.
EDIT : Just observed that you are storing data in mm/dd/yyyy format, when I closely observed above answers.You are storing only first date of every month. So, the above query can be tweaked as :
select * from
(
select ID,DATE_,PREV_DATE,last_day(PREV_DATE)+1 ABSENT_DATE, case when DATE_DIFF>31 then 1 else 0 end comparison from
(
select ID, DATE_ ,LAG(DATE_,1) OVER (PARTITION BY ID ORDER BY DATE_) PREV_DATE,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;

oracle count based on month

I am attempting to write Oracle SQL.
I am looking for solution something similar. Please find below data I have
start_date end_date customer
01-01-2012 31-06-2012 a
01-01-2012 31-01-2012 b
01-02-2012 31-03-2012 c
I want the count of customer in that date period. My result should look like below
Month : Customer Count
JAN-12 : 2
FEB-12 : 2
MAR-12 : 2
APR-12 : 1
MAY-12 : 1
JUN-12 : 1
One option would be to generate the months separately in another query and join that to your data table (note that I'm assuming that you intended customer A to have an end-date of June 30, 2012 since there is no June 31).
SQL> ed
Wrote file afiedt.buf
1 with mnths as(
2 select add_months( date '2012-01-01', level - 1 ) mnth
3 from dual
4 connect by level <= 6 ),
5 data as (
6 select date '2012-01-01' start_date, date '2012-06-30' end_date, 'a' customer from dual union all
7 select date '2012-01-01', date '2012-01-31', 'b' from dual union all
8 select date '2012-02-01', date '2012-03-31', 'c' from dual
9 )
10 select mnths.mnth, count(*)
11 from data,
12 mnths
13 where mnths.mnth between data.start_date and data.end_date
14 group by mnths.mnth
15* order by mnths.mnth
SQL> /
MNTH COUNT(*)
--------- ----------
01-JAN-12 2
01-FEB-12 2
01-MAR-12 2
01-APR-12 1
01-MAY-12 1
01-JUN-12 1
6 rows selected.
WITH TMP(monthyear,start_date,end_date,customer) AS (
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from data
union all
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from TMP
where LAST_DAY(end_date) >= LAST_DAY(start_date)
)
SELECT TO_CHAR(MonthYear, 'MON-YY') TheMonth,
Count(Customer) Customers
FROM TMP
GROUP BY MonthYear
ORDER BY MonthYear;
SQLFiddle