Date Range Report - Aggregation - sql

I have a few issues doing a typical report style SQL that I’m hoping someone with more experience might be able to help with.
I have the following tables
products
product_id
product_name
product_category
product_defects
product_id
defect_date
high_priority
med_priority
low_priority
calendar
date
And what I want is to be able to generate a report that outlines the number of high / medium / low defects associated with each product category on each day e.g – even though data may not exist in product_defect for a particular day, in which case it should be returned as 0. Example:
product_category | date | high | medium | low
1 2012-10-01 1 5 6
2 2012-10-01 2 4 3
3 2012-10-01 1 5 6
1 2012-10-02 0 0 0
2 2012-10-02 2 4 3
3 2012-10-02 1 5 6
…
What I’ve done so far is:
Create a lookup table called calendar which has a series of days in it going back/forward several years
Right joined the lookup/product_defects table to get a series of dates so missing days can be marked as 0
Used COALESCE and SUM to calculate totals and change any missing data to 0
Used MIN / MAX on the defect_date to get the exact report range
I've banged my head on this for a few days now, hoping someone can help.
Thank you

You need to start with all combinations of products and dates, and then join in the defects:
select p.product_category, c.date,
coalesce(SUM(high_priority), 0) as high_priority,
coalesce(SUM(med_priority), 0) as med_priority,
coalesce(SUM(low_priority), 0) as low_priority
from product p cross join
calendar c left outer join
product_defects pd
on pd.product_id = p.product_id and
pd.date = c.date
group by p.product_category, c.date
order by 2, 1
(Note: this is untested, so may have syntax errors.)

something like this. I used dual instead of calendar which you can use or substitute calendar table.
eg with start date as 01-jan-2013 through to 15th jan 2013.
with dates as (select to_date('01/01/2013', 'dd/mm/yyyy') + rownum - 1 dte
from dual
connect by level <= to_date('15/01/2013', 'dd/mm/yyyy')
- to_date('01/01/2013', 'dd/mm/yyyy') + 1)
select dt.dte, p.product_id, p.product_name,
sum(d.high_priority), sum(d.med_priority), sum(d.low_priority)
from products p
inner join product_defects d
on d.product_id = p.product_id
right outer join dates dt
on dt.dte = d.defect_date -- trunc(d.defect_date) if you store with a time element.
group by dt.dte, p.product_id, p.product_name
order by dt.dte;

So this uses a sub-factory querying clause (cte) to aggregate all the defects for each category for each day. I used this construct to make the logic clearer; there are other ways to do it. The sub-query is then outer-joined to the calendar table.
with cte as
( select p.product_category
, d.defect_date
, sum(pd.high_priority) as high_priority
, sum(pd.med_priority) as med_priority
, sum(pd.low_priority) as low_priority
from product p
join product_defect pd
on (pd.product_id = p.product_id )
group by p.product_category
, d.defect_date )
select cte.product_category
, cal.date
, nvl(cte.high_priority, 0) as high_priority
, nvl(cte.med_priority, 0) as med_priority
, nvl(cte.low_priority, 0) as low_priority
from calendar cal
left outer join cte
on cal.date = cte.defect_date
order by cte.product_category
, cal.date

Calendar table example only. Increase the number of months back from -24 (2 years) to any number - copy/paste the code:
-- 2 years back by date --
SELECT TRUNC(SYSDATE, 'YEAR') - LEVEL AS mydate
FROM dual
CONNECT BY LEVEL <= TRUNC(SYSDATE, 'yy') - TRUNC(Add_Months(SYSDATE, -24), 'yy')
/
Add more dates:
-- 2 years back by date and week --
SELECT mydate
, TRUNC(mydate, 'iw') wk_starts
, TRUNC(mydate, 'iw') + 7 - 1/86400 wk_ends
, TO_NUMBER (TO_CHAR (mydate, 'IW')) ISO_wk#
FROM
(
SELECT TRUNC(SYSDATE, 'YEAR') - LEVEL AS mydate
FROM dual
CONNECT BY LEVEL <= TRUNC(SYSDATE, 'yy') - TRUNC(Add_Months(SYSDATE, -24), 'yy')
)
/
Post create table and inserts scripts to answer the rest of your questions or use sqlfiddle...

Related

How to get current employee counts as of every date in the last 5 years

--PL/SQL
I want two resulting columns. One column is every date in the last 5 years and the other column is the employee count as of each date.
I have a query below where you enter a date as a parameter and it tells you how many people are employed on that date but I don't know how to extrapolate it to achieve my goal above. Achieving my goal would make the parameter unnecessary so I am looking to get rid of it.
Please help, I'm stumped!
select count(person_id), :EFF_DATE
from
(
select paa.person_id
from apps.per_all_assignments_f paa --employee assignments
,apps.per_assignment_status_types past --assignment statuses
where paa.assignment_status_type_id = past.assignment_status_type_id
and past.user_status in ('Active Assignment','Transitional - Active','Transitional -
Inactive','Sabbatical','Sabbatical 50%')
and :EFF_DATE between paa.effective_start_date and paa.effective_end_date
group by paa.person_id
)
If I understood you correctly, you need to create a calendar which contains all dates in last 5 years, and then join it to tables you currently use in that query.
Something like this (untested, as I don't have your tables nor data):
WITH
calendar (datum)
AS
-- last 5 years
( SELECT TRUNC (SYSDATE) - LEVEL + 1
FROM DUAL
CONNECT BY LEVEL <=
TRUNC (SYSDATE) - ADD_MONTHS (TRUNC (SYSDATE), -12 * 5))
SELECT c.datum, COUNT (DISTINCT paa.person_id) cnt
FROM calendar c
JOIN apps.per_all_assignment_f paa
ON c.datum BETWEEN paa.effective_start_date
AND paa.effective_end_date
JOIN apps.per_assignment_status_types past
ON past.assignment_status_type_id = paa.assignment_status_type_id
WHERE past.user_status IN ('Active Assignment',
'Transitional - Active',
'Transitional - Inactive',
'Sabbatical',
'Sabbatical 50%')
GROUP BY c.datum
ORDER BY c.datum;

Show all results in date range replacing null records with zero

I am querying the counts of logs that appear on particular days. However on some days, no log records I'm searching for are recorded. How can I set count to 0 for these days and return a result with the full set of dates in a date range?
SELECT r.LogCreateDate, r.Docs
FROM(
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs, to_char(l.log_create_date,'YYYY-MM-DD') AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING to_char(l.log_create_date , 'YYYY-MM-DD') BETWEEN '2020-01-01' AND '2020-01-07'
GROUP BY to_char(l.log_create_date,'YYYY-MM-DD')
ORDER BY to_char(l.log_create_date,'YYYY-MM-DD') DESC
) r
ORDER BY r.logcreatedate
Current Result - Id like to include the 01, 04, 05 with 0 docs.
LOGCREATEDATE
Docs
2020-01-02
7
2020-01-03
3
2020-01-06
6
2020-01-07
1
You need a full list of dates first, then outer join the log data to that. There are several ways to generate the list of dates but now common table expressions (cte) are an ANSI standard way to do this, like so:
with cte (dt) as (
select to_date('2020-01-01','yyyy-mm-dd') as dt from dual -- start date here
union all
select dt + 1 from cte
where dt + 1 < to_date('2020-02-01','yyyy-mm-dd') -- finish (the day before) date here
)
select to_char(cte.dt,'yyyy-mm-dd') as LogCreateDate
, r.Docs
from cte
left join (
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs
, trunc(l.log_create_date) AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING trunc(l.log_create_date) BETWEEN to_date('2020-01-01','yyyy-mm-dd' AND to_date('2020-01-07','yyyy-mm-dd')
GROUP BY trunc(l.log_create_date)
) r on cte.dt = r.log_create_date
order by cte.dt
also, when dealing with dates I prefer to not convert them to strings until final output which allows you to still get proper date order and maximum query efficiency.

Filter customers with atleast 3 transactions a year for the past 2 years Presto/SQL

I have a table of customer transactions called cust_trans where each transaction made by a customer is stored as one row. I have another col called visit_date that contains the transaction date. I would like to filter the customers who transact atleast 3 times a year for the past 2 years.
The data looks like below
Id visit_date
---- ------
1 01/01/2019
1 01/02/2019
1 01/01/2019
1 02/01/2020
1 02/01/2020
1 03/01/2020
1 03/01/2020
2 01/02/2019
3 02/04/2019
I would like to know the customers who visited atleast 3 times every year for the past two years
ie. I want below output.
id
---
1
From the customer table only one person visited atleast 3 times for 2 years.
I tried with below query but it only checks if total visits greater than or equal to 3
select id
from
cust_scan
GROUP by
id
having count(visit_date) >= 3
and year(date(max(visit_date)))-year(date(min(visit_date))) >=2
I would appreciate any help, guidance or suggestions
One option would be to generate a list of distinct ids, cross join it with the last two years, and then bring the original table with a left join. You can then aggregate to count how many visits each id had each year. The final step is to aggregate again, and filter with a having clause
select i.id
from (
select i.id, y.yr, count(c.id) cnt
from (select distinct id from cust_scan) i
cross join (values
(date_trunc('year', current_date)),
(date_trunc('year', current_date) - interval '1' year)
) as y(yr)
left join cust_scan c
on i.id = c.id
and c.visit_date >= y.yr
and c.visit_date < y.yr + interval '1' year
group by i.id, y.yr
) t
group by i.id
having min(cnt) >= 3
Another option would be to use two correlated subqueries:
select distinct id
from cust_scan c
where
(
select count(*)
from cust_scan c1
where
c1.id = c.id
and c1.visit_date >= date_trunc('year', current_date)
and c1.visit_date < date_trunc('year', current_date) + interval '1' year
) >= 3
and (
select count(*)
from cust_scan c1
where
c1.id = c.id
and c1.visit_date >= date_trunc('year', current_date) - interval '1' year
and c1.visit_date < date_trunc('year', current_date)
) >= 3
I assume you mean calendar years. I think I would use two levels of aggregation:
select ct.id
from (select ct.id, year(visit_date) as yyyy, count(*) as cnt
from cust_trans ct
where ct.visit_date >= '2019-01-01' -- or whatever
group by ct.id
) ct
group by ct.id
having count(*) = 2 and -- both year
min(cnt) >= 3; -- at least three transactions
If you want the last two complete years, just change the where clause in the subquery.
You can use a similar idea -- of two aggregations -- if you want the last two years relative to the current date. That would be two full years, rather than 1 and some fraction of the current year.

How to fill value as zero when No data exists for particular week in oracle

I have a table with following structure.
Note_title varchar2(100)
Note_created_on date
Now in a report, I want to show all notes created week-wise, So I implemented the following solution for it.
SELECT to_char(Note_created_on - 7/24,'ww')||'/'||to_char(Note_created_on - 7/24,'yyyy') as Week ,
nvl(COUNT(Note_title),'0') as AMOUNT
FROM Notes
GROUP BY to_char(Note_created_on - 7/24,'ww') ,
to_char(Note_created_on -7/24,'yyyy')
ORDER BY to_char(Note_created_on - 7/24,'ww') DESC
And i am getting correct output from it, But suppose week 42,45 do not have any created Note then its just missing it.
Sample Output:
WEEK AMOUNT
46/2018 3
44/2018 22
43/2018 45
41/2018 1
40/2018 2
39/2018 27
38/2018 23
So How can I get zero values for week 42,45 instead of leaving them out?
First you would need to generate all the weeks between each year, after that would left join with the Notes tables on the weeks and group by the weeks generated. Eg:
with weeks
as ( select level as lvl /*Assume 52 weeks in a calendar year..*/
from dual
connect by level <=52
)
,weeks_year
as (select distinct
b.lvl||'/'||trunc(Note_created_on,'YYYY') as week_year_val /*From the start of year in Note_created_on*/
from Notes a
join weeks b
on 1=1
)
SELECT a.week_year_val as Week
,COUNT(Note_title) as AMOUNT
FROM weeks_year a
LEFT JOIN Notes b
ON a.week_year_val=to_char(b.Note_created_on - 7/24,'ww')||'/'||to_char(b.Note_created_on - 7/24,'yyyy')
GROUP BY a.week_year_val
ORDER BY a.week_year_val DESC
If you want to perform this for the current year, you may use the following SQL statement which uses such a RIGHT JOIN as below :
SELECT d.week as Week,
nvl(COUNT(Note_title), '0') as AMOUNT
FROM Notes
RIGHT JOIN
(SELECT lpad(level,2,'0')|| '/' ||to_char(sysdate,'yyyy') as week,
'0' as amount FROM dual CONNECT BY level <= 53) d
ON
( d.week =
to_char(Note_created_on - 7 / 24, 'ww') ||'/'||to_char(Note_created_on - 7 / 24, 'yyyy') )
GROUP BY d.week
ORDER BY d.week DESC;
P.S. There's a common belief that a year is composed of 52 weeks, true but truncated :). So, I used 53,
Notice that select to_char( date'2016-12-31' - 7 / 24, 'ww') from dual yields 53 as a sample.
Rextester Demo
As mentioned by jarlh:
Create a list of weeks:
SELECT TO_CHAR(LEVEL, 'FM00')||'/2018' wk
FROM dual
CONNECT BY LEVEL <= 53
This query generates 53 rows, and level is just a number.. 1.. 2.. upto 53. We format it to become 01/2018, 02/2018.. 53/2018
If you plan to use this query in other years, you'd be better off making the year dynamic:
SELECT TO_CHAR(LEVEL, 'FM00')||TO_CHAR(sysdate-7/24,'/YYYY') wk
FROM dual
CONNECT BY LEVEL <= 53
(Credits to Barbaros for pointing out that the last day of any year is reported by Oracle as being in week 53, or said another way 7*52 = 364)
We left join the notes data onto it. I wasn't really clear on why you subtracted 7 hours from the date (time zone?) but I left it. I removed the complexity of the count, as you seem to only want the count of records in a particular week. I also removed the double to_char, because you can do it all in a single operation. One doesn't need to TO_CHAR(date, 'WW')||'/'||TO_CHAR(date,'YYYY') etc.. you just tochar with WW/YYYY as a format. Our query now looks like:
SELECT lst.wk as week, COALESCE(amt, 0) as amount FROM
(
SELECT TO_CHAR(LEVEL, 'FM00')||TO_CHAR(sysdate-7/24,'/YYYY') wk
FROM dual
CONNECT BY LEVEL <= 52
) lst
LEFT OUTER JOIN
(
SELECT
to_char(Note_created_on - 7/24,'ww/yyyy') as wk,
COUNT(*) as amt
FROM Notes
GROUP BY to_char(Note_created_on - 7/24,'ww/yyyy')
) dat
ON lst.wk = dat.wk
ORDER BY lst.wk
For weeks where there are no note, the left join records a null against that week, so we coalesce it to make it 0.
You can, of course, do the query in other ways (many ways), here's a compare:
SELECT lst.wk as week, COUNT(dat.wk) as amount FROM
(
SELECT TO_CHAR(LEVEL, 'FM00')||TO_CHAR(sysdate-7/24,'/YYYY') wk
FROM dual
CONNECT BY LEVEL <= 52
) lst
LEFT OUTER JOIN
(
SELECT
to_char(Note_created_on - 7/24,'ww/yyyy') as wk
FROM Notes
) dat
ON lst.wk = dat.wk
GROUP BY lst.wk
ORDER BY lst.wk
In this form we do the groupby/count after the join. By counting the dat.wk, which for some lst.wk might be NULL, we can omit the coalesce, because count(null) is 0

SQL query for all the days of a month

i have the following table RENTAL(book_date, copy_id, member_id, title_id, act_ret_date, exp_ret_date). Where book_date shows the day the book was booked. I need to write a query that for every day of the month(so from 1-30 or from 1-29 or from 1-31 depending on month) it shows me the number of books booked.
i currently know how to show the number of books rented in the days that are in the table
select count(book_date), to_char(book_date,'DD')
from rental
group by to_char(book_date,'DD');
my questions are:
How do i show the rest of the days(if let's say for some reason in my database i have no books rented on 20th or 19th or multiple days) and put the number 0 there?
How do i show the number of days only of the current month so(28,29,30,31 all these 4 are possible depending on month or year)... i am lost . This must be done using only SQL query no pl/SQL or other stuff.
The following query would give you all days in the current month, in your case you can replace SYSDATE with your date column and join with this query to know how many for a given month
SELECT DT
FROM(
SELECT TRUNC (last_day(SYSDATE) - ROWNUM) dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')
The answer is to create a table like this:
table yearsmonthsdays (year varchar(4), month varchar(2), day varchar(2));
use any language you wish, e.g. iterate in java with Calendar.getInstance().getActualMaximum(Calendar.DAY_OF_MONTH) to get the last day of the month for as many years and months as you like, and fill that table with the year, month and days from 1 to last day of month of your result.
you'd get something like:
insert into yearsmonthsdays ('1995','02','01');
insert into yearsmonthsdays ('1995','02','02');
...
insert into yearsmonthsdays ('1995','02','28'); /* non-leap year */
...
insert into yearsmonthsdays ('1996','02','01');
insert into yearsmonthsdays ('1996','02','02');
...
insert into yearsmonthsdays ('1996','02','28');
insert into yearsmonthsdays ('1996','02','29'); /* leap year */
...
and so on.
Once you have this table done, your work is almost finished. Make an outer left join between your table and this table, joining year, month and day together, and when no lines appear, the count will be zero as you wish. Without using programming, this is your best bet.
In oracle, you can query from dual and use the conncect by level syntax to generate a series of rows - in your case, dates. From there on, it's just a matter of deciding what dates you want to display (in my example I used all the dates from 2014) and joining on your table:
SELECT all_date, COALESCE (cnt, 0)
FROM (SELECT to_date('01/01/2014', 'dd/mm/yyyy') + rownum - 1 AS all_date
FROM dual
CONNECT BY LEVEL <= 365) d
LEFT JOIN (SELECT TRUNC(book_date), COUNT(book_date) AS cnt
FROM rental
GROUP BY book_date) r ON d.all_date = TRUNC(r.book_date)
There's no need to get ROWNUM involved ... you can just use LEVEL in the CONNECT BY:
WITH d1 AS (
SELECT TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL AS book_date
FROM dual
CONNECT BY TRUNC(SYSDATE, 'MONTH') - 1 + LEVEL <= LAST_DAY(SYSDATE)
)
SELECT TRUNC(d1.book_date), COUNT(r.book_date)
FROM d1 LEFT JOIN rental r
ON TRUNC(d1.book_date) = TRUNC(r.book_date)
GROUP BY TRUNC(d1.book_date);
Simply replace SYSDATE with a date in the month you're targeting for results.
All days of the month based on current date
select trunc(sysdate) - (to_number(to_char(sysdate,'DD')) - 1)+level-1 x from dual connect by level <= TO_CHAR(LAST_DAY(sysdate),'DD')
It did works to me:
SELECT DT
FROM (SELECT TRUNC(LAST_DAY(SYSDATE) - (CASE WHEN ROWNUM=1 THEN 0 ELSE ROWNUM-1 END)) DT
FROM DUAL
CONNECT BY ROWNUM <= 32)
WHERE DT >= TRUNC(SYSDATE, 'MM')
In Oracle SQL the query must look like this to not miss the last day of month:
SELECT DT
FROM(
SELECT trunc(add_months(sysdate, 1),'MM')- ROWNUM dt
FROM DUAL CONNECT BY ROWNUM < 32
)
where DT >= trunc(sysdate,'mm')