substituting "filter" in a sql query on oracle - sql

We have a table with data that has one date column indicating what day the data is for ("planning_day") and another column for logging when the data was sent ("first_sent_time").
I'm trying to make a report showing how far in the past/future we've sent data on which day. So if today we sent 2 data for yesterday, 5 for today and 1 for the day after tomorrow, the result should be something like this:
sent_day minus2 minus1 sameDay plus1 plus2
2021-11-24 0 2 5 0 1
...
I know I could do this in postgres with a query using "filter":
select
trunc(t.first_sent_time),
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = -2) as "minus2",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = -1) as "minus1",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 0) as "sameDay",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 1) as "plus1",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 2) as "plus2"
from
my_table t
group by
trunc(t.first_sent_time)
;
Unfortunately, this "filter" doesn't exist in Oracle. I need help here. I tried something like following:
select
sent_day,
sum(minus2),
sum(minus1),
sum(sameDay),
sum(plus1),
sum(plus2)
from (
select
*
from (
select
b.id,
trunc(b.first_sent_time) as sent_day,
b.planning_day,
b.planning_day - trunc(b.first_sent_time) as day_diff
from
my_table b
where
b.first_sent_time >= DATE '2021-11-01'
)
pivot (
count(id) for day_diff in (-2 as "minus2",-1 as "minus1",0 as "sameDay", 1 as "plus1",2 as "plus2")
)
)
group by
sent_day
order by
sent_day
;
but it doesn't work and it feels like I'm going too complicated and there must be an easier solution.

Use a CASEexpression within the aggregation function to simulate the filter.
Here a simplified example
with dt as (
select 1 id , 1 diff_days from dual union all
select 2 id , 1 diff_days from dual union all
select 3 id , -1 diff_days from dual union all
select 4 id , -1 diff_days from dual union all
select 4 id , -1 diff_days from dual)
/* query */
select
count(case when diff_days = 1 then id end) as cnt_1,
count(case when diff_days = -1 then id end) as cnt_minus_1
from dt;
results in
CNT_1 CNT_MINUS_1
---------- -----------
2 3

Related

See the distribution of secondary requests grouped by time interval in sql

I have the following table:
RequestId,Type, Date, ParentRequestId
1 1 2020-10-15 null
2 2 2020-10-19 1
3 1 2020-10-20 null
4 2 2020-11-15 3
For this example I am interested in the request type 1 and 2, to make the example simpler. My task is to query a big database and to see the distribution of the secondary transaction based on the difference of dates with the parent one. So the result would look like:
Interval,Percentage
0-7 days,50 %
8-15 days,0 %
16-50 days, 50 %
So for the first line from teh expected result we have the request with the id 2 and for the third line from the expected result we have the request with the id 4 because the date difference fits in this interval.
How to achieve this?
I'm using sql server 2014.
We like to see your attempts, but by the looks of it, it seems like you're going to need to treat this table as 2 tables and do a basic GROUP BY, but make it fancy by grouping on a CASE statement.
WITH dateDiffs as (
/* perform our date calculations first, to get that out of the way */
SELECT
DATEDIFF(Day, parent.[Date], child.[Date]) as daysDiff,
1 as rowsFound
FROM (SELECT RequestID, [Date] FROM myTable WHERE Type = 1) parent
INNER JOIN (SELECT ParentRequestID, [Date] FROM myTable WHERE Type = 2) child
ON parent.requestID = child.parentRequestID
)
/* Now group and aggregate and enjoy your maths! */
SELECT
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end as myInterval,
sum(rowsFound) as totalFound,
(select sum(rowsFound) from dateDiffs) as totalRows,
1.0 * sum(rowsFound) / (select sum(rowsFound) from dateDiffs) * 100.00 as percentFound
FROM dateDiffs
GROUP BY
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end;
This seems like basically a join and group by query:
with dates as (
select 0 as lo, 7 as hi, '0-7 days' as grp union all
select 8 as lo, 15 as hi, '8-15 days' union all
select 16 as lo, 50 as hi, '16-50 days'
)
select d.grp,
count(*) as cnt,
count(*) * 1.0 / sum(count(*)) over () as raio
from dates left join
(t join
t tp
on tp.RequestId = t. ParentRequestId
)
on datediff(day, tp.date, t.date) between d.lo and d.hi
group by d.grp
order by d.lo;
The only trick is generating all the date groups, so you have rows with zero values.

SQL - '1' IF hour in month EXISTS, '0' IF NOT EXISTS

I have a table that has aggregations down to the hour level YYYYMMDDHH. The data is aggregated and loaded by an external process (I don't have control over). I want to test the data on a monthly basis.
The question I am looking to answer is: Does every hour in the month exist?
I'm looking to produce output that will return a 1 if the hour exists or 0 if the hour does not exist.
The aggregation table looks something like this...
YYYYMM YYYYMMDD YYYYMMDDHH DATA_AGG
201911 20191101 2019110100 100
201911 20191101 2019110101 125
201911 20191101 2019110103 135
201911 20191101 2019110105 95
… … … …
201911 20191130 2019113020 100
201911 20191130 2019113021 110
201911 20191130 2019113022 125
201911 20191130 2019113023 135
And defined as...
CREATE TABLE YYYYMMDDHH_DATA_AGG AS (
YYYYMM VARCHAR,
YYYYMMDD VARCHAR,
YYYYMMDDHH VARCHAR,
DATA_AGG INT
);
I'm looking to produce the following below...
YYYYMMDDHH HOUR_EXISTS
2019110100 1
2019110101 1
2019110102 0
2019110103 1
2019110104 0
2019110105 1
... ...
In the example above, two hours do not exist, 2019110102 and 2019110104.
I assume I'd have to join the aggregation table against a computed table that contains all the YYYYMMDDHH combos???
The database is Snowflake, but assume most generic ANSI SQL queries will work.
You can get what you want with a recursive CTE
The recursive CTE generates the list of possible Hours. And then a simple left outer join gets you the flag for if you have any records that match that hour.
WITH RECURSIVE CTE (YYYYMMDDHH) as
(
SELECT YYYYMMDDHH
FROM YYYYMMDDHH_DATA_AGG
WHERE YYYYMMDDHH = (SELECT MIN(YYYYMMDDHH) FROM YYYYMMDDHH_DATA_AGG)
UNION ALL
SELECT TO_VARCHAR(DATEADD(HOUR, 1, TO_TIMESTAMP(C.YYYYMMDDHH, 'YYYYMMDDHH')), 'YYYYMMDDHH') YYYYMMDDHH
FROM CTE C
WHERE TO_VARCHAR(DATEADD(HOUR, 1, TO_TIMESTAMP(C.YYYYMMDDHH, 'YYYYMMDDHH')), 'YYYYMMDDHH') <= (SELECT MAX(YYYYMMDDHH) FROM YYYYMMDDHH_DATA_AGG)
)
SELECT
C.YYYYMMDDHH,
IFF(A.YYYYMMDDHH IS NOT NULL, 1, 0) HOUR_EXISTS
FROM CTE C
LEFT OUTER JOIN YYYYMMDDHH_DATA_AGG A
ON C.YYYYMMDDHH = A.YYYYMMDDHH;
If your timerange is too long you'll have issues with the cte recursing too much. You can create a table or temp table with all of the possible hours instead. For example:
CREATE OR REPLACE TEMPORARY TABLE HOURS (YYYYMMDDHH VARCHAR) AS
SELECT TO_VARCHAR(DATEADD(HOUR, SEQ4(), TO_TIMESTAMP((SELECT MIN(YYYYMMDDHH) FROM YYYYMMDDHH_DATA_AGG), 'YYYYMMDDHH')), 'YYYYMMDDHH')
FROM TABLE(GENERATOR(ROWCOUNT => 10000)) V
ORDER BY 1;
SELECT
H.YYYYMMDDHH,
IFF(A.YYYYMMDDHH IS NOT NULL, 1, 0) HOUR_EXISTS
FROM HOURS H
LEFT OUTER JOIN YYYYMMDDHH_DATA_AGG A
ON H.YYYYMMDDHH = A.YYYYMMDDHH
WHERE H.YYYYMMDDHH <= (SELECT MAX(YYYYMMDDHH) FROM YYYYMMDDHH_DATA_AGG);
You can then fiddle with the generator count to make sure you have enough hours.
You can generate a table with every hour of the month and LEFT OUTER JOIN your aggregation to it:
WITH EVERY_HOUR AS (
SELECT TO_CHAR(DATEADD(HOUR, HH, TO_DATE(YYYYMM::TEXT, 'YYYYMM')),
'YYYYMMDDHH')::NUMBER YYYYMMDDHH
FROM (SELECT DISTINCT YYYYMM FROM YYYYMMDDHH_DATA_AGG) t
CROSS JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY NULL) - 1 HH
FROM TABLE(GENERATOR(ROWCOUNT => 745))
) h
QUALIFY YYYYMMDDHH < (YYYYMM + 1) * 10000
)
SELECT h.YYYYMMDDHH, NVL2(a.YYYYMM, 1, 0) HOUR_EXISTS
FROM EVERY_HOUR h
LEFT OUTER JOIN YYYYMMDDHH_DATA_AGG a ON a.YYYYMMDDHH = h.YYYYMMDDHH
Here's something that might help get you started. I'm guessing you want to have 'synthetic' [YYYYMMDD] values? Otherwise, if the value aren't there, then they shouldn't appear in the list
DROP TABLE IF EXISTS #_hours
DROP TABLE IF EXISTS #_temp
--Populate a table with hours ranging from 00 to 23
CREATE TABLE #_hours ([hour_value] VARCHAR(2))
DECLARE #_i INT = 0
WHILE (#_i < 24)
BEGIN
INSERT INTO #_hours
SELECT FORMAT(#_i, '0#')
SET #_i += 1
END
-- Replicate OP's sample data set
CREATE TABLE #_temp (
[YYYYMM] INTEGER
, [YYYYMMDD] INTEGER
, [YYYYMMDDHH] INTEGER
, [DATA_AGG] INTEGER
)
INSERT INTO #_temp
VALUES
(201911, 20191101, 2019110100, 100),
(201911, 20191101, 2019110101, 125),
(201911, 20191101, 2019110103, 135),
(201911, 20191101, 2019110105, 95),
(201911, 20191130, 2019113020, 100),
(201911, 20191130, 2019113021, 110),
(201911, 20191130, 2019113022, 125),
(201911, 20191130, 2019113023, 135)
SELECT X.YYYYMM, X.YYYYMMDD, X.YYYYMMDDHH
-- Case: If 'target_hours' doesn't exist, then 0, else 1
, CASE WHEN X.target_hours IS NULL THEN '0' ELSE '1' END AS [HOUR_EXISTS]
FROM (
-- Select right 2 characters from converted [YYYYMMDDHH] to act as 'target values'
SELECT T.*
, RIGHT(CAST(T.[YYYYMMDDHH] AS VARCHAR(10)), 2) AS [target_hours]
FROM #_temp AS T
) AS X
-- Right join to keep all of our hours and only the target hours that match.
RIGHT JOIN #_hours AS H ON H.hour_value = X.target_hours
Sample output:
YYYYMM YYYYMMDD YYYYMMDDHH HOUR_EXISTS
201911 20191101 2019110100 1
201911 20191101 2019110101 1
NULL NULL NULL 0
201911 20191101 2019110103 1
NULL NULL NULL 0
201911 20191101 2019110105 1
NULL NULL NULL 0
With (almost) standard sql, you can do a cross join of the distinct values of YYYYMMDD to a list of all possible hours and then left join to the table:
select concat(d.YYYYMMDD, h.hour) as YYYYMMDDHH,
case when t.YYYYMMDDHH is null then 0 else 1 end as hour_exists
from (select distinct YYYYMMDD from tablename) as d
cross join (
select '00' as hour union all select '01' union all
select '02' union all select '03' union all
select '04' union all select '05' union all
select '06' union all select '07' union all
select '08' union all select '09' union all
select '10' union all select '11' union all
select '12' union all select '13' union all
select '14' union all select '15' union all
select '16' union all select '17' union all
select '18' union all select '19' union all
select '20' union all select '21' union all
select '22' union all select '23'
) as h
left join tablename as t
on concat(d.YYYYMMDD, h.hour) = t.YYYYMMDDHH
order by concat(d.YYYYMMDD, h.hour)
Maybe in Snowflake you can construct the list of hours with a sequence much easier instead of all those UNION ALLs.
This version accounts for the full range of days, across months and years. It's a simple cross join of the set of possible days with the set of possible hours of the day -- left joined to actual dates.
set first = (select min(yyyymmdd::number) from YYYYMMDDHH_DATA_AGG);
set last = (select max(yyyymmdd::number) from YYYYMMDDHH_DATA_AGG);
with
hours as (select row_number() over (order by null) - 1 h from table(generator(rowcount=>24))),
days as (
select
row_number() over (order by null) - 1 as n,
to_date($first::text, 'YYYYMMDD')::date + n as d,
to_char(d, 'YYYYMMDD') as yyyymmdd
from table(generator(rowcount=>($last-$first+1)))
)
select days.yyyymmdd || lpad(hours.h,2,0) as YYYYMMDDHH, nvl2(t.yyyymmddhh,1,0) as HOUR_EXISTS
from days cross join hours
left join YYYYMMDDHH_DATA_AGG t on t.yyyymmddhh = days.yyyymmdd || lpad(hours.h,2,0)
order by 1
;
$first and $last can be packed in as sub-queries if you prefer.

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.

List all months with a total regardless of null

I have a very small SQL table that lists courses attended and the date of attendance. I can use the code below to count the attendees for each month
select to_char(DATE_ATTENDED,'YYYY/MM'),
COUNT (*)
FROM TRAINING_COURSE_ATTENDED
WHERE COURSE_ATTENDED = 'Fire Safety'
GROUP BY to_char(DATE_ATTENDED,'YYYY/MM')
ORDER BY to_char(DATE_ATTENDED,'YYYY/MM')
This returns a list as expected for each month that has attendees. However I would like to list it as
January 2
February 0
March 5
How do I show the count results along with the nulls? My table is very basic
1234 01-JAN-15 Fire Safety
108 01-JAN-15 Fire Safety
1443 02-DEC-15 Healthcare
1388 03-FEB-15 Emergency
1355 06-MAR-15 Fire Safety
1322 09-SEP-15 Fire Safety
1234 11-DEC-15 Fire Safety
I just need to display each month and the total attendees for Fire Safety only. Not used SQL developer for a while so any help appreciated.
You would need a calendar table to select a period you want to display. Simplified code would look like this:
select to_char(c.Date_dt,'YYYY/MM')
, COUNT (*)
FROM calendar as c
left join TRAINING_COURSE_ATTENDED as tca
on tca.DATE_ATTENDED = c.Date_dt
WHERE tca.COURSE_ATTENDED = 'Fire Safety'
and c.Date_dt between [period_start_dt] and [period_end_dt]
GROUP BY to_char(c.Date_dt,'YYYY/MM')
ORDER BY to_char(c.Date_dt,'YYYY/MM')
You can create your own set required year month's on-fly with 0 count and use query as below.
Select yrmth,sum(counter) from
(
select to_char(date_attended,'YYYYMM') yrmth,
COUNT (1) counter
From TRAINING_COURSE_ATTENDED Where COURSE_ATTENDED = 'Fire Safety'
Group By Y to_char(date_attended,'YYYYMM')
Union All
Select To_Char(2015||Lpad(Rownum,2,0)),0 from Dual Connect By Rownum <= 12
)
group by yrmth
order by 1
If you want to show multiple year's, just change the 2nd query to
Select To_Char(Year||Lpad(Month,2,0)) , 0
From
(select Rownum Month from Dual Connect By Rownum <= 12),
(select 2015+Rownum-1 Year from Dual Connect By Rownum <= 3)
Try this :
SELECT Trunc(date_attended, 'MM') Month,
Sum(CASE
WHEN course_attended = 'Fire Safety' THEN 1
ELSE 0
END) Fire_Safety
FROM training_course_attended
GROUP BY Trunc(date_attended, 'MM')
ORDER BY Trunc(date_attended, 'MM')
Another way to generate a calendar table inline:
with calendar (month_start, month_end) as
( select add_months(date '2014-12-01', rownum)
, add_months(date '2014-12-01', rownum +1) - interval '1' second
from dual
connect by rownum <= 12 )
select to_char(c.month_start,'YYYY/MM') as course_month
, count(tca.course_attended) as attended
from calendar c
left join training_course_attended tca
on tca.date_attended between c.month_start and c.month_end
and tca.course_attended = 'Fire Safety'
group by to_char(c.month_start,'YYYY/MM')
order by 1;
(You could also have only the month start in the calendar table, and join on trunc(tca.date_attended,'MONTH') = c.month_start, though if you had indexes or partitioning on tca.date_attended that might be less efficient.)

SQL Query Help (Advanced - for me!)

I have a question about a SQL query I am trying to write.
I need to query data from a database.
The database has, amongst others, these 3 fields:
Account_ID #, Date_Created, Time_Created
I need to write a query that tells me how many accounts were opened per hour.
I have written said query, but there are times that there were 0 accounts created, so these "hours" are not populated in the results.
For example:
Volume Date__Hour
435 12-Aug-12 03
213 12-Aug-12 04
125 12-Aug-12 06
As seen in the example above, hour 5 did not have any accounts opened.
Is there a way that the result can populate the hour but and display 0 accounts opened for this hour?
Example of how I want my results to look like:
Volume Date_Hour
435 12-Aug-12 03
213 12-Aug-12 04
0 12-Aug-12 05
125 12-Aug-12 06
Thanks!
Update: This is what I have so far
SELECT count(*) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour
To get the results you want, you will need to create a table (or use a query to generate a "temp" table) and then use a left join to your calculation query to get rows for every hour - even those with 0 volume.
For example, assume I have a table with app_date and app_hour fields. Also assume that this table has a row for every day/hour you wish to report on.
The query would be:
SELECT NVL(c.num_apps,0) as num_apps, t.app_date, t.app_hour
FROM time_table t
LEFT OUTER JOIN
(
SELECT count(*) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour
) c ON (t.app_date = c.app_date AND t.app_hour = c.app_hour)
I believe the best solution is not to create some fancy temporary table but just use this construct:
select level
FROM Dual
CONNECT BY level <= 10
ORDER BY level;
This will give you (in ten rows):
1
2
3
4
5
6
7
8
9
10
For hours interval just little modification:
select 0 as num_apps, (To_Date('16-09-12','DD-MM-RR') + level / 24) as created_ts
FROM dual
CONNECT BY level <= (sysdate - To_Date('16-09-12','DD-MM-RR')) * 24 ;
And just for the fun of it adding solution for you(I didn't try syntax, so I'm sorry for any mistake, but the idea is clear):
SELECT SUM(num_apps) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM(
SELECT count(*) as num_apps, created_ts
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-09-12','DD-MM-RR')
UNION ALL
select 0 as num_apps, (To_Date('16-09-12','DD-MM-RR') + level / 24) as created_ts
FROM dual
CONNECT BY level <= (sysdate - To_Date('16-09-12','DD-MM-RR')) * 24 ;
)
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour
;
You can also use a CASE statement in the SELECT to force the value you want.
It can be useful to have a "sequence table" kicking around, for all sorts of reasons, something that looks like this:
create table dbo.sequence
(
id int not null primary key clustered ,
)
Load it up with million or so rows, covering positive and negative values.
Then, given a table that looks like this
create table dbo.SomeTable
(
account_id int not null primary key clustered ,
date_created date not null ,
time_created time not null ,
)
Your query is then as simple as (in SQL Server):
select year_created = years.id ,
month_created = months.id ,
day_created = days.id ,
hour_created = hours.id ,
volume = t.volume
from ( select * ,
is_leap_year = case
when id % 400 = 0 then 1
when id % 100 = 0 then 0
when id % 4 = 0 then 1
else 0
end
from dbo.sequence
where id between 1980 and year(current_timestamp)
) years
cross join ( select *
from dbo.sequence
where id between 1 and 12
) months
left join ( select *
from dbo.sequence
where id between 1 and 31
) days on days.id <= case months.id
when 2 then 28 + years.is_leap_year
when 4 then 30
when 6 then 30
when 9 then 30
when 11 then 30
else 31
end
cross join ( select *
from dbo.sequence
where id between 0 and 23
) hours
left join ( select date_created ,
hour_created = datepart(hour,time_created ) ,
volume = count(*)
from dbo.SomeTable
group by date_created ,
datepart(hour,time_created)
) t on datepart( year , t.date_created ) = years.id
and datepart( month , t.date_created ) = months.id
and datepart( day , t.date_created ) = days.id
and t.hour_created = hours.id
order by 1,2,3,4
It's not clear to me if created_ts is a datetime or a varchar. If it's a datetime, you shouldn't use to_date; if it's a varchar, you shouldn't use to_char.
Assuming it's a datetime, and borrowing #jakub.petr's FROM Dual CONNECT BY level trick, I suggest:
SELECT count(*) as num_apps, to_char(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM (select level-1 as hour FROM Dual CONNECT BY level <= 24) h
LEFT JOIN accounts a on h.hour = to_number(to_char(a.created_ts,'HH24'))
WHERE created_ts >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY trunc(created_ts), h.hour
ORDER BY app_date, app_hour