SQL Query to convert number value into date - sql

In my transaction table has id Number(11), name Varchar2(25) , transactiondate number(22).
Need to write SQL query to fetch the transaction details. transactiondate should be return as date & time format instead of number.
transaction table
ID Name transactiondate
1 AAA 2458010
2 BBB 2458351
3 CCC 2458712
I got the below result when i execute the below query
Select * from transaction where transactiondate <= TOCHAR(todate('2019/09/17 00:00:00', 'YYYY/MM/DD hh24:mi:ss') , 'J');
ID Name transactiondate
1 AAA 2458010
2 BBB 2458351
I got the query syntax error when i tried execute the below query
Select name, convert(datetime, convert(varchar(10), transactiondate)) as txndateformat
from transaction;
Expecting query that has to be return name and transactiondate as date format instead of number.
I got below result when i execute the below query
Desc transaction;
Name Null? Type
Id Not Null Number(19)
Name Not Null VarChar2(100)
transactiondate Not Null Number(22)

It all depends on when you are measuring time zero from and what your units are.
Here are some typical solutions:
Oracle Setup:
CREATE TABLE transaction ( ID, Name, transactiondate ) AS
SELECT 1, 'AAA', 2456702 FROM DUAL UNION ALL
SELECT 2, 'BBB', 2456703 FROM DUAL
Query:
SELECT name,
TO_DATE( transactiondate, 'J' )
AS julian_date,
DATE '1970-01-01' + NUMTODSINTERVAL( transactiondate / 1000, 'SECOND' )
AS unix_timestamp,
DATE '1970-01-01' + NUMTODSINTERVAL( transactiondate, 'SECOND' )
AS seconds_since_1970,
DATE '1970-01-01' + NUMTODSINTERVAL( transactiondate, 'MINUTE' )
AS minutes_since_1970,
DATE '1970-01-01' + NUMTODSINTERVAL( transactiondate, 'HOUR' )
AS hours_since_1970,
DATE '1900-01-01' + NUMTODSINTERVAL( transactiondate, 'HOUR' )
AS hours_since_1900,
DATE '1899-12-30' + transactiondate
AS excel_date
FROM transaction
Output:
NAME | JULIAN_DATE | UNIX_TIMESTAMP | SECONDS_SINCE_1970 | MINUTES_SINCE_1970 | HOURS_SINCE_1970 | HOURS_SINCE_1900 | EXCEL_DATE
:--- | :------------------ | :------------------ | :------------------ | :------------------ | :------------------ | :------------------ | :------------------
AAA | 2014-02-13 00:00:00 | 1970-01-01 00:40:56 | 1970-01-29 10:25:02 | 1974-09-03 01:02:00 | 2250-04-05 14:00:00 | 2180-04-04 14:00:00 | 8626-03-21 00:00:00
BBB | 2014-02-14 00:00:00 | 1970-01-01 00:40:56 | 1970-01-29 10:25:03 | 1974-09-03 01:03:00 | 2250-04-05 15:00:00 | 2180-04-04 15:00:00 | 8626-03-22 00:00:00
db<>fiddle here
(Note: Excel dates are slightly more complicated if you want to support values before 1900-03-01 but most people do not need this so there is only the simplified version included above.)

I assume that numbers are epoch numbers.
For SQL Server:
SELECT DATEADD(ss, 2456702, '19700101') --ss means interval = seconds
For Oracle:
select to_date('19700101', 'YYYYMMDD') + ( 1 / 24 / 60 / 60) * 2456702
from dual;

Related

Oracle generating schedule rows with an interval

I have some SQL that generates rows for every 5 minutes. How can this be modified to get rid of overlapping times (see below)
Note: Each row should be associated with a location_id with no repeats on the location_id. In this case there should be 25 rows generated so the CONNECT by should be something like SELECT count(*) from locations.
My goal is to create a function that takes in a schedule_id and a start_date in the format
'MMDDYYYY HH24:MI'; and stop creating rows if the next entry will cross midnight; that means some of the location_id may not be used.
The end result is to have the rows placed in the schedule table below. Since I don't have a function yet the schedule_id can be hard coded to 1. I've heard about recursive CTE, would this quality for that method?
Thanks in advance to all who answer and your expertise.
ALTER SESSION SET NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';
create table schedule(
schedule_id NUMBER(4),
location_id number(4),
start_date DATE,
end_date DATE,
CONSTRAINT start_min check (start_date=trunc(start_date,'MI')),
CONSTRAINT end_min check (end_date=trunc(end_date,'MI')),
CONSTRAINT end_gt_start CHECK (end_date >= start_date),
CONSTRAINT same_day CHECK (TRUNC(end_date) = TRUNC(start_date))
);
CREATE TABLE locations AS
SELECT level AS location_id,
'Door ' || level AS location_name,
CASE. round(dbms_random.value(1,3))
WHEN 1 THEN 'A'
WHEN 2 THEN 'T'
WHEN 3 THEN 'G'
END AS location_type
FROM dual
CONNECT BY level <= 25;
with
row_every_5_mins as
( select trunc(sysdate) + (rownum-1)*5/1440 t_from,
trunc(sysdate) + rownum*5/1440 t_to
from dual
connect by level <= 1440/5
) SELECT * from row_every_5_mins;
Current output:
|T_FROM|T_TO|
|-----------------|-----------------|
|08162021 00:00:00|08162021 00:05:00|
|08162021 00:05:00|08162021 00:10:00|
|08162021 00:10:00|08162021 00:15:00|
|08162021 00:15:00|08162021 00:20:00|
…
Desired output
|T_FROM|T_TO|
|-----------------|-----------------|
|08162021 00:00:00|08162021 00:05:00|
|08162021 00:10:00|08162021 00:15:00|
|08162021 00:20:00|08162021 00:25:00|
…
You may avoid recursive query or loop, because you essentially need a row number of each row in locations table. So you'll need to provide an appropriate sort order to the analytic function. Below is the query:
with a as (
select
date '2021-01-01'
+ to_dsinterval('0 23:30:00')
as start_dt_param
from dual
)
, date_gen as (
select
location_id
, start_dt_param
, start_dt_param + (row_number() over(order by location_id) - 1)
* interval '10' minute as start_dt
, start_dt_param + (row_number() over(order by location_id) - 1)
* interval '10' minute + interval '5' minute as end_dt
from a
cross join locations
)
select
location_id
, start_dt
, end_dt
from date_gen
where end_dt < trunc(start_dt_param + 1)
LOCATION_ID | START_DT | END_DT
----------: | :------------------ | :------------------
1 | 2021-01-01 23:30:00 | 2021-01-01 23:35:00
2 | 2021-01-01 23:40:00 | 2021-01-01 23:45:00
3 | 2021-01-01 23:50:00 | 2021-01-01 23:55:00
UPD:
Or if you wish a procedure, then it is even simpler. Because from 12c Oracle has fetch first addition, and analytic function may be simplified to rownum pseudocolumn:
create or replace procedure populate_schedule (
p_schedule_id in number
, p_start_date in date
) as
begin
insert into schedule (schedule_id, location_id, start_date, end_date)
select
p_schedule_id
, location_id
, p_start_date + (rownum - 1) * interval '10' minute
, p_start_date + (rownum - 1) * interval '10' minute + interval '5' minute
from locations
/*Put your order of location assignment here*/
order by location_id
/*The number of 10-minute intervals before midnight from the first end_date*/
fetch first ((trunc(p_start_date + 1) - p_start_date + 1/24/60*5)*24*60/10) rows only
;
commit;
end;
/
begin
populate_schedule(1, timestamp '2020-01-01 23:37:00');
populate_schedule(2, timestamp '2020-01-01 23:35:00');
populate_schedule(3, timestamp '2020-01-01 23:33:00');
end;/
select *
from schedule
order by schedule_id, start_date
SCHEDULE_ID | LOCATION_ID | START_DATE | END_DATE
----------: | ----------: | :------------------ | :------------------
1 | 1 | 2020-01-01 23:37:00 | 2020-01-01 23:42:00
1 | 2 | 2020-01-01 23:47:00 | 2020-01-01 23:52:00
2 | 1 | 2020-01-01 23:35:00 | 2020-01-01 23:40:00
2 | 2 | 2020-01-01 23:45:00 | 2020-01-01 23:50:00
2 | 3 | 2020-01-01 23:55:00 | 2020-01-02 00:00:00
3 | 1 | 2020-01-01 23:33:00 | 2020-01-01 23:38:00
3 | 2 | 2020-01-01 23:43:00 | 2020-01-01 23:48:00
3 | 3 | 2020-01-01 23:53:00 | 2020-01-01 23:58:00
db<>fiddle here
Just loop every 10 minutes instead of every 5 minutes:
WITH input (start_time) AS (
SELECT TRUNC(SYSDATE) + INTERVAL '23:30' HOUR TO MINUTE FROM DUAL
)
SELECT start_time + (LEVEL-1) * INTERVAL '10' MINUTE
AS t_from,
start_time + (LEVEL-1) * INTERVAL '10' MINUTE + INTERVAL '5' MINUTE
AS t_to
FROM input
CONNECT BY (LEVEL-1) * INTERVAL '10' MINUTE < INTERVAL '1' DAY
AND LEVEL <= (SELECT COUNT(*) FROM locations)
AND start_time + (LEVEL-1) * INTERVAL '10' MINUTE < TRUNC(start_time) + INTERVAL '1' DAY;
db<>fiddle here
A CTE is certainly the fastest solution. If you like to get more flexibility for intervals then you can use the SCHEDULER SCHEDULE. As drawback the performance might be weaker.
CREATE OR REPLACE TYPE TimestampRecType AS OBJECT (
T_FROM TIMESTAMP(0),
T_TO TIMESTAMP(0)
);
CREATE OR REPLACE TYPE TimestampTableType IS TABLE OF TimestampRecType;
CREATE OR REPLACE FUNCTION GetGchedule(
start_time IN TIMESTAMP,
stop_time in TIMESTAMP DEFAULT TRUNC(SYSDATE)+1)
RETURN TimestampTableType AS
ret TimestampTableType := TimestampTableType();
return_date_after TIMESTAMP := start_time;
next_run_date TIMESTAMP ;
BEGIN
LOOP
DBMS_SCHEDULER.EVALUATE_CALENDAR_STRING('FREQ=MINUTELY;INTERVAL=5;', NULL, return_date_after, next_run_date);
ret.EXTEND;
ret(ret.LAST) := TimestampRecType(return_date_after, next_run_date);
return_date_after := next_run_date;
EXIT WHEN next_run_date >= stop_time;
END LOOP;
RETURN ret;
END;
SELECT *
FROM TABLE(GetGchedule(trunc(sysdate)));
See syntax for calendar here: Calendaring Syntax

Cannot get the row that contain the last day of each month

I am quite new to SQL and I am trying to find the row that contains the last day of each month.
product_table example:
log_date | product_id | stock
10/30/2018 | 1001 | 59
10/29/2018 | 1002 | 100
10/28/2018 | 1003 | 2
...
9/30/2018 | 1001 | 1
9/30/2018 | 1002 | 45
This is my code:
SELECT *
FROM product_table
WHERE log_date IN
(
SELECT MAX(log_date)
FROM product_table
GROUP BY strftime('%m', log_date), strftime('%y', log_date)
)
Output:
9/9/2018 1001 28
9/9/2018 1002 94
9/9/2018 1003 29
9/9/2018 1004 89
9/9/2018 1005 3
9/9/2018 1006 46
...
Expected output:
9/30/2018 1001 28
9/30/2018 1002 94
9/30/2018 1003 29
...
8/31/2018 1001 89
8/31/2018 1002 3
...
7/31/2018 1001 46
...
I am working on a data file that date is a format like this: mm/dd/yyyy.
Should I change the date format to the normal way like yyyy-mm-dd because the code above returns the wrong result?
Do you guys know how to fix this? Thank you.
Update your table so the dates have the format YYYY-MM-DD which is the only valid date format for SQLite:
update product_table
set log_date =
substr(log_date, -4) || '-' ||
case
when log_date like '__/__/____' then
substr(log_date, 1, 2) || '-' || substr(log_date, 4, 2)
when log_date like '_/__/____' then
'0' || substr(log_date, 1, 1) || '-' || substr(log_date, 3, 2)
when log_date like '__/_/____' then
substr(log_date, 1, 2) || '-0' || substr(log_date, 4, 1)
when log_date like '_/_/____' then
'0' || substr(log_date, 1, 1) || '-0' || substr(log_date, 3, 1)
end;
Then your query should work.
This is a simplification of the GROUP BY clause:
SELECT *
FROM product_table
WHERE log_date IN
(
SELECT MAX(log_date)
FROM product_table
GROUP BY strftime('%Y%m', log_date)
)
See the demo.
Results:
| log_date | product_id | stock |
| ---------- | ---------- | ----- |
| 2018-10-30 | 1001 | 59 |
| 2018-09-30 | 1001 | 1 |
| 2018-09-30 | 1002 | 45 |
If you would like to get last day in a month based on log_date field, You have to use date function
Compute the last day of the current month.
SELECT date('now','start of month','+1 month','-1 day');
So, to get last day of current month, use:
SELECT T.log_date, date(substr(log_date, 6+T.noofdays, 4) || '-' || substr('00' || substr(log_date, 1, 1+T.noofdays), -2) || '-' || substr(log_date, 3+T.noofdays, 2), 'start of month', '1 month', '-1 day') AS lastdayofmonth
FROM (
SELECT log_date, instr(log_date, '/')-2 as noofdays
FROM product_table
) AS T;
DbFiddle
You can also use datetime function in the same way.
[EDIT]
Accorgingly to the discusssion in comment with #forpas (Thank you for your valuable comments)...
If 'last day of month' means 'last day of month existsing in a table', then MAX(log_date) should do the job. But if, you want to get 'last day of month' even if there's no corresponding date in a table, above query shows how to achieve that.
Good luck!

Counting records and grouping them by the hour

I'm trying to count the records in my table and grouping them by hour, i'm getting results with my query but I want it to return every hour even if there are no records.
My current query is,
SELECT nvl(count(*),0) AS transactioncount, trunc(date_modified, 'HH') as TRANSACTIONDATE
FROM TABLE
WHERE date_modified between to_date('23-JAN-19 07:00:00','dd-MON-yy hh24:mi:ss') and to_date('24-Jan-19 06:59:59','dd-MON-yy hh24:mi:ss')
group by trunc(date_modified, 'HH');
This returns a result like this,
TRANSACTIONCOUNT | TRANSACTIONDATE
43 | 23-Jan-19 07:00:00
47 | 23-Jan-19 08:00:00
156 | 23-Jan-19 14:00:00
558 | 23-Jan-19 15:00:00
What I want is for it to return every hour between my 2 dates so,
TRANSACTIONCOUNT | TRANSACTIONDATE
43 | 23-Jan-19 07:00:00
47 | 23-Jan-19 08:00:00
0 | 23-Jan-19 09:00:00
0 | 23-Jan-19 10:00:00
0 | 23-Jan-19 11:00:00
0 | 23-Jan-19 12:00:00
0 | 23-Jan-19 13:00:00
156 | 23-Jan-19 14:00:00
558 | 23-Jan-19 15:00:00
--......
0 | 24-Jan-19 00:00:00
0 | 24-Jan-19 01:00:00
0 | 24-Jan-19 02:00:00
--and so on
To fill the holes in the transaction hours you create first a complete table of hours.
You may use Recursive Subquery Factoring to do it
WITH hour_table(TRANSACTIONDATE) AS (
SELECT to_date('23-JAN-19 07:00:00','dd-MON-yy hh24:mi:ss') /* init hour here */
FROM DUAL
UNION ALL
SELECT TRANSACTIONDATE + 1/24
FROM hour_table
WHERE TRANSACTIONDATE + 1/24 < to_date('24-JAN-19 06:59:59','dd-MON-yy hh24:mi:ss') /* limit here */
)
select * from hour_table;
TRANSACTIONDATE
-------------------
23.01.2019 07:00:00
23.01.2019 08:00:00
...
24.01.2019 05:00:00
24.01.2019 06:00:00
Note that you use the staring and ending date in this query, the starting date must be exact an hour.
Next step is as simple as to outer join this hour table to your aggregation and set the default value for the missing hours with NVL.
with hour_table(TRANSACTIONDATE) AS (
SELECT to_date('23-JAN-19 07:00:00','dd-MON-yy hh24:mi:ss') /* init hour here */
FROM DUAL
UNION ALL
SELECT TRANSACTIONDATE + 1/24
FROM hour_table
WHERE TRANSACTIONDATE + 1/24 < to_date('24-JAN-19 06:59:59','dd-MON-yy hh24:mi:ss') /* limit */
),
agg as (
SELECT nvl(count(*),0) AS transactioncount, trunc(date_modified, 'HH') as TRANSACTIONDATE
FROM "TABLE"
WHERE date_modified between to_date('23-JAN-19 07:00:00','dd-MON-yy hh24:mi:ss') and to_date('24-Jan-19 06:59:59','dd-MON-yy hh24:mi:ss')
group by trunc(date_modified, 'HH')
)
select t.TRANSACTIONDATE, nvl(transactioncount,0) transactioncount
from hour_table t
left outer join agg a
on t.TRANSACTIONDATE = a.TRANSACTIONDATE
order by 1;
You might consider using the following with CONNECT BY level logic :
SELECT sum(transactioncount) as transactioncount, transactiondate
FROM
(
with "TABLE"(date_modified) as
(
SELECT timestamp'2019-01-23 08:00:00' FROM dual union all
SELECT timestamp'2019-01-23 08:30:00' FROM dual union all
SELECT timestamp'2019-01-23 09:00:00' FROM dual union all
SELECT timestamp'2019-01-24 05:01:00' FROM dual
)
SELECT nvl(count(*),0) AS transactioncount, trunc(date_modified, 'hh24') as transactiondate
FROM "TABLE" t
GROUP BY trunc(date_modified, 'HH24')
UNION ALL
SELECT 0, timestamp'2019-01-23 07:00:00' + ( level - 1 )/24
FROM dual
CONNECT BY level <= 24 * extract( day from
timestamp'2019-01-24 06:59:59'-
timestamp'2019-01-23 07:00:00') +
extract( hour from
timestamp'2019-01-24 06:59:59'-
timestamp'2019-01-23 07:00:00') + 1
)
GROUP BY transactiondate
ORDER BY transactiondate
Rextester Demo

Is there a way to group timestamp data by 30 day intervals starting from the min(date) and add them as columns

I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns
i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.
After the join i get a table similar to this:
+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+
what i am trying to get is results that pull this :
+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+
So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.
Ive looked at other questions such as :
Group rows by 7 days interval starting from a certain date
MySQL query to select min datetime grouped by 30 day intervals
However it did not seem to answer my specific problem.
Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.
Any help would be greatly appreciated.
Thank you.
If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-
select a11.url
,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
from page_creation a11
join page_visits a12
on a11.link = a12.url
group by a11.url
If you are using BigQuery, I would recommend:
countif() to count a boolean value
timestamp_add() to add intervals to timestamps
The exact boundaries are a bit vague, but I would go for:
select pc.url,
countif(pv.date >= pc.dt_crtd and
pv.date < timestamp_add(pc.dt_crtd, interval 30 day
) as Interval_00_29,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 60 day
) as Interval_30_59,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 90 day
) as Interval_60_89
from page_creation pc join
page_visits pv
on pc.link = pv.url
group by pc.url
The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN
So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)
#standardSQL
WITH `project.dataset.page_creation` AS (
SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
SELECT 'www.twitter.com', '2016-02-01 00:00:00'
), `project.dataset.page_visits` AS (
SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
SELECT 'www.twitter.com', '2016-03-04 00:00:00'
), `After the join` AS (
SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
SELECT link, dt_crtd FROM `project.dataset.page_creation`
)
SELECT
url, min_date,
COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
FROM (
SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
FROM `After the join`
)
GROUP BY url, min_date
with result as
Row url min_date Interval_1 Interval_2 Interval_3
1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0

How to count ratio hourly?

I`m stuck a bit with understanding of my further actions while performing queries.
I have two tables "A"(date, response, b_id) and "B"(id, country). I need to count hourly ratio of a number of entries where response exists to the total number of entries on a specific date. The final selection should consist of columns "hour", "ratio".
SELECT COUNT(*) FROM A WHERE RESPONSE IS NOT NULL//counting entries with response
SELECT COUNT(*) FROM A//counting total number of entries
How to count the ratio? Should I create a separate variable for it?
How to count for each hour on a day? Should I make smth like a loop? + How can I get the "hour" part of a date?
What is the best way to select the hours and counted ratio? Should I make a separate table for it?
I`m rather new to make complex queries, so I woud be happy for every kind of help
You can do this as:
select to_char(datecol, 'HH24') as hour,
count(response) as has_response, count(*) as total,
count(response) / count(*) as ratio
from a
where datecol >= date '2018-09-18' and datecol < date '2018-09-19'
group by to_char(datecol, 'HH24');
You can also do this using avg() -- which is also fun:
select to_char(datecol, 'HH24'),
avg(case when response is not null then 1.0 else 0 end) as ratio
from a
where datecol >= date '2018-09-18' and datecol < date '2018-09-19'
group by to_char(datecol, 'HH24')
In this case, that requires more typing, though.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE A ( dt, response, b_id ) AS
SELECT DATE '2018-09-18' + INTERVAL '00:00' HOUR TO MINUTE, NULL, 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '00:10' HOUR TO MINUTE, 'A', 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '00:20' HOUR TO MINUTE, 'B', 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '01:00' HOUR TO MINUTE, 'C', 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '01:10' HOUR TO MINUTE, 'D', 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '02:00' HOUR TO MINUTE, NULL, 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '03:00' HOUR TO MINUTE, 'E', 1 FROM DUAL UNION ALL
SELECT DATE '2018-09-18' + INTERVAL '05:10' HOUR TO MINUTE, 'F', 1 FROM DUAL;
Query 1:
SELECT b_id,
TO_CHAR( TRUNC( dt, 'HH' ), 'YYYY-MM-DD HH24:MI:SS' ) AS hour,
COUNT(RESPONSE) AS total_response_per_hour,
COUNT(*) AS total_per_hour,
total_response_per_day,
total_per_day,
COUNT(response) / total_response_per_day AS ratio_for_responses,
COUNT(*) / total_per_day AS ratio
FROM (
SELECT A.*,
COUNT(RESPONSE) OVER ( PARTITION BY b_id, TRUNC( dt ) ) AS total_response_per_day,
COUNT(*) OVER ( PARTITION BY b_id, TRUNC( dt ) ) AS total_per_day
FROM A
)
GROUP BY
b_id,
total_per_day,
total_response_per_day,
TRUNC( dt, 'HH' )
ORDER BY
TRUNC( dt, 'HH' )
Results:
| B_ID | HOUR | TOTAL_RESPONSE_PER_HOUR | TOTAL_PER_HOUR | TOTAL_RESPONSE_PER_DAY | TOTAL_PER_DAY | RATIO_FOR_RESPONSES | RATIO |
|------|---------------------|-------------------------|----------------|------------------------|---------------|---------------------|-------|
| 1 | 2018-09-18 00:00:00 | 2 | 3 | 6 | 8 | 0.3333333333333333 | 0.375 |
| 1 | 2018-09-18 01:00:00 | 2 | 2 | 6 | 8 | 0.3333333333333333 | 0.25 |
| 1 | 2018-09-18 02:00:00 | 0 | 1 | 6 | 8 | 0 | 0.125 |
| 1 | 2018-09-18 03:00:00 | 1 | 1 | 6 | 8 | 0.16666666666666666 | 0.125 |
| 1 | 2018-09-18 05:00:00 | 1 | 1 | 6 | 8 | 0.16666666666666666 | 0.125 |
SELECT withResponses.hour,
withResponses.cnt AS withResponse,
alls.cnt AS AllEntries,
(withResponses.cnt / alls.cnt) AS ratio
FROM
( SELECT to_char(d, 'DD-MM-YY - HH24') || ':00 to :59 ' hour,
count(*) AS cnt
FROM A
WHERE RESPONSE IS NOT NULL
GROUP BY to_char(d, 'DD-MM-YY - HH24') || ':00 to :59 ' ) withResponses,
( SELECT to_char(d, 'DD-MM-YY - HH24') || ':00 to :59 ' hour,
count(*) AS cnt
FROM A
GROUP BY to_char(d, 'DD-MM-YY - HH24') || ':00 to :59 ' ) alls
WHERE alls.hour = withResponses.hour ;
SQLFiddle: http://sqlfiddle.com/#!4/c09b9/2