Select data grouped by time over midnight - sql

I have a table like:
ID TIMEVALUE
----- -------------
1 06.07.15 06:43:01,000000000
2 06.07.15 12:17:01,000000000
3 06.07.15 18:21:01,000000000
4 06.07.15 23:56:01,000000000
5 07.07.15 04:11:01,000000000
6 07.07.15 10:47:01,000000000
7 07.07.15 12:32:01,000000000
8 07.07.15 14:47:01,000000000
and I want to group this data by special times.
My current query looks like this:
SELECT TO_CHAR(TIMEVALUE, 'YYYY\MM\DD'), COUNT(ID),
SUM(CASE WHEN TO_CHAR(TIMEVALUE, 'HH24MI') <=700 THEN 1 ELSE 0 END) as morning,
SUM(CASE WHEN TO_CHAR(TIMEVALUE, 'HH24MI') >700 AND TO_CHAR(TIMEVALUE, 'HH24MI') <1400 THEN 1 ELSE 0 END) as daytime,
SUM(CASE WHEN TO_CHAR(TIMEVALUE, 'HH24MI') >=1400 THEN 1 ELSE 0 END) as evening FROM Table
WHERE TIMEVALUE >= to_timestamp('05.07.2015','DD.MM.YYYY')
GROUP BY TO_CHAR(TIMEVALUE, 'YYYY\MM\DD')
and I am getting this output
day overall morning daytime evening
----- ---------
2015\07\05 454 0 0 454
2015\07\06 599 113 250 236
2015\07\07 404 139 265 0
so that is fine grouping on the same day (0-7 o'clock, 7-14 o'clock and 14-24 o'clock)
But my question now is:
How can I group over midnight?
For example count from 6-14 , 14-23 and 23-6 o'clock on next day.
I hope you understand my question. You are welcome to even improve my upper query if there is a better solution.

EDIT: It is tested now: SQL Fiddle
The key is simply to adjust the group by so that anything before 6am gets grouped with the previous day. After that, the counts are pretty straight-forward.
SELECT TO_CHAR(CASE WHEN EXTRACT(HOUR FROM timevalue) < 6
THEN timevalue - 1
ELSE timevalue
END, 'YYYY\MM\DD') AS day,
COUNT(*) AS overall,
SUM(CASE WHEN EXTRACT(HOUR FROM timevalue) >= 6 AND EXTRACT(HOUR FROM timevalue) < 14
THEN 1 ELSE 0 END) AS morning,
SUM(CASE WHEN EXTRACT(HOUR FROM timevalue) >= 14 AND EXTRACT(HOUR FROM timevalue) < 23
THEN 1 ELSE 0 END) AS daytime,
SUM(CASE WHEN EXTRACT(HOUR FROM timevalue) < 6 OR EXTRACT(HOUR FROM timevalue) >= 23
THEN 1 ELSE 0 END) AS evening
FROM my_table
WHERE timevalue >= TO_TIMESTAMP('05.07.2015','DD.MM.YYYY')
GROUP BY TO_CHAR(CASE WHEN EXTRACT(HOUR FROM timevalue) < 6
THEN timevalue - 1
ELSE timevalue
END, 'YYYY\MM\DD');

Substract 1 day from timevalue for times lower than '06:00' at first and then:
SQLFiddle demo
select TO_CHAR(day, 'YYYY\MM\DD') day, COUNT(ID) cnt,
SUM(case when '23' < tvh or tvh <= '06' THEN 1 ELSE 0 END) as midnight,
SUM(case when '06' < tvh and tvh <= '14' THEN 1 ELSE 0 END) as daytime,
SUM(case when '14' < tvh and tvh <= '23' THEN 1 ELSE 0 END) as evening
FROM (
select id, to_char(TIMEVALUE, 'HH24') tvh,
trunc(case when (to_char(timevalue, 'hh24') <= '06')
then timevalue - interval '1' day
else timevalue end) day
from t1
)
GROUP BY day

Maybe you can do it like this (with some reformatting or PIVOT):
WITH spans AS
(SELECT TIMESTAMP '2015-01-01 00:00:00' + LEVEL * INTERVAL '1' HOUR AS start_time
FROM dual
CONNECT BY TIMESTAMP '2015-01-01 00:00:00' + LEVEL * INTERVAL '1' HOUR < LOCALTIMESTAMP),
t AS
(SELECT start_time, lead(start_time, 1) OVER (ORDER BY start_time) AS end_time, ROWNUM AS N
FROM spans
WHERE EXTRACT(HOUR FROM start_time) IN (6,14,23))
SELECT N, start_time, end_time, COUNT(*) AS ID_COUNT,
DECODE(EXTRACT(HOUR FROM start_time), 6,'morning', 14,'daytime', 23,'evening') AS daytime
FROM t
JOIN YOUR_TABLE WHERE TIMEVALUE BETWEEN start_time AND end_time
GROUP BY N;
Of course, the initial time value ('2015-01-01 00:00:00' in my example) has to be lower than the least date in your table.

Related

Writing a sql query that captures the 2022 YTD data and compare to same period from 2021.. ex"Jan 1 to Jun 10 2022" vs "Jan 1 to Jun 10 2021"

SUM(case when TO_DATE(WO.W_MEASURE_CLAIMS_SAVING_DATE,'YYYY-MM-DD') between (sysdate -365) and sysdate then PM.M_GROSS_ANNUAL_KWH_SVG else 0 end) as CURRENT_KWH_12,
SUM(case when TO_DATE(WO.W_MEASURE_CLAIMS_SAVING_DATE,'YYYY-MM-DD') between sysdate -730 and (sysdate -365) then PM.M_GROSS_ANNUAL_KWH_SVG else 0 end) as PREVIOUS_LAST_KWH_12,
SUM(case when TO_DATE(WO.W_MEASURE_CLAIMS_SAVING_DATE,'YYYY-MM-DD') between (sysdate -365) and sysdate then PM.M_GROSS_ANNUAL_THERM_SVG else 0 end) as CURRENT_THERMS_12,
sum(case when TO_DATE(WO.W_MEASURE_CLAIMS_SAVING_DATE,'YYYY-MM-DD') between sysdate -730 and (sysdate -365) then PM.M_GROSS_ANNUAL_THERM_SVG else 0 end) as PREVIOUS_LAST_THERMS_12,
... i think the "current" data needs to be W_MEASURE_CLAIMS_SAVING_DATE >= 2022-01-01
and the [2:09 PM] Siegel, James R
and the "previous" data needs to be W_MEASURE_CLAIMS_SAVING_DATE between 2021-01-01 and today's date -365 days
but i just need help writing it
You appear to want to compare values between the start of this year until now to between the start of last year and the same date as now last year:
SUM(
CASE
WHEN WO.W_MEASURE_CLAIMS_SAVING_DATE BETWEEN TRUNC(SYSDATE, 'YY')
AND SYSDATE
THEN PM.M_GROSS_ANNUAL_KWH_SVG
END
) AS CURRENT_KWH,
SUM(
CASE
WHEN WO.W_MEASURE_CLAIMS_SAVING_DATE BETWEEN ADD_MONTHS(TRUNC(SYSDATE, 'YY'), -12)
AND ADD_MONTHS(SYSDATE, -12)
THEN PM.M_GROSS_ANNUAL_KWH_SVG
END
) as PREVIOUS_KWH

SQL why does dateA - dateB <= '3 years' give a different result than dateA <= dateB + '3 years'

I was doing a MODE.com SQL practice question about date format.
The practice question is: Write a query that counts the number of companies acquired within 3 years, 5 years, and 10 years of being founded (in 3 separate columns). Include a column for total companies acquired as well. Group by category and limit to only rows with a founding date.
It uses two tables:
tutorial.crunchbase_companies_clean_date table, which includes information about all the companies, like company name, founded year, etc.
tutorial.crunchbase_acquisitions_clean_datetable, which includes the information about all the acquired companies, like acquired company name, acquired date, etc.
My code is:
SELECT companies.category_code,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '3 years' THEN 1 ELSE NULL END) AS less_than_3_years,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '5 years' THEN 1 ELSE NULL END) AS between_3_to_5_years,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '10 years' THEN 1 ELSE NULL END) AS within_10_years,
COUNT(1) AS total
FROM tutorial.crunchbase_companies_clean_date companies
JOIN tutorial.crunchbase_acquisitions_clean_date acq
ON companies.permalink = acq.company_permalink
WHERE companies.founded_at_clean IS NOT NULL
GROUP BY 1
ORDER BY total DESC
The result is:
My result
The answer query is:
SELECT companies.category_code,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '3 years'
THEN 1 ELSE NULL END) AS acquired_3_yrs,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '5 years'
THEN 1 ELSE NULL END) AS acquired_5_yrs,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '10 years'
THEN 1 ELSE NULL END) AS acquired_10_yrs,
COUNT(1) AS total
FROM tutorial.crunchbase_companies_clean_date companies
JOIN tutorial.crunchbase_acquisitions_clean_date acquisitions
ON acquisitions.company_permalink = companies.permalink
WHERE founded_at_clean IS NOT NULL
GROUP BY 1
ORDER BY 5 DESC
The result is:
The answer result
You can see in the screenshots that the results are very similar, but some numbers are different.
The only difference I can see between my query and the answer is in the COUNT statements, but I don't really see the difference, for example, between: acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '3 years' and acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '3 years'
I tried adding INTERVAL in my SELECT statement:
SELECT companies.category_code,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '3 years' THEN 1 ELSE NULL END) AS less_than_3_years,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '5 years' THEN 1 ELSE NULL END) AS between_3_to_5_years,
COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '10 years' THEN 1 ELSE NULL END) AS within_10_years,
COUNT(1) AS total
and remove the INTERVAL from the answer query:
SELECT companies.category_code,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '3 years'
THEN 1 ELSE NULL END) AS acquired_3_yrs,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '5 years'
THEN 1 ELSE NULL END) AS acquired_5_yrs,
COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '10 years'
THEN 1 ELSE NULL END) AS acquired_10_yrs,
COUNT(1) AS total
But the results are the same.
I tried to know the result of just the difference between the acquired_date and founded_date, to see if the value can be compared with INTERVAL. The result is in days, which looks promising to me.
The result
I try to give all the information for your consideration. Hope somebody could help. Thank you in advance!
My suggestion is to add/subtract the INTERVAL to/from one date/time and then compare with the other date/time. Don't subtract the date/times and then compare to a string literal. Your database seems to understand '3 YEARS' as 3 * 365 days, regardless of the actual number of days between someDateTime and someDateTime +/- '3 YEARS'. The actual number of days from year to year could be 365 or 366, depending on whether a leap year is crossed.
Here's a simple example of comparing with a specific interval, which also requires we know whether and how many leap years were crossed.
Fiddle
The test case:
WITH dates AS (
SELECT '2021-01-01'::date AS xdate
)
SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
, xdate - (xdate - INTERVAL '1' YEAR) = '1 YEAR' AS b1
, xdate - (xdate - INTERVAL '1' YEAR) = '365 DAYS' AS b2
, xdate - (xdate - INTERVAL '1' YEAR) = '366 DAYS' AS b3
FROM dates
;
-- AND --
WITH dates AS (
SELECT '2021-01-01'::date AS xdate
)
SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
, xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '1' YEAR AS b1
, xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '365 DAYS' AS b2
, xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '366 DAYS' AS b3
FROM dates
;
Result:
diff
b1
b2
b3
366 days
f
f
t
Fiddle
WITH dates AS (
SELECT '2021-01-01'::date AS xdate
)
, diff AS (
SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
FROM dates
)
SELECT diff
, CASE WHEN diff = (366*24*60*60 * INTERVAL '1' SECOND)
THEN 1
END AS compare1
, 366*24*60*60 AS seconds
, CASE WHEN diff = (366*24*60*60 * INTERVAL '1' SECOND)
THEN 1
END AS compare2
, CASE WHEN diff = '31622400 SECONDS'
THEN 1
END AS compare3
FROM diff
;
The result:
diff
compare1
seconds
compare2
compare3
366 days
1
31622400
1
1
Original response:
The fiddle for PostgreSQL
The behavior shown here (below) is similar to the posted behavior.
The problem is the value generated isn't necessarily what you think.
Here's a test case in postgresql which might be representative of your issue.
Maybe this is related to leap year, where the number of days in a year isn't constant.
So it's probably safer to compare the dates rather than assume some number of days, which is probably the assumption <= '3 years' makes.
The test SQL:
WITH test (acquired_at_cleaned, founded_at_clean, n) AS (
SELECT current_date, current_date - INTERVAL '4' YEAR, 4 UNION
SELECT current_date, current_date - INTERVAL '3' YEAR, 3 UNION
SELECT current_date, current_date - INTERVAL '2' YEAR, 2 UNION
SELECT current_date, current_date - INTERVAL '1' YEAR, 1
)
, cases AS (
SELECT test.*
, CASE WHEN acquired_at_cleaned <= founded_at_clean::timestamp + INTERVAL '3' year
THEN 1 ELSE NULL
END AS acquired_3_yrs_case1
, CASE WHEN acquired_at_cleaned - founded_at_clean::timestamp <= '3 year'
THEN 1 ELSE NULL
END AS acquired_3_yrs_case2
, acquired_at_cleaned - founded_at_clean::timestamp AS x1
, acquired_at_cleaned - (n * INTERVAL '1' YEAR) AS x2
FROM test
)
SELECT acquired_at_cleaned AS acquired
, founded_at_clean AS founded
, n
, acquired_3_yrs_case1 AS case1
, acquired_3_yrs_case2 AS case2
, x1, x2
FROM cases
ORDER BY founded_at_clean
;
The result:
acquired
founded
n
case1
case2
x1
x2
2021-12-25
2017-12-25 00:00:00
4
null
null
1461 days
2017-12-26 00:00:00
2021-12-25
2018-12-25 00:00:00
3
1
null
1096 days
2018-12-26 00:00:00
2021-12-25
2019-12-25 00:00:00
2
1
1
731 days
2019-12-26 00:00:00
2021-12-25
2020-12-25 00:00:00
1
1
1
365 days
2020-12-26 00:00:00
Interesting result.

how to optimize my oracle sql?

I need count in range two date,this sql is work,bug not better,can you help me?
select dmc.doctor_id,
(
select count(*)
from hele_dct_member_config dmc
WHERE (EXTRACT(YEAR FROM dmc.start_time) = 2016 OR EXTRACT(YEAR FROM dmc.end_time) = 2016) AND dmc.status=1
AND TO_DATE('2016-01-31', 'yyyy-mm-dd') BETWEEN start_time AND end_time
) Jan,
(
select count(*)
from hele_dct_member_config dmc
WHERE (EXTRACT(YEAR FROM dmc.start_time) = 2016 OR EXTRACT(YEAR FROM dmc.end_time) = 2016) AND dmc.status=1
AND TO_DATE('2016-02-28', 'yyyy-mm-dd') BETWEEN start_time AND end_time
) Feb,
.
.
.
from hele_dct_member_config dmc
enter code here
WHERE (EXTRACT(YEAR FROM dmc.start_time) = 2016 OR EXTRACT(YEAR FROM dmc.end_time) = 2016) AND dmc.status=1
grouy by dmc.doctor_id
I need count in range two date,this sql is work,bug not better,can you help me?
Use conditional aggregation:
select dmc.doctor_id,
sum(case when date '2016-01-31' BETWEEN start_time AND end_time then 1 else 0
end) as Jan,
sum(case when date '2016-02-31' BETWEEN start_time AND end_time then 1 else 0
end) as Feb,
.
.
.
from hele_dct_member_config dmc
where (extract(year from dmc.start_time) = 2016 or
extract(year from dmc.end_time) = 2016) AND
dmc.status = 1
group by dmc.doctor_id;
if i want get quarter count ? this is one way
SUM(case when date '2016-01-31' BETWEEN start_time AND end_time then 1 else 0 end) +
SUM(case when date '2016-02-28' BETWEEN start_time AND end_time then 1 else 0 end) +
SUM(case when date '2016-03-31' BETWEEN start_time AND end_time then 1 else 0 end) one

count rows before time

i have the following situation. every row has a timestamp when it was written on table. now i want to evaluate per day how many rows have been inserted before 5 am and how many after. how can that be done??
You can use the HH24 format to get the hour in 24-hour time:
select trunc(created_Date) as the_day
,sum(case when to_number(to_char(created_Date,'HH24')) < 5 then 1 else 0 end) as before_five
,sum(case when to_number(to_char(created_Date,'HH24')) >= 5 then 1 else 0 end) as after_five
from yourtable
group by trunc(created_Date)
Per USER's comment on 5:10, to show timestamps just before and after 5:
select trunc(created_Date) as the_day
,sum(case when to_number(to_char(created_Date,'HH24')) < 5 then 1 else 0 end) as before_five
,sum(case when to_number(to_char(created_Date,'HH24')) >= 5 then 1 else 0 end) as after_five
from (
-- one row januar 1 just after 5:00 a.m.
select to_Date('01/01/2015 05:10:12','dd/mm/yyyy hh24:mi:ss') as created_date from dual
union all
-- one row Januar 2 just before 5:00 a.m.
select to_Date('02/01/2015 04:59:12','dd/mm/yyyy hh24:mi:ss') as created_date from dual
)
group by trunc(created_Date);
THE_DAY, BEFORE_FIVE, AFTER_FIVE
02/01/2015, 1, 0
01/01/2015, 0, 1
Assuming your timestamp is a DATE column:
select trunc(date_written) as day
, count (case when (date_written-trunc(date_written))*24 < 5 then 1 end) before_5_count
, count (case when (date_written-trunc(date_written))*24 >= 5 then 1 end) after_5_count
from mytable
group by trunc(date_written)
select to_char(time_column, 'dd/mm/yyyy'),
sum( decode ( greatest(extract(hour from time_column), 5), extract(hour from time_column), 1, 0)) post_5,
sum( decode ( greatest(extract(hour from time_column), 5), extract(hour from time_column), 0, 1)) pre_5
from test_time
group by to_char(time_column, 'dd/mm/yyyy')

SQL Results group by month

I'm trying to return some results spread over a rolling 12 month period eg:
MONTH IN OUT
January 210 191
February 200 111
March 132 141
April 112 141
May 191 188
etc...
How do I spread the results over a date range, populating the first column with the month name?
IN MSSQL it would be something like:
SELECT COUNT(problem.problem_type = 'IN') AS IN,
COUNT(problem.problem_type = 'OUT') AS OUT,
DATEPART(year, DateTime) as Year,
DATEPART(month, DateTime) as Month
FROM problem
WHERE (DateTime >= dbo.FormatDateTime('2010-01-01'))
AND
(DateTime < dbo.FormatDateTime('2010-01-31'))
GROUP BY DATEPART(year, DateTime),
DATEPART(month, DateTime);
But this is against an Oracle database so DATEPART and DateTime are not available.
My Problem table is roughly:
problem_ID Problem_type IN_Date OUT_Date
1 IN 2010-01-23 16:34:29.0 2010-02-29 13:06:28.0
2 IN 2010-01-27 12:34:29.0 2010-01-29 12:01:28.0
3 OUT 2010-02-13 13:24:29.0 2010-09-29 15:04:28.0
4 OUT 2010-02-15 16:31:29.0 2010-07-29 11:03:28.0
Use:
SELECT SUM(CASE WHEN p.problem_type = 'IN' THEN 1 ELSE 0 END) AS IN,
SUM(CASE WHEN p.problem_type = 'OUT' THEN 1 ELSE 0 END) AS OUT,
TO_CHAR(datetime, 'YYYY') AS year,
TO_CHAR(datetime, 'MM') AS month
FROM PROBLEM p
WHERE p.DateTime >= TO_DATE('2010-01-01', 'YYYY-MM-DD')
AND p.DateTime < TO_DATE('2010-01-31', 'YYYY-MM-DD')
GROUP BY TO_CHAR(datetime, 'YYYY'), TO_CHAR(datetime, 'MM')
You could also use:
SELECT SUM(CASE WHEN p.problem_type = 'IN' THEN 1 ELSE 0 END) AS IN,
SUM(CASE WHEN p.problem_type = 'OUT' THEN 1 ELSE 0 END) AS OUT,
TO_CHAR(datetime, 'MM-YYYY') AS mon_year
FROM PROBLEM p
WHERE p.DateTime >= TO_DATE('2010-01-01', 'YYYY-MM-DD')
AND p.DateTime < TO_DATE('2010-01-31', 'YYYY-MM-DD')
GROUP BY TO_CHAR(datetime, 'MM-YYYY')
Reference:
TO_CHAR
TO_DATE
You probably want something like
SELECT SUM( (CASE WHEN problem_type = 'IN' THEN 1 ELSE 0 END) ) in,
SUM( (CASE WHEN problem_type = 'OUT' THEN 1 ELSE 0 END) ) out,
EXTRACT( year FROM DateTime ) year,
EXTRACT( month FROM DateTime ) month
FROM problem
WHERE DateTime >= date '2010-01-01'
AND DateTime < date '2010-01-31'
GROUP BY EXTRACT( year FROM DateTime ),
EXTRACT( month FROM DateTime )