I have an Orders table in my SQLite database. What I want to do is group by the data by every 168 hours (7 days), and count to total Orders per 168 hours.
What I did was create an in memory "calendar table" and I joined my Orders table to that calendar set.
This works fine when I group by 12, 24, 48 or even 120 hours (5 days). But for some reason it doesn't work when I group by 168 hours (7 days). I get NULL values back instead of what count() should really return.
The following sql code is an example that groups by every 120 hours (5 days).
CREATE TABLE IF NOT EXISTS Orders (
Id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
Key TEXT,
Timestamp TEXT NOT NULL
);
INSERT INTO Orders (Key, Timestamp) VALUES ('k1', '2019-10-01 10:00:23');
INSERT INTO Orders (Key, Timestamp) VALUES ('k2', '2019-10-01 15:45:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k3', '2019-10-02 17:05:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k4', '2019-10-03 20:12:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k5', '2019-10-04 08:49:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k6', '2019-10-05 11:24:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k7', '2019-10-07 11:24:19');
WITH RECURSIVE dates(date1) AS (
VALUES('2019-10-01 00:00:00')
UNION ALL
SELECT datetime(date1, '+120 hours')
FROM dates
WHERE date1 <= '2019-10-29 00:00:00'
)
SELECT date1 as __ddd, d2.* FROM dates AS d1
LEFT JOIN (
SELECT count(Key) AS OrderKey,
datetime((strftime('%s', timestamp) / 432000) * 432000, 'unixepoch') as __interval
FROM `Orders`
WHERE `Timestamp` >= '2019-09-29T00:00:00.000'
GROUP BY __interval LIMIT 10
) d2 ON d1.date1 = d2.__interval
Important note:
If you want to update this code to test it with 168 hours (7 days), then you should do the following:
Change +120 hours to +168 hours
Change 432000 (432000 == 120 hours) to 604800 (604800 == 168 hours)
note that this number occurs twice, both should be replaced
Anyone any idea why it stops working properly when I change the sql code to 168 hours?
Your problem is that when you change to a 7-day interval, the values in your dates CTE don't align with the intervals generated from your Orders table. You can fix that by making the dates CTE start on a similarly aligned date:
WITH RECURSIVE dates(date1) AS (
SELECT datetime((strftime('%s', '2019-10-01 00:00:00') / 604800) * 604800, 'unixepoch')
UNION ALL
SELECT datetime(date1, '+168 hours')
FROM dates
WHERE date1 <= '2019-10-29 00:00:00'
)
Output:
__ddd OrderKey __interval
2019-09-26 00:00:00 3 2019-09-26 00:00:00
2019-10-03 00:00:00 4 2019-10-03 00:00:00
2019-10-10 00:00:00 null null
2019-10-17 00:00:00 null null
2019-10-24 00:00:00 null null
2019-10-31 00:00:00 null null
Demo on dbfiddle
Related
I am trying to calculate the number of hours of operation per week for each facility in a region. The part I am struggling with is that there are multiple programs each day that overlap which contribute to the total hours.
Here is a sample of the table I am working with:
location
program
date
start_time
end_time
a
1
09-22-21
14:45:00
15:45:00
a
2
09-22-21
15:30:00
16:30:00
b
88
09-22-21
10:45:00
12:45:00
b
89
09-22-21
10:45:00
14:45:00
I am hoping to get:
location
hours of operation
a
1.75
b
4
I've tried using SUM DATEDIFF with some WHERE statements but couldn't get them to work. What I have found is how to identify the overlapping ranges(Detect overlapping date ranges from the same table), but not how to sum the difference to get the desired outcome of total non-overlapping hours of operation.
Believe you are trying to identify the total hours of operation for each location. Now because some programs can overlap, you want to rule those out. To do this, I generate a tally table of each possible 15 minute increment in the date and then count the time periods that have a program operating
Identify Total Hours of Operation per Date
DROP TABLE IF EXISTS #OperationSchedule
CREATE TABLE #OperationSchedule (ID INT IDENTITY(1,1),Location CHAR(1),Program INT,OpDate DATE,OpStart TIME(0),OpEnd TIME(0))
INSERT INTO #OperationSchedule
VALUES ('a',1,'09-22-21','14:45:00','15:45:00')
,('a',2,'09-22-21','15:30:00','16:30:00')
,('b',88,'09-22-21','10:45:00','12:45:00')
,('b',89,'09-22-21','10:45:00','14:45:00');
/*1 row per 15 minute increment in a day*/
;WITH cte_TimeIncrement AS (
SELECT StartTime = CAST('00:00' AS TIME(0))
UNION ALL
SELECT DATEADD(minute,15,StartTime)
FROM cte_TimeIncrement
WHERE StartTime < '23:45'
),
/*1 row per date in data*/
cte_DistinctDate AS (
SELECT OpDate
FROM #OperationSchedule
GROUP BY OpDate
),
/*Cross join to generate 1 row for each time increment*/
cte_DatetimeIncrement AS (
SELECT *
FROM cte_DistinctDate
CROSS JOIN cte_TimeIncrement
)
/*Join and count each time interval that has a match to identify times when location is operating*/
SELECT Location
,A.OpDate
,HoursOfOperation = CAST(COUNT(DISTINCT StartTime) * 15/60.0 AS Decimal(4,2))
FROM cte_DatetimeIncrement AS A
INNER JOIN #OperationSchedule AS B
ON A.OpDate = B.OpDate
AND A.StartTime >= B.OpStart
AND A.StartTime < B.OpEnd
GROUP BY Location,A.OpDate
Here is an alternative method without having to round to nearest 15 minute increments:
Declare #OperationSchedule table (
ID int Identity(1, 1)
, Location char(1)
, Program int
, OpDate date
, OpStart time(0)
, OpEnd time(0)
);
Insert Into #OperationSchedule (Location, Program, OpDate, OpStart, OpEnd)
Values ('a', 1, '09-22-21', '14:45:00', '15:45:00')
, ('a', 2, '09-22-21', '15:30:00', '16:30:00')
, ('b', 88, '09-22-21', '10:45:00', '12:45:00')
, ('b', 89, '09-22-21', '10:45:00', '14:45:00')
, ('c', 23, '09-22-21', '12:45:00', '13:45:00')
, ('c', 24, '09-22-21', '14:45:00', '15:15:00')
, ('3', 48, '09-22-21', '09:05:00', '13:55:00')
, ('3', 49, '09-22-21', '14:25:00', '15:38:00')
;
With overlappedData
As (
Select *
, overlap_op = lead(os.OpStart, 1, os.OpEnd) Over(Partition By os.Location Order By os.ID)
From #OperationSchedule os
)
Select od.Location
, start_date = min(od.OpStart)
, end_date = max(iif(od.OpEnd < od.overlap_op, od.OpEnd, od.overlap_op))
, hours_of_operation = sum(datediff(minute, od.OpStart, iif(od.OpEnd < od.overlap_op, od.OpEnd, od.overlap_op)) / 60.0)
From overlappedData od
Group By
od.Location;
I have a table in DB2 which contains data such as the following:
completed_timestamp
details
2021-12-19-15.38.10
abcd
2021-12-19-15.39.10
efgh
2021-12-19-15.48.10
ijkl
2021-12-19-15.49.10
mnop
2021-12-19-15.54.10
qrst
I want to be able to count the number of rows in the table for every 10 minutes e.g.
Time
count
2021-12-19-15.40
2
2021-12-19-15.50
2
2021-12-19-16.00
1
completed_timestamp = Timestamp, details = varchar
I've seen this done in other SQL languages but so far have not figured out how to do it with DB2. How would I do that?
You can get the minute from the timestamp, divide it by ten, get the next integer, multiply by ten and add it to the hour as minutes :
WITH TABLE1 (completed_timestamp, details) AS (
VALUES
(timestamp '2021-12-19-15.38.10', 'abcd'),
(timestamp '2021-12-19-15.39.10', 'efgh'),
(timestamp '2021-12-19-15.48.10', 'ijkl'),
(timestamp '2021-12-19-15.49.10', 'mnop'),
(timestamp '2021-12-19-15.54.10', 'qrst')
),
trunc_timestamps (time, details) AS (
SELECT trunc(completed_timestamp, 'HH24') + (ceiling(minute(completed_timestamp) / 10.0) * 10) MINUTES, details FROM table1
)
SELECT trunc_timestamps.time, count(*) FROM trunc_timestamps GROUP BY trunc_timestamps.time
I have data like this
availabilities
[{"starts_at":"09:00","ends_at":"17:00"}]
I have query below and it works
select COALESCE(availabilities,'Total') as availabilities,
SUM(DATEDIFF(minute,start_at,end_at)) as 'Total Available Hours in Minutes'
from (
select cast(availabilities as NVARCHAR) as availabilities,
cast(SUBSTRING(availabilities,16,5) as time) as start_at,
cast(SUBSTRING(availabilities,34,5) as time) as end_at
from alfy.dbo.daily_availabilities
)x
GROUP by ROLLUP(availabilities);
Result
availabilities Total Available Hours in Minutes
[{"starts_at":"09:00","ends_at":"17:00"}] 480
How if the data like below
availabilities
[{"starts_at":"10:00","ends_at":"13:30"},{"starts_at":"14:00","ends_at":"18:00"}]
[{"starts_at":"09:00","ends_at":"12:30"},{"starts_at":"13:00","ends_at":"15:30"},{"starts_at":"16:00","ends_at":"18:00"}]
How to count the number of minutes over two or more time ranges?
Since you have JSON data use OPENJSON (Transact-SQL) to parse it, e.g.:
create table dbo.daily_availabilities (
id int,
availabilities nvarchar(max) --JSON
);
insert dbo.daily_availabilities (id, availabilities) values
(1, N'[{"starts_at":"09:00","ends_at":"17:00"}]'),
(2, N'[{"starts_at":"10:00","ends_at":"13:30"},{"starts_at":"14:00","ends_at":"18:00"}]'),
(3, N'[{"starts_at":"09:00","ends_at":"12:30"},{"starts_at":"13:00","ends_at":"15:30"},{"starts_at":"16:00","ends_at":"18:00"}]');
select id, sum(datediff(mi, starts_at, ends_at)) as total_minutes
from dbo.daily_availabilities
cross apply openjson(availabilities) with (
starts_at time,
ends_at time
) av
group by id
id
total_minutes
1
480
2
450
3
480
I'm relatively new to working with PostgreSQL and I could use some help with this.
Suppose I have a table of forecasted values (let's say temperature) are stored, which are indicated by a dump_date_time . This dump_date_time is the date_time when the values were stored in the table. The temperature forecasts are also indicated by the date_time to which the forecast corresponds. Lets say that every 6 hours a forecast is published.
Example:
At 06:00 today the temperature for tomorrow at 16:00 is published and stored in the table. Then at 12:00 today the temperature for tomorrow at 16:00 is published and also stored in the table. I now have two forecasts for the same date_time (16:00 tomorrow) which are published at two different times (06:00 and 12:00 today), indicated by the dump_date_time.
All these values are stored in the same table, with three columns: dump_date_time, date_time and value. My goal is to SELECT from this table the difference between the temperatures of the two forecasts. How do I do this?
One option uses a join:
select date_time, t1.value - t2.value value_diff
from mytable t1
inner join mytable t2 using (date_time)
where t1.dump_date = '2020-01-01 06:00:00'::timestamp
and t2.dump_date = '2020-01-01 16:00:00'::timestamp
Something like:
create table forecast(dump_date_time timestamptz, date_time timestamptz, value numeric)
insert into forecast values ('09/24/2020 06:00', '09/25/2020 16:00', 50), ('09/24/2020 12:00', '09/25/2020 16:00', 52);
select max(value) - min(value) from forecast where date_time = '09/25/2020 16:00';
?column?
----------
2
--Specifying dump_date_time range
select
max(value) - min(value)
from
forecast
where
date_time = '09/25/2020 16:00'
and
dump_date_time <#
tstzrange(current_date + interval '6 hours',
current_date + interval '12 hours', '[]');
?column?
----------
2
This is a very simple case. If you need something else you will need to provide more information.
UPDATE
Add example that uses timestamptz range to select dump_date_time in range.
I need to get the count of records using PostgreSQL from time 7:00:00 am till next day 6:59:59 am and the count resets again from 7:00am to 6:59:59 am.
Where I am using backend as java (Spring boot).
The columns in my table are
id (primary_id)
createdon (timestamp)
name
department
createdby
How do I give the condition for shift wise?
You'd need to pick a slice based on the current time-of-day (I am assuming this to be some kind of counter which will be auto-refreshed in some application).
One way to do that is using time ranges:
SELECT COUNT(*)
FROM mytable
WHERE createdon <# (
SELECT CASE
WHEN current_time < '07:00'::time THEN
tsrange(CURRENT_DATE - '1d'::interval + '07:00'::time, CURRENT_DATE + '07:00'::time, '[)')
ELSE
tsrange(CURRENT_DATE + '07:00'::time, CURRENT_DATE + '1d'::interval + '07:00'::time, '[)')
END
)
;
Example with data: https://rextester.com/LGIJ9639
As I understand the question, you need to have a separate group for values in each 24-hour period that starts at 07:00:00.
SELECT
(
date_trunc('day', (createdon - '7h'::interval))
+ '7h'::interval
) AS date_bucket,
count(id) AS count
FROM lorem
GROUP BY date_bucket
ORDER BY date_bucket
This uses the date and time functions and the GROUP BY clause:
Shift the timestamp value back 7 hours ((createdon - '7h'::interval)), so the distinction can be made by a change of date (at 00:00:00). Then,
Truncate the value to the date (date_trunc('day', …)), so that all values in a bucket are flattened to a single value (the date at midnight). Then,
Add 7 hours again to the value (… + '7h'::interval), so that it represents the starting time of the bucket. Then,
Group by that value (GROUP BY date_bucket).
A more complete example, with schema and data:
DROP TABLE IF EXISTS lorem;
CREATE TABLE lorem (
id serial PRIMARY KEY,
createdon timestamp not null
);
INSERT INTO lorem (createdon) (
SELECT
generate_series(
CURRENT_TIMESTAMP - '36h'::interval,
CURRENT_TIMESTAMP + '36h'::interval,
'45m'::interval)
);
Now the query:
SELECT
(
date_trunc('day', (createdon - '7h'::interval))
+ '7h'::interval
) AS date_bucket,
count(id) AS count
FROM lorem
GROUP BY date_bucket
ORDER BY date_bucket
;
produces this result:
date_bucket | count
---------------------+-------
2019-03-06 07:00:00 | 17
2019-03-07 07:00:00 | 32
2019-03-08 07:00:00 | 32
2019-03-09 07:00:00 | 16
(4 rows)
You can use aggregation -- by subtracting 7 hours:
select (createdon - interval '7 hour')::date as dy, count(*)
from t
group by dy
order by dy;