Group By day for custom time interval - cratedb

I'm very new to SQL and time series database. I'm using crate database. I want to aggregate the data by day. But the I want to start each day start time is 9 AM not 12AM..
Time interval is 9 AM to 11.59 PM.
Unix time stamp is used to store the data. following is my sample database.
|sensorid | reading | timestamp|
====================================
|1 | 1616457600 | 10 |
|1 | 1616461200 | 100 |
|2 | 1616493600 | 1 |
|2 | 1616493601 | 10 |
Currently i grouped using following command. But it gives the start time as 12 AM.
select date_trunc('day', v.timestamp) as day,sum(reading)
from sensor1 v(timestamp)
group by (DAY)
From the above table. i want sum of 1616493600 and 1616493601 data (11 is result). because 1616457600 and 1616461200 are less than 9 am data.

You want to add nine hours to midnight:
date_trunc('day', v.timestamp) + interval '9' hour
Edit: If you want to exclude hours before 9:00 from the data you add up, you must add a WHERE clause:
where extract(hour from v.timestamp) >= 9
Here is a complete query with all relevant data:
select
date_trunc('day', v.timestamp) as day,
date_trunc('day', v.timestamp) + interval '9' hour as day_start,
min(v.timestamp) as first_data,
max(v.timestamp) as last_data,
sum(reading) as total_reading
from sensor1 v(timestamp)
where extract(hour from v.timestamp) >= 9
group by day
order by day;

Related

how to query time-series data in postgresql to find spikes

I have a table called cpu_usages and I'm trying to find spikes of cpu usage. My table stores 4 columns:
id serial
at timestamp
cpu_usage float
cpu_core int
the at column stores a timestamp of every minute ever day.
I want to select all rows where I take each timestamp and get the next 3 minutes and if any of the timestamps has a cpu_value over at least 3% higher than the starting value for that timestamp, then return it
So for example if I have these rows:
id|at|cpu_values,cpu_core
1 | 2019-01-01-00:00|1|0
2 | 2019-01-01-00:01|1|0
3 | 2019-01-01-00:02|4|0
4 | 2019-01-01-00:03|1|0
5 | 2019-01-01-00:04|1|0
6 | 2019-01-01-00:05|1|0
7 | 2019-01-01-00:06|1|0
8 | 2019-01-01-00:07|1|0
9 | 2019-01-01-00:08|6|0
10 | 2019-01-01-00:00|1|1
11 | 2019-01-01-00:01|1|1
12| 2019-01-01-00:02|4|1
13 | 2019-01-01-00:03|1|1
14 | 2019-01-01-00:04|1|1
15 | 2019-01-01-00:05|1|1
16 | 2019-01-01-00:06|1|1
17 | 2019-01-01-00:07|1|1
18 | 2019-01-01-00:08|6|1
It would return rows:
1,2,6,7,8
I am not sure how to do this because it sounds like it needs some sort of nested joins.
Can anyone assist me with this?
This answers the original version of the question.
Just use window functions. Assuming you want the larger value, then you want to look back not forward:
select t.*
from (select t.*,
max(cpu_value) over (order by timestamp
range between interval '3 minute' preceding and interval '1 second' preceding
) as previous_min
from t
) t
where previous_min * 1.03 < cpu_value;
EDIT:
Looking backwards, this would be:
select t.*
from (select t.*,
min(cpu_value) over (order by timestamp
range between interval '1 second' following and interval '3 minute' following
) as next_min
from t
) t
where cpu_value * 1.03 > next_min;

grouping the table data(id,machine_id,telemetry_time,riskscore,current) by plant shift time and considering date as shift startime

I have a table with timestamp and someother metrics like riskscore and current of a machine.Here plant shift starts # 08:00 am and ends # next day 08:00am.
i want to group the data by day(shift: 08:00am to nextday 08:00am) of timetamp and label it as shift start date.i have a 6months of data.)
expected output:
machine | date | avg_riskscore | avg_current
2 | 2020-12-02 | 25.5 | 10
here this record is group of data between '2020-12-02 08:00:00' and '2020-12-03 08:00:00' and should insert with date '2020-12-02'
here i need to aggregate the 6 months of data like this.
DB Fiddle
You can just offset the timestamp by 8 hours, then truncate to date and aggregate. Based on your fiddle, that would be:
select
equipment_id,
(telemetry_time - interval '8 hour')::date as date,
avg(riskscore) as avg_riskscore,
avg(i_rms) as avg_i_rms
from telemetry_test
group by equipment_id, date

Aggregate data based on unix time stamp crate database

I'm very new to SQL and time series database. I'm using crate database ( it think which is used PostgreSQL).i want to aggregate the data by hour,day ,week and month. Unix time stamp is used to store the data. following is my sample database.
|sensorid | reading | timestamp|
====================================
|1 | 1604192522 | 10 |
|1 | 1604192702 | 9.65 |
|2 | 1605783723 | 8.1 |
|2 | 1601514122 | 9.6 |
|2 | 1602292210 | 10 |
|2 | 1602291611 | 12 |
|2 | 1602291615 | 10 |
i tried the sql query using FROM_UNIXTIME not supported .
please help me?
im looking the answer for hourly data as follows.
sensorid ,reading , timestamp
1 19.65(10+9.65) 1604192400(starting hour unixt time)
2 8.1 1605783600(starting hour unix time)
2 9.6 1601514000(starting hour unix time)
2 32 (10+12+10) 1602291600(starting hour unix time)
im looking the answer for monthly data is like
sensorid , reading , timestamp
1 24.61(10+9.65+8.1) 1604192400(starting month unix time)
2 41.6(9.6+10+12+10) 1601510400(starting month unix time)
A straight-forward approach is:
SELECT
(date '1970-01-01' + unixtime * interval '1 second')::date as date,
extract(hour from date '1970-01-01' + unixtime * interval '1 second') AS hour,
count(c.user) AS count
FROM core c
GROUP BY 1,2
If you are content with having the date and time in the same column (which would seem more helpful to me), you can use date_trunc():
select
date_trunc('hour', date '1970-01-01' + unixtime * interval '1 second') as date_hour,
count(c.user) AS count
FROM core c
GROUP BY 1,2
You can convert a unix timestamp to a date/time value using to_timestamp(). You can aggregate along multiple dimensions at the same time using grouping sets. So, you might want:
select date_trunc('year', v.ts) as year,
date_trunc('month', v.ts) as month,
date_trunc('week', v.ts) as week,
date_trunc('day', v.ts) as day,
date_trunc('hour', v.ts) as hour,
count(*), avg(reading), sum(reading)
from t cross join lateral
(values (to_timestamp(timestamp))) v(ts)
group by grouping sets ( (year), (month), (week), (day), (hour) );

Get a rolling count of timestamps in SQL

I have a table (in an Oracle DB) that looks something like what is shown below with about 4000 records. This is just an example of how the table is designed. The timestamps range for several years.
| Time | Action |
| 9/25/2019 4:24:32 PM | Yes |
| 9/25/2019 4:28:56 PM | No |
| 9/28/2019 7:48:16 PM | Yes |
| .... | .... |
I want to be able to get a count of timestamps that occur on a rolling 15 minute interval. My main goal is to identify the maximum number of timestamps that appear for any 15 minute interval. I would like this done by looking at each timestamp and getting a count of timestamps that appear within 15 minutes of that timestamp.
My goal would to have something like
| Interval | Count |
| 9/25/2019 4:24:00 PM - 9/25/2019 4:39:00 | 2 |
| 9/25/2019 4:25:00 PM - 9/25/2019 4:40:00 | 2 |
| ..... | ..... |
| 9/25/2019 4:39:00 PM - 9/25/2019 4:54:00 | 0 |
I am not sure how I would be able to do this, if at all. Any ideas or advice would be much appreciated.
If you want any 15 minute interval in the data, then you can use:
select t.*,
count(*) over (order by timestamp
range between interval '15' minute preceding and current row
) as cnt_15
from t;
If you want the maximum, then use rank() on this:
select t.*
from (select t.*, rank() over (order by cnt_15 desc) as seqnum
from (select t.*,
count(*) over (order by timestamp
range between interval '15' minute preceding and current row
) as cnt_15
from t
) t
) t
where seqnum = 1;
This doesn't produce exactly the results you specify in the query. But it does answer the question:
I want to be able to get a count of timestamps that occur on a rolling 15 minute interval. My main goal is to identify the maximum number of timestamps that appear for any 15 minute interval.
You could enumerate the minutes with a recursive query, then bring the table with a left join:
with recursive cte (start_dt, max_dt) as (
select trunc(min(time), 'mi'), max(time) from mytable
union all
select start_dt + interval '1' minute, max_dt from cte where start_dt < max_dt
)
select
c.start_dt,
c.start_dt + interval '15' minute end_dt,
count(t.time) cnt
from cte c
left join mytable t
on t.time >= c.start_dt
and t.time < c.start_dt + interval '15' minute
group by c.start_dt

SQLite: Sum of differences between two dates group by every date

I have a SQLite database with start and stop datetimes
With the following SQL query I get the difference hours between start and stop:
SELECT starttime, stoptime, cast((strftime('%s',stoptime)-strftime('%s',starttime)) AS real)/60/60 AS diffHours FROM tracktime;
I need a SQL query, which delivers the sum of multiple timestamps, grouped by every day (also whole dates between timestamps).
The result should be something like this:
2018-08-01: 12 hours
2018-08-02: 24 hours
2018-08-03: 12 hours
2018-08-04: 0 hours
2018-08-05: 1 hours
2018-08-06: 14 hours
2018-08-07: 8 hours
You can try this, use CTE RECURSIVE make a calendar table for every date start time and end time, and do some calculation.
Schema (SQLite v3.18)
CREATE TABLE tracktime(
id int,
starttime timestamp,
stoptime timestamp
);
insert into tracktime values
(11,'2018-08-01 12:00:00','2018-08-03 12:00:00');
insert into tracktime values
(12,'2018-09-05 18:00:00','2018-09-05 19:00:00');
Query #1
WITH RECURSIVE cte AS (
select id,starttime,date(starttime,'+1 day') totime,stoptime
from tracktime
UNION ALL
SELECT id,
date(starttime,'+1 day'),
date(totime,'+1 day'),
stoptime
FROM cte
WHERE date(starttime,'+1 day') < stoptime
)
SELECT strftime('%Y-%m-%d', starttime),(strftime('%s',CASE
WHEN totime > stoptime THEN stoptime
ELSE totime
END) -strftime('%s',starttime))/3600 diffHour
FROM cte;
| strftime('%Y-%m-%d', starttime) | diffHour |
| ------------------------------- | -------- |
| 2018-08-01 | 12 |
| 2018-09-05 | 1 |
| 2018-08-02 | 24 |
| 2018-08-03 | 12 |
View on DB Fiddle