I'm having troubles writing a query that would aggregate my results per one second. Here I'm creating an example table and make two inserts.
create table example (
start timestamp,
stop timestamp,
qty INTEGER
);
insert into example(start, stop, qty)
values ('2019-06-11 09:59:59', '2019-06-11 10:00:04', 14);
insert into example(start, stop, qty)
values ('2019-06-11 10:00:00', '2019-06-11 10:00:03', 12);
I need a query that would return me something like this:
or
Where 1,2,3,4,5 are seconds from the first 2 inserts. 09:59:59 to 10:00:04 gives 5 seconds.
and 14, 26, 26, 26, 14 is the sum of qty for the rows with the same date.
14 + 12 = 26 and hence this number. And this addition occurs only for the seconds that occure in the same moment.
Is such a query possible?
In Oracle SQL, you could do something like this:
WITH test_data AS (
SELECT to_date('2019-06-11 09:59:59', 'yyyy-mm-dd hh24:mi:ss') AS start_time, to_date('2019-06-11 10:00:04', 'yyyy-mm-dd hh24:mi:ss') AS end_time, 14 AS qty FROM dual UNION ALL
SELECT to_date('2019-06-11 10:00:00', 'yyyy-mm-dd hh24:mi:ss') AS start_time, to_date('2019-06-11 10:00:03', 'yyyy-mm-dd hh24:mi:ss') AS end_time, 12 AS qty FROM dual
), seconds_between_first_last AS (
SELECT MIN(t.start_time) AS first_start_time,
MAX(t.end_time) AS last_end_time,
(MAX(t.end_time) - MIN(t.start_time)) * (24*60*60) AS seconds_elapsed /* Get the number of seconds between the first start time and the last end time */
FROM test_data t
), second_rows AS (
SELECT LEVEL AS seconds_since_start,
d.first_start_time + ((LEVEL - 1) / (24*60*60)) AS target_time
FROM seconds_between_first_last d
CONNECT BY LEVEL <= d.seconds_elapsed /* Get one row for each second in the interval */
)
SELECT r.seconds_since_start,
COALESCE(SUM(d.qty), 0) AS total_qty_in_interval
FROM second_rows r
LEFT JOIN test_data d
ON d.start_time <= r.target_time
AND d.end_time > r.target_time
GROUP BY r.seconds_since_start
ORDER BY r.seconds_since_start
You can get the boundaries easily enough:
with ss as (
select start as ts, qty
from t
union all
select stop, -qty
from t
)
select ts, sum(qty) as day_qty,
sum(sum(qty)) over (order by ts) as running_qty
from ss
group by ts;
This has all the timestamps when something starts or stops. It does not "fill in" values. The best way to do that depends on the database.
In access we have to use a workaround. See the example below.
SELECT TimeStamp, (SELECT SUM(Value) AS Total FROM Table1 WHERE Table1.TImeStamp <= T1.TimeStamp) AS Total
FROM Table1 as T1;
Related
I have an Oracle table with time stamps and I need to check on all rows where the current row is bigger the the previous row by less than a minute and state the start and end time and if its bigger than a minute I need to start a new group as in the example below. (The table is ordered in ASC time
I have the table
ID
TIME (TIME STAMP)
11:33:03
11:34:01
11:34:40
11:35:59
11:38:00
11:38:50
I need to pull
Group number
start time
end time
1
11:33:03
11:34:40
2
11:35:59
11:35:59
3
11:38:00
11:38:50
You can use:
SELECT id,
grp,
MIN(time) AS start_time,
MAX(time) AS end_time
FROM (
SELECT id,
time,
SUM(grp_change) OVER (PARTITION BY id ORDER BY time) AS grp
FROM (
SELECT t.*,
CASE
WHEN time - LAG(time) OVER (PARTITION BY id ORDER BY time) <= INTERVAL '1' MINUTE
THEN 0
ELSE 1
END AS grp_change
FROM table_name t
)
)
GROUP BY id, grp;
Which, for the sample data:
CREATE TABLE table_name (ID, TIME) AS
SELECT 1, TIMESTAMP '2022-06-14 11:33:03' FROM DUAL UNION ALL
SELECT 1, TIMESTAMP '2022-06-14 11:34:01' FROM DUAL UNION ALL
SELECT 1, TIMESTAMP '2022-06-14 11:34:40' FROM DUAL UNION ALL
SELECT 1, TIMESTAMP '2022-06-14 11:35:59' FROM DUAL UNION ALL
SELECT 1, TIMESTAMP '2022-06-14 11:38:00' FROM DUAL UNION ALL
SELECT 1, TIMESTAMP '2022-06-14 11:38:50' FROM DUAL;
Outputs:
ID
GRP
START_TIME
END_TIME
1
2
2022-06-14 11:35:59.000000000
2022-06-14 11:35:59.000000000
1
3
2022-06-14 11:38:00.000000000
2022-06-14 11:38:50.000000000
1
1
2022-06-14 11:33:03.000000000
2022-06-14 11:34:40.000000000
db<>fiddle here
Ex:
Date - up,
1/2 - 1.1.127.0 ,
1/3 - 1.1.127.1,
1/3 - 1.1.127.0,
1/4 - 1.1.127.3,
1/4 - 1.1.127.5,
1/5 - 1.1.127.3,
Output:
Date-count,
1/2 - 1,
1/3 - 1,
1/4 - 2,
1/5 -0
New and unique ip logged in in each day
You want to count how many IPs exist for a date that have not occurred on a previous date. You want to use analytic functions for this.
The number of new IDs is the total number of distinct IDs on a date minus the number of the previous date. In order to get this, first select the running count per row. Then aggregate per date to get the distinct number of IDs per date. Then use LAG to get the difference per day.
select
date,
max(cnt) - lag(max(cnt)) over (order by date) as new_ips
from
(
select date, count(distinct ip) over (order by date) as cnt
from mytable
) running_counts
group by date
order by date;
The same without analytic functions, which is probably more readable:
select date, count(distinct ip) as cnt
from mytable
where not exists
(
select null
from mytable before
where before.date < mytable.date
and before.id = mytable.id
)
group by date
order by date;
The DISTINCT in this latter query is not necessary, if there can be no duplicates (two rows with the same date and IP) in the table.
You can also use below solution using left join.
with t (dt, ip) as (
select to_date( '1/2', 'MM/DD' ), '1.1.127.0' from dual union all
select to_date( '1/3', 'MM/DD' ), '1.1.127.1' from dual union all
select to_date( '1/3', 'MM/DD' ), '1.1.127.0' from dual union all
select to_date( '1/4', 'MM/DD' ), '1.1.127.3' from dual union all
select to_date( '1/4', 'MM/DD' ), '1.1.127.5' from dual union all
select to_date( '1/5', 'MM/DD' ), '1.1.127.3' from dual
)
select t.DT, count( decode(t2.IP, null, 1, null) ) cnt
from t
left join t t2
on ( t2.DT < t.DT and t2.IP = t.IP )
group by t.DT
order by 1
;
demo
I have a query like the following.
select some_date_col, count(*) as cnt
from <the table>
group by some_date_col
I get something like that at the output.
13-12-2021, 6
13-12-2021, 8
13-12-2021, 9
....
How is that possible? Here some_date_col is of type Date.
A DATE is a binary data-type that is composed of 7 bytes (century, year-of-century, month, day, hour, minute and second) and will always have those components.
The user interface you use to access the database can choose to display some or all of those components of the binary representation of the DATE; however, regardless of whether or not they are displayed by the UI, all the components are always stored in the database and used in comparisons in queries.
When you GROUP BY a date data-type you aggregate values that have identical values down to an accuracy of a second (regardless of the accuracy the user interface).
So, if you have the data:
CREATE TABLE the_table (some_date_col) AS
SELECT DATE '2021-12-13' FROM DUAL CONNECT BY LEVEL <= 6 UNION ALL
SELECT DATE '2021-12-13' + INTERVAL '1' SECOND FROM DUAL CONNECT BY LEVEL <= 8 UNION ALL
SELECT DATE '2021-12-13' + INTERVAL '1' MINUTE FROM DUAL CONNECT BY LEVEL <= 9;
Then the query:
SELECT TO_CHAR(some_date_col, 'YYYY-MM-DD HH24:MI:SS') AS some_date_col,
count(*) as cnt
FROM the_table
GROUP BY some_date_col;
Will output:
SOME_DATE_COL
CNT
2021-12-13 00:01:00
9
2021-12-13 00:00:01
8
2021-12-13 00:00:00
6
The values are grouped according to equal values (down to the maximum precision stored in the date).
If you want to GROUP BY dates with the same date component but any time component then use the TRUNCate function (which returns a value with the same date component but the time component set to midnight):
SELECT TRUNC(some_date_col) AS some_date_col,
count(*) as cnt
FROM <the table>
GROUP BY TRUNC(some_date_col)
Which, for the same data outputs:
SOME_DATE_COL
CNT
13-DEC-21
23
And:
SELECT TO_CHAR(TRUNC(some_date_col), 'YYYY-MM-DD HH24:MI:SS') AS some_date_col,
count(*) as cnt
FROM the_table
GROUP BY TRUNC(some_date_col)
Outputs:
SOME_DATE_COL
CNT
2021-12-13 00:00:00
23
db<>fiddle here
Oracle date type holds a date and time component. If the time components do not match, grouping by that value will place the same date (with different times) in different groups:
The fiddle
CREATE TABLE test ( xdate date );
INSERT INTO test VALUES (current_date);
INSERT INTO test VALUES (current_date + INTERVAL '1' MINUTE);
With the default display format:
SELECT xdate, COUNT(*) FROM test GROUP BY xdate;
Result:
XDATE
COUNT(*)
13-DEC-21
1
13-DEC-21
1
Now alter the format and rerun:
ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MON-DD HH24:MI:SS';
SELECT xdate, COUNT(*) FROM test GROUP BY xdate;
The result
XDATE
COUNT(*)
2021-DEC-13 23:29:36
1
2021-DEC-13 23:30:36
1
Also try this:
SELECT to_char(xdate, 'YYYY-MON-DD HH24:MI:SS') AS formatted FROM test;
Result:
FORMATTED
2021-DEC-13 23:29:36
2021-DEC-13 23:30:36
and this:
SELECT to_char(xdate, 'YYYY-MON-DD HH24:MI:SS') AS formatted, COUNT(*) FROM test GROUP BY xdate;
Result:
FORMATTED
COUNT(*)
2021-DEC-13 23:29:36
1
2021-DEC-13 23:30:36
1
I have this statement which returns values for the dates that exist in the table, the cte then just fills in the half hourly intervals.
with cte (reading_date) as (
select date '2020-11-17' from dual
union all
select reading_date + interval '30' minute
from cte
where reading_date + interval '30' minute < date '2020-11-19'
)
select c.reading_date, d.reading_value
from cte c
left join dcm_reading d on d.reading_date = c.reading_date
order by c.reading_date
However, later on I needed to use A SELECT within a SELECT like this:
SELECT serial_number,
register,
reading_date,
reading_value,,
ABS(A_plus)
FROM
(
SELECT
serial_number,
register,
TO_DATE(reading_date, 'DD-MON-YYYY HH24:MI:SS') AS reading_date,
reading_value,
LAG(reading_value,1, 0) OVER(ORDER BY reading_date) AS previous_read,
LAG(reading_value, 1, 0) OVER (ORDER BY reading_date) - reading_value AS A_plus,
reading_id
FROM DCM_READING
WHERE device_id = 'KXTE4501'
AND device_type = 'E'
AND serial_number = 'A171804699'
AND reading_date BETWEEN TO_DATE('17-NOV-2019' || ' 000000', 'DD-MON-YYYY HH24MISS') AND TO_DATE('19-NOV-2019' || ' 235959', 'DD-MON-YYYY HH24MISS')
ORDER BY reading_date)
ORDER BY serial_number, reading_date;
For extra information:
I am selecting data from a table that exists, and using lag function to work out difference in reading_value from previous record. However, later on I needed to insert dummy data where there are missing half hour reads. The CTE table brings back a list of all half hour intervals between the two dates I am querying on.
ultimately I want to get a result that has all the reading_dates in half hour, the reading_value (if there is one) and then difference between the reading_values that do exist. For the half hourly reads that don't have data returned from table DCM_READING I want to just return NULL.
Is it possible to use a CTE table with multiple selects?
Not sure what you would like to achieve, but you can have multiple CTEs or even nest them:
with
cte_1 as
(
select username
from dba_users
where oracle_maintained = 'N'
),
cte_2 as
(
select owner, round(sum(bytes)/1024/1024) as megabytes
from dba_segments
group by owner
),
cte_3 as
(
select username, megabytes
from cte_1
join cte_2 on cte_1.username = cte_2.owner
)
select *
from cte_3
order by username;
I have an events table with two columns eventkey (unique, primary-key) and createtime, which stores the creation time of the event as the number of milliseconds since Jan 1 1970 in a NUMBER column.
I would like to create a "histogram" or frequency distribution that shows me how many events were created in each hour of the past week.
Is this the best way to write such a query in Oracle, using the width_bucket() function? Is it possible to derive the number of rows that fall into each bucket using one of the other Oracle analytic functions rather than using width_bucket to determine what bucket number each row belongs to and doing a count(*) over that?
-- 1305504000000 = 5/16/2011 12:00am GMT
-- 1306108800000 = 5/23/2011 12:00am GMT
select
timestamp '1970-01-01 00:00:00' + numtodsinterval((1305504000000/1000 + (bucket * 60 * 60)), 'second') period_start,
numevents
from (
select bucket, count(*) as events from (
select eventkey, createtime,
width_bucket(createtime, 1305504000000, 1306108800000, 24 * 7) bucket
from events
where createtime between 1305504000000 and 1306108800000
) group by bucket
)
order by period_start
If your createtime were a date column, this would be trivial:
SELECT TO_CHAR(CREATE_TIME, 'DAY:HH24'), COUNT(*)
FROM EVENTS
GROUP BY TO_CHAR(CREATE_TIME, 'DAY:HH24');
As it is, casting the createtime column isn't too hard:
select TO_CHAR(
TO_DATE('19700101', 'YYYYMMDD') + createtime / 86400000),
'DAY:HH24') AS BUCKET, COUNT(*)
FROM EVENTS
WHERE createtime between 1305504000000 and 1306108800000
group by TO_CHAR(
TO_DATE('19700101', 'YYYYMMDD') + createtime / 86400000),
'DAY:HH24')
order by 1
If, alternatively, you're looking for the fencepost values (for example, where do I go from the first decile (0-10%) to the next (11-20%), you'd do something like:
select min(createtime) over (partition by decile) as decile_start,
max(createtime) over (partition by decile) as decile_end,
decile
from (select createtime,
ntile (10) over (order by createtime asc) as decile
from events
where createtime between 1305504000000 and 1306108800000
)
I'm unfamiliar with Oracle's date functions, but I'm pretty certain there's an equivalent way of writing this Postgres statement:
select date_trunc('hour', stamp), count(*)
from your_data
group by date_trunc('hour', stamp)
order by date_trunc('hour', stamp)
Pretty much the same response as Adam, but I would prefer to keep the period_start as a time field so it is easier to filter further if needed:
with
events as
(
select rownum eventkey, round(dbms_random.value(1305504000000, 1306108800000)) createtime
from dual
connect by level <= 1000
)
select
trunc(timestamp '1970-01-01 00:00:00' + numtodsinterval(createtime/1000, 'second'), 'HH') period_start,
count(*) numevents
from
events
where
createtime between 1305504000000 and 1306108800000
group by
trunc(timestamp '1970-01-01 00:00:00' + numtodsinterval(createtime/1000, 'second'), 'HH')
order by
period_start
Using oracle provided function "WIDTH_BUCKET" to accumulate continuous or fine-discrete data. The following example shows a way to create a histogram with 5 buckets and gather "COLUMN_VALUE" from 510 to 520 (so each bucket gets values of range 2). WIDTH_BUCKET will create additional id=0 and num_buckets+1 buckets for values below min and above max.
SELECT "BUCKET_ID", count(*),
CASE
WHEN "BUCKET_ID"=0 THEN -1/0F
ELSE 510+(520-510)/5*("BUCKET_ID"-1)
END "BUCKET_MIN",
CASE
WHEN "BUCKET_ID"=5+1 THEN 1/0F
ELSE 510+(520-510)/5*("BUCKET_ID")
END "BUCKET_MAX"
FROM
(
SELECT "COLUMN_VALUE",
WIDTH_BUCKET("COLUMN_VALUE", 510, 520, 5) "BUCKET_ID"
FROM "MY_TABLE"
)
group by "BUCKET_ID"
ORDER BY "BUCKET_ID";
Sample output
BUCKET_ID COUNT(*) BUCKET_MIN BUCKET_MAX
---------- ---------- ---------- ----------
0 45 -Inf 5.1E+002
1 220 5.1E+002 5.12E+002
2 189 5.12E+002 5.14E+002
3 43 5.14E+002 5.16E+002
4 3 5.16E+002 5.18E+002
In my table, there's no 518-520, so bucket with id=5 is not shown. On the other hand, there's values below min (510), so there's a bucket with id=0, gathering -inf to 510 values.