Group Timestamps into intervals of 5 minutes, take value that's closest to timestamp and always give out a value - sql

I'm new to SQL coding and would heavily appreciate help for a problem I'm facing. I have the following SQL script, that gives me the following output (see picture 1):
WITH speicher as(
select a.node as NODE_ID, d.name_0 as NODE_NAME, d.parent as PARENT_ID, c.time_stamp as ZEITSTEMPEL, c.value_num as WERT, b.DESCRIPTION_0 as Beschreibung, TO_CHAR(c.time_stamp, 'HH24:MI:SS') as Uhrzeit
from p_value_relations a, l_nodes d, p_values b, p_value_archive c
where a.node in (select sub_node from l_node_relations r where r.node in (
50028,
49989,
49848
))
and a.node = d.id
and (b."DESCRIPTION_0" like 'Name1' OR b."DESCRIPTION_0" like 'Name2')
and c.time_stamp between SYSDATE-30 AND SYSDATE-1
and a.value = b.id and b.id = c.value)
SELECT WERT as Value, NODE_NAME, ZEITSTEMPEL as Timestamp, Uhrzeit as Time, Beschreibung as Category
FROM speicher
I would like to create time intervals of 5 minutes to output the value. It should always choose the value closest above one on the defined time interval time stamps. If there is no value inside a set 5 minute intervall it should still give out the last value it finds, since the value has not changed in that case. To see what I mean please see the following picture. Any help wold be greatly appreciated. This data is from an oracle database.
Result until now [
Result I would like

Since I do not understand your data, and can't test with it, I present something I could test with. My data has a table which tracks when folks login to a system.
This is not intended as a complete answer, but as something to potentially point you in the right direction;
with time_range
as
(
select rownum, sysdate - (1/288)*rownum time_stamp
from dual
connect By Rownum <= 288*30
)
select time_stamp, min(LOGIN_TIME)
from time_range
left outer join WEB_SECURITY_LOGGED_IN on LOGIN_TIME >= time_stamp
group by time_stamp
order by 1;
Good luck...
Edit:
The with part of the query builds a time_stamp column which has one row for every 5 minutes for the last 30 days. The query portion joins to my login log table which I get the login which is the smallest date/time greater than the time_stamp.

Related

POSTGRES DATA_TRUNC should return 0 for intervals that has no data

I am trying to do a time series-like reporting, for that, I am using the Postgres DATA_TRUNC function, it works fine and I am getting the expected output, but when a specific interval has no record then it is getting skipped to show, but my expected output is to get the interval also with 0 as the count, below is the query that I have right now. What change I should do to get the intervals that have no data? Thanks in advance.
SELECT date_trunc('days', sent_at), count('*')
FROM (select * from invoice
WHERE supplier = 'ABC' and sent_at BETWEEN '2021-12-01' AND '2022-07-31') as inv
GROUP BY date_trunc('days', sent_at)
ORDER BY date_trunc('days', sent_at);
Expected: As you can see below, the current output now shows 02/12 and then 07/12, it has skipped dates in the middle, but for me, it should also show 03/12, 04/12, 05/12 with count as 0
Current output
It doesn't seem like you have those dates in your data, in which case you need to generate them. Also, casting your timestamp to date instead of date_trunc() can get rid of those zeroes.
SELECT dates::date, count(*) filter (where sent_at is not null)
FROM (
select *
from invoice a
right join generate_series( '2021-12-01'::date,
'2021-12-31'::date,
'1 day'::interval ) as b(dates)
on sent_at::date=b.dates) as inv
GROUP BY 1
ORDER BY 1;
Here's a working example. Also, please try to improve your question according to #nbk's comment.

sql query to get today new records compared with yesterday

i have this table:
COD (Integer) (PK)
ID (Varchar)
DATE (Date)
I just want to get the new ID's from today, compared with yesterday (the ID's from today that are not present yesterday)
This needs to be done with just one query, maximum efficiency because the table will have 4-5 millions records
As a java developer i am able to do this with 2 queries, but with just one is beyond my knowledge so any help would be so much appreciated
EDIT: date format is dd/mm/yyyy and every day each ID may come 0 or 1 times
Here is a solution that will go over the base data one time only. It selects the id and the date where the date is either yesterday or today (or both). Then it GROUPS BY id - each group will have either one or two rows. Then it filters by the condition that the MIN date in the group is "today". Those are the id's that exist today but did not exist yesterday.
DATE is an Oracle keyword, best not used as a column name. I changed that to DT. I also assume that your "dt" field is a pure date (as pure as it can be in Oracle, meaning: time of day, which is always present, is 00:00:00).
select id
from your_table
where dt in (trunc(sysdate), trunc(sysdate) - 1)
group by id
having min(dt) = trunc(sysdate)
;
Edit: Gordon makes a good point: perhaps you may have more than one such row per ID, in the same day? In that case the time-of-day may also be different from 00:00:00.
If so, the solution can be adapted:
select id
from your_table
where dt >= trunc(sysdate) - 1 and dt < trunc(sysdate) + 1
group by id
having min(dt) >= trunc(sysdate)
;
Either way: (1) the base table is read just once; (2) the column DT is not wrapped within any function, so if there is an index on that column, it can be used to access just the needed rows.
The typical method would use not exists:
select t.*
from t
where t.date >= trunc(sysdate) and t.date < trunc(sysdate + 1) and
not exists (select 1
from t t2
where t2.id = t.id and
t2.date >= trunc(sysdate - 1) and t2.date < trunc(sysdate)
);
This is a general solution. If you know that there is at most one record per day, there are better solutions, such as using lag().
Use MINUS. I suppose your date column has a time part, so you need to truncate it.
select id from mytable where trunc(date) = trunc(sysdate)
minus
select id from mytable where trunc(date) = trunc(sysdate) - 1;
I suggest the following function index. Without it, the query would have to full scan the table, which would probably be quite slow.
create idx on mytable( trunc(sysdate) , id );

How to compare time stamps from consecutive rows

I have a table that I would like to sort by a timestamp desc and then compare all consecutive rows to determine the difference between each row. From there, I would like to find all the rows whose difference is greater than ~2hours.
I'm stuck on how to actually compare consecutive rows in a table. Any help would be much appreciated.
I'm using Oracle SQL Developer 3.2
You didn't show us your table definition, but something like this:
select *
from (
select t.*,
t.timestamp_column,
t.timestamp_column - lag(timestamp_column) over (order by timestamp_column) as diff
from the_table t
) x
where diff > interval '2' hour;
This assumes that timestamp_column is defined as timestamp not date (otherwise the result of the difference wouldn't be an interval)

generate each minute string for a day within specified time limit

My aim is to generate per minute count of all records existing in a table like this.
SELECT
COUNT(*) as RECORD_COUNT,
to_Char(MY_DATE,'HH24:MI') MINUTE_GAP
FROM
TABLE_A
WHERE
BLAH='Blah! Blah!!'
GROUP BY
to_Char(MY_DATE,'HH24:MI')
However, This query doesn't give me the minutes where there were no results.
To get the desired result it, I'm to using the following query to fill the gaps in the original query by doing a JOIN between these two results.
SELECT
*
FROM
( SELECT
TO_CHAR(TRUNC(SYSDATE)+( (ROWNUM-1) /1440) ,'HH24:MI') as MINUTE_GAP,
0 as COUNT
FROM
SOME_LARGE_TABLE_B
WHERE
rownum<=1440
)
WHERE
minute_gap>'07:00' /*I want only the data starting from 7:00AM*/
This works for me, But
I can't rely on SOME_LARGE_TABLE_B to generate the minutes
because it might have no records at some point in future
The query doesn't look like a professional solution.
Is there any easier way to do this?
NOTE:I don't want any new tables created with static values for all the minutes just for one query.
Just generate your timestamps and left join your grouped data to it:
SELECT MINUTE, ....
FROM (
SELECT TO_CHAR(TO_DATE((LEVEL + 419) * 60, 'SSSSS'), 'HH24:MI') MINUTE /* 07:00 - 23:59 */ FROM DUAL CONNECT BY LEVEL <= 1020)
LEFT JOIN (
<your grouped subquery>
) ON MINUTE = MINUTE_GAP

Sqlite3: Need to Cartesian On date

I have a table which is a list of games that have been played in a sqlite3 database. The field "datetime" is the a datetime of when game ended. The field "duration" is the number of seconds the game lasted. I want to know what percent of the past 24 hours had at least 5 games running simutaniously. I figured out to tell how many games running at a given time are:
select count(*)
from games
where strftime('%s',datetime)+0 >= 1257173442 and
strftime('%s',datetime)-duration <= 1257173442
If I had a table that was simply a list of every second (or every 30 seconds or something) I could do an intentional cartisian product like this:
select count(*)
from (
select count(*) as concurrent, d.second
from games g, date d
where strftime('%s',datetime)+0 >= d.second and
strftime('%s',datetime)-duration <= d.second and
d.second >= strftime('%s','now') - 24*60*60 and
d.second <= strftime('%s','now')
group by d.second) x
where concurrent >=5
Is there a way to create this date table on the fly? Or that I can get a similar effect to this without having to actually create a new table that is simply a list of all the seconds this week?
Thanks
First, I can't think of a way to approach your problem by creating a table on the fly or without the aid of an extra table. Sorry.
My suggestion is for you to rely on a static Numbers table.
Create a fixed table with the format:
CREATE TABLE Numbers (
number INTEGER PRIMARY KEY
);
Populate it with the number of seconds in 24h (24*60*60 = 84600). I would use any scripting language to do that using the insert statement:
insert into numbers default values;
Now the Numbers table has the numbers 1 through 84600. Your query will them be modified to be:
select count(*)
from (
select count(*) as concurrent, strftime('%s','now') - 84601 + n.number second
from games g, numbers n
where strftime('%s',datetime)+0 >= strftime('%s','now') - 84601 + n.number and
strftime('%s',datetime)-duration <= strftime('%s','now') - 84601 + n.number
group by second) x
where concurrent >=5
Without a procedural language in the mix, that is the best you'll be able to do, I think.
Great question!
Here's a query that I think gives you what you want without using a separate table. Note this is untested (so probably contains errors) and I've assumed datetime is an int column with # of seconds to avoid a ton of strftime's.
select sum(concurrent_period) from (
select min(end_table.datetime - begin_table.begin_time) as concurrent_period
from (
select g1.datetime, g1.num_end, count(*) as concurrent
from (
select datetime, count(*) as num_end
from games group by datetime
) g1, games g2
where g2.datetime >= g1.datetime and
g2.datetime-g2.duration < g1.datetime and
g1.datetime >= strftime('%s','now') - 24*60*60 and
g1.datetime <= strftime('%s','now')+0
) end_table, (
select g3.begin_time, g1.num_begin, count(*) as concurrent
from (
select datetime-duration as begin_time,
count(*) as num_begin
from games group by datetime-duration
) g3, games g4
where g4.datetime >= g3.begin_time and
g4.datetime-g4.duration < g3.begin_time and
g3.begin_time >= strftime('%s','now') - 24*60*60 and
g3.begin_time >= strftime('%s','now')+0
) begin_table
where end_table.datetime > begin_table.begin_time
and begin_table.concurrent < 5
and begin_table.concurrent+begin_table.num_begin >= 5
and end_table.concurrent >= 5
and end_table.concurrent-end_table.num_end < 5
group by begin_table.begin_time
) aah
The basic idea is to make two tables: one with the # of concurrent games at the begin time of each game, and one with the # of concurrent games at the end time. Then join the tables together and only take rows at "critical points" where # of concurrent games crosses 5. For each critical begin time, take the critical end time that happened soonest and that hopefully gives all the periods where at least 5 games were running concurrently.
Hope that's not too convoluted to be helpful!
Kevin rather beat me to the punchline there (+1), but I'll post this variation as it differs at least a little.
The key ideas are
Map the data in to a stream of events with attributes time and 'polarity' (=start or end of game)
Keep a running total of how many games are open at the time of each event
(this is done by forming a self-join on the event stream)
Find the event times where the number of games (as Kevin says) transitions up to 5, or down to 4
A little trick: add up all the down-to-4 times and take away the up-to-5s - the order is not important
The result is the number of seconds spent with 5 or more games open
I don't have sqllite, so I've been testing with MySQL, and I've not bothered to limit the time window to preserve some sanity. Shouldn't be difficult to revise.
Also, and more importantly, I've not considered what to do if games are open at the beginning or end of the period!
Something tells me there's a big simplification to be had here, but I've not spotted it yet.
SELECT SUM( event_time )
FROM (
SELECT -ga.event_type * ga.event_time AS event_time,
SUM( ga.event_type * gb.event_type ) event_type
FROM
( SELECT UNIX_TIMESTAMP( g1.endtime - g1.duration ) AS event_time
, 1 event_type
FROM games g1
UNION
SELECT UNIX_TIMESTAMP( g1.endtime )
, -1
FROM games g1 ) AS ga,
( SELECT UNIX_TIMESTAMP( g1.endtime - g1.duration ) AS event_time
, 1 event_type
FROM games g1
UNION
SELECT UNIX_TIMESTAMP( g1.endtime )
, -1
FROM games g1 ) AS gb
WHERE
ga.event_time >= gb.event_time
GROUP BY ga.event_time
HAVING SUM( ga.event_type * gb.event_type ) IN ( -4, 5 )
) AS gr
Why don't you trim the date and keep only the time, if you filter your data for any given date every time is unique. In this way you'll only need a table with numbers from 1 to 86400 (or less if you take bigger intervals), you may create two columns, "from" and "to" to define the intervals.
I'm not familiar with SQLite functions but according to the manual you have to use the strftime function with this format: HH:MM:SS.