Query recent records with separate date and time fields

Query recent records with separate date and time fields - sql

I'm working with a table in SAP Advantage with separate date and time fields. I want to find records with a date and time within the last 5 minutes.
This works 99% of the time:
SELECT
*
FROM
table_name
WHERE
TIMESTAMPDIFF(SQL_TSI_DAY, date_field, CURRENT_TIMESTAMP()) < 1
AND
TIMESTAMPDIFF(SQL_TSI_MINUTE, time_field, CURRENT_TIMESTAMP()) < 5
However, this won't work around midnight. For instance, at 12:00AM, any records created at 11:57PM the previous day won't match the filter.
Any idea to do this? Thanks!
Sample image of data. Based on this data, at 7/12/19 at 12:01AM, I'd like to return the last 2 rows.
Created: 7/11/2019 22:54:43
Item EmpNo LastName FirstName date_field time_field
--------------------------------------------------------------------------
1 2 Nelson Roberto 7/11/2019 21:00:00
2 4 Young Bruce 7/11/2019 22:00:00
3 5 Lambert Kim 7/11/2019 23:00:00
4 8 Johnson Leslie 7/11/2019 23:56:00
5 9 Forest Phil 7/12/2019 00:00:00

The easiest way is to recombine the fields and then use TIMESTAMPDIFF():
TRY DROP TABLE #test; CATCH ALL END TRY;
CREATE TABLE #test
(
date_field DATE
, time_field TIME
);
INSERT INTO #test
SELECT '2019-07-11', '21:00:00' FROM system.iota
UNION SELECT '2019-07-11', '22:00:00' FROM system.iota
UNION SELECT '2019-07-11', '23:00:00' FROM system.iota
UNION SELECT '2019-07-11', '23:56:00' FROM system.iota
UNION SELECT '2019-07-12', '00:00:00' FROM system.iota
;
SELECT
TIMESTAMPDIFF(SQL_TSI_MINUTE,
CREATETIMESTAMP(
YEAR(date_field)
, MONTH(date_field)
, DAY(date_field)
, HOUR(time_field)
, MINUTE(time_field)
, SECOND(time_field)
, 0
)
, DATETIME'2019-07-12T00:00:00' -- CURRENT_TIMESTAMP()
)
FROM #test;
Which gives the expected result of:
180
120
4
0
It would be even more trivial if ADS supported an operator or a function to directly combine a date and a time, but I can't find one in the documentation.
So if you integrate that into your original SQL code, it would be:
SELECT
*
FROM
table_name
WHERE
TIMESTAMPDIFF(SQL_TSI_MINUTE,
CREATETIMESTAMP(
YEAR(date_field)
, MONTH(date_field)
, DAY(date_field)
, HOUR(time_field)
, MINUTE(time_field)
, SECOND(time_field)
, 0
)
, CURRENT_TIMESTAMP()
) < 5

Related

Get cumulative distinct count of active ids(ids where deleted date is null as of/before the modified date)

I am facing a problem while getting the cumulative distinct count of resource ids as of different modified dates in vertica. If you see the below table I have resource id, modified date and deleted date and I want to calculate the count of distinct active resources as of all unique modified dates. A resource is considered active when deleted date is null as of/before that modified date.
I was able to get the count when for a particular resource lets say resource id 1 the active count(deleted date null) or inactive count(deleted date not null) dont occur consecutively.
But when they occur consecutively I have to take the count as 1 till it becomes inactive and then I have to consider count as 0 for that resource id when it becomes inactive and all consecutive inactive values till it becomes active again. Likewise for all the distinct resource ids and cumulative sum of those.
sa_resource_id
modified_date
deleted_Date
1
2022-01-22 15:46:06.758
2
2022-01-22 15:46:06.758
16
2022-04-22 15:46:06.758
17
2022-04-22 15:46:06.758
18
2022-04-22 15:46:06.758
16
2022-04-29 15:46:06.758
2022-04-29 15:46:06.758
17
2022-04-29 15:46:06.758
2022-04-29 15:46:06.758
1
2022-05-22 15:46:06.758
2022-05-22 15:46:06.758
2
2022-05-22 15:46:06.758
2022-05-22 15:46:06.758
1
2022-05-23 22:16:06.758
1
2022-05-24 22:16:06.758
2022-05-24 22:16:06.758
1
2022-05-25 22:16:06.758
1
2022-05-27 22:16:06.758
This is the partition and sum query I have tried out where I partition the table based on resource ids and do sum over different modified dates.
SELECT md,
dca_agent_count
FROM
(
SELECT modified_date AS md,
SUM(SUM(CASE WHEN deleted_Date IS NULL THEN 1
WHEN deleted_Date IS NOT NULL THEN -1 ELSE 0
END)) OVER (ORDER BY modified_date) AS dca_agent_count
FROM
(
SELECT sa_resource_id,
modified_date,
deleted_Date,
ROW_NUMBER() OVER (
PARTITION BY sa_Resource_id, deleted_Date
ORDER BY modified_date desc
) row_num
FROM mf_Shared_provider_Default.dca_entity_resource_raw
WHERE sa_ResourcE_id IS NOT NULL
AND sa_resource_id IN ('1','2','34','16','17','18')
) t
GROUP BY modified_date
ORDER BY modified_Date
) b
Current Output:
md
dca_agent_count
2022-01-22 15:46:06.758
2
2022-04-22 15:46:06.758
5
2022-04-29 15:46:06.758
3
2022-05-22 15:46:06.758
1
2022-05-23 22:16:06.758
2
2022-05-24 22:16:06.758
1
2022-05-25 22:16:06.758
2
2022-05-27 22:16:06.758
3
If you see the output above all the values are correct except for the last row 27-05-2022 where i need to get count 2 only instead of 3
How do I get the cumulative distinct count of sa resource ids as of the modified dates based on deleted date condition(null/not null) and count should not change when deleted date (null/not null) occur consecutively

To me, a DATE has no hours, minutes, seconds, let alone second fractions, so I renamed the time containing attributes to %_ts, as they are TIMESTAMPs.
I had to completely start from scratch to solve it.
I think this is the first problem I had to solve with as much as 5 Common Table Expressions:
Add a Boolean is_active that is never NULL
Add the previous obtained is_active using LAG(). NULL here means there is no predecessor for the same resource id.
remove the rows whose previous is_active is equal to the current is_active.
UNION SELECT the positive COUNT DISTINCTs of the active rows and the negative COUNT DISTINCTs of the inactive rows. This also removes the last timestamp.
get the distinct timestamps from the original input for the final query
The final query takes CTE 5 and LEFT JOINs it with CTE 4, making a running sum of the obtained distinct counts.
Here goes:
WITH
-- not part of the final query: this is your input data
indata(sa_resource_id,modified_ts,deleted_ts) AS (
SELECT 1,TIMESTAMP '2022-01-22 15:46:06.758',NULL
UNION ALL SELECT 2,TIMESTAMP '2022-01-22 15:46:06.758',NULL
UNION ALL SELECT 16,TIMESTAMP '2022-04-22 15:46:06.758',NULL
UNION ALL SELECT 17,TIMESTAMP '2022-04-22 15:46:06.758',NULL
UNION ALL SELECT 18,TIMESTAMP '2022-04-22 15:46:06.758',NULL
UNION ALL SELECT 16,TIMESTAMP '2022-04-29 15:46:06.758',TIMESTAMP '2022-04-29 15:46:06.758'
UNION ALL SELECT 17,TIMESTAMP '2022-04-29 15:46:06.758',TIMESTAMP '2022-04-29 15:46:06.758'
UNION ALL SELECT 1,TIMESTAMP '2022-05-22 15:46:06.758',TIMESTAMP '2022-05-22 15:46:06.758'
UNION ALL SELECT 2,TIMESTAMP '2022-05-22 15:46:06.758',TIMESTAMP '2022-05-22 15:46:06.758'
UNION ALL SELECT 1,TIMESTAMP '2022-05-23 22:16:06.758',NULL
UNION ALL SELECT 1,TIMESTAMP '2022-05-24 22:16:06.758',TIMESTAMP '2022-05-24 22:16:06.758'
UNION ALL SELECT 1,TIMESTAMP '2022-05-25 22:16:06.758',NULL
UNION ALL SELECT 1,TIMESTAMP '2022-05-27 22:16:06.758',NULL
)
-- real query starts here, replace the following comma with "WITH" ...
,
-- need a "active flag" that is never null
w_active_flag AS (
SELECT
*
, (deleted_ts IS NULL) AS is_active
FROM indata
)
,
-- need current and previous is_active to filter ..
w_prev_flag AS (
SELECT
*
, LAG(is_active) OVER w AS prev_flag
FROM w_active_flag
WINDOW w AS(PARTITION BY sa_resource_id ORDER BY modified_ts)
)
,
-- use obtained filter arguments to filter out two consecutive
-- active or non-active rows for same sa_resource_id
-- this can remove timestamps from the final result
de_duped AS (
SELECT
sa_resource_id
, modified_ts
, is_active
FROM w_prev_flag
WHERE prev_flag IS NULL OR prev_flag <> is_active
)
-- get count distinct "sa_resource_id" only now
,
grp AS (
SELECT
modified_ts
, COUNT(DISTINCT sa_resource_id) AS dca_agent_count
FROM de_duped
WHERE is_active
GROUP BY modified_ts
UNION ALL
SELECT
modified_ts
, COUNT(DISTINCT sa_resource_id) * -1 AS dca_agent_count
FROM de_duped
WHERE NOT is_active
GROUP BY modified_ts
)
,
-- get back all input timestamps in a help table
tslist AS (
SELECT DISTINCT
modified_ts
FROM indata
)
SELECT
tslist.modified_ts
, SUM(NVL(dca_agent_count,0)) OVER w AS dca_agent_count
FROM tslist LEFT JOIN grp USING(modified_ts)
WINDOW w AS (ORDER BY tslist.modified_ts);
-- out modified_ts | dca_agent_count
-- out -------------------------+-----------------
-- out 2022-01-22 15:46:06.758 | 2
-- out 2022-04-22 15:46:06.758 | 5
-- out 2022-04-29 15:46:06.758 | 3
-- out 2022-05-22 15:46:06.758 | 1
-- out 2022-05-23 22:16:06.758 | 2
-- out 2022-05-24 22:16:06.758 | 1
-- out 2022-05-25 22:16:06.758 | 2
-- out 2022-05-27 22:16:06.758 | 2

SQL (Vertica) - Calculate number of users who returned to the app at least x days in the past 7 days

Suppose I have my table like:
uid day_used_app
--- -------------
1 2012-04-28
1 2012-04-29
1 2012-04-30
2 2012-04-29
2 2012-04-30
2 2012-05-01
2 2012-05-21
2 2012-05-22
Suppose I want the number of unique users who returned to the app at least 2 different days in the last 7 days (from 2012-05-03).
So as an example to retrieve the number of users who have used the application on at least 2 different days in the past 7 days:
select count(distinct case when num_different_days_on_app >= 2
then uid else null end) as users_return_2_or_more_days
from (
select uid,
count(distinct day_used_app) as num_different_days_on_app
from table
where day_used_app between current_date() - 7 and current_date()
group by 1
)
This gives me:
users_return_2_or_more_days
---------------------------
2
The question I have is:
What if I want to do this for every day up to now so that my table looks like this, where the second field equals the number of unique users who returned 2 or more different days within a week prior to the date in the first field.
date users_return_2_or_more_days
-------- ---------------------------
2012-04-28 2
2012-04-29 2
2012-04-30 3
2012-05-01 4
2012-05-02 4
2012-05-03 3

Would this help?
WITH
-- your original input, don't use in "real" query ...
input(uid,day_used_app) AS (
SELECT 1,DATE '2012-04-28'
UNION ALL SELECT 1,DATE '2012-04-29'
UNION ALL SELECT 1,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-04-29'
UNION ALL SELECT 2,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-05-01'
UNION ALL SELECT 2,DATE '2012-05-21'
UNION ALL SELECT 2,DATE '2012-05-22'
)
-- end of input, start "real" query here, replace ',' with 'WITH'
,
one_week_b4 AS (
SELECT
uid
, day_used_app
, day_used_app -7 AS day_used_1week_b4
FROM input
)
SELECT
one_week_b4.uid
, one_week_b4.day_used_app
, count(*) AS users_return_2_or_more_days
FROM one_week_b4
JOIN input
ON input.day_used_app BETWEEN one_week_b4.day_used_1week_b4 AND one_week_b4.day_used_app
GROUP BY
one_week_b4.uid
, one_week_b4.day_used_app
HAVING count(*) >= 2
ORDER BY 1;
Output is:
uid|day_used_app|users_return_2_or_more_days
1|2012-04-29 | 3
1|2012-04-30 | 5
2|2012-04-29 | 3
2|2012-04-30 | 5
2|2012-05-01 | 6
2|2012-05-22 | 2
Does that help your needs?
Marco the Sane ...

SELECT DISTINCT
t1.day_used_app,
(
SELECT SUM(CASE WHEN t.num_visits >= 2 THEN 1 ELSE 0 END)
FROM
(
SELECT uid,
COUNT(DISTINCT day_used_app) AS num_visits
FROM table
WHERE day_used_app BETWEEN t1.day_used_app - 7 AND t1.day_used_app
GROUP BY uid
) t
) AS users_return_2_or_more_days
FROM table t1

Finding Avg of following dataset

Following is the data.
select * from (
select to_date('20140601','YYYYMMDD') log_date, null weight from dual
union
select to_date('20140601','YYYYMMDD')+1 log_date, 0 weight from dual
union
select to_date('20140601','YYYYMMDD')+2 log_date, 4 weight from dual
union
select to_date('20140601','YYYYMMDD')+3 log_date, 4 weight from dual
union
select to_date('20140601','YYYYMMDD')+4 log_date, null weight from dual
union
select to_date('20140601','YYYYMMDD')+5 log_date, 8 weight from dual);
Log_date weight avg_weight
----------------------------------
6/1/2014 NULL 0 (0/1) Since no previous data, I consider it as 0
6/2/2014 0 0 ((0+0)/2)
6/3/2014 4 4/3 ((0+0+4)/3)
6/4/2014 4 2 (0+0+4+4)/4
6/5/2014 NULL 2 (0+0+4+4+2)/5 Since it is NULL I want to take previous day avg = 2
6/6/2014 8 3 (0+0+4+4+2+8)/6 =3
So the average for the above data should be 3.
How can I achieve this in SQL instead of PLSQL. Appreciate any help on this.

I just learned how to use recursive CTEs today, really excited! Hope this helps...
; WITH RawData (log_Date, Weight) AS (
select cast('2014-06-01' as SMALLDATETIME)+0, null
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+1, 0
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+2, 4
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+3, 4
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+4, null
UNION ALL select cast('2014-06-01' as SMALLDATETIME)+5, 8
)
, IndexedData (Id, log_Date, Weight) AS (
SELECT ROW_NUMBER() OVER (ORDER BY log_Date)
, log_Date
, Weight
FROM RawData
)
, ResultData (Id, log_Date, Weight, total, avg_weight) AS (
SELECT Id
, log_Date
, Weight
, CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
, CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
FROM IndexedData
WHERE Id = 1
UNION ALL
SELECT i.Id
, i.log_Date
, i.Weight
, CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT)
, CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT) / i.Id
FROM ResultData r
JOIN IndexedData i ON i.Id = r.Id + 1
)
SELECT Log_Date, Weight, avg_weight FROM ResultData
OPTION (MAXRECURSION 0)
This gives the output:
Log_Date Weight avg_weight
----------------------- ----------- ----------------------
2014-06-01 00:00:00 NULL 0
2014-06-02 00:00:00 0 0
2014-06-03 00:00:00 4 1.33333333333333
2014-06-04 00:00:00 4 2
2014-06-05 00:00:00 NULL 2
2014-06-06 00:00:00 8 3
Note that in my answer, I modified the "Data" section of your question as it didn't compile for me. It's still the same data though, hope it helps.
Edit: By default, MAXRECURSION is set to 100. This means that the query will not work for more than 101 rows of Raw Data. By adding the OPTION (MAXRECURSION 0), I have removed this limit so that the query works for all input data. However, this can be dangerous if the query isn't tested thoroughly because it might lead to infinite recursion.

How to make a time dependent distribution in SQL?

I have an SQL Table in which I keep project information coming from primavera.
Suppose that i have columns for Start Date,End Date,Duration, and Total Qty as shown below .
How can i distribute Total Qty over Months using these information. What kind of additional columns, sql queries i need in order to get correct monthly distribution?
Thanks in Advance.
Columns in order:
itemname,quantity,startdate,duration,enddate
item1 -- 108 -- 2013-03-25 -- 720 -- 2013-07-26
item2 -- 640 -- 2013-03-25 -- 720 -- 2013-07-26
.
.

I think the key is to break the records apart by month. Here is an example of how to do it:
with months as (
select 1 as mon union all select 2 union all select 3 union all
select 4 as mon union all select 5 union all select 6 union all
select 7 as mon union all select 8 union all select 9 union all
select 10 as mon union all select 11 union all select 12
)
select item, m.mon, quantity / nummonths
from (select t.*, (month(enddate) - month(startdate) + 1) as nummonths
from t
) t join
months m
on month(t.startDate) <= m.mon and
months(t.endDate) >= m.mon;
This works because all the months are within the same year -- as in your example. You are quite vague on how the split should be calculated. So, I assumed that every month from the start to the end gets an equal amount.

SQL for grouping/compressing by time span for a report dynamically

How can I compress / aggregate / group a table with events dynamically over time.
I have a table with values and time of occurrence.
Something like this:
value_col time_col
3 | 2011-02-16 22:21:05.250
2 | 2011-02-16 21:21:06.170
15 | 2011-02-16 21:21:05.250
I need to aggregate the values by a given time span (e.g. hourly) starting from the first row (latest event). So in this example I want to end up with two rows for hourly aggregation.
5
15
So if a new value comes in:
value_col time_col
6 | 2011-02-16 23:21:05.247
3 | 2011-02-16 22:21:05.250
2 | 2011-02-16 21:21:06.170
15 | 2011-02-16 21:21:05.250
If I would run that query again I want to end up with:
9
17
It should be easy to change the time span in the query. For example compress over the last 30 seconds, past 6 hours, past 24 hours , etc.. How can I do that in oracle and MS SQL?

Thanks to the previous answers I got the idea on how to fulfill all the requirements.
For each record I calculate the time difference to the latest record in milliseconds (or seconds, depending on resolution). I then modulo the difference with the time span that I am currently interested in (e.g. 3600 sec = 1 h).
Then I add that value to the time_col of the same record and group over that.
Create table:
CREATE TABLE [dbo].[test_table](
[value_col] [int] NOT NULL,
[time_col] [datetime] NOT NULL
) ON [PRIMARY]
GO
INSERT [dbo].[test_table] ([value_col], [time_col]) VALUES (3, CAST(0x00009E8C01705737 AS DateTime))
INSERT [dbo].[test_table] ([value_col], [time_col]) VALUES (2, CAST(0x00009E8C015FDD8B AS DateTime))
INSERT [dbo].[test_table] ([value_col], [time_col]) VALUES (15, CAST(0x00009E8C015FDC77 AS DateTime))
INSERT [dbo].[test_table] ([value_col], [time_col]) VALUES (6, CAST(0x00009E8C0180D1F6 AS DateTime))
Solution for SQL:
SELECT SUM(value_col) AS s_val, aggregation_time FROM
(SELECT value_col, time_col,
DATEADD(millisecond,DATEDIFF(millisecond,time_col,(SELECT MAX(time_col)
FROM test_table)) % (3600 * 1000), time_col) AS aggregation_time
FROM test_table)
GROUP BY aggregation_time
ORDER BY aggregation_time DESC
Solution for Oracle:
SELECT SUM(value_col) as s_val, aggregation_time FROM
(SELECT value_col, time_col +
(MOD(ROUND(((CAST((SELECT MAX(time_col) FROM test_table) AS DATE ) -
CAST(time_col AS DATE ))*86400),0),3600))/86400 as aggregation_time
FROM test_table l)
GROUP BY aggregation_time
ORDER BY aggregation_time DESC
If I want to aggregate over the last 2 h I just change 3600 to 7200 seconds.
The result is:
9 2011-02-16 23:21:05.247
17 2011-02-16 22:21:05.247

a b
3 | 2011-02-16 23:21:05.250
2 | 2011-02-16 22:21:05.267
15 | 2011-02-16 22:21:05.155
with tmp as (
select a, to_char(b, 'YYYYMMDDHH24') h from tab
)
select sum(a), h from tmp group by h
/

Here's an example how to aggregate hourly:
SELECT TO_CHAR(TRUNC(a.created, 'HH24'), 'DD.MM.YYYY HH24:MI'), COUNT(*)
FROM all_objects a
GROUP BY TRUNC(a.created, 'HH24');
This gives you the number of objects from all_objects aggregated hourly by their creation time. The key is TRUNC(column, 'HH24') which aggregates your data hourly.
In your case, something like this:
create table t (i int, d date);
insert into t values (3, to_date('2011-02-16 22:21:05', 'YYYY-MM-DD HH24:MI:SS'));
insert into t values (2, to_date('2011-02-16 21:21:05', 'YYYY-MM-DD HH24:MI:SS'));
insert into t values (15, to_date('2011-02-16 21:21:05', 'YYYY-MM-DD HH24:MI:SS'));
commit;
select sum(i), TO_CHAR(TRUNC(t.d, 'HH24'), 'DD.MM.YYYY HH24:MI') from t group by TRUNC(t.d, 'HH24');

Here is an Oracle variant, using only one table access.
SQL> create table t (value,mydate)
2 as
3 select 3, to_timestamp('2011-02-16 22:21:05.250','yyyy-mm-dd hh24:mi:ss.ff3') from dual union all
4 select 2, to_timestamp('2011-02-16 21:21:05.267','yyyy-mm-dd hh24:mi:ss.ff3') from dual union all
5 select 15, to_timestamp('2011-02-16 21:21:05.155','yyyy-mm-dd hh24:mi:ss.ff3') from dual
6 /
Table created.
The next query groups by difference in hours, counted from the most recent timestamp, which seems to be what you want:
SQL> select sum(value)
2 from ( select extract(hour from (max(mydate) over () - mydate)) difference_in_hours
3 , value
4 from t
5 )
6 group by difference_in_hours
7 order by difference_in_hours
8 /
SUM(VALUE)
----------
5
15
2 rows selected.
But apparently your example is not accurate, because when I add the fourth row from your example, the 15 value is more than two hours away from the most recent timestamp, which leads to an extra group:
SQL> insert into t values (6,to_timestamp('2011-02-16 23:21:05.249','yyyy-mm-dd hh24:mi:ss.ff3'))
2 /
1 row created.
SQL> select sum(value)
2 from ( select extract(hour from (max(mydate) over () - mydate)) difference_in_hours
3 , value
4 from t
5 )
6 group by difference_in_hours
7 order by difference_in_hours
8 /
SUM(VALUE)
----------
9
2
15
3 rows selected.
So did I misinterpret your requirement or do you have a mistake in your example?
Regards,
Rob.

For SQLServer you will have something like
SELECT DATEDIFF(hour,b.date_time_col,a.dt), SUM(b.id)
FROM (SELECT MAX(date_time_col) as dt FROM table1)a,
table1 b
GROUP BY DATEDIFF(hour,b.date_time_col,a.dt)
Oracle doesn't have DATE_DIFF, equivalent will be TRUNC(24*(a.dt-b.date_time_col))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Query recent records with separate date and time fields - sql

Related

Get cumulative distinct count of active ids(ids where deleted date is null as of/before the modified date)

SQL (Vertica) - Calculate number of users who returned to the app at least x days in the past 7 days

Finding Avg of following dataset

How to make a time dependent distribution in SQL?

SQL for grouping/compressing by time span for a report dynamically

Categories

Resources