I am using the DATEDIFF function to calculate the difference between my two timestamps.
payment_time = 2021-10-29 07:06:32.097332
trigger_time = 2021-10-10 14:11:13
What I have written is : date_diff('minute',payment_time,trigger_time) <= 15
I basically want the count of users who paid within 15 mins of the triggered time
thus I have also done count(s.user_id) as count
However it returns count as 1 even in the above case since the minutes are within 15 but the dates 10th October and 29th October are 19 days apart and hence it should return 0 or not count this row in my query.
How do I compare the dates in my both columns and then count users who have paid within 15 mins?
This also works to calculate minutes between to timestamps (it first finds the interval (subtraction), and then converts that to seconds (extracting EPOCH), and divides by 60:
extract(epoch from (payment_time-trigger_time))/60
In PostgreSQL, I prefer to subtract the two timestamps from each other, and extract the epoch from the resulting interval:
Like here:
WITH
indata(payment_time,trigger_time) AS (
SELECT TIMESTAMP '2021-10-29 07:06:32.097332',TIMESTAMP '2021-10-10 14:11:13'
UNION ALL SELECT TIMESTAMP '2021-10-29 00:00:14' ,TIMESTAMP '2021-10-29 00:00:00'
)
SELECT
EXTRACT(EPOCH FROM payment_time-trigger_time) AS epdiff
, (EXTRACT(EPOCH FROM payment_time-trigger_time) <= 15) AS filter_matches
FROM indata;
-- out epdiff | filter_matches
-- out ----------------+----------------
-- out 1616119.097332 | false
-- out 14.000000 | true
Related
I am calculating a TIMESTAMPDIFF from timestamps that can have a fairly large range of time intervals between them, from a few tenths of a second to 60+mins. Since the DB2 TIMESTAMPDIFF() function in DB2 returns an integer as a result, I am using microseconds as my numeric interval expression. TIMESTAMPDIFF DB2 documentation states:
Microseconds (the absolute value of the duration must be less than 3547.483648)
This equates to approximately ~59 minutes - so any interval over this amount returns as a null value which is the issue I'm trying to address.
Sample queries/timestamps I'm working with in the data:
select timestampdiff(1, char(timestamp('2022-09-12 14:30:40.444896') - timestamp('2022-09-12 14:30:40.115789'))) from sysibm.SYSDUMMY1
select timestampdiff(1, char(timestamp('2022-09-12 15:59:14.548636') - timestamp('2022-09-12 14:56:10.791140'))) from sysibm.SYSDUMMY1
The second query above is an example that returns a null value as the result exceeds the maximum result interval limit. I am pigeon-holed into using microseconds as my interval as results less than 1 whole second are still valid.
Are there any methods of working around this limit to return results exceeding the limit?
SELECT
A, B
, (
(DAYS (A) - DAYS (B)) * DEC (86400, 31)
+ MIDNIGHT_SECONDS (A) - MIDNIGHT_SECONDS (B)
) * 1000
+ (MICROSECOND (A) - MICROSECOND (B)) / 1000
AS DIFF_MS
FROM
(
VALUES
(timestamp('2022-09-12 14:30:40.444896'), timestamp('2022-09-12 14:30:40.115789'))
, (timestamp('2022-09-12 15:59:14.548636'), timestamp('2022-09-12 14:56:10.791140'))
) T (A, B)
A
B
DIFF_MS
2022-09-12 14:30:40.444896
2022-09-12 14:30:40.115789
329
2022-09-12 15:59:14.548636
2022-09-12 14:56:10.791140
3783758
Update
Just in case. You should be familiar with TIMESTAMPDIFF specific.
It works on timestamp DURATIONs (in the yyyymmddhhmmss.zzzzzzzzzzzz format, but I've truncated the z's part) and may produce quite a "surprising" result, if the difference is more than 1 month.
SELECT
DATE (A) AS A, DATE (B) AS B
, DAYS (B) - DAYS (A) AS DIFF_REAL
, TIMESTAMPDIFF (16, CHAR (B - A)) AS DIFF_TS
, INT (CHAR (B - A)) AS DURATION
, (DAYS (B) - DAYS (A)) * 86400 AS DIFF_REAL_SEC
, TIMESTAMPDIFF (2, CHAR (B - A)) AS DIFF_TS_SEC
FROM
(
VALUES
('2022-01-01'::TIMESTAMP, '2022-02-01'::TIMESTAMP)
, ('2022-01-31'::TIMESTAMP, '2022-03-01'::TIMESTAMP)
, ('2022-02-01'::TIMESTAMP, '2022-03-01'::TIMESTAMP)
, ('2022-02-01'::TIMESTAMP, '2022-02-28'::TIMESTAMP)
) T (A, B)
A
B
DIFF_REAL
DIFF_TS
DURATION
DIFF_REAL_SEC
DIFF_TS_SEC
2022-01-01
2022-02-01
31
30
1,00,00,00,00
2 678 400
2 592 000
2022-01-31
2022-03-01
29
31
1,01,00,00,00
2 505 600
2 678 400
2022-02-01
2022-03-01
28
30
1,00,00,00,00
2 419 200
2 592 000
2022-02-01
2022-02-28
27
27
27,00,00,00
2 332 800
2 332 800
So, in short: don't use DURATIONs and TIMESTAMPDIFF, if you want to calculate the difference between 2 timestamps in days, hours, minutes, etc., if the difference is more than 1 month.
When you subtract dates or timestamps, you end up with a duration..
A duration is a number formatted as yyyymmddhhmmss.zzzzzzzzzzzz.
From the manual:
A timestamp duration represents a number of years, months, days, hours, minutes, seconds, and
fractional seconds, expressed as a DECIMAL(14+s,s) number, where s is the number of digits of
fractional seconds ranging from 0 to 12. To be properly interpreted, the number must have the format
yyyymmddhhmmss.zzzzzzzzzzzz, where yyyy, mm, dd, hh, mm, ss, and zzzzzzzzzzzz represent,
respectively, the number of years, months, days, hours, minutes, seconds, and fractional seconds. The
result of subtracting one timestamp value from another is a timestamp duration with scale that
matches the maximum timestamp precision of the timestamp operands.
select timestamp('2022-09-12 15:59:14.548636') - timestamp('2022-09-12 14:56:10.791140') from sysibm.SYSDUMMY1;
returns 10303.757496
And is read as 1 hour, 3 minutes, 3.757496 seconds
So if you wanted to, you can do the math yourself. Better yet build your own UDF that returns a big integer or even larger as a decimal value.
I am trying to group the rows in a table fortnightly, but can't seem to work out how to do it - especially, as the date_part function does not have a 'fortnight' keyword argument.
This is what I have so far:
CREATE TABLE foo(
dt DATE NOT NULL,
f1 REAL NOT NULL,
f2 REAL NOT NULL,
f3 REAL NOT NULL,
f4 REAL NOT NULL
);
SELECT AVG((f1+f2+f3+f4)/4) as fld_avg FROM
(
SELECT date_part('year', dt) AS year_part,
date_part('fortnight', dt) AS fortnight_part,
f1, f2, f3, f4
FROM foo
WHERE dt >= date_trunc('day', NOW() - '3 month')
) foo
GROUP BY year_part, fortnight_part
How may I rewrite (or modify) the query above so as to group data fortnightly?
Basic idea
What we need to do, is take intervals of 14 consecutive days and map them to unique buckets and then group by those buckets. These buckets can of any type, int, char, timstamp, as long as we have unique value.
Division
A simple way to accomplish this is division. Divide by 14 days and truncate the result to date precision.
For example, we can extract the number of seconds since 1970-01-01, the UNIX epoch, and divide by the number of seconds in a fortnight: 14 * 24 * 60 * 60 = 14 * 86400 = 1209600. (I'll use Vao Tsun's example data)
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT (EXTRACT(EPOCH FROM d)::int/86400)/14 fortnight FROM c
which yields fortnights since 1970-01-01 (a Thursday):
fortnight
-----------
1251
1252
1254
1254
(4 rows)
The integer values we get, represent the number of fortnights since 1970-01-01, but we don't have to care about this. The important thing is, that it uniquely identifies a fortnight.
Due to 1970-01-01 being a Thursday, all fortnights will start at a Thursday. We might want to vary the starting point of our fortnight to a different day of the week (e.g. Monday) by adding:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT (EXTRACT(EPOCH FROM d)::int/86400 + 4)/14 fortnight FROM c
By adding four days to Thursday we end up at Monday.
If you rather want fortnights with respect to the beginning of the year, instead of some arbitrary absolute date, such as 1970-01-01, we can use the day of the year instead:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT EXTRACT(year FROM d) * 26 + EXTRACT(doy FROM d)::int/14 AS fortnight FROM c;
which yields
fortnight
-----------
52467
52468
52469
52470
(4 rows)
We need to multiply the extracted year by 26, because there are 26.1… fortnights in a year.
Truncation
Instead of division another approach is truncation. We map each day of a specific fortnight to the first timestamp of that fortnight.
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT d - make_interval(secs => EXTRACT(EPOCH FROM d)::int % (86400 * 14)) AS fortnight FROM c;
which yields
fortnight
---------------------
2017-12-14 00:00:00
2017-12-28 00:00:00
2018-01-25 00:00:00
2018-01-25 00:00:00
(4 rows)
This might seems a bit more complicated, but has some benefits. The result is still a date/time type and other code does not need to worry about the fact, that we used fortnights.
Again, instead of absolute fortnights, we can calculate this with respect to the beginning of the year:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT d - make_interval(days => EXTRACT(dow FROM d)::int % 14) AS fortnight FROM c;
which yields
fortnight
---------------------
2017-12-17 00:00:00
2017-12-31 00:00:00
2018-01-21 00:00:00
2018-01-28 00:00:00
(4 rows)
The result is of type timestamp, you might want to have date instead. This can be addressed by casting:
(d - make_interval(days => EXTRACT(dow FROM d)::int % 14))::date
or subtracting int instead of interval from date:
d - (EXTRACT(dow FROM d)::int % 14)
There are much more possibilities. With this scheme, we can calculate the fortnight or any other interval with respect to the beginning of the month, some arbitrary date, etc.
update
fortnight is a two week period - one even the other odd. eg week 1 and 2, 3 and 4, 5 and 6.
closer: 2 is even, mod(2,2)=0 and 1 is odd, mod(1,2)=1
4 is even, mod(4,2)=0 and 3 is odd, mod(3,2)=1
6 is even, mod(6,2)=0 and 5 is odd, mod(5,2)=1
thus you can make an assumption that each one week's in year consecutive number divided by two reminder is 1, and each next one weeks number/2 reminders 0
The general idea is - using the sequential number of week in a year. To avoid Jan 1st to be first and Dec31 (possible be the 53rd - and thus two odds in a row), I use IW
week number of ISO 8601 week-numbering year (01-53; the first Thursday
of the year is in week 1)
then I assume that if one week number will be odd, next will be even, so we divide all the time in parts of two weeks - even+odd.
SQL Example:
o=# with c(d) as (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
select d,to_char(d,'IW'),right(to_char(d,'IW'),1)::int,mod(right(to_char(d,'IW'),1)::int, 2) from c;
d | to_char | right | mod
------------+---------+-------+-----
2017-12-21 | 51 | 1 | 1
2017-12-31 | 52 | 2 | 0
2018-01-26 | 04 | 4 | 0
2018-02-01 | 05 | 5 | 1
(4 rows)
mod is either 0 or 1 - group by this column
https://www.postgresql.org/docs/current/static/functions-math.html
https://www.postgresql.org/docs/current/static/functions-formatting.html
Of course you would need to add outer join on generate_series if you want data without gaps...
I post another answer to explain how I was wrong and why my "smart-n-neat"
way failed...
the schema build and queries are at:
https://www.db-fiddle.com/f/j5i2Td8CvxCVXQQYePKzCe/0
the first (and correct) query:
select distinct w2, avg(c) over (partition by w2)
from d
join generate_series('2016.11.28'::date,'2017.02.23'::date,'2 weeks'::interval) w2
on gs >= w2 and gs < w2 + '2 weeks'::interval
order by w2;
Is a long, simple and correct approach. with idea is to join on two weeks interval. It's working, reliable and all good.
Now the second query:
select distinct div(to_char(gs,'IW')::int,2), min(gs) over w, avg(c) over w
from d
window w as (partition by div(to_char(gs,'IW')::int,2))
order by min;
Is much shorter, neater and smarter, yet has a huge limitation and is unusable. Here's why:
My approach splits next to last two-weeks-interval to two parts: last week of 2016 and first week of 2017, thus dividing the result by half. If you multiply a sum of averages for those two weeks by a half, the result for both queries will match. Alas introducing CASE WHEN logic for the edge year weeks makes neat solution a heavy and overhead. And thus the very point is lost.
TL;DR the neat and lightweight solution works only on interval of one year, farther then two weeks from end or start of the year and lastly if our fortnightly interval starts from Monday.
Now the idea behind lightweight solution: round(2/2, 0)=1 and round(3/2, 0)=1, so you can divide year in intervals of two weeks and use it for grouping by.
Also I deliberately took not this New Year switch, because this 2018 Jan 1 is Monday, so IW is same as WW - which usually is not the case.
Lastly my first answer with odd and even weeks is not viable at all. It divides year not in two-weeks interval, but rather in two parts - for even and odd weeks... I deceived myself with "something close" idea and worked on the reminder, while I should do the opposite the whole value of division...
I have a table of table records, call it "game"
It has an id and timestamp.
What I need to know is unrelated to the table specifically. In order to know the average number of games played per hour, I need to know :
Total games played for each hour over the date range
Number of hourly
periods between the date range.
Finding the first is a matter of extracting the hour from the timestamp and grouping by it.
For the second, if the date range was rounded to the nearest day, finding this value would be easy (totalgames/numdays).
Unfortunately I can't assume this. What I need help with is finding the number of specific hour periods existing within a time range.
Example:
If the range is 5 PM today to 8 PM tomorrow, there is one "00" hour (midnight to 1 AM), but two 17, 18, 19 hours (5-6, 6-7, 7-8)
Thanks for the help
Edit: for clarity, consider the following query:
I have table game:
id, daytime
select EXTRACT(hour from daytime) as hour_period, count (*)
from game
where daytime > dateFrom and daytime < dayTo
group by hour_period
This will give me the number of games played broken down into hourly chunks for the time period.
In order to find the average games played per hour, I need to know exactly how many specific hour durations are between two timestamps. Simply dividing by the number of days is not accurate.
Edit: The ideal output will look something like this:
00 275
01 300
02 255
...
Consider the following: How many times does midnight occur between date 1 and date 2 ? If you have 1.5 days, that doesn't guarantee that midnight will occur twice. 6 AM today to 6 PM tomorrow night, for example, has 1 midnight, but 9PM tonight to 9 AM two days from now has 2 midnights.
What I'm trying to find is how many of the EXACT HOUR occurs between two timestamps, so I can use it to average the number of games played at THAT HOUR over a time period.
EDIT:
The following query gets the days, hours, and # of games, giving an output as below:
29 23 100
29 00 130
30 22 140
30 23 150
Then, the outer query adds up the number of games for each distinct hour and divides by the number of hours, as follows
22 140
23 125
00 130
The modified query is below:
SELECT
hour_period,
sum(hourly_no_of_games) / count(hour_period)
FROM
(
SELECT
EXTRACT(DAY from daytime) as day_period,
EXTRACT(HOUR from daytime) as hour_period,
count (*) hourly_no_of_games
from game
where daytime > dateFrom and daytime < dayTo
group by EXTRACT(DAY from daytime), EXTRACT(HOUR from daytime)
) hourly_data
GROUP BY hour_period
ORDER BY hour_period;
SQL Fiddle demo
If you need something to GROUP BY, you can truncate the timestamp to the level of hour, as in the following:
DECLARE #Date DATETIME
SET #Date = GETDATE()
SELECT #Date, DATEADD(Hour, DATEDIFF(Hour, 0, #Date), 0) AS RoundedDate
If you just need to find the total hours, you can just select the DATEDIFF in hours, such as with
SELECT DATEDIFF(Hour, '5/29/2014 20:01:32.999', GETDATE())
Extract not only the hour of the day but the day of the year (1-366). Then group on those. If there is the possibility the interval could span a year, then add the year itself and group by all three.
year dy hr games
2013 365 23 115
2014 1 00 103
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the fastest way to truncate timestamps to 5 minutes in Postgres?
Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)
I want to aggregate data at 5 minute intervals in PostgreSQL. If I use the date_trunc() function, I can aggregate data at an hourly, monthly, daily, weekly, etc. interval but not a specific interval like 5 minute or 5 days.
select date_trunc('hour', date1), count(*) from table1 group by 1;
How can we achieve this in PostgreSQL?
SELECT date_trunc('hour', date1) AS hour_stump
, (extract(minute FROM date1)::int / 5) AS min5_slot
, count(*)
FROM table1
GROUP BY 1, 2
ORDER BY 1, 2;
You could GROUP BY two columns: a timestamp truncated to the hour and a 5-minute-slot.
The example produces slots 0 - 11. Add 1 if you prefer 1 - 12.
I cast the result of extract() to integer, so the division / 5 truncates fractional digits. The result:
minute 0 - 4 -> slot 0
minute 5 - 9 -> slot 1
etc.
This query only returns values for those 5-minute slots where values are found. If you want a value for every slot or if you want a running sum over 5-minute slots, consider this related answer:
PostgreSQL: running count of rows for a query 'by minute'
Here's a simple query you can either wrap in a function or cut and paste all over the place:
select now()::timestamp(0), (extract(epoch from now()::timestamptz(0)-date_trunc('d',now()))::int)/60;
It'll give you the current time, and a number from 0 to the n-1 where n=60 here. To make it every 5 minutes, make that number 300 and so on. It groups by the seconds since the start of the day. To make it group by seconds since year begin, hour begin, or whatever else, change the 'd' in the date_trunc.
I have a table with measures and the time this measures have been taken in the following form: MM/DD/YYYY HH:MI:SS AM. I have measures over many days starting at the same time every day.The datas are minute by minute so basically the seconds are always = 0. I want to select only the measures for the first 5 minutes of each day. I would have used the where statement but the condition would only be on the minutes and note the date is there a way to do this?
Thanks
You could try something like this:
SELECT * FROM SomeTable
WHERE
DATEPART(hh, timestamp_col) = 0 AND -- filter for first hour of the day
DATEPART(mm, timestamp_col) <= 5 -- filter for the first five minutes
Careful! 0 means midnight. If your "first hour" of the day is actually 8 or 9 AM then you should replace the 0 with an 8 or 9.