I have the following two cases
Case1
Table1
Name Start_Sub1 End_Sub1 Start_Sub2 End_Sub2
A 2018-09-19 07:42:00 2018-09-19 09:12:00 2018-09-23 04:02:00 2018-09-23 05:09:00
I want to find the total time the student has spent in the exam, i.e in both the subjects. Which function should I use to get this.
Case 2:
Due to human error, the data has been documented like this:
Name Start_Sub1 End_Sub1 Start_Sub2 End_Sub2
A 2018-09-19 07:42:00 2018-09-19 09:12:00 2018-09-19 08:02:00 2018-09-19 02:09:00
In this case, the time is overlapping in both the timestamps. Can the total time spent in the exam be calculated in such a scenario?
You can convert the timestamps to seconds using the EXTRACT() function and find out if the segments overlap or not. The query should look like:
select
*,
case
when extract(epoch from End_Sub1) < extract(epoch from Start_Sub2)
then extract(epoch from End_Sub1) - extract(epoch from Start_Sub1) +
extract(epoch from End_Sub2) - extract(epoch from Start_Sub2)
else extract(epoch from End_Sub2) - extract(epoch from Start_Sub1)
end as diff
from table1
For details on how to get the time difference see Time difference in seconds using Nettezza.
Related
I am using the DATEDIFF function to calculate the difference between my two timestamps.
payment_time = 2021-10-29 07:06:32.097332
trigger_time = 2021-10-10 14:11:13
What I have written is : date_diff('minute',payment_time,trigger_time) <= 15
I basically want the count of users who paid within 15 mins of the triggered time
thus I have also done count(s.user_id) as count
However it returns count as 1 even in the above case since the minutes are within 15 but the dates 10th October and 29th October are 19 days apart and hence it should return 0 or not count this row in my query.
How do I compare the dates in my both columns and then count users who have paid within 15 mins?
This also works to calculate minutes between to timestamps (it first finds the interval (subtraction), and then converts that to seconds (extracting EPOCH), and divides by 60:
extract(epoch from (payment_time-trigger_time))/60
In PostgreSQL, I prefer to subtract the two timestamps from each other, and extract the epoch from the resulting interval:
Like here:
WITH
indata(payment_time,trigger_time) AS (
SELECT TIMESTAMP '2021-10-29 07:06:32.097332',TIMESTAMP '2021-10-10 14:11:13'
UNION ALL SELECT TIMESTAMP '2021-10-29 00:00:14' ,TIMESTAMP '2021-10-29 00:00:00'
)
SELECT
EXTRACT(EPOCH FROM payment_time-trigger_time) AS epdiff
, (EXTRACT(EPOCH FROM payment_time-trigger_time) <= 15) AS filter_matches
FROM indata;
-- out epdiff | filter_matches
-- out ----------------+----------------
-- out 1616119.097332 | false
-- out 14.000000 | true
In my Spiceworks database there is a table, tickets, with two columns I am concerned with, first_response_secs and created_at.
I have been tasked with finding the average response time of tickets for every week.
So if I run the following query:
select AVG(first_response_secs) from (
select first_response_secs,created_at
from tickets
where created_at BETWEEN '2017-03-19' and '2017-03-25'
)
I will get back the average first response seconds for that week. But that's as far as my limited SQL gets me. I need 6 months worth of data and I don't want to manually edit the date range and rerun the query 24 times.
I would like to write a query that will return output similar to the following:
WEEK AVERAGE RESPONSE TIME(secs)
-----------------------------------------------------------
2017-02-26 - 2017-03-04 21447
2017-03-05 - 2017-03-11 20564
2017-03-12 - 2017-03-18 25883
2017-03-19 - 2017-03-25 12244
Or something like that, back 6 months.
Weeks are tricky. How about:
select min(created_at) as weekstart, first_response_secs, created_at
from tickets
group by floor(julianday('2017-03-25) - julianday(created_at)) % 7 = 0
order by weekstart
One dirty way is to use case to define week boundaries:
select week, avg(first_response_secs)
from (
select case
when created_at between '2017-02-26' and '2017-03-04' then '2017-02-26 - 2017-03-04'
when created_at between '2017-03-05' and '2017-03-11' then '2017-03-05 - 2017-03-11'
when created_at between '2017-03-12' and '2017-03-18' then '2017-03-12 - 2017-03-18'
when created_at between '2017-03-19' and '2017-03-25' then '2017-03-19 - 2017-03-25'
end as week,
first_response_secs
from tickets
) t
group by week;
Demo
Note that this method is a general purpose one and can be modified to change the boundaries as you wish.
I work with a Vertica database and I needed to make a query that, given two dates, would give me a list of all months between said dates. For example, if I were to give the query 2015-01-01 and 2015-12-31, it would output me the following list:
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01
After a bit of digging, I was able to discover the following query:
SELECT date_trunc('MONTH', ts)::date as Mois
FROM
(
SELECT '2015-01-01'::TIMESTAMP as tm
UNION
SELECT '2015-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 month' OVER (ORDER BY tm)
This query works and gives me the following output:
2014-12-01
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01
As you can see, by giving the query a starting date of '2015-01-01' or anywhere in january for that matters, I end up with an extra entry, namely 2014-12-01. In itself, the bug (or whatever you want to call this unexpected behavior) is easy to circumvent (just start in february), but I have to admit my curiosity's piked. Why exactly is the serie starting one month BEFORE the date I specified?
EDIT: Alright, after reading Kimbo's warning and confirming that indeed, long periods will eventually cause problems, I was able to come up with the following query that readjusts the dates correctly.
SELECT ts as originalMonth,
ts +
(
mod
(
day(first_value(ts) over (order by ts)) - day(ts) + day(last_day(ts)),
day(last_day(ts))
)
) as adjustedMonth
FROM
(
SELECT ts
FROM
(
SELECT '2015-01-01'::TIMESTAMP as tm
UNION
SELECT '2018-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 month' OVER (ORDER BY tm)
) as temp
The only problem I have is that I have no control over the initial day of the first record of the series. It's set automatically by Vertica to the current day. So if I run this query on the 31st of the month, I wonder how it'll behave. I guess I'll just have to wait for december to see unless someone knows how to get timeseries to behave in a way that would allow me to test it.
EDIT: Okay, so after trying out many different date combinations, I was able to determine that the day which the series starts changes depending on the date you specify. This caused a whole lot of problems... until we decided to go the simple way. Instead of using a month interval, we used a day interval and only selected one specific day per month. WAY simpler and it works all the time. Here's the final query:
SELECT ts as originalMonth
FROM
(
SELECT ts
FROM
(
SELECT '2000-02-01'::TIMESTAMP as tm
UNION
SELECT '2018-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 day' OVER (ORDER BY tm)
) as temp
where day(ts) = 1
I think it boils down to this statement from the doc: http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/TIMESERIESClause.htm
TIME_SLICE can return the start or end time of a time slice, depending
on the value of its fourth input parameter (start_or_end). TIMESERIES,
on the other hand, always returns the start time of each time slice.
When you define a time interval with some start date (2015-01-01, for example), then TIMESERIES ts AS '1 month' will create for its first time slice a slice that starts 1 month ahead of that first data point, so 2014-12-01. When you do DATE_TRUNC('MON', ts), that of course sets the first date value to 2014-12-01 even if your start date is 2015-01-03, or whatever.
e: I want to throw out one more warning -- your use of DATE_TRUNC achieves what you need, I think. But, from the doc: Unlike TIME_SLICE, the time slice length and time unit expressed in [TIMESERIES] length_and_time_unit_expr must be constants so gaps in the time slices are well-defined. This means that '1 month' is actually 30 days exactly. This obviously has problems if you're going for more than a couple years.
I have the database where I have two columns - date (incl. time) and minutes - as follows:
Open_Time Minute
2013-01-01 09:00:00.000 1
2013-01-01 09:01:00.000 1
2013-01-01 09:02:00.000 1
2013-01-01 09:03:00.000 1
2013-01-01 09:04:00.000 1
2013-01-01 09:05:00.000 1
How to count the minutes between the first and last date time?
select COUNT(Minute)
from test_table
where open_time between '2013-01-01 09:00:00.000' and '2013-01-01 09:05.000'
does not work for me.
I will need to count the minutes as current time - open time in the future.
Thank you for any feedback!
for mysql may be can use :
SELECT TIMESTAMPDIFF(MINUTE,'2013-01-01 09:00:00.000','2013-01-01 09:05.000') ; // return result as minutes
read here : http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html
The SQL looks fine although it returns 6. Did you want that or did you want 5? You could always just start your SELECT criteria from 09:01:00.000 if that's what you want.
SQLfiddle here: http://sqlfiddle.com/#!2/d3f13/2
I am not getting exactly what you want. But if want to extract minute from your date then use following query.
SELECT EXTRACT(MINUTE FROM Open_Time) FROM test_table;
SELECT EXTRACT(MINUTE FROM Open_Time)-10 FROM test_table;
Above query will give only difference in minute. so check you DAY,HOUR,MINUTE,SECONDS based on your criteria
In sql server you can use DATEDIFF function,wich accept folowing parameters:
DATEDIFF(datepart,startdate,end date) and returns count(int) between specified date boundaries for selected datepart,so you could use something like this:
select DATEDIFF(mi,MIN(opentime),MAX(opentime)) AS 'minutes'
from test_table
where open_time between '2013-01-01 09:00:00.000' and '2013-01-01 09:05.000'
You also cloud count minutes from column "minutes",but it will be very hard to count them with current date(unless,every time when counting you insert current time in the table,or create temp table) !
I'm using Vertica Database. I am trying to get the total secs in a particular hour from the following example session data. Any sample SQL code would be very helpful - Thanks
start time end time session length(secs)
2010-02-21 20:30:00 2010-02-21 23:30:00 10800
2010-02-21 21:30:00 2010-02-21 22:30:00 3600
2010-02-21 21:45:00 2010-02-21 21:59:00 840
2010-02-21 22:00:00 2010-02-21 22:20:00 1200
2010-02-21 22:30:00 2010-02-21 23:30:00 3600
Desired Output
hour secs_in_that_hour
20 1800
21 6240
22 8400
23 3600
You would need a table containing every hour, so that you could join it in. That join would be based on the hour being within start and end time and then you can extract the time using (min(hour end,end time) - max(hour start,start time)). Then group on the hour and sum.
Since I don't know vertica, I have no complete answer to this.
Vertica is based on PostgresSQL, especially language-wise. The best thing you could do is look up Postgres's Date Time functions and related tutorials. I haven't found an instance where a Postgres time function does not work in Vertica.
http://www.postgresql.org/docs/8.0/interactive/functions-datetime.html
There is probably a datediff type function you can use. (Sorry, I don't have to time to look it up.)
See Vertica function
TIMESERIES Clause
Provides gap-filling and interpolation (GFI) computation, an important component of time series analytics computation. See Using Time Series Analytics in the Programmer's Guide for details and examples.
Syntax
TIMESERIES slice_time AS 'length_and_time_unit_expression' OVER (
... [ window_partition_clause (page 147) [ , ... ] ]
... ORDER BY time_expression )
... [ ORDER BY table_column [ , ... ] ]
The simplest way is to just extract epoch (number of seconds) on the interval (the difference between timestamps).
As for the overlapping sums, you'll need to first break it out by hour. Some of these hours don't exist so you'll need to generate them using a TIMESERIES clause.
The idea will be to first create your hourly time slices, then theta join to find (and fan out) for all possible matches on this. This is basically looking for any and all overlaps of the time range. Luckily, this is pretty simple as it is just anywhere the start time is before the end of the slice and the end time is greater than the start of the slice.
Then you use greatest and least to find the actual time to start and stop within the slice, subtract them out, convert interval to seconds and done.
See below for the example.
with slices as (
select slice_time slice_time_start, slice_time + interval '1 hour' slice_time_end
from (
select min(start_time) time_range from mytest
union all
select max(end_time) from mytest
) range
timeseries slice_time as '1 HOUR' over (order by range.time_range)
)
select slice_time_start "hour", extract(epoch from sum( least(end_time, slice_time_end)-greatest(slice_time_start, start_time))) secs_in_that_hour
from slices join mytest on ( start_time < slice_time_end and end_time > slice_time_start)
group by 1
order by 1
There may be some edge cases or so additional filtering needed if your data isn't so clean.