I have pretty simple table which has 2 column. First one show time (timestamp), the second one show speed of car at that time (float8).
| DATE_TIME | SPEED |
|---------------------|-------|
| 2018-11-09 00:00:00 | 256 |
| 2018-11-09 01:00:00 | 659 |
| 2018-11-09 02:00:00 | 256 |
| other dates | xxx |
| 2018-11-21 21:00:00 | 651 |
| 2018-11-21 22:00:00 | 515 |
| 2018-11-21 23:00:00 | 849 |
Lets say we have period from 9 november to 21 november. How to group that period by week. In fact I want such result:
| DATE_TIME | AVG_SPEED |
|---------------------|-----------|
| 9-11 November | XXX |
| 12-18 November | YYY |
| 19-21 November | ZZZ |
I use PostgreSQL 10.4.
I use such SQL Statement to know the number of the week of the certain date:
SELECT EXTRACT(WEEK FROM TIMESTAMP '2018-11-09 00:00:00');
EDIT:
#tim-biegeleisen when I set period from '2018-11-01' to '2018-11-13' your sql statement return 2 result:
In fact I need such result:
2018-11-01 00:00:00 | 2018-11-04 23:00:00
2018-11-05 00:00:00 | 2018-11-11 23:00:00
2018-11-12 00:00:00 | 2018-11-13 05:00:00
As you can see in the calendar there are 3 week in that period.
We can do this using a calendar table. This answer assumes that a week begins with the first date in your data set. You could also do this assuming something else, e.g. a standard week according to something else.
WITH dates AS (
SELECT date_trunc('day', dd)::date AS dt
FROM generate_series
( '2018-11-09'::timestamp
, '2018-11-21'::timestamp
, '1 day'::interval) dd
),
cte AS (
SELECT t1.dt, t2.DATE_TIME, t2.SPEED,
EXTRACT(week from t1.dt) week
FROM dates t1
LEFT JOIN yourTable t2
ON t1.dt = t2.DATE_TIME::date
)
SELECT
MIN(dt)::text || '-' || MAX(dt) AS DATE_TIME,
AVG(SPEED) AS AVG_SPEED
FROM cte
GROUP BY
week
ORDER BY
MIN(dt);
Demo
Question: Find out no follow-up appointments to the call within the following 7 days for a particular Patient
My query:
select *, DATEDIFF(DAY, (APPOINTMENT_DATE - LAG(APPOINTMENT_DATE)
over (ORDER BY PATIENT_ID)), APPOINTMENT_DATE) as DIFFERENCE from [dbo].
[Appointment Data]
Problems:
1.DIFFERENCE CHANGES to some crazy format because of datetime may be.
2.Is my query right? How do I find difference for each customer? I know I have to apply group by but I am little confused.
PLS HELP!
Dataset:
APPOINTMENT_DATE PATIENT_ID DIFFERENCE
2010-05-06 00:00:00.000 00051101 NULL
2010-04-11 00:00:00.000 00101005 40302
2010-05-06 00:00:00.000 00130521 40277
2010-02-07 00:00:00.000 00130521 40302
It seems that you have several mistakes in your query:
1) You should use column PATIENT_ID in partitioning and order by APPOINTMENT_DATE in LAG function
2) You have unnecessary subtraction in DATEDIFF function
So, your query should be something like:
select
*, datediff(dd, lag(APPOINTMENT_DATE) over (partition by PATIENT_ID order by APPOINTMENT_DATE), APPOINTMENT_DATE)
from
[dbo].[Appointment Data]
select *,
DATEDIFF(DAY, LAG(APPOINTMENT_DATE) over (ORDER BY PATIENT_ID), APPOINTMENT_DATE) as DIFFERENCE
from [dbo].[Appointment Data]
Result:
+-----------------------+------------+------------+
| APPOINTMENT_DATE | PATIENT_ID | DIFFERENCE |
+-----------------------+------------+------------+
| 5/6/2010 12:00:00 AM | 00051101 | null |
| 4/11/2010 12:00:00 AM | 00101005 | -25 |
| 5/6/2010 12:00:00 AM | 00130521 | 25 |
| 2/7/2010 12:00:00 AM | 00130521 | -88 |
+-----------------------+------------+------------+
If you switch the dates, the result will be different.
select *,
DATEDIFF(DAY, APPOINTMENT_DATE, LAG(APPOINTMENT_DATE) over (ORDER BY PATIENT_ID)) as DIFFERENCE
from [dbo].[Appointment Data]
Result:
+-----------------------+------------+------------+
| APPOINTMENT_DATE | PATIENT_ID | DIFFERENCE |
+-----------------------+------------+------------+
| 5/6/2010 12:00:00 AM | 00051101 | null |
| 4/11/2010 12:00:00 AM | 00101005 | 25 |
| 5/6/2010 12:00:00 AM | 00130521 | -25 |
| 2/7/2010 12:00:00 AM | 00130521 | 88 |
+-----------------------+------------+------------+
I have a structure like this
+-----+-----+------------+----------+------+----------------------+---+
| Row | id | date | time | hour | description | |
+-----+-----+------------+----------+------+----------------------+---+
| 1 | foo | 2018-03-02 | 19:00:00 | 8 | across single day | |
| 2 | bar | 2018-03-02 | 23:00:00 | 1 | end at midnight | |
| 3 | qux | 2018-03-02 | 10:00:00 | 3 | inside single day | |
| 4 | quz | 2018-03-02 | 23:15:00 | 2 | with minutes | |
+-----+-----+------------+----------+------+----------------------+---+
(I added the description column only to understand the context, for analysis purpose is useless)
Here is the statement to generate table
WITH table AS (
SELECT "foo" as id, CURRENT_dATE() AS date, TIME(19,0,0) AS time,8 AS hour
UNION ALL
SELECT "bar", CURRENT_dATE(), TIME(23,0,0), 1
UNION ALL
SELECT "qux", CURRENT_dATE(), TIME(10,0,0), 3
UNION ALL
SELECT "quz", CURRENT_dATE(), TIME(23,15,0), 2
)
SELECT * FROM table
Adding the hour value to the given time, I need to split the row on multiple ones, if the sum goes on the next day.
Jumps on multiple days are NOT to be considered, like +27 hours (this should simplify the scenario)
My initial idea was starting from adding the hours value in a date field, in order to obtain start and end limits of the interval
SELECT
id,
DATETIME(date, time) AS date_start,
DATETIME_ADD(DATETIME(date, time), INTERVAL hour HOUR) AS date_end
FROM table
here is the result
+-----+-----+---------------------+---------------------+---+
| Row | id | date_start | date_end | |
+-----+-----+---------------------+---------------------+---+
| 1 | foo | 2018-03-02T19:00:00 | 2018-03-03T03:00:00 | |
| 2 | bar | 2018-03-02T23:00:00 | 2018-03-03T00:00:00 | |
| 3 | qux | 2018-03-02T10:00:00 | 2018-03-02T13:00:00 | |
| 4 | quz | 2018-03-02T23:15:00 | 2018-03-03T01:15:00 | |
+-----+-----+---------------------+---------------------+---+
but now I'm stuck on how to proceed considering the existing interval.
Starting from this table, the rows should be splitted if the day change, like
+-----+-----+------------+-------------+----------+-------+--+
| Row | id | date | hourt_start | hour_end | hours | |
+-----+-----+------------+-------------+----------+-------+--+
| 1 | foo | 2018-03-02 | 19:00:00 | 00:00:00 | 5 | |
| 2 | foo | 2018-03-03 | 00:00:00 | 03:00:00 | 3 | |
| 3 | bar | 2018-03-02 | 23:00:00 | 00:00:00 | 1 | |
| 4 | qux | 2018-03-02 | 10:00:00 | 13:00:00 | 3 | |
| 5 | quz | 2018-03-02 | 23:15:00 | 00:00:00 | 0.75 | |
| 6 | quz | 2018-03-03 | 00:00:00 | 01:15:00 | 1.25 | |
+-----+-----+------------+-------------+----------+-------+--+
I tried to study a similar scenario from an already analyzed scenario, but I was unable to adapt it for handling the day component as well.
My whole final scenario will include both this approach and the other one analyzed in the other question (split on single days and then split on given breaks of hours), but I can approach these 2 themes separately, first query split with day (this question) and then split on time breaks (other question)
Interesting problem ... I tried the following:
Create a second table creating all the new rows starting at midnight
UNION ALL it with source table while correcting hours of old rows accordingly
Commented Result:
WITH table AS (
SELECT "foo" as id, CURRENT_dATE() AS date, TIME(19,0,0) AS time,8 AS hour
UNION ALL
SELECT "bar", CURRENT_dATE(), TIME(23,0,0), 1
UNION ALL
SELECT "qux", CURRENT_dATE(), TIME(10,0,0), 3
)
,table2 AS (
SELECT
id,
-- create datetime, add hours, then cast as date again
CAST( datetime_add( datetime(date, time), INTERVAL hour HOUR) AS date) date,
time(0,0,0) AS time -- losing minutes and seconds
-- substract hours to midnight
,hour - (24-EXTRACT(HOUR FROM time)) hour
FROM
table
WHERE
date != CAST( datetime_add( datetime(date,time), INTERVAL hour HOUR) AS date) )
SELECT
id
,date
,time
-- correct hour if midnight split
,IF(EXTRACT(hour from time)+hour > 24,24-EXTRACT(hour from time),hour) hour
FROM
table
UNION ALL
SELECT
*
FROM
table2
Hope, it makes sense.
Of course, if you need to consider jumps over multiple days, the correction fails :)
Here a possibile solution I came up starting from #Martin Weitzmann approach.
I used 2 different ways:
ids where there is a "jump" on the day
ids which are in the same day
and a final UNION ALL of the two data
I forgot to mention the first time that the hours value of the input value can be float (portion of hours) so I added that too.
#standardSQL
WITH
input AS (
-- change of day
SELECT "bap" as id, CURRENT_dATE() AS date, TIME(19,0,0) AS time, 8.0 AS hour UNION ALL
-- end at midnight
SELECT "bar", CURRENT_dATE(), TIME(23,0,0), 1.0 UNION ALL
-- inside single day
SELECT "foo", CURRENT_dATE(), TIME(10,0,0), 3.0 UNION ALL
-- change of day with minutes and float hours
SELECT "qux", CURRENT_dATE(), TIME(23,15,0), 2.5 UNION ALL
-- start from midnight
SELECT "quz",CURRENT_dATE(), TIME(0,0,0), 4.5
),
-- Calculate end_date and end_time summing hours value
table AS (
SELECT
id,
date AS start_date,
time AS start_time,
EXTRACT(DATE FROM DATETIME_ADD(DATETIME(date,time), INTERVAL CAST(hour*3600 AS INT64) SECOND)) AS end_date,
EXTRACT(TIME FROM DATETIME_ADD(DATETIME(date,time), INTERVAL CAST(hour*3600 AS INT64) SECOND)) AS end_time
FROM input
),
-- portion that start from start_time and end at midnight
start_to_midnight AS (
SELECT
id,
start_time,
start_date,
TIME(23,59,59) as end_time,
start_date as end_date
FROM
table
WHERE end_date > start_date
),
-- portion that start from midnightand end at end_time
midnight_to_end AS (
SELECT
id,
TIME(0,0,0) as start_time,
end_date as start_date,
end_time,
end_date
FROM
table
WHERE
end_date > start_date
-- Avoid rows that starts from 0:0:0 and ends to 0:0:0 (original row ends at 0:0:0)
AND end_time != TIME(0,0,0)
)
-- Union of the 3 tables
SELECT
id,
start_date,
start_time,
end_time
FROM (
SELECT id, start_time, end_time, start_date FROM table WHERE start_date = end_date
UNION ALL
SELECT id, start_time, end_time, start_date FROM start_to_midnight
UNION ALL
SELECT id, start_time, end_time, start_date FROM midnight_to_end
)
ORDER BY id,start_date,start_time
Here is the provided output
+-----+-----+------------+------------+----------+---+
| Row | id | start_date | start_time | end_time | |
+-----+-----+------------+------------+----------+---+
| 1 | bap | 2018-03-03 | 19:00:00 | 23:59:59 | |
| 2 | bap | 2018-03-04 | 00:00:00 | 03:00:00 | |
| 3 | bar | 2018-03-03 | 23:00:00 | 23:59:59 | |
| 4 | foo | 2018-03-03 | 10:00:00 | 13:00:00 | |
| 5 | qux | 2018-03-03 | 23:15:00 | 23:59:59 | |
| 6 | qux | 2018-03-04 | 00:00:00 | 01:45:00 | |
| 7 | quz | 2018-03-03 | 00:00:00 | 04:30:00 | |
+-----+-----+------------+------------+----------+---+
Any hints to get the minimum difference between start time and end time per guid with the following data in Microsoft SQL 2014:
id| start time | guid | end time
1 | 2015-04-05 12:00 | a | 2015-04-05 12:30
2 | 2015-04-05 12:10 | a | 2015-04-05 12:15
3 | 2015-04-05 12:20 | a | 2015-04-05 12:30
4 | 2015-04-05 12:30 | b | 2015-04-05 12:35
5 | 2015-04-05 12:40 | b | 2015-04-05 12:55
6 | 2015-04-05 12:50 | c | 2015-04-05 12:55
7 | 2015-04-05 13:00 | c | 2015-04-05 13:25
the output I am looking for is:
id | start time | guid | end time
2 | 2015-04-05 12:10 | a | 2015-04-05 12:15
4 | 2015-04-05 12:30 | b | 2015-04-05 12:35
6 | 2015-04-05 12:50 | c | 2015-04-05 12:55
I have tried grouping by guid and using the DateDiff function, but it didn't work.
try with below query
;with CTE as(
select id, sttime,guid,endtime
row_number() over (partition by guid order by datediff(ss,endtime,sttime))
from tablename
) select * from CTE where rowid =1
This answer looks a bit like Indra's answer, however there is a significant difference. Not using datediff, which will fail if any dates are more than approximate 168 years(or 2147483647 seconds) apart. Also fixed some issues.
;WITH CTE as
(
SELECT
id, start_time, guid, end_time,
row_number() over (partition by guid order by end_time - start_time) rn
FROM
table
)
SELECT
id, start_time, guid, end_time
FROM CTE
WHERE rn = 1
WITH CTE
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY GUID ORDER BY DATEDIFF(SS,STARTTIME,ENDTIME) ASC) AS RN
FROM YOURTABLE
)
SELECT * FROM CTE WHERE RN=1
Use NOT EXIST to return a row if no other row with same guid has less datediff:
select id, start_time, guid, end_time
from tablename t1
where not exists
(select 1 from tablename t2
where t2.guid = t1.guid
and datediff(t2.end_time - t2.start_time) < datediff(t1.end_time - t1.start_time))
Note, I don't know SQL Server, so you'll have to adjust the datediff code above.
I figured this would be easy, but the only way I can get this figured out is a temp table. Basically I have 1 column called `myDate' which is a datetime column And what I want to know is the average difference in days for all these rows.
So Basically the results are this
1/1/2014
1/14/2014
1/20/2014
so basically i want to know the average is 9.5 days. 1/1 - 1/14 is 13 days and 14/20 is 6 days, so 19 / 2 is 9.5
my basic query is select myDate from myTable
The average is the maximum minus the minimum divided by one less than the number of days. So, you can get it as:
select datediff(day, min(myDate), max(myDate)) / cast(count(*) - 1 as float)
from temp;
If you wanted to be real careful, you might prevent the potential divide-by-zero error:
select (case when count(*) > 1
then datediff(day, min(myDate), max(myDate)) / cast(count(*) - 1 as float)
else 0
end)
from temp;
You don't need a temp table probably, but I'm not convinced the above method is true for all circumstances. Your requirement is to compare the date of one row to the date of the next row, but there is no indication that the 'next row date' is always greater than the previous date. If a pair of dates produce a negative result from datediff() the duration between those dates ignores the sign i.e. is the absolute value. e.g. if we calculate datediff(day,2014-01-02,2013-01-03) the result is -364, but in truth the duration is 364 because we should have flipped the date sequence in the datediff() function.
| MYDATE | NXTDATE | RAWDIFF | DAYDIFF |
|------------|------------|---------|---------|
| 2013-01-01 | 2014-01-02 | 366 | 366 |
| 2014-01-02 | 2013-01-03 | -364 | 364 |
| 2013-01-03 | 2014-01-27 | 389 | 389 |
| 2014-01-27 | 2013-01-28 | -364 | 364 |
| 2013-01-28 | 2014-01-29 | 366 | 366 |
| 2014-01-29 | 2014-06-30 | 152 | 152 |
| 2014-06-30 | (null) | (null) | (null) |
So, with dates possibly going backward and forward, measuring the average duration by a total span could be misleading.
| MIN_DT | MAX_DT | MAX_MIN_SPAN | SPAN_AVG | SUM_DAYDIFFS | COUNT | TRUE_AVG |
|------------|------------|--------------|----------|--------------|-------|----------|
| 2013-01-01 | 2014-06-30 | 545 | 90.83333 | 2001 | 6 | 333.5 |
queries used for this:
SELECT
MIN(mydate) min_dt
, MAX(mydate) max_dt
, DATEDIFF(DAY, MIN(mydate), MAX(mydate)) max_min_span
, DATEDIFF(DAY, MIN(mydate), MAX(mydate)) / (COUNT(daydiff) * 1.0) span_avg
, SUM(daydiff) sum_daydiffs
, COUNT(daydiff) count_daydiffs
, SUM(daydiff) / (COUNT(daydiff) * 1.0) true_avg
FROM (
SELECT
mydate
, ABS(DATEDIFF(DAY, mydate, LEAD(mydate) OVER (ORDER BY (SELECT 1)) )) AS daydiff
FROM mytable
) sq
;
SELECT
mydate
, lead(mydate) over(order by (select 1)) nxtdate
, DATEDIFF(DAY, mydate, LEAD(mydate) OVER (ORDER BY (SELECT 1)) ) AS rawdiff
, ABS(DATEDIFF(DAY, mydate, LEAD(mydate) OVER (ORDER BY (SELECT 1)) )) AS daydiff
FROM mytable
;
see: http://sqlfiddle.com/#!6/a7cdc/4