Time difference between two rows for specified ID - sql-server-2012

I'm trying to find the time difference in seconds between two rows that have the same ID.
Here's a simple table.
The table is ordered by myid and timestamp. I'm trying to get the total second between two rows that have the same myid.
Here's what I have come up with. The only problem with this query is that it calculates the time difference for all records but not for the same ID.
SELECT DATEDIFF(second, pTimeStamp, TimeStamp), q.*
FROM (
SELECT *,
LAG(TimeStamp) OVER (ORDER BY TimeStamp) pTimeStamp
FROM data
) q
WHERE pTimeStamp IS NOT NULL
This is the output.
I only want the output highlighted in yellow.
Any suggestions?
SQLFIDDLE

The fix is simply a matter of narrowing the window, with PARTITION BY, to rows with the same ID:
SELECT DATEDIFF(second, pTimeStamp, TimeStamp), q.*
FROM (
SELECT *,
LAG(TimeStamp) OVER (PARTITION BY ID ORDER BY TimeStamp) pTimeStamp
FROM data
) q
WHERE pTimeStamp IS NOT NULL

Related

How do I select a data every second with PostgreSQL?

I've got a SQL query that selects every data between two dates and now I would like to add the time scale factor so that instead of returning all the data it returns one data every second, minute or hour.
Do you know how I can achieve it ?
My query :
"SELECT received_on, $1 FROM $2 WHERE $3 <= received_on AND received_on <= $4", [data_selected, table_name, date_1, date_2]
The table input:
As you can see there are several data the same second, I would like to select only one per second
If you want to select data every second, you may use ROW_NUMBER() function partitioned by 'received_on' as the following:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY received_on ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
If you want to select data every minute or hour, you may use the extract function to get the number of seconds in 'received_on' and divide it by 60 to get the minutes or divide it by 3600 to get the hours.
epoch: For date and timestamp values, the number of seconds since 1970-01-01 00:00:00-00 (can be negative); for interval values, the total number of seconds in the interval
Group by minutes:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / 60) ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
Group by hours:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / (60*60)) ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
See a demo.
When there are several rows per second, and you only want one result row per second, you can decide to pick one of the rows for each second. This can be a randomly chosen row or you pick the row with the greatest or least value in a column as shown in Ahmed's answer.
It would be more typical, though, to aggregate your data per second. The columns show figures and you are interested in those figures. Your sample data shows two times the value 2509 and three times the value 2510 for the adc_v column at 2022-07-29, 15:52. Consider what you would like to see. Maybe you don't want this value go below some boundary, so you show the minimum value MIN(adc_v) to see how low it went in the second. Or you want to see the value that occured most often in the second MODE(adc_v). Or you'd like to see the average value AVG(adc_v). Make this decision for every value, so as to get the informarion most vital to you.
select
received_on,
min(adc_v),
avg(adc_i),
...
from mytable
group by received_on
order by received_on;
If you want this for another interval, say an hour instead of the month, truncate your received_on column accordingly. E.g.:
select
date_trunc('hour', received_on) as received_hour,
min(adc_v),
avg(adc_i),
...
from mytable
group by date_trunc('hour', received_on)
order by date_trunc('hour', received_on);

How to get records from SQL Server for every hour only first record in a day in a month?

I have code like this below, I want to generate records for last 30 days from SQL Server for every hour first record in a day. for every minute there are like 10 records. I want to grab only one record for every hour. I want only 24 records for each day
SELECT *
FROM table
WHERE ordertime >= '2019-07-21 12:00' AND ordertime <= '2019-08-21 12:00' ;
Using Row_Number
;with cte as
(
SELECT *,rn = row_number() over (partition by datepart(hour,getdate()) order by [key])
FROM table WHERE
ordertime >='2019-07-21 12:00' AND ordertime <= '2019-08-21 12:00> ;
)
select *
from cte where rn =1
You can get the first record for each hour using row_number(). The correct expression is:
SELECT t.*
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY CONVERT(DATE, ordertime), DATEPART(HOUR, ordertime)
ORDER BY ordertime
) seqnum
FROM table t
) t
WHERE ordertime >= '2019-07-21 12:00' AND
ordertime < '2019-08-21 12:00' AND
seqnum = 1;
Note that the PARTITION BY clause has both the full date and hour. This uniquely identifies each hour. ROW_NUMBER() itself is enumerating the rows, based on the ORDER BY. So, "1" is for the first record.
I also changed your second comparison to < rather than <=. That makes more sense to me.
I think this is what you are looking for:
select * from (
select *,
ROW_NUMBER() OVER(
PARTITION BY datepart(hour, ordertime ),datepart(day,ordertime ) ORDER BY ordertime
) rn from table
) a
WHERE rn=1 and ordertime >='2019-07-21 12:00' AND ordertime <= '2019-08-21 12:00> ;
This uses the ROW_NUMBER function, this gives each row a number in regard of a partition specified ,in this case the partition was made for day and hour, this would mean records in the same day and hour will be assigned row numbers. In the outer query Im returning the records that get assigned the row number 1, which means that only one record will be returned for each hour of each day.
Bear in mind that you can add another field in the ORDER BY part of the partition to specifiy what record is the record you return (for example if you want the last record of each hour you would ORDER BY ordertime DESC)

Delete duplicated rows

I ve got duplicated rows in a temp table mainly because there are some date values which are seconds/miliseconds different to each other.
For example:
2018-08-30 12:30:19.000
2018-08-30 12:30:20.000
This is what causes the duplication.
How can I keep only one of those values? Let s say the higher one?
Thank you.
Well, one method is to use lead():
select t.*
from (select t.*, lead(ts) over (order by ts) as next_ts
from t
) t
where next_ts is null or
datediff(second, ts, next_ts) < 60; -- or whatever threshold you want
You could assign a Row_Number to each value, as follows:
Select *
, Row_Number() over
(partition by ObjectID, cast(date as date)... ---whichever criteria you want to consider duplicates
order by date desc) --assign the latest date to row 1, may want other order criteria if you might have ties on this field
as RN
from MyTable
Then retain only the rows where RN = 1 to remove duplicates. See this answer for examples of how to round your dates to the nearest hour, minute, etc. as needed; I used truncating to the day above as an example.

Postgres windowing (determine contiguous days)

Using Postgres 9.3, I'm trying to count the number of contiguous days of a certain weather type. If we assume we have a regular time series and weather report:
date|weather
"2016-02-01";"Sunny"
"2016-02-02";"Cloudy"
"2016-02-03";"Snow"
"2016-02-04";"Snow"
"2016-02-05";"Cloudy"
"2016-02-06";"Sunny"
"2016-02-07";"Sunny"
"2016-02-08";"Sunny"
"2016-02-09";"Snow"
"2016-02-10";"Snow"
I want something count the contiguous days of the same weather. The results should look something like this:
date|weather|contiguous_days
"2016-02-01";"Sunny";1
"2016-02-02";"Cloudy";1
"2016-02-03";"Snow";1
"2016-02-04";"Snow";2
"2016-02-05";"Cloudy";1
"2016-02-06";"Sunny";1
"2016-02-07";"Sunny";2
"2016-02-08";"Sunny";3
"2016-02-09";"Snow";1
"2016-02-10";"Snow";2
I've been banging my head on this for a while trying to use windowing functions. At first, it seems like it should be no-brainer, but then I found out its much harder than expected.
Here is what I've tried...
Select date, weather, Row_Number() Over (partition by weather order by date)
from t_weather
Would it be better just easier to compare the current row to the next? How would you do that while maintaining a count? Any thoughts, ideas, or even solutions would be helpful!
-Kip
You need to identify the contiguous where the weather is the same. You can do this by adding a grouping identifier. There is a simple method: subtract a sequence of increasing numbers from the dates and it is constant for contiguous dates.
One you have the grouping, the rest is row_number():
Select date, weather,
Row_Number() Over (partition by weather, grp order by date)
from (select w.*,
(date - row_number() over (partition by weather order by date) * interval '1 day') as grp
from t_weather w
) w;
The SQL Fiddle is here.
I'm not sure what the query engine is going to do when scanning multiple times across the same data set (kinda like calculating area under a curve), but this works...
WITH v(date, weather) AS (
VALUES
('2016-02-01'::date,'Sunny'::text),
('2016-02-02','Cloudy'),
('2016-02-03','Snow'),
('2016-02-04','Snow'),
('2016-02-05','Cloudy'),
('2016-02-06','Sunny'),
('2016-02-07','Sunny'),
('2016-02-08','Sunny'),
('2016-02-09','Snow'),
('2016-02-10','Snow') ),
changes AS (
SELECT date,
weather,
CASE WHEN lag(weather) OVER () = weather THEN 1 ELSE 0 END change
FROM v)
SELECT date
, weather
,(SELECT count(weather) -- number of times the weather didn't change
FROM changes v2
WHERE v2.date <= v1.date AND v2.weather = v1.weather
AND v2.date >= ( -- bounded between changes of weather
SELECT max(date)
FROM changes v3
WHERE change = 0
AND v3.weather = v1.weather
AND v3.date <= v1.date) --<-- here's the expensive part
) curve
FROM changes v1
Here is another approach based off of this answer.
First we add a change column that is 1 or 0 depending on whether the weather is different or not from the previous day.
Then we introduce a group_nr column by summing the change over an order by date. This produces a unique group number for each sequence of consecutive same-weather days since the sum is only incremented on the first day of each sequence.
Finally we do a row_number() over (partition by group_nr order by date) to produce the running count per group.
select date, weather, row_number() over (partition by group_nr order by date)
from (
select *, sum(change) over (order by date) as group_nr
from (
select *, (weather != lag(weather,1,'') over (order by date))::int as change
from tmp_weather
) t1
) t2;
sqlfiddle (uses equivalent WITH syntax)
You can accomplish this with a recursive CTE as follows:
WITH RECURSIVE CTE_ConsecutiveDays AS
(
SELECT
my_date,
weather,
1 AS consecutive_days
FROM My_Table T
WHERE
NOT EXISTS (SELECT * FROM My_Table T2 WHERE T2.my_date = T.my_date - INTERVAL '1 day' AND T2.weather = T.weather)
UNION ALL
SELECT
T.my_date,
T.weather,
CD.consecutive_days + 1
FROM
CTE_ConsecutiveDays CD
INNER JOIN My_Table T ON
T.my_date = CD.my_date + INTERVAL '1 day' AND
T.weather = CD.weather
)
SELECT *
FROM CTE_ConsecutiveDays
ORDER BY my_date;
Here's the SQL Fiddle to test: http://www.sqlfiddle.com/#!15/383e5/3

SQL Statement Only latest entry of the day

seems it is too long ago that I needed create own SQL Statements. I have a table (GAS_COUNTER) with timestamps (TS) and values (VALUE).
There are hundreds of entries per day, but I only need the latest of the day. I tried different ways but never get what I need.
Edit
Thanks for the fast replies, but some do not meet my needs (I need the latest value of each day in the table) and some don't work. My best own statement was:
select distinct (COUNT),
from
(select
extract (DAY_OF_YEAR from TS) as COUNT,
extract (YEAR from TS) as YEAR,
extract (MONTH from TS) as MONTH,
extract (DAY from TS) as DAY,
VALUE as VALUE
from GAS_COUNTER
order by COUNT)
but the value is missing. If I put it in the first select all rows return. (logical correct as every line is distinct)
Here an example of the Table content:
TS VALUE
2015-07-25 08:47:12.663 0.0
2015-07-25 22:50:52.155 2.269999999552965
2015-08-10 11:18:07.667 52.81999999284744
2015-08-10 20:29:20.875 53.27999997138977
2015-08-11 10:27:21.49 54.439999997615814
2nd Edit and solution
select TS, VALUE from GAS_COUNTER
where TS in (
select max(TS) from GAS_COUNTER group by extract(DAY_OF_YEAR from TS)
)
This one would give you the very last record:
select top 1 * from GAS_COUNTER order by TS desc
Here is one that would give you last records for every day:
select VALUE from GAS_COUNTER
where TS in (
select max(TS) from GAS_COUNTER group by to_date(TS,'yyyy-mm-dd')
)
Depending on the database you are using you might need to replace/adjust to_date(TS,'yyyy-mm-dd') function. Basically it should extract date-only part from the timestamp.
Select the max value for the timestamp.
select MAX(TS), value -- or whatever other columns you want from the record
from GAS_COUNTER
group by value
Something like this would window the data and give you the last value on the day - but what happens if you get two TS the same? Which one do you want?
select *
from ( select distinct cast( TS as date ) as dt
from GAS_COUNTER ) as gc1 -- distinct days
cross apply (
select top 1 VALUE -- last value on the date.
from GAS_COUNTER as gc2
where gc2.TS < dateadd( day, 1, gc1.dt )
and gc2.TS >= gc1.dt
order by gc2.TS desc
) as x