Select Every Nth Record From SQL

Select Every Nth Record From SQL - sql

I am having a Database in which data is been logged in regular interval of time i e for 5 minutes say it is been logged for 24 hours as shown in below table.
Date and Time Value
2016-09-17 14:00:00 25.26
2016-09-17 14:05:00 24.29
2016-09-17 14:10:00 25.22
2016-09-17 14:20:00 25.10
2016-09-17 23:55:00 20.21
I want To display Every 1 hour reading using SQL query There are chances the some reading may be missing The expected Output should be.
Date and Time Value
2016-09-17 14:00:00 25.26
2016-09-17 15:00:00 27.29
2016-09-17 16:00:00 28.12
2016-09-17 17:00:00 22.11
There are chances my be that some reading may be missing. like
Date and Time Value
2016-09-17 14:35:00 25.26
This reading may be missing
Please Suggest SQL query for the same

SELECT t1.DateCol,
t1.Value
FROM yourTable t1
INNER JOIN
(
SELECT MIN(DateCol) AS firstDate
FROM yourTable
GROUP BY FORMAT(DateCol, 'dd/MM/yyyy hh')
) t2
ON t1.DateCol = t2.firstDate
If you instead wanted to group by every 15 minutes, you could try:
GROUP BY CONCAT(FORMAT(DateCol, 'dd/MM/yyyy hh'),
FLOOR(DATEPART(MINUTE, DateCol) / 15))

Related

SQL Server datetime ranges between records

What would be the best way to get datetime ranges between records in SQL Server? I think it would be easiest to explain with an example.
I have the following data - these records start and end datetime ranges would never overlap:
ID
Start
End
1
1/27/2021 06:00:00
1/27/2021 09:00:00
2
1/27/2021 10:00:00
1/27/2021 14:00:00
3
1/27/2021 21:00:00
1/28/2021 04:00:00
4
1/28/2021 06:00:00
1/28/2021 09:00:00
I need to get the date time range between records. So the resulting SQL query would return the following result set (ID doesn't matter):
ID
Start
End
1
1/27/2021 09:00:00
1/27/2021 10:00:00
2
1/27/2021 14:00:00
1/27/2021 21:00:00
3
1/28/2021 04:00:00
1/28/2021 06:00:00
Thanks for any help in advance.

Use lead():
select t.*
from (select id, end as start, lead(start) over (order by start) as end
from t
) t
where end is not null;
Note: end is a lousy name for a column, given that it is a SQL keyword. I assume it is for illustrative purposes only.
Here is a SQL Fiddle.

Google Bigquery - Create time series of number of active records

I'm trying to create a timeseries in google bigquery SQL. My data is a series of time ranges covering the period of activity for that record. Here is an example:
Start End
2020-11-01 21:04:00 UTC 2020-11-02 07:15:00 UTC
2020-11-01 21:45:00 UTC 2020-11-02 04:00:00 UTC
2020-11-01 22:00:00 UTC 2020-11-02 09:48:00 UTC
2020-11-01 22:00:00 UTC 2020-11-02 06:00:00 UTC
I wish to create a new table to total the number of active records within a 15 minute block. "21:00:00" would for example be 21:00 to 21:14.59. My desired output for the above would be:
Period Active_Records
2020-11-01 21:00:00 1
2020-11-01 21:15:00 1
2020-11-01 21:30:00 1
2020-11-01 21:45:00 2
2020-11-01 22:00:00 4
2020-11-01 22:15:00 4
etc until the end of the last active range.
I would also like to be able to generate this on the fly by querying a date range and having it return every 15 minute block in the range and how many active records there was in that period.
Any assistance would be greatly appreciated.

Below is for BigQuery Standard SQL
#standardSQL
select ts as period, count(1) as Active_Records
from unnest((
select generate_timestamp_array(timestamp_trunc(min(start), hour), max(`end`), interval 15 minute)
from `project.dataset.table`
)) ts
join `project.dataset.table`
on not (`end` < ts or start > timestamp_add(ts, interval 15 * 60 - 1 second))
group by ts
if to apply to sample data from your question - output is

SAS - PROC SQL: two tables: each one column distinct value, left join

I have a table with distinct dates YYYYMMDD from 20000101 until 20001231 and a table with distinct time points (HH:MM:SS) from 09:30:00 until 16:00:00.
I would like to create a (left) join where every day gets repeated 391 times assigned with each time point. That looks to me like a left join, however, I do not have any id's for joining.
date time
20000101 09:30:00
20000101 09:31:00
20000101 ...
20000101 ...
20000101 15:59:00
20000101 16:00:00
20000102 09:30:00
20000102 ...
20000102 16:00:00
how would the respective code look like (if there is no explicit common primary key to join on)?
PROC SQL;
SELECT DISTINCT a.date, b.time
FROM table_1 a, table_1 b (both information are in the same table)
;
QUIT;
Just as background: there are days that are "shorter" / less than 391 observation points. However, I would like to make sure every day has 391 observation points, just filled up with missing values.

You need Cartesian Product since you want to generate all combinations of date and time. So to produce such result you need CROSS JOIN in which you don't have to give any JOIN Condition.
Try the below query:
PROC SQL;
SELECT a.date, b.time
FROM table_1 a
CROSS JOIN
table_1 b
GROUP BY a.date, b.time
;
QUIT;
OR
PROC SQL;
SELECT a.date, b.time
FROM (SELECT date FROM table_1) a
CROSS JOIN
(SELECT time FROM table_1) b
GROUP BY a.date, b.time
;
QUIT;
For more info on CROSS JOIN Follow the below link:
http://support.sas.com/documentation/cdl/en/fedsqlref/67364/HTML/default/viewer.htm#p1q7agzgxs9ik5n1p7k3sdft0u9u.htm

You can do either a Left Join or Join and add Where 1=1 this will create the Cartesian Product for you:
Code:
proc sql;
create table want as
select t1.date, t2.time
from t1 left join t2 on 1=1
order by date, time;
quit;

To show all observed times (over all dates) for each date, as well as maintaining original satellite information I would use a reflexive cross join of the combinatoric columns for the basis of a reflexive left join.
Consider this sample data generator. It simulates the case of data being gathered at different intervals (every 10 or 20 minutes) on different days.
data have;
do i = 1 to 5;
date = '01-apr-2018'd + (i-1);
do j = 0 to 4;
time = '12:00't + (mod(i,2)+1) * 600 * j; * every other day sample at 1o or 20 minute interval;
x = ceil ( 25 * ranuni(123) );
OUTPUT;
end;
end;
format date yymmdd10. time time8.;
keep date time x;
run;
SQl is used to cross join the distinct dates and times and then the original data is left joined to the cross join.
proc sql;
create table cross_as_left_basis
as
select
cross.date
, cross.time
, have.x
from
( select distinct dates.date, times.time
from have as dates
cross join have as times
) as
cross
left join
have
on
cross.date = have.date
and cross.time = have.time
;
Have is
date time x
2018-04-01 12:00:00 19
12:20:00 9
12:40:00 5
13:00:00 23
13:20:00 9
2018-04-02 12:00:00 6
12:10:00 20
12:20:00 10
12:30:00 4
12:40:00 5
2018-04-03 12:00:00 20
12:20:00 11
12:40:00 25
13:00:00 7
13:20:00 18
2018-04-04 12:00:00 14
12:10:00 14
12:20:00 22
12:30:00 4
12:40:00 22
2018-04-05 12:00:00 17
12:20:00 20
12:40:00 18
13:00:00 9
13:20:00 14
The join result is
date time x
2018-04-01 12:00:00 19
12:10:00 .
12:20:00 9
12:30:00 .
12:40:00 5
13:00:00 23
13:20:00 9
2018-04-02 12:00:00 6
12:10:00 20
12:20:00 10
12:30:00 4
12:40:00 5
13:00:00 .
13:20:00 .
2018-04-03 12:00:00 20
12:10:00 .
12:20:00 11
12:30:00 .
12:40:00 25
13:00:00 7
13:20:00 18
2018-04-04 12:00:00 14
12:10:00 14
12:20:00 22
12:30:00 4
12:40:00 22
13:00:00 .
13:20:00 .
2018-04-05 12:00:00 17
12:10:00 .
12:20:00 20
12:30:00 .
12:40:00 18
13:00:00 9
13:20:00 14

Total time calculation in a sql query for a day where time in 24 hour format as hhmm

I have a table with date(date), left time(varchar2(4)) and arrival time(varchar2(4)). Time taken is in 24 hour format as hhmm. If a person travel 3 times a day, what will be the query to calculate total travel time in a day?
I am using oracle 11g. Kindly help. Thank you.

Convert the value to a number and report in minutes:
select to_number(substring(time, 1, 2))*60 + to_number(substring(time, 3, 2)) as minutes
Your query would look something like:
select person, sum(to_number(substring(time, 1, 2))*60 + to_number(substring(time, 3, 2))) as minutes
from t
group by person;
I see no reason to convert this back to a string -- or to even store the value as a string instead of as a number. But if you need to, you can reverse the process to get a string.

There are 2 answers, If you want to sum time only on date then it can be done as:-
select curr_date,
sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by curr_date,arrival_time,left_time;
The sample output is as follows:-
select curr_date,left_time,arrival_time from sql_prac;
CURR_DATE LEFT_TIME ARRIVAL_TIME
--------- -------------------- --------------------
30-JUN-17 00:00:00 15:00:00
30-JUL-17 03:30:00 11:30:00
30-AUG-17 03:00:00 12:30:00
30-SEP-17 04:00:00 17:00:00
30-JUN-17 00:00:00 15:00:00
30-JUL-17 03:30:00 11:30:00
30-AUG-17 03:00:00 12:30:00
30-SEP-17 04:00:00 17:00:00
30-SEP-17 04:00:00 17:00:00
9 rows selected
select curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by curr_date,arrival_time,left_time;
CURR_DATE DIFFERENCE
--------- ----------
30-JUN-17 30
30-JUL-17 16
30-SEP-17 39
30-AUG-17 19

If you want to sum it by person and date then it can be done as:-
select dept,curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by dept,curr_date,arrival_time,left_time order by Dept;
The sample output is as follows:-
Data in table is:-
select dept,curr_date,left_time,arrival_time from sql_prac;
DEPT CURR_DATE LEFT_TIME ARRIVAL_TIME
-------------------- --------- -------------------- --------------------
A 30-SEP-17 04:00:00 17:00:00
B 30-SEP-17 04:00:00 17:00:00
C 30-AUG-17 03:00:00 12:30:00
D 30-DEC-17 04:00:00 17:00:00
A 30-SEP-17 04:00:00 17:00:00
B 30-JUL-17 03:30:00 11:30:00
C 30-AUG-17 03:00:00 12:30:00
D 30-SEP-17 04:00:00 17:00:00
R 30-SEP-17 04:00:00 17:00:00
Data fetched using the query
select dept,curr_date,sum(24 * (to_date(arrival_time, 'HH24:mi:ss')- to_date(left_time, 'HH24:mi:ss'))) as difference
from sql_prac group by dept,curr_date,arrival_time,left_time order by Dept;
DEPT CURR_DATE DIFFERENCE
-------------------- --------- ----------
A 30-SEP-17 26
B 30-JUL-17 8
B 30-SEP-17 13
C 30-AUG-17 19
D 30-SEP-17 13
D 30-DEC-17 13
R 30-SEP-17 13

PostgreSQL: How do I join two tables based on same start and end time (timestamp without time zone)?

Okay, I came across this relevant question but it is slightly different than my case.
Problem
I have two similar type of tables in my PostgreSQL 9.5 database tbl1 and tbl2 both containing 1,274 rows. The structure and layout of table 1 is as follows:
Table 1:
id (integer) start_time end_time my_val1 (numeric)
51 1994-09-26 16:50:00 1994-10-29 13:30:00 3.7
52 1994-10-29 13:30:00 1994-11-27 12:30:00 2.4
53 1994-11-27 12:30:00 1994-12-29 09:25:00 7.6
54 1994-12-29 09:25:00 1994-12-31 23:59:59 2.9
54 1995-01-01 00:00:00 1995-02-05 13:50:00 2.9
55 1995-02-05 13:50:00 1995-03-12 11:10:00 1.6
56 1995-03-12 11:10:00 1995-04-11 09:05:00 2.2
171 1994-10-29 16:15:00 1994-11-27 19:10:00 6.9
172 1994-11-27 19:10:00 1994-12-29 11:40:00 4.2
173 1994-12-29 11:40:00 1994-12-31 23:59:59 6.7
173 1995-01-01 00:00:00 1995-02-05 15:30:00 6.7
174 1995-02-05 15:30:00 1995-03-12 09:45:00 3.2
175 1995-03-12 09:45:00 1995-04-11 11:30:00 1.2
176 1995-04-11 11:30:00 1995-05-11 15:30:00 2.7
321 1994-09-26 14:40:00 1994-10-30 14:30:00 0.2
322 1994-10-30 14:30:00 1994-11-27 14:45:00 7.8
323 1994-11-27 14:45:00 1994-12-29 14:20:00 4.6
324 1994-12-29 14:20:00 1994-12-31 23:59:59 4.1
324 1995-01-01 00:00:00 1995-02-05 14:35:00 4.1
325 1995-02-05 14:35:00 1995-03-12 11:30:00 8.2
326 1995-03-12 11:30:00 1995-04-11 09:45:00 1.2
.....
In some rows, start_time and end_time may look similar but whole time window may not be equal. For example,
id (integer) start_time end_time my_val1 (numeric)
54 1994-12-29 09:25:00 1994-12-31 23:59:59 2.9
173 1994-12-29 11:40:00 1994-12-31 23:59:59 6.7
Start_time and end_time are timestamp without time zone. The start_time and end_time have to be in one year window thus whenever there was a change of year from 1994 to 1995 then that row was divided into two rows therefore, there are repeating IDs in the column id. Table 2 tbl2 contains the similar start_time and end_time (timestamp without time zone) and column my_val2 (numeric). For each row in table 1 I need to join corresponding row of table 2 where start_time and end_time are similar.
What I have tried,
Select
a.id,
a.start_time, a.end_time,
a.my_val1,
b.my_val2
from tbl1 a
left join tbl2 b on
b.start_time = a.start_time
order by a.id;
The query returned 3,802 rows which is not desired. The desired result is 1,274 rows of table 1 joined with my_val2. I am aware of Postgres Distinct on clause but I need to keep all repeating ids of tbl1 and only need to join my_val2 of tbl2. Do I need to use Postgres Window function here. Can someone suggest that how to join these two tables?

why you don't add to the ON part the condition
ON b.start_time = a.start_time AND a.id = b.id

For each row in table 1 I need to join corresponding row of table 2
where start_time and end_time are similar.
SQL query should include end_time
SELECT a.id,
a.start_time,
a.end_time,
a.my_val1,
b.my_val2
FROM tbl1 a
LEFT JOIN tbl2 b
ON b.start_time = a.start_time
AND b.end_time = a.end_time
ORDER BY a.id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select Every Nth Record From SQL - sql

Related

SQL Server datetime ranges between records

Google Bigquery - Create time series of number of active records

SAS - PROC SQL: two tables: each one column distinct value, left join

Total time calculation in a sql query for a day where time in 24 hour format as hhmm

PostgreSQL: How do I join two tables based on same start and end time (timestamp without time zone)?

Categories

Resources