How to use date range in window function in SQL Server - sql

How to use a date range using window function in SQL Server?
I have this table:
id date item
----------------------
123 07/01/2018 anf
123 31/12/2017 sh
123 01/01/2018 ab
123 12/03/2018 fhy
123 02/01/2018 fg
124 10/12/2017 ab
124 03/03/2017 sh
125 21/11/2017 ab
125 31/12/2017 sh
125 01/03/2017 ab
126 31/12/2017 ab
I want all the information of ids from the latest date to the previous 30 days. My data has missing dates so that I cannot use over partition by rows
I need to use the similar logic of date range in window function, but it is not supported in SQL Server.

SELECT * FROM YourTable
WHERE DATEDIFF(DAY,date,GETDATE())<=30

I want all the information of ids from the latest date to the previous 30 days.
Your question is unclear on what you actually want. If you mean the latest date in the data, then you can use:
select . . .
from (select t.*, max(date) over (partition by id) as max_date
from t
) t
where date > dateadd(day, -30, max_date);

Related

Select the latest NEW rows by Date from the Snapshot Table

I have a snapshot table like the following
id
name
value
date
123
ABC Corp
500
yesterday
123
ABC Corp
500
today
456
XYZ Ltd.
700
today
123
ABC Corp
500
tomorrow
456
XYZ Ltd.
700
tomorrow
789
PQR Consulting
100
tomorrow
I would like to get the new rows only like the following table from the above snapshot table using sql
id
name
value
date
456
XYZ Ltd.
700
today
789
PQR Consulting
100
tomorrow
I need a pointer whether to follow the window function (like LAG() etc.) to get the new table. or more simple solution is there? Thanks in advance!
There are a few options here, one of them is to use a cte or a derived table to add row_number based on the date column to the table, and the other is to use first_value window function. I'm pretty sure the derived table solution would be better in terms of performance, but I don't have the time to test.
Here's what I would do:
;WITH cte AS
(
SELECT id, name, value, date, ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) as rn
FROM snapshotTable
)
SELECT id, name, value, date
FROM cte
WHERE rn = 1;
To get the earliest records all you need to do is remove the DESC from the order by clause.

How to retrieve trips from historical data?

I have the following table mytable in Hive:
id radar_id car_id datetime
1 A21 123 2017-03-08 17:31:19.0
2 A21 555 2017-03-08 17:32:00.0
3 A21 777 2017-03-08 17:33:00.0
4 B15 123 2017-03-08 17:35:22.0
5 B15 555 2017-03-08 17:34:05.0
5 B15 777 2017-03-08 20:50:12.0
6 A21 123 2017-03-09 11:00:00.0
7 C11 123 2017-03-09 11:10:00.0
8 A21 123 2017-03-09 11:12:00.0
9 A21 555 2017-03-09 11:12:10.0
10 B15 123 2017-03-09 11:14:00.0
11 C11 555 2017-03-09 11:20:00.0
I want to get the routes of cars passing through radars A21 and B15 within the same trip. For example, if the date is different for the same car_id, then it is not the same trip. Basically, I want to consider that the maximum time difference between radars A21 and B15 for the same vehicle should be 30 minutes. If it's bigger, then the trip is not the same, like for example for the car_id 777.
My final goal is to count the average number of trips per day (non-unique, so if the same car passed 2 times by the same route, then it should be calculated 2 times).
The expected result is the following one:
radar_start radar_end avg_tripscount_per_day
A21 B15 1.5
On the date 2017-03-08 there are 2 trips between radars A21 and B15 (car 777 is not considered due to 30 minutes limit), while on the date 2017-03-09 there is only 1 trip. The average is 2+1=1.5 trips per day.
How can I get this result? Basically, I do not know how to introduce 30 minutes limit in the query and how to group rides by radar_start and radar_end.
Thanks.
Update:
The trip is registered at the date it started.
If the car was triggered by radar A21 at 2017-03-08 23:55 and by radar B15 at 2017-03-09 00:15, then it should be considered as the same trip registered for the date 2017-03-08.
In case of ids 6 and 8 the same car 123 passed by A21 two times, and then it turned to B15 (id 10). The last ride with id 8 should be considered. So, 8-10. Thus, the closest previous to B15. The interpretation is that a car passed by A21 two times and the second time is turned to B15.
select count(*) / count(distinct to_date(datetime)) as trips_per_day
from (select radar_id
,datetime
,lead(radar_id) over w as next_radar_id
,lead(datetime) over w as next_datetime
from mytable
where radar_id in ('A21','B15')
window w as
(
partition by car_id
order by datetime
)
) t
where radar_id = 'A21'
and next_radar_id = 'B15'
and datetime + interval '30' minutes >= next_datetime
;
+----------------+
| trips_per_day |
+----------------+
| 1.5 |
+----------------+
P.s.
If your version does not support intervals, the last code record could be replaced by -
and to_unix_timestamp(datetime) + 30*60 > to_unix_timestamp(next_datetime)
I missed that you're using Hive so started writing query for SQL-Server, but maybe it will help for you. Try something like this:
QUERY
select radar_start,
radar_end,
convert(decimal(6,3), count(*)) / convert(decimal(6,3), count(distinct dt)) as avg_tripscount_per_day
from (
select
t1.radar_id as radar_start,
t2.radar_id as radar_end,
convert(date, t1.[datetime]) dt,
row_number() over (partition by t1.radar_id, t1.car_id, convert(date, t1.[datetime]) order by t1.[datetime] desc) rn1,
row_number() over (partition by t2.radar_id, t2.car_id, convert(date, t2.[datetime]) order by t2.[datetime] desc) rn2
from trips as t1
join trips as t2 on t1.car_id = t2.car_id
and datediff(minute,t1.[datetime], t2.[datetime]) between 0 and 30
and t1.radar_id = 'A21'
and t2.radar_id = 'B15'
)x
where rn1 = 1 and rn2 = 1
group by radar_start, radar_end
OUPUT
radar_start radar_end avg_tripscount_per_day
A21 B15 1.5000000000
SAMPLE DATA
create table trips
(
id int,
radar_id char(3),
car_id int,
[datetime] datetime
)
insert into trips values
(1,'A21',123,'2017-03-08 17:31:19.0'),
(2,'A21',555,'2017-03-08 17:32:00.0'),
(3,'A21',777,'2017-03-08 17:33:00.0'),
(4,'B15',123,'2017-03-08 17:35:22.0'),
(5,'B15',555,'2017-03-08 17:34:05.0'),
(5,'B15',777,'2017-03-08 20:50:12.0'),
(6,'A21',123,'2017-03-09 11:00:00.0'),
(7,'C11',123,'2017-03-09 11:10:00.0'),
(8,'A21',123,'2017-03-09 11:12:00.0'),
(9,'A21',555,'2017-03-09 11:12:10.0'),
(8,'B15',123,'2017-03-09 11:14:00.0'),
(9,'C11',555,'2017-03-09 11:20:00.0')

max data in one column based on another column in sql

Hello I am very new to SQL programming, started last week. I am trying to select a userID and Maxdate from a table that looks like this for example:
Key USERID Date
1 111 12/1/2014
2 202 4/1/2014
3 111 3/8/2014
4 111 2/5/2014
5 202 2/10/2014
I want to make a query that would end up with the following results:
USERID DATE
111 12/1/2014
202 4/1/2014
Simply use GROUP BY clause with aggregate function MAX to achieve this:
Try this:
SELECT USERID, MAX(Date) AS Date
FROM tableA
GROUP BY USERID

How to find most recent date given a set a values that fulfill condition *

I've been trying to build an sql query that finds from (table) the most recent date for selected id's that fulfill the condition where 'type' is in hierarchy 'vegetables'. My goal is to be able to get the whole row once max(date) and hierarchy conditions are met for each id.
Example values
ID DATE PREFERENCE AGE
123 1/3/2013 carrot 14
123 1/3/2013 apple 12
123 1/2/2013 carrot 14
124 1/5/2013 carrot 13
124 1/3/2013 apple 13
124 1/2/2013 carrot 14
125 1/4/2013 carrot 13
125 1/3/2013 apple 14
125 1/2/2013 carrot 13
I tried the following
SELECT *
FROM table
WHERE date in
(SELECT max(date) FROM (table) WHERE id in (123,124,125))
and preference in
(SELECT preference FROM (hierarchy_table)
WHERE hierarchy = vegetables))
and id in (123,24,125)
but it doesn't give me the most recent date for each id that meets the hierarchy conditions. (ex. in this scenario I would only get id 124)
Thank you in advance!
SELECT max(date) FROM (table) WHERE id in (123,124,125)
is giving you the max date from all dates, you need to group them.
Try replacing with:
SELECT max(date) FROM (table) GROUP BY id
This way you will get the max date for each id
I figured this out. Please see the query below as an example:
SELECT * FROM (table) t
WHERE t.date in
(SELECT max(date) FROM table sub_t where t.ID = sub_t.ID and (date !> (currentdate))
and preference in
(SELECT preference FROM (hierarchy_table) WHERE hierarchy ='vegetables')
and ID in ('124')
Change:
max(date)
To:
-- if your date data is in mm/dd/yyyy
max( str_to_date( date, '%m/%d/%Y' ) )
OR
-- if your date data is in dd/mm/yyyy
max( str_to_date( date, '%d/%m/%Y' ) )

SQL return latest date for each week

I have query that is returning the follwing data:
Number WeekNumber Date
1111 23 9/11/12 11:01 AM
1111 23 9/11/12 11:58 AM
2222 24 9/17/12 10:14 AM
2222 24 9/18/12 9:52 AM
2222 24 9/19/12 9:46 AM
2222 24 9/20/12 9:42 AM
However what I want is to get the latest date for each week, the result should be:
Number WeekNumber Date
1111 23 9/11/12 11:58 AM
2222 24 9/20/12 9:42 AM
What could I use to obtain this. I have tried to use MAX(DATE) but what I am obtaining is the latest date, not the latest date for each week. I have tried with distinct too but I couldn't make it work with the WHERE clause.
Thank you very much.
SELECT Number, WeekNumber, MAX([Date])
FROM TableName
GROUP BY Number, WeekNumber
You're close with the MAX(Date), however you then need to GROUP BY the other details.
You say the apparent 1:1 relationship between Number and WeekNumber is an artefact of the data you posted and doesn't really apply in your actual data. If that is the case a straightforward GROUP BY won't provide your answer.
If you are using a flavour of RDBMS which uses analytic functions this will work:
select distinct number
, weeknumber
, max(date) over (partition by weeknumber) as weekly_max_date
from yourtable;
If you don't have analytics you will need to use a correlated sub-query:
select distinct t.number
, t.weeknumber
, q.weekly_max_date
from yourtable t
join ( select weeknumber, max(date) as weekly_max_date
from yourtable
group by weeknumber) q
on (q.weeknumber = t.weeknumber)