SQL aggregate based on date range - sql

I'm having following data in a MSSQL table. The requirement is to group the records for users which falls under same/end time duration, and sum up the Rate field.
Is there any way to achieve this via query on-the-fly?
Row Data
-----------------------------------------------------
RawId Start Time End Time User Rate
1 1/9/2021 14:29 1/9/2021 14:40 User-1 10
2 1/9/2021 10:37 1/9/2021 14:00 User-2 20
3 1/9/2021 14:03 1/9/2021 14:59 User-2 30
4 1/9/2021 8:51 1/9/2021 14:39 User-1 40
5 1/9/2021 14:02 1/9/2021 14:59 User-2 50
Expected Output
-----------------------------------------------------
ProID Start Time End Time User RateTotal
xx1 1/9/2021 14:29 1/9/2021 14:40 User-1 50
xx2 1/9/2021 14:02 1/9/2021 14:59 User-2 80
xx3 1/9/2021 10:37 1/9/2021 14:00 User-2 20
Business logic
ProID xx1: RawID 1 & 4, belong to User-1 and RawID 1 start & end time (14:29-14:40) falls within RawID 4 (08:51-14:39). In this case rates have to be added up and show only one record.
ProID xx2: RawID 3 & 5, belong to User-2 and RawID 3 start & end time (14:03-14:59) falls within RawID 5 (14:02-14:59). In this case rates have to be added up and show only one record.
ProID xx3: RawID 2 also belongs to User-2 but start & end time (10:37-14:00) doesnt fall within other User-2 records. Hence this will be considered as separate row.

with cte as
(
select Rate as Rate,dateadd(hour,datediff(HOUR,0,StartTime),0) as starttime,
dateadd(HOUR,DATEDIFF(hour,0,endtime),0) as EndTime
from Row_Data
)
select sum(rate) as Rate,StartTime,Endtime from cte
group by StartTime,EndTime
order by starttime desc

Something like (I'm assuming you made a typo in the sample expected data and the starts/ends are meant to be the 0th minutes)
SELECT SUM(Rate),
Trunc_Start,
Trunc_End
FROM (
SELECT dateadd(hour, datediff(hour, 0, Start_Time), 0) AS Trunc_Start,
dateadd(hour, datediff(hour, 0, ENd_Time) + 1, 0) AS Trunc_End,
Rate
FROM SOME_TABLE
)
GROUP BY Trunc_Start,
Trunc_end

select sum(Rate),StartTime ,EndTime from table
group by StartTime ,EndTime

Related

Price Change History in SQL Server [duplicate]

This question already has answers here:
Is there a way to access the "previous row" value in a SELECT statement?
(9 answers)
Closed 7 months ago.
I have a table in SQL Server with sales price data of items on different dates like this:
Item
Date
Price
1
2021-05-01
200
1
2021-06-11
210
1
2021-06-27
225
1
2021-08-01
250
2
2021-02-10
600
2
2021-04-21
650
2
2021-06-17
675
2
2021-07-23
700
I'm creating a table that specifies the start and end date of prices as below:
Item
DateStart
Price
DateEnd
1
2021-05-01
200
2021-06-10
1
2021-06-11
210
2021-06-26
1
2021-06-27
225
2021-07-31
1
2021-08-01
250
Today date
2
2021-02-10
600
2021-04-20
2
2021-04-21
650
2021-06-16
2
2021-06-17
675
2021-07-22
2
2021-07-23
700
Today date
As you can see, the end date is one day less than the next price change date. I also have a calendar table called "DimDates" with one row per day. I had hoped to use joins but it doesn't do what I thought it would do. Any suggestions on how to write the query? I'm using SQL Server 2016.
We can use LEAD() here along with DATEADD():
WITH cte AS (
SELECT *, DATEADD(day, -1, LEAD(Date, 1, GETDATE())
OVER (PARTITION BY Item
ORDER BY Date)) AS LastDate
FROM yourTable
)
SELECT Item, Date AS DateStart, Price, LastDate AS DateEnd
FROM cte
ORDER BY Item, Date;
Demo

Subtract /Loop through rows in HIVE Query

I have a data in table like below
ID status timestamp
ABC login 1/1/2020 12:00
ABC lock 1/1/2020 13:19
ABC unlock 1/1/2020 13:52
ABC Disconnect 1/1/2020 15:52
ABC Reconnect 1/1/2020 15:55
ABC lock 1/1/2020 16:25
ABC unlock 1/1/2020 16:30
ABC logoff 1/1/2020 17:00
ABC login 2/1/2020 12:00
ABC lock 2/1/2020 13:19
ABC unlock 2/1/2020 13:52
ABC lock 2/1/2020 16:22
ABC logoff 2/1/2020 17:00
I need to find the effective working hours of an employee on a particular date for which he has really worked. Meaning sum of total time minus timings when the status was lock, disconnect.
Example: for employee ABC on 01-JAN-2020, his system was ideal between 13:19 - 13:52(33 minutes) and again from 15:52 - 15:55(3 minutes).
Hence, out of total working hour i.e... 5hrs (time between login and log off time) his effective time would be 5hr - 36 minutes = 4hr24 minutes.
Similarly for 01-FEB-2020.
You can use window functions, then aggregation:
select
id,
to_date(timestamp) timestamp_day,
sum(case when status in ('lock', 'disconnect') then - duration else duration end) / 60 / 60 hours_worked
from (
select t.*,
lead(timestamp) over(partition by id order by timestamp)
- unix_timestamp(timestamp) status_duration
from mytable t
) t
group by id, to_date(timestamp)
order by id, to_date(timestamp)
In the subquery, we use lead() to retrieve the timestamp of the "next" action, so we can compute the duration of the current step. The outer query aggregates by employee and day, and do the final computation of working hours according to your business rule.

Replace of self join in SQL Server 2012

I have a scenario on following tables:
SampleData: (has one hour periodic values of a Prod.)
TST_DATE CLK_LITRE_WT
---------------------------------
09/15/2019 17:15 1280 <-- current time value
09/15/2019 16:15 1300
09/15/2019 15:15 1190
09/15/2019 14:15 1200
09/15/2019 13:15 1200
CLK_LITRE_WT is prod name out of 13 products. So totally 14 columns are there. But it doesn't matter. Note that no of row/hour is given user.
SettingMaster:
UserCode LastRunDate
------------------------------------
aa 2019-09-15 15:18:01.350
LastRunDate is nothing but Last DB reached time of a User. So I need query which should be like and Expected Result:
TST_DATE CLK_LITRE_WT_Real CLK_LITRE_WT_Seen
-----------------------------------------------------------
09/15/2019 17:15 1280 1190 <-- value of previous live record.
09/15/2019 16:15 1300 1190 <-- value of previous live record.
09/15/2019 15:15 1190 1190 <-- Last seen record by User.
09/15/2019 14:15 1200 1200
09/15/2019 13:15 1200 1200
I tried Self Join, LEAD-LAG (will share the query soon). But I did not achieve the expected result. So I need your help for how can I get the Expected result.
Edit 1:
I recreate LEAD-LAG what I tried yesterday.
select TST_DATE, CLK_LITRE_WT CLK_LITRE_WT_Real, lead (CLK_LITRE_WT) over (order by convert(datetime,sd.TST_DATE + ':15.00') desc) CLK_LITRE_WT_Seen
from SettingMaster sm left join SampleData sd
on convert(datetime,sd.TST_DATE + ':15.00') between dateadd (hour, -(4), '2019-09-15 17:20:02.733') and '2019-09-15 17:20:02.733'
where sm.UserCode = 'aa'
order by convert(datetime,sd.TST_DATE + ':15.00') desc
Here yesterday date and time was passed static. Because, given data are yesterday data. And result is:
TST_DATE CLK_LITRE_WT_Real CLK_LITRE_WT_Seen
09/15/2019 17:15 1280 1300
09/15/2019 16:15 1300 1190 <-- value of previous live record
09/15/2019 15:15 1190 1200 <-- here should be '1190' as Real
09/15/2019 14:15 1200 NULL

Find first and last trip with same source-destination by day

I have a table with source, dest, and time of the trip. I want to find list of all the sources that had same destination for the first and last trip of the day. Table looks like below:
Source Dest Trip_Time
1 2 2/1/2019 6:00
2 3 2/1/2019 7:00
4 2 2/1/2019 7:00
1 3 2/1/2019 8:00
2 1 2/1/2019 9:00
3 1 2/1/2019 9:00
4 1 2/1/2019 9:00
1 4 2/1/2019 15:00
2 1 2/1/2019 17:30
3 5 2/1/2019 17:30
4 5 2/1/2019 17:30
2 3 2/1/2019 19:45
3 1 2/1/2019 19:45
5 2 2/1/2019 19:45
1 4 2/2/2019 17:00
1 3 2/2/2019 21:00
I have figured out a query to get what I wanted, but I was wondering if there is more optimal way of achieving the result, especially the one that'll work with millions of rows.
select source, max(first_trip) ft, max(last_trip) lt from
(select source, case when (a.max) = 1 then (dest) end as first_trip,
case when (a.min) = 1 then (dest) end as last_trip from (select source, dest, time_trip,
Row_Number() Over (partition by source order by time_trip desc) as max,
Row_Number() Over (partition by source order by time_trip asc) as min from trips) a
where a.max = 1 or a.min = 1) b
group by b.source) c where ft = lt```
Expected result:
caller fc lc
2 3 3
3 1 1
5 2 2
One method is to use first_value() and last_value(). Date functions are notoriously dependent on the database, but you need to extract the date from the trip_time. The following illustrates the idea but the date function could differ on your database:
select distinct source, cast(trip_time as date)
from (select t.*,
first_value(dest) over (partition by source, cast(trip_time as date) order by trip_time asc) as first_dest,
first_value(dest) over (partition by source, cast(trip_time as date) order by trip_time desc) as last_dest
from trips t
) t
where first_dest = last_dest;

Adding a reference data to the table column from different table line

I have an event table with following columns:
sequence (int)
DeviceID (varchar(8))
time_start (datetime)
DeviceState (smallint)
time_end (datetime)
All columns except time_end are populated with the data (my current time_end column is NULL through out the table). What I'd need to do is to populate the time_end column with the event closure data. This is actually the time when new event from the same device occurred.
Here is an example data model how it should work at the end:
sequence DeviceID time_start DeviceState time_end
--------------------------------------------------------------------------------------
1 000012A7 2010-10-31 12:00 14 2010-10-31 12:10
2 000012A7 2010-10-31 12:10 18 2010-10-31 12:33
3 000012A8 2010-10-31 12:20 16 2010-10-31 13:01
4 000012A7 2010-10-31 12:33 13 2010-10-31 12:47
5 000012A7 2010-10-31 12:47 18 2010-10-31 13:20
6 000012A8 2010-10-31 13:01 20 2010-10-31 13:23
7 000012A7 2010-10-31 13:20 05 2010-10-31 14:12
8 000012A8 2010-10-31 13:23 32 2010-10-31 14:15
9 000012A7 2010-10-31 14:12 12
10 000012A8 2010-10-31 14:15 35
The idea is that for each record within the table I need to select an record on the higher sequence for specific device and update the time_end with the time_start data of that higher level record.
With this I'll be able to track the time period of each event.
I was thinking on doing this with a function call, but I have two main difficulties:
1. getting the data from e.g.: sequence=2 and updating the time_end of sequence=1
2. creating a function which will do this continuously as new records are added into the table
I'm quite new to the SQL and I'm quite lost on what else is possible. Based on my knowledge I should use the function which would reference the data together, but my current knowledge is limiting me in doing that.
I hope someone could provide me some guidance into which direction to go and to provide me some feedback if I'm on the right track or not. Any support articles would be very much appreciated.
View:
CREATE VIEW tableview AS
with timerank AS
(
SELECT mytable.*, ROW_NUMBER() OVER (PARTITION BY DeviceID ORDER BY time_start) as row
FROM THE_TABLE mytable
)
SELECT tstart.*, tend.time_start AS time_end
FROM timerank tstart
LEFT JOIN timerank tend ON tstart.row = tend.row - 1
AND tstart.DeviceID = tend.DeviceID
Edit: I see your deviceID requirement now.
#OMG Ponies: I think here will be a bit better formatting:
UPDATE YOUR_TABLE
SET time_end = (SELECT TOP 1
t.time_start
FROM YOUR_TABLE t
WHERE t.DeviceID = YOUR_TABLE.DeviceID
AND t.time_start > YOUR_TABLE.time_start
ORDER BY t.time_start ASC)