How to calculate time between rows with condition?

How to calculate time between rows with condition? - sql

I have a table in SQL Server about how people going in and out of building.
user_id
datetime
direction
1
27.09.2022 10:30
in
1
27.09.2022 12:30
out
1
27.09.2022 14:30
in
1
27.09.2022 15:35
out
2
27.09.2022 11:30
in
2
27.09.2022 13:20
out
2
27.09.2022 15:00
in
2
27.09.2022 15:40
out
3
27.09.2022 11:45
in
3
27.09.2022 11:46
in
3
27.09.2022 15:40
out
3
27.09.2022 15:47
in
3
27.09.2022 18:00
out
I need to calculate how much time each user spent inside the building by days.
For example, on 27th Sep user #1 spent 3 hours 5 minutes. User #2 spent 2 hours 30 minutes.
There is also a bug that may spoil the results - sometimes I may have two 'in' or two 'out' in a row, like in case of user #3. I understand the nature of such bug, and know I only have to keep last of two same rows (in fact user #3 entered in 11:46, not 11:45). Does anyone have an idea how to solve that?

select user_id
,sum(time_spent) as time_spent_minutes
from (
select *
,datediff(minute, lag(case when direction = 'in' then datetime end) over(partition by user_id order by datetime), datetime) as time_spent
from t
) t
group by user_id
user_id
time_spent_minutes
1
185
2
150
Fiddle

The window functions would be a nice fit here.
Example or Updated dbFiddle
Select user_id
,Duration = convert(time(0),dateadd(second,sum(Secs),0))
From (
Select user_id
,Secs = datediff(second,case when direction ='in'
and lead([direction],1) over (partition by user_id order by datetime)='out'
then [datetime]
end
,lead([datetime],1) over (partition by user_id order by datetime))
From YourTable
) A
Group By user_id
Results
user_id Duration
1 03:05:00 -- << Check your desired results
2 02:30:00
3 06:07:00

Related

How do I calculate the amount of time between multiple datetimes in multiple rows in sql

I've done a search but I can't find any that are exactly what I need. I need to be able to calculate the amount of time that someone has been in the building over time in a sql query (T-SQL on SQL Server). The data looks like this:
UserId Clocking Status
------------------------------
1 01/12/2020 09:00 In
2 01/12/2020 09:12 In
1 01/12/2020 09:25 Out
3 01/12/2020 10:00 In
2 01/12/2020 10:45 Out
3 01/12/2020 13:11 Out
1 03/12/2020 11:14 In
2 03/12/2020 15:56 In
1 03/12/2020 16:04 Out
2 03/12/2020 17:00 Out
I want the output to look like this:
UserId TimeInBuilding
----------------------
1 03:35
2 05:25
3 03:11

Assuming that the ins/outs are perfectly interleaved, you can do this by assigning the next "out" time to the "in" time and aggregating:
select userid,
sum(datediff(second, clocking, out_time)) / (60.0 * 60) as decimal_hours
from (select t.*,
lead(clocking) over (partition by userid order by clocking) as out_time
from t
) t
where status = 'In'
group by userid;
You can convert this to HH:MM format using:
select userid,
convert(varchar(5),
convert(time,
dateadd(second,
sum(datediff(second, clocking, out_time),
0)
)
) as hhmm
from (select t.*,
lead(clocking) over (partition by userid order by clocking) as out_time
from t
) t
where status = 'In'
group by userid;
Here is a db<>fiddle.

Join sum to closest timestamp once up to interval cap

I am trying to join a site_interactions table with a store_transactions table. For this, I want that the store_transactions.sales_amount for a given username gets attached to the closest site_interactions.timestamp match, at most one time and up to 7 days of the site_interactions.timestamp variable.
site_interaction table:
username timestamp
John 01.01.2020 15:00:00
John 02.01.2020 11:30:00
Sarah 03.01.2020 12:00:00
store_transactions table:
username timestamp sales_amount
John 02.01.2020 16:00:00 45
John 03.01.2020 16:00:00 70
John 09.01.2020 16:00:00 15
Sarah 02.01.2020 09:00:00 35
Tim 02.01.2020 10:00:00 60
Desired output:
username timestamp sales_amount
John 01.01.2020 15:00:00 NULL
John 02.01.2020 11:30:00 115
Sarah 03.01.2020 12:00:00 NULL
Explanation:
John has 3 entries/transactions in the store_transactions table. The first and the second purchase were realized within the 7 days interval/limit, and the sum of these two transactions (45 + 70 = 115) were attached/joined to the closest and nearest match only once - i.e. to John's second interaction (timestamp = 02.01.2020 11:30:00). John's third transactions was not attached to any site interaction, because it exceeds the 7 days interval (including the time).
Sarah has one transaction realized before her interaction with the site. Thus her sales_amount of 35 was not attached to the site_interaction table.
Last, Tim's transaction was not attached anywhere - because this username does not show in the site_interaction table.
Here a link of the tables: https://rextester.com/RKSUK73038
Thanks in advance!

Below is for BigQuery Standard SQL
#standardSQL
select i.username, i.timestamp,
sum(sales_amount) as sales_amount
from (
select username, timestamp,
ifnull(lead(timestamp) over(partition by username order by timestamp), timestamp_add(timestamp, interval 7 day)) next_timestamp
from `project.dataset.site_interaction`
) i
left join `project.dataset.store_transactions` t
on i.username = t.username
and t.timestamp >= i.timestamp
and t.timestamp < least(next_timestamp, timestamp_add(i.timestamp, interval 7 day))
group by username, timestamp
if to apply to sample data from your question - output is

I need to calculate the time between dates in different lines. (PLSQL)

I have a table where I store all status changes and the time that it has been made. So, when I search the order number on the table of times I get all the dates of my changes, but what I realy want is the time (hours/minutes) that the order was in each status.
The table of time seems like this
ID_ORDER | Status | Date
1 Waiting 27/09/2017 12:00:00
1 Late 27/09/2017 14:00:00
1 In progress 28/09/2017 08:00:00
1 Validating 30/09/2017 14:00:00
1 Completed 30/09/2017 14:00:00
Thanks!

Use lead():
select t.*,
(lead(date) over (partition by id_order order by date) - date) as time_in_order
from t;

SQL Server Query to Get Available Employee based on Schedule

I have two tables, parent table Employees and child table Employees_Availability, like this:
Employees table:
EmployeesID Name Group Availability_Order Available
--------------------------------------------------------------
1 Steve Sales 1 TRUE
2 Ann Sales 2 TRUE
3 Jack Sales 3 FALSE
4 Sandy Support 4 TRUE
5 Bill Support 5 TRUE
6 John Support 6 TRUE
Employees_Schedule table:
EmployeesID Day From To
----------------------------------------------
1 Monday 8:00 12:00
1 Monday 13:00 17:00
2 Monday 12:00 13:00
3 Tuesday 7:30 11:30
3 Wednesday 7:30 11:30
3 Friday 14:30 16:30
4 Tuesday 11:30 17:00
5 Wednesday 8:00 12:00
5 Wednesday 13:00 17:00
5 Thursday 12:00 13:00
5 Friday 7:30 11:30
6 Friday 12:00 13:00
How can I create a query that given date/time and Group return first available employee? I am using SQL Server 2012. Here is what I started doing but got stuck:
Select top 1
Name
from
Empolyees e join? Employees_Schedule s
on
e.employeesID = s.EmployeesID
where
e.group = 'Sales'
and DATENAME(Weekday,'5/24/2016 10:00') = s.Day
and CAST('5/24/2016 10:00' AS TIME) 'hh:mm' >= CAST(s.from AS TIME)
and CAST('5/24/2016 10:00' AS TIME) 'hh:mm' <= CAST(s.to AS TIME)
order by
e.availability_order
Thanks

Have you looked into Window Function and CTE? You could easily achieve this with, for example..
Row_Number() OVER(PARTITION BY day ORDER BY starttime ASC) as ColumnName
Combined with predicate
WHERE columnName = 1 AND groupName = 'groupname'
For detail, read BOL on OVER()Clause here, and CTE here.

It looks like you're close. If you wrap the main part of your SQL in a Common Table Expression and use the row_number() window function then you can find the first available:
;with cte as (
Select top 1
Name,
row_number() over (order by ea.From) PrioritySequence
from
Empolyees e join? Employees_Schedule s
on
e.employeesID = s.EmployeesID
where
e.group = 'Sales'
and DATENAME(Weekday,'5/24/2016 10:00') = s.Day
and CAST('5/24/2016 10:00' AS TIME) 'hh:mm' >= CAST(s.from AS TIME)
and CAST('5/24/2016 10:00' AS TIME) 'hh:mm' <= CAST(s.to AS TIME)
)
select *
from cte
where PrioritySequence = 1

SQL Find latest date where condition doesn't exist

I'm quite new to SQL and working on a query that has been thoroughly defeating me for a while now. I come to this site often - it's a terrific resource thanks to all of your expertise, and generally I find what I need, but this time I think my query is a bit too specific and I've not found something applicable. Could someone give me a hand, please?
I have two tables: one Client table and one Contact (aka appointment) table. What I need to find are all of the clients' most recent appointment days (before a certain date, in this case '11/08/2015') where the Outcome for any appointment on that day is NOT a '2'. Each client may have more than one appointment on a single day, and an Outcome of '2' for any of those appointments means that we have to ignore the whole day and move back to the next most recent day..
For example, Client '3' should have a returned appointment date of '01/07/2015' (and just one row) and not the two rows for '16/07/2015', because one of the appointments on '16/07/2015' had an Outcome of '2'. All other values for Outcome are acceptable (including NULL), just not '2'.
The multiple appointment on the same day bit is the part that I'm finding tricky - I can find the latest appointment day using a Select MAX (or TOP 1) statement, but when I add on a "<> '2'" it still continues to return the same days that may have an Outcome '2' because other appointments on that same day have another Outcome. I've been trying to play around with my tables and GROUP BY and NOT EXIST, but I don't seem to be making any headway.
Contact
ClientID AppDate Outcome
1 30/07/2015 17:00 2
1 01/07/2015 17:00 3
2 03/03/2015 16:00 NULL
2 01/03/2015 16:00 NULL
3 16/07/2015 15:40 6
3 16/07/2015 15:40 2
3 01/07/2015 15:40 3
4 05/08/2015 12:30 6
4 05/08/2015 12:30 2
4 01/08/2015 12:30 3
5 23/07/2015 15:30 2
5 23/07/2015 15:30 NULL
5 01/07/2015 15:30 4
6 20/07/2015 10:10 NULL
6 20/07/2015 10:10 2
6 01/07/2015 10:10 6
7 23/07/2015 15:40 2
7 01/07/2015 15:40 1
7 23/06/2015 15:40 8
8 13/07/2015 11:30 2
8 13/07/2015 11:30 6
8 01/07/2015 11:30 2
8 01/06/2015 11:30 3
9 29/07/2015 17:00 3
9 29/07/2015 17:00 6
10 14/07/2015 11:00 NULL
10 01/07/2015 11:00 5
Client
ClientID Forename Surname
1 I B
2 J B
3 S C
4 S T
5 P C
6 K D
7 P E
8 P H
9 S F
10 A G
Apologies if I'm missing something glaringly obvious! Thanks for reading and for any responses. I attach my truncated query for your general amusement...
SELECT
cli.ClientID ,
cli.Forename ,
cli.Surname ,
con.AppDate ,
con.Outcome
FROM
Client AS cli
INNER JOIN
Contact AS con
ON cli.ClientID = con.ClientID
AND con.AppDate =
(SELECT MAX(con1.AppDate)
FROM Contact AS con1
WHERE con.ClientID = con1.ClientID
AND con1.AppDate < '11/08/2015 00:00:00'
AND con1.Outcome <> '2')
ORDER BY
cli.ClientID
EDIT:
Thank you to Mr Linoff for the Cross Apply query, it worked perfectly.
Sorry that I didn't include the expected output earlier. For reference (for anyone else working with a similar problem in future) I was looking to obtain:
Appointments
Client ID Act Date and Time Outcome
1 01/07/2015 17:00 3
2 03/03/2015 16:00 NULL
3 01/07/2015 15:40 3
4 01/08/2015 12:30 3
5 01/07/2015 15:30 4
6 01/07/2015 10:10 6
7 01/07/2015 15:40 1
8 01/06/2015 11:30 3
9 29/07/2015 17:00 3
9 29/07/2015 17:00 6
10 14/07/2015 11:00 NULL

I think cross apply is the best approach to this:
select c.*, con.*
from client c cross apply
(select top 1 con.*
from (select con.*,
sum(case when Outcome = 2 then 1 else 0 end) over (partition by ClientId, AppDate) as num2s
from contact con
where con.ClientId = c.ClientId and
con.AppDate < '2015-11-08'
) con
where num2s = 0
order by AppDate desc
) con;
In this case, cross apply works a lot like a correlated subquery, but you can return multiple values. The subquery uses window functions to count the number of "2" on a given day and the rest of the logic should be pretty obvious.
This returns one row from the most recent date with appropriate appointments. If you want multiple such rows, use with ties.

What you need to do is to have a condition which picks out all the Dates where Outcome is two and filter them out.
Something like this:
WITH ClientCTE AS
(
SELECT MAX(con1.AppDate) AS AppDate ,ClientID
FROM Contact AS con1
WHERE con.ClientID = con1.ClientID
AND con1.AppDate < '11/08/2015 00:00:00'
AND con1.Outcome <> '2'
AND con1.AppDate NOT IN (SELECT AppDate FROM Contact WHERE Outcome = '2'))
SELECT
* FROM Client C
INNER JOIN
Contact AS con
ON cli.ClientID = con.ClientID
INNER JOIN ClientCTE CTE
ON cli.ClientID = CTE.ClientCTE
AND CTE.AppDate = con.AppDate
Let me know if it works

Please try the below query.
It uses the MAX(case...) OVER( partition BY CAST(AppDate as DATE) windowed function to flag all appointments which have a 2 outcome on same day along with WHERE clause to filter out dates.
This flag is then used in outer WHERE clause to remove unwanted data and joined on contacts and client tables
SELECT
cli.ClientID ,
cli.Forename ,
cli.Surname ,
con.AppDate ,
con.Outcome
from
client cli right join contact con
on con.ClientID=cli.ClientID
inner join
(
select ClientID,MAX(AppDate)as lastDate from
( select *,
MAX(CASE when ISNULL(Outcome,1) =2 then 1 else 0 end) OVER(PARTITION BY CAST(AppDate as DATE),ClientID ORDER BY AppDate) as flag
from Contact
where AppDate< '2015-11-08'
) p
where flag =0 group by ClientID
) s
on s.ClientId=con.ClientID and s.lastDate=con.AppDate
Also here's the link to sql fiddle for demo:
http://sqlfiddle.com/#!6/c519d/2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas