Exclude Sub Intervals from parent interval in postgresql - sql

I have 2 tables:
Table 1 has Users Session Data
user_id
login_time
logout_time
a1
2022-01-10 09:02:54.927
2022-01-10 18:07:42.876
s1
2022-01-10 09:07:51.104
2022-01-10 18:44:23.053
Table 2 has Users Chat Data
user_id
user_connected_time
user_disconnected_time
a1
2022-02-10 13:10:49.975
2022-02-10 13:25:22.26
a1
2022-02-10 17:35:12.34
2022-02-10 17:55:05.283
s1
2022-02-15 14:08:34.39
2022-02-15 14:09:28.627
s1
2022-02-15 14:00:47.261
2022-02-15 14:16:29.339
I need the agent ideal time i.e. when the agent wasn't connected on any chat.
Expected output
Agent Ideal Duration
agent_ideal_duration
agent_id
08:44:55.006
a1
09:20:49.871
s1
Overlapping chat duration is removed from Agent's s1 ideal time.
Note: 1 agent can take multiple chats at a time.

Related

How to index match with conditions in sql

I have tables like this:
regist table
userID
registDate
1
2022-01-22
2
2022-01-23
session table
userID
date_key
traffic
null
2022-01-02
facebook
1
2021-01-03
facebook
1
2021-01-04
google
1
2021-01-05
linkedin
2
2021-01-15
facebook
2
2021-01-25
facebook
3
2021-01-20
facebook
Output
userID
date_key
traffic
regist date
1
2021-01-03
facebook
2022-01-22
1
2021-01-04
google
2022-01-22
1
2021-01-05
linkedin
2022-01-22
2
2021-01-15
facebook
2022-01-23
How do I merge the tables so that I can return the regist date. Do I do a right join?
Is this correct?
select *
from sessiontables st
left join registtable rt on st.userID = rt.userID
where st.userID is not null
How to do exist userID exist in regist table statement?
if I understand correctly, You can try to use self join with an aggregate function.
select rt.userID,
st.date_key,
st.traffic,
rt.registDate
from (
SELECT userID,min(date_key) date_key,traffic
FROM sessiontables
GROUP BY traffic,userID
) st
JOIN registtable rt
ON st.userID=rt.userID

snowflake: counting no.of rows present in an hour as single row

I have a user record for every login he does. I need to count how many times user has logged in. But I also need to consider that even though how many times a user logged in half an hour, i need to count as 1 time.
USER_ID TIMESTAMP
A1 2021-03-10 10:00:00
A1 2021-03-10 10:01:00
A1 2021-03-10 10:05:00
A1 2021-03-10 10:15:00
A1 2021-03-10 10:32:00
A1 2021-03-10 11:02:00
A1 2021-03-11 12:00:00
A2 2021-03-10 10:01:00
USER_ID TIMESTAMP
A1 4
A2 1
I am not able to figure out how to use lag and lead with the situation. Any help would be appreciatable.
SELECT user_id, count(distinct(date_trunc('hour',timestamp)::text||iff(minute(timestamp)>30,'_1','_0'))) as count
FROM table
GROUP BY 1 ORDER BY 1;
so this works by truncating to the hour and turning it into a string then add a suffix per half hour. Not the cleanest, but it should work.
Ah this question asked how to get time in 30 minutes truncations.
Of which the time_slice was a nice answer:
SELECT user_id, count(distinct(time_slice(timestamp, 30, 'MINUTE'))) as count
FROM table
GROUP BY user_id, ORDER BY user_id;

How to insert into SQL table with previous data check

I'm creating a table in which I will store bookmakers odds changes for sport events over time (it will have hundrets k of rows).
I want to create an update function in PHP, which puts in the table data only if current_odd_value is different than most recent odd_value stored in table.
Using simple INSERT function I created this table of 1 match (8483075) from two companies (66 and 22) for the same market (1) which has 3 selection (1001, 1002, 1003) that I get today at 17:00:
internal_id
match_id
company_id
market_id
selection_id
odd_value
update_date
1
8483075
66
1
1001
9,60
2021-01-04 17:00:00
2
8483075
66
1
1002
18,00
2021-01-04 17:00:00
3
8483075
66
1
1003
1,09
2021-01-04 17:00:00
4
8483075
22
1
1001
8,40
2021-01-04 17:00:00
5
8483075
22
1
1002
16,00
2021-01-04 17:00:00
6
8483075
22
1
1003
1,08
2021-01-04 17:00:00
At 17:05 I checked odds once again and I noticed 2 changes (for internal_id 2 and 6):
2 / 8483075 / 66 / 1 / 1002 / 15,00 ==> 18,00
6 / 8483075 / 22 / 1 / 1003 / 1,08 ==> 1,18
, that I should put into that table and should look like this:
internal_id
match_id
company_id
market_id
selection_id
odd_value
update_date
7
8483075
66
1
1002
15,00
2021-01-04 17:05:00
8
8483075
22
1
1003
1,18
2021-01-04 17:05:00
My idea to do that was to:
get table of all recent odd values for each match_id + company_id + market_id + selection_id
compare it with current odd value and only if it's different than value from point 1. put new record into table with proper data
MY QUESTIONS:
What will be the SELECT query to get what I need for point 1? I think I can use internal_id (higher means most recent) or update_date to get it, but I don't know how. I know how to make it for specific match_id + company_id + makret_id + selection_id but I need whole table in one select not one by one.
Is my approach correct or I should try different approach? (I think that retriving whole table at the beginning of update with most recent odds should be faster than comparing each value one by one)
Additional info:
All data that I have are coming from XML/JSON files that I'm receiving from different sources (so different formats etc. that I'm unifying under my db).

Subtract /Loop through rows in HIVE Query

I have a data in table like below
ID status timestamp
ABC login 1/1/2020 12:00
ABC lock 1/1/2020 13:19
ABC unlock 1/1/2020 13:52
ABC Disconnect 1/1/2020 15:52
ABC Reconnect 1/1/2020 15:55
ABC lock 1/1/2020 16:25
ABC unlock 1/1/2020 16:30
ABC logoff 1/1/2020 17:00
ABC login 2/1/2020 12:00
ABC lock 2/1/2020 13:19
ABC unlock 2/1/2020 13:52
ABC lock 2/1/2020 16:22
ABC logoff 2/1/2020 17:00
I need to find the effective working hours of an employee on a particular date for which he has really worked. Meaning sum of total time minus timings when the status was lock, disconnect.
Example: for employee ABC on 01-JAN-2020, his system was ideal between 13:19 - 13:52(33 minutes) and again from 15:52 - 15:55(3 minutes).
Hence, out of total working hour i.e... 5hrs (time between login and log off time) his effective time would be 5hr - 36 minutes = 4hr24 minutes.
Similarly for 01-FEB-2020.
You can use window functions, then aggregation:
select
id,
to_date(timestamp) timestamp_day,
sum(case when status in ('lock', 'disconnect') then - duration else duration end) / 60 / 60 hours_worked
from (
select t.*,
lead(timestamp) over(partition by id order by timestamp)
- unix_timestamp(timestamp) status_duration
from mytable t
) t
group by id, to_date(timestamp)
order by id, to_date(timestamp)
In the subquery, we use lead() to retrieve the timestamp of the "next" action, so we can compute the duration of the current step. The outer query aggregates by employee and day, and do the final computation of working hours according to your business rule.

Calculate Average between columns by comparing two rows in SQL Server

I have the below table
BidID AppID AppStatus StatusTime
1 1 In Review 2019-01-02 12:00:00
1 1 Approved 2019-01-02 13:00:00
1 2 In Review 2019-01-04 13:00:00
1 2 Approved 2019-01-04 14:00:00
2 2 In Review 2019-01-07 15:00:00
2 2 Approved 2019-01-07 17:00:00
3 1 In Review 2019-01-09 13:00:00
4 1 Approved 2019-01-09 13:00:00
What I am trying to do is first to calculate the average of statusTime minutes difference by the following logic
First group by the BidID and then by AppID and then calculate the time difference between the StatusTime between In Review and Approved AppStatus
eg
First Group BidID,Then group App ID
, Then First Check for In Review Status and Find the Next Approved status and then have to calculate min difference between the dates
BidID AppID AppStatus BidAverage
1 -> 1,2 -> For App ID 1(2019-01-02 1hour 1.5
15:48:42.000 - 2019-01-02
12:33:36.000)
For App ID 2(2019-01-04 2hour
10:33:12.000 - 2019-01-04
10:33:12.000)
2-> 2 -> For App ID 2(2019-01-04 1 1
10:33:12.000 - 2019-01-04
10:33:12.000)
3-> 1-> No Calculation since no Approved
4-> 1-> No Calculation since no In Review before Approved
Final Average (1.5 + 1) / 2 = 1.25 for the table
The time difference excluding saturday I have already figured out Time Dfference Exluding Weekend using David's suggestion.
I am not sure how to check if AppStatus is first in In Review and then Approved and then only calculate the time difference and if there is no Approved like in BidID 3 then don't use that in the average calculation and then average it across the APPId and then the BidID
Thanks
I think you can just use min() and max() for simplicity to get the times for the bid/app pairs. The rest is just aggregation and more aggregation.
The processing you describe seems to be:
select avg(avg_bid_diff)
from (select bid, avg(diff*1.0) as avg_bid_diff
from (select bid, appid,
datediff(second, min(starttime), max(statustime)) as diff
from t
where appstatus in ('In Review', 'Approved')
group by bid, appid
having count(*) = 2
) ba
group by bid
) b;
This makes assumptions that are consistent with the provided data -- that the statuses don't have duplicates for the bid/app pairs an that approval is always after review.