How can I calculate user session time from heart beat data in Presto SQL?

How can I calculate user session time from heart beat data in Presto SQL? - sql

I'm currently recording when user's are active via a heart beat. It's stored in a table like so:
User ID
Minute of Day
1
3
1
4
1
5
1
8
1
9
2
2
2
3
2
4
User ID 1 is active from 3 to 5 but then is inactive from 6 to 7 and then becomes active again from 8 to 9.
User ID 1 was active for 3 minutes: (5-3 + 9-8) = 3
User ID 2 was active for 2 minutes: 4-2 = 2
How can I calculate this using a SQL (Presto) query?
Output should be like so:
User ID
Total Minutes
1
3
2
2

You may try the following which uses the lag function to determine active periods (diff = 1) before summing them
SELECT
USERID,
SUM(diff) as TotalMinutes
FROM (
SELECT
UserId,
(MinuteofDay - LAG(MinuteofDay,1,MinuteofDay) OVER (PARTITION BY UserId ORDER BY MinuteofDay)) as diff
FROM
my_table
) t
WHERE
diff = 1
GROUP BY
UserID;
userid
TotalMinutes
1
3
2
2
View on DB Fiddle

Related

sql snowflake, aggregate over window or sth

I have a table below
days
balance
user_id
wanted column
2022/08/01
10
1
1
2022/08/02
11
1
1
2022/08/03
10
1
1
2022/08/03
0
2
1
2022/08/05
3
2
2
2022/08/06
3
2
2
2022/08/07
3
3
3
2022/08/08
0
2
3
since I'm new to SQL couldn't aggregate over window by clauses, correctly.
which means; I want to find unique users that have balance>0 per day.
thanks
update:
exact output wanted:
days
unque users
2022/08/01
1
2022/08/02
1
2022/08/03
1
2022/08/05
2
2022/08/06
2
2022/08/07
3
2022/08/08
3
update: how if I want to accumulate the number of unique users over time? with consideration of new users [means: counting users who didn't exist before], and the balance > 0
everyones help is appreaciated deeply :)

SELECT
*,
COUNT(DISTINCT CASE WHEN balance > 0 THEN USER_ID END) OVER (ORDER BY days)
FROM
your_table

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.

SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

Assign Unique Group Id To Sets of Rows with Same Column Value Separated by Other value

I have some data that looks like this:
uid radius
1 10
2 10
3 10
4 2
5 4
6 10
7 10
8 10
What I want is for each group which has the same radius value to have its own unique id, for example:
uid radius GroupdId
1 10 1
2 10 1
3 10 1
4 2 2
5 4 3
6 10 4
7 10 4
8 10 4
What I don't want is the second group with radius 10 to have the same groupid as the first group (not 1).
I'm working on SQL Server but the solution should be the same across all databases.
(I've done this before, but for the life of me, I can't remember how I did it.)

Try this:
with t as
(
select
uid,
radius,
lag(radius,1) over (order by uid) as prev_rad
from
radtable
)
select
uid,
radius,
sum
(
case when radius = coalesce(prev_rad,radius) then 0 else 1 end
)
over
(
order by uid
) + 1 as GroupID
from
t

sql best strategy to partition same values based on temporal sequence

I have data that looks like this, where there are multiple values for each ID that correspond to an ascending date variable:
ID LEVEL DATE
1 10 10/1/2000
1 10 11/20/2001
1 10 12/01/2001
1 30 02/15/2002
1 30 02/15/2002
1 20 05/17/2002
1 20 01/04/2003
1 30 07/20/2003
1 30 03/16/2004
1 30 04/15/2004
I want to acquire a count per each ID/LEVEL/DATE block that looks like this:
ID LEVEL COUNT
1 10 3
1 30 2
1 20 2
1 30 3
The problem is that if I use the count windows function and partition by level, it groups 30 together regardless of the temporal sequence. I want the count for level 30 both before and after 20 to be distinct. Does anyone know how to do that?

A standard gaps and islands solution using ROW_NUMBER(), if it's available on your particular DBMS...
WITH
ordered AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS set_ordinal,
ROW_NUMBER() OVER (PARTITION BY id, level ORDER BY date) AS grp_ordinal
FROM
yourData
)
SELECT
id,
level,
set_ordinal - grp_ordinal,
MIN(date),
COUNT(*)
FROM
ordered
GROUP BY
id,
level,
set_ordinal - grp_ordinal
ORDER BY
id,
MIN(date)
Visualising the effect of the two row numbers...
ID LEVEL DATE set_ordinal grp_ordinal set-grp GROUP
-- ----- ---------- ----------- ----------- ------- --------
1 10 10/01/2000 1 1 0 1,10,0
1 10 11/20/2001 2 2 0 1,10,0
1 10 12/01/2001 3 3 0 1,10,0
1 30 02/15/2002 4 1 3 1,30,3
1 30 02/15/2002 5 2 3 1,30,3
1 20 05/17/2002 6 1 5 1,20,5
1 20 01/04/2003 7 2 5 1,20,5
1 30 07/20/2003 8 3 5 1,30,5
1 30 03/16/2004 9 4 5 1,30,5
1 30 04/15/2004 10 5 5 1,30,5

Select Query to Get Unique Cells in Two Columns

I have an SQL Server database, that logs weather device sensor data.
The table looks like this:
Id DeviceId SensorId Value
1 1 1 42
2 1 1 3
3 1 2 30
4 2 2 0
5 2 1 1
6 3 1 26
7 3 1 23
8 3 2 1
In return the query should return the following:
Id DeviceId SensorId Value
2 1 1 3
3 1 2 30
4 2 2 0
5 2 1 1
7 3 1 23
8 3 2 1
For each device the sensor should be unique. i.e. Values in Columns DeviceId and SensorId should be unique (row-wise).
Apologies if I'm not clear enough.

If you don't want to sum Value as your desired result suggest, so you just want to take an "arbitrary" row of each "DeviceId + SensorId"-group:
WITH CTE AS
(
SELECT Id, DeviceId, SensorId, Value,
RN = ROW_NUMBER() OVER (PARTITION BY DeviceId, SensorId ORDER BY ID DESC)
FROM dbo.TableName
)
SELECT Id, DeviceId, SensorId, Value
FROM CTE
WHERE RN = 1
ORDER BY ID
This returns the row with the highest ID per group. You need to change ORDER BY ID DESC if you want a different result. Demo: http://sqlfiddle.com/#!6/8e31b/2/0 (your result)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How can I calculate user session time from heart beat data in Presto SQL? - sql

Related

sql snowflake, aggregate over window or sth

Resetting a Count in SQL

Assign Unique Group Id To Sets of Rows with Same Column Value Separated by Other value

sql best strategy to partition same values based on temporal sequence

Select Query to Get Unique Cells in Two Columns

Categories

Resources