SQL Query to find the activity time of each user - sql

How can we use this data to calculate the activity of each user in each period?
user_id date status
101 2018-01-1 1
101 2018-01-2 0
101 2018-01-3 1
101 2018-01-4 1
101 2018-01-5 1
Output:
user_id start_date end_date status length
101 2018-01-1 2018-01-1 1 1
101 2018-01-2 2018-01-2 0 1
101 2018-01-3 2018-01-5 1 3

This is a gaps-and-islands problem. You can group them together using by subtracting a sequence:
select user_id, status, min(date), max(date),
julianday(max(date)) - julianday(min(date)) as length
from (select t.*,
row_number() over (partition by user_id, status order by date) as seqnum
from t
) t
group by user_id, status, date(date, '-' || seqnum || ' day')
order by user_id, length;
Here is a db<>fiddle.

Related

How to use SQL to get column count for a previous date?

I have the following table,
id status price date
2 complete 10 2020-01-01 10:10:10
2 complete 20 2020-02-02 10:10:10
2 complete 10 2020-03-03 10:10:10
3 complete 10 2020-04-04 10:10:10
4 complete 10 2020-05-05 10:10:10
Required output,
id status_count price ratio
2 0 0 0
2 1 10 0
2 2 30 0.33
I am looking to add the price for previous row. Row 1 is 0 because it has no previous row value.
Find ratio ie 10/30=0.33
You can use analytical function ROW_NUMBER and SUM as follows:
SELECT
id,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) - 1 AS status_count,
COALESCE(SUM(price) OVER (PARTITION BY id ORDER BY date), 0) - price as price
FROM yourTable;
DB<>Fiddle demo
I think you want something like this:
SELECT
id,
COUNT(*) OVER (PARTITION BY id ORDER BY date) - 1 AS status_count,
COALESCE(SUM(price) OVER (PARTITION BY id
ORDER BY date ROWS BETWEEN
UNBOUNDED PRECEDING AND 1 PRECEDING), 0) price
FROM yourTable;
Demo
Please also check another method:
with cte
as(*,ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) - 1 AS status_count,
SUM(price) OVER (PARTITION BY id ORDER BY date) ss from yourTable)
select id,status_count,isnull(ss,0)-price price
from cte

GROUP BY ignore groups containing NULL value and fetch recent record by date from each group

I have a user_details table as below.
id user_id start_date end_date
1 55 5-1-2017 NULL
2 55 3-1-2017 4-30-2017
3 66 1-1-2018 1-31-2018
4 66 2-1-2018 4-12-2018
5 77 11-1-2016 11-30-2016
6 77 12-1-2016 NULL
7 99 8-1-2016 1-31-2017
8 99 7-1-2016 7-31-2016
I have to fetch the latest record by start_date for each user but fetch only those users having end_date set for all records of that user.
The output should be as below:
id user_id start_date end_date
4 66 2-1-2018 4-12-2018
7 99 8-1-2016 1-31-2017
How can I achieve this result?
You can use DISTINCT ON and an ORDER BY clause to get the row with the latest start_date per group.
Then eliminate the results with end_date IS NULL.
SELECT id, user_id, start_date, end_date
FROM (SELECT DISTINCT ON (user_id)
id, user_id, start_date, end_date
FROM user_detail
ORDER BY user_id, start_date DESC, end_date, id) AS q
WHERE end_date IS NOT NULL;
One approach is to aggregate by user_id, and then identify which users have an end_date from the latest record which is not NULL. We use a CTE to find the max start_date value for each user. In the HAVING clause we assert that when the start_date is the latest starting date, the end_date is also not NULL.
WITH cte AS (
SELECT id, user_id, start_date, end_date,
MAX(start_date) OVER (PARTITION BY user_id) max_start_date
FROM yourTable
)
SELECT
user_id,
MIN(start_date) AS start_date,
MAX(end_date) AS end_date
FROM cte
GROUP BY
user_id
HAVING
COUNT(CASE WHEN start_date = max_start_date AND
end_date IS NOT NULL THEN 1 END) > 0;
Demo
With NOT EXISTS:
select u.*
from user_details u
where not exists (
select 1 from user_details
where user_id = u.user_id and (start_date > u.start_date or end_date is null)
)
See the demo.
Results:
| id | user_id | start_date | end_date |
| --- | ------- | ---------- | --------- |
| 4 | 66 | 2018-02-01 | 2018-04-12|
| 7 | 99 | 2016-08-01 | 2017-01-31|
DISTINCT ON with the right index might be the most efficient method. But that index is quite specific: (user_id, start_date DESC, end_date, id). The following should have similar performance but with a simpler index:
select ud.*
from user_details ud
where ud.id = (select ud2.id
from user_details ud2
where ud2.user_id = ud.user_id
order by ud2.start_date desc
limit 1
) and
ud.end_date is not null;
For this, you want an index on user_details(user_id, start_date desc, id).

SQL to find difference of dates - Oracle

I have a table:
table1
tran_id user_id start_date end_date
1 100 01-04-2018 02-04-2018
2 100 14-06-2018 14-06-2018
4 100 12-06-2018 12-06-2018
7 101 05-01-2018 05-01-2018
9 101 08-01-2018 08-01-2018
3 101 03-01-2018 03-01-2018
Date format is DD-MM-YYYY
I need to find the day difference ordered by user_id and start_date. The formula to find day difference is to subtract end_date of the prior record and start_date of the next.
Here is the expected output:
tran_id user_id start_date end_date day_diff
1 100 01-04-2018 02-04-2018 71
4 100 12-06-2018 12-06-2018 2
2 100 14-06-2018 14-06-2018 0
3 101 03-01-2018 03-01-2018 2
7 101 05-01-2018 05-01-2018 3
9 101 08-01-2018 08-01-2018 0
How to get this in an SQL query?
Use function lead() to find next value, then substract dates:
select tran_id, user_id, start_date, end_date, nvl(nsd - start_date, 0) diff
from (
select t.*, lead(start_date) over (partition by user_id
order by start_date) nsd
from table1 t)
demo
You can use lead() for this. Because lead() takes three arguments, you don't need a subquery or to deal with null values:
select t.*,
(lead(start_date, 1, start_date) over (partition by user_id order by start_date) -
start_date
) as diff
from t;
I think this: DATEDIFF function in Oracle in combination with the LAG function should help you out.
A minor correction to the answer given by Ponder Stibbons. Difference highlighted in Bold.
select tran_id, user_id, start_date, end_date, nvl(nsd - end_date, 0) diff
from (
select t.*, lead(start_date) over (partition by user_id
order by start_date) nsd
from date_test t);
Regards
Akash

MS SQL get aggregate datetime difference by status

I have below table in sql.
======================================================
UnitID Status DateTime Value
======================================================
101 A 01/12/2017 00:02:10 10
101 A 01/12/2017 00:02:40 25
101 A 01/12/2017 00:03:20 18
101 B 01/12/2017 00:03:55 30
101 B 01/12/2017 00:04:05 10
101 B 01/12/2017 00:04:30 20
101 B 01/12/2017 00:04:50 10
101 A 01/12/2017 00:05:00 28
101 A 01/12/2017 00:05:50 18
101 A 01/12/2017 00:06:20 18
102 A 01/12/2017 00:02:10 10
102 A 01/12/2017 00:02:40 25
102 A 01/12/2017 00:03:20 18
102 B 01/12/2017 00:03:55 30
102 B 01/12/2017 00:04:05 10
102 B 01/12/2017 00:04:30 20
102 B 01/12/2017 00:04:50 10
102 A 01/12/2017 00:05:00 28
102 A 01/12/2017 00:05:50 18
102 A 01/12/2017 00:06:20 18
From this table i need below mention output.
===========================================
UnitID StatusA StatusB MaxValue
===========================================
101 02:30 00:55 30
102 02:30 00:55 30
what i need is the total time difference by status. so how could i achieve this in mssql query. so here 02:30 is time duration for status "A" in the table.
Thank you in advanced.
As far as I know you cannot have status in different columns, only by row.
SELECT [UnitID], [Status], MAX([DateTime]) - MIN([DateTime]), MAX([Value])
FROM [theTable]
GROUP BY [UnitID], [Status]
Output would be like
101 A 02:30 30
101 B 00:55 30
102 A 02:30 30
102 B 00:55 30
If you have fixed states of A and B you can go messy and do this:
SELECT UnitID, A, B, MaxValue
FROM
(
SELECT [UnitID], MAX([DateTime]) - MIN([DateTime]) AS A, null AS B, MAX([Value]) AS MaxValue
FROM [theTable]
WHERE Status = 'A'
GROUP BY [UnitID]
UNION ALL
SELECT [UnitID], null, MAX([DateTime]) - MIN([DateTime]), MAX([Value])
FROM [theTable]
WHERE Status = 'B'
GROUP BY [UnitID]
) x
You can do what you need with the following query. I tried to separate each step on different CTE's so you can see step by step how to get to your result. LAG will retrieve the previous row value (spliting by the PARTITION BY columns and ordering by the ORDER BY).
;WITH LaggedValues AS
(
SELECT
M.UnitID,
M.Status,
M.DateTime,
LaggedDateTime = LAG(M.DateTime) OVER (PARTITION BY M.UnitID ORDER BY M.DateTime ASC),
LaggedStatus = LAG(M.Status) OVER (PARTITION BY M.UnitID ORDER BY M.DateTime ASC)
FROM
Measures AS M
),
TimeDifferences AS
(
SELECT
T.*,
SecondDifference = CASE
WHEN T.Status = T.LaggedStatus THEN DATEDIFF(SECOND, T.LaggedDateTime, T.DateTime) END
FROM
LaggedValues AS T
),
TotalsByUnitAndStatus AS
(
SELECT
T.UnitID,
T.Status,
SecondDifference = SUM(T.SecondDifference)
FROM
TimeDifferences AS T
GROUP BY
T.UnitID,
T.Status
),
TotalsByUnit AS -- Conditional aggregation (alternative to PIVOT)
(
SELECT
T.UnitID,
StatusA = MAX(CASE WHEN T.Status = 'A' THEN T.SecondDifference END),
StatusB = MAX(CASE WHEN T.Status = 'B' THEN T.SecondDifference END)
FROM
TotalsByUnitAndStatus AS T
GROUP BY
T.UnitID
)
SELECT
T.UnitID,
StatusA = CONVERT(VARCHAR(10), T.StatusA / 60) + ':' + CONVERT(VARCHAR(10), T.StatusA % 60),
StatusB = CONVERT(VARCHAR(10), T.StatusB / 60) + ':' + CONVERT(VARCHAR(10), T.StatusB % 60)
FROM
TotalsByUnit AS T
You can get the difference for each group:
select unitid, status, min(datetime) as mindt, max(datetime) as maxdt, max(value) as maxvalue
from (select t.*,
row_number() over (partition by unitid order by datetime) as seqnum,
row_number() over (partition by unitid, status order by datetime) as seqnum_s
from t
) t
group by unitid, status, (seqnum - seqnum_s);
This solves the "groups-and-islands" problem. Now you can get the information you want using conditional aggregation:
with t as (
select unitid, status, min(datetime) as mindt, max(datetime) as maxdt, max(value) as maxvalue
from (select t.*,
row_number() over (partition by unitid order by datetime) as seqnum,
row_number() over (partition by unitid, status order by datetime) as seqnum_s
from t
) t
group by unitid, status, (seqnum - seqnum_s)
)
select unitid,
sum(case when status = 'A' then datediff(minute, mindt, maxdt) end) as a_minutes,
sum(case when status = 'b' then datediff(minute, mindt, maxdt) end) as a_minutes,
max(maxvalue)
from t
group by unitid;
I'll leave it up to you to convert the minutes back to times.

Oracle SQL overlap between begin date and end date in 2 or more records

Database my_table:
id seq start_date end_date
1 1 01-01-2017 02-01-2017
1 2 07-01-2017 09-01-2017
1 3 11-01-2017 11-01-2017
2 1 20-01-2017 20-01-2017
3 1 01-02-2017 02-02-2017
3 2 03-02-2017 04-02-2017
3 3 08-01-2017 09-02-2017
3 4 09-01-2017 10-02-2017
3 5 10-01-2017 12-02-2017
My requirement is to get the first date (normally seq 1 start date) and end date (normally last seq end date) and the number of dates occurred during all seq for each unique ID.
Date occurred:
id 1 2 3
01-01-2017 20-01-2017 01-02-2017
02-01-2017 02-02-2017
07-01-2017 03-02-2017
08-01-2017 04-02-2017
09-01-2017 08-02-2017
11-01-2017 09-02-2017
10-02-2017
11-02-2017
12-02-2017
total 6 1 9
Here is the result I want:
id start_date end_date num_date
1 01-01-2017 11-01-2017 6
2 20-01-2017 20-01-2017 1
3 01-02-2017 12-02-2017 9
I have tried
SELECT id
, MIN(start_date)
, MAX(end_date)
, SUM(end_date - start_date + 1)
FROM my_table
GROUP BY id
and this SQL statement work fine in id 1 and 2 since there is no overlap date between begin date and end date. But for id 3, the result num_date is 11. Could you please suggest the SQL statement to solve this problem? Thank you.
One more question: The date in database is in datetime format. How do I convert it to date. I tried to use TRUNC function but it sometimes convert date to yesterday instead.
You need to count how many times an end_date equals the following start_date. For this you need to use the lag() or the lead() analytic function. You can use a case expression for the comparison, but alas you can't wrap the case expression within a COUNT or SUM in the same query; you need a subquery and an outer query.
Something like this; not tested, since you didn't provide CREATE TABLE and INSERT statements to recreate your sample data.
select id, min(start_date) as start_date, max(end_date) as end_date,
sum(end_date - start_date + 1 - flag) as num_days
from ( select id, start_date, end_date,
case when start_date = lag(end_date)
over (partition by id order by end_date) then 1
else 0 end as flag
from my_table
)
group by id;
SELECT id,
MIN( start_date ) AS start_date,
MAX( end_date ) AS end_date,
SUM( end_date - start_date + 1 ) AS num_days
FROM (
SELECT id,
GREATEST(
start_date,
COALESCE(
LAG( end_date ) OVER ( PARTITION BY id ORDER BY seq ) + 1,
start_date
)
) AS start_date,
end_date
FROM your_table
)
WHERE start_date <= end_date
GROUP BY id;