MSSQL - Running sum with reset after gap - sql

I have been trying to solve a problem for a few days now, but I just can't get it solved. Hence my question today.
I would like to calculate the running sum in the following table. My result so far looks like this:
PersonID
Visit_date
Medication_intake
Previous_date
Date_diff
Running_sum
1
2012-04-26
1
1
2012-11-16
1
2012-04-26
204
204
1
2013-04-11
0
1
2013-07-19
1
1
2013-12-05
1
2013-07-19
139
343
1
2014-03-18
1
2013-12-05
103
585
1
2014-06-24
0
2
2014-12-01
1
2
2015-03-09
1
2014-12-01
98
98
2
2015-09-28
0
This is my desired result. So only the running sum over contiguous blocks (Medication_intake=1) should be calculated.
PersonID
Visit_date
Medication_intake
Previous_date
Date_diff
Running_sum
1
2012-04-26
1
1
2012-11-16
1
2012-04-26
204
204
1
2013-04-11
0
1
2013-07-19
1
1
2013-12-05
1
2013-07-19
139
139
1
2014-03-18
1
2013-12-05
103
242
1
2014-06-24
0
2
2014-12-01
1
2
2015-03-09
1
2014-12-01
98
98
2
2015-09-28
0
I work with Microsoft SQL Server 2019 Express.
Thank you very much for your tips!

This is a gaps and islands problem, and one approach uses the difference in row numbers method:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY PersonID
ORDER BY Visit_date) rn1,
ROW_NUMBER() OVER (PARTITION BY PersonId, Medication_intake
ORDER BY Visit_date) rn2
FROM yourTable
)
SELECT PersonID, Visit_date, Medication_intake, Previous_date, Date_diff,
CASE WHEN Date_diff IS NOT NULL AND Medication_intake = 1
THEN SUM(Date_diff) OVER (PARTITION BY PersonID, rn1 - rn2
ORDER BY Visit_date) END AS Running_sum
FROM cte
ORDER BY PersonID, Visit_date;
Demo
The CASE expression in the outer query computes the rolling sum for date diff along islands of records having a medication intake value of 1. For other records, or for records where date diff be null, the value generated is simply null.

Related

Joins and/or Sub queries or Ranking functions

I have a table as follows:
Order_ID
Ship_num
Item_code
Qty_to_pick
Qty_picked
Pick_date
1111
1
1
3000
0
Null
1111
1
2
2995
1965
2021-05-12
1111
2
1
3000
3000
2021-06-24
1111
2
2
1030
0
Null
1111
3
2
1030
1030
2021-08-23
2222
1
3
270
62
2021-03-18
2222
1
4
432
0
Null
2222
2
3
208
0
Null
2222
2
4
432
200
2021-05-21
2222
3
3
208
208
2021-08-23
2222
3
4
232
200
2021-08-25
From this table,
I only want to show the rows that has the latest ship_num information, not the latest pick_date information (I was directed to a question like this that needed to return the rows with the latest entry time, I am not looking for that) for an order i.e., I want it as follows
Order_ID
Ship_num
Item_code
Qty_to_pick
Qty_picked
Pick_date
1111
3
2
1030
1030
2021-08-23
2222
3
3
208
208
2021-08-23
2222
3
4
232
200
2021-08-25
I tried the following query,
select order_id, max(ship_num), item_code, qty_to_pick, qty_picked, pick_date
from table1
group by order_id, item_code, qty_to_pick, qty_picked, pick_date
Any help would be appreciated.
Thanks in advance.
Using max(ship_num) is a good idea, but you should use the analytic version (with an OVER clause).
select *
from
(
select t.*, max(ship_num) over (partition by order_id) as orders_max_ship_num
from table1 t1
) with_max
where ship_num = orders_max_ship_num
order by order_id, item_code;
You can get this using the DENSE_RANK().
Query
;with cte as (
select rnk = dense_rank()
over (Partition by order_id order by ship_num desc)
, *
from table_name
)
Select *
from cte
Where rnk =1;

SQL - get device continous uptime

Device uptime time series table
There is a device monitor table recording if a device is up (STATE 1) or down for each day.
DEVICE_ID, STATE, DATE
1 0 2017-10-09
1 1 2017-10-10
1 1 2017-10-11
1 1 2017-10-12
1 0 2017-10-13
1 1 2017-10-14
1 1 2017-10-15
1 0 2017-10-16
1 1 2017-10-17
1 0 2017-10-18
...
2 0 2017-10-10
...
Question
How can I the duration of dates during which each device was up? The device 1 went up on 2017-10-10 and went down on 2017-10-13, hence it was up for 3 days (10, 11, 12). Then 2 days from 2017-10-14 to 2017-10-15.
The expected result should look like below.
DEVICE_ID, STATE, DATE
1 3 2017-10-10
1 2 2017-10-14
1 1 2017-10-17
Please advise.
This is a gaps-and-islands problem. You can solve this version with the difference of row numbers:
select device_id, min(date), max(date), count(*) as num_days
from (select t.*,
row_number() over (partition by device_id order by date) as seqnum,
row_number() over (partition by device_id, state order by date) as seqnum_2
from t
) t
where state = 1
group by device_id, (seqnum - seqnum_2), state;
Why this works is a little tricky to explain. If you stare at the results of the subquery, you will see how the difference between the two row number values defines the adjacent values that you want.

SQL sum by month with the previous values

I have following data:
cohort activity counter
-----------------------------
2010-12 0 470
2010-12 1 2
2010-12 2 1
2010-12 3 1
2010-12 6 1
2011-01 0 550
2011-01 1 1
2011-01 6 1
I want to sum counter of different activities by month, so the final table looks like:
cohort activity counter sumResult
-------------------------------------------
2010-12 0 470 470
2010-12 1 2 472
2010-12 2 1 473
2010-12 3 1 474
2010-12 6 1 475
2011-01 0 550 550
2011-01 1 1 551
2011-01 6 1 552
I've tried to do it like this:
select
a.activity, a.counter, a.cohort,
(
select sum(b.counter)
from data_table as b
where b.cohort = a.cohort and b.counter >= a.counter
) as sumResult
from data_table as a;
GO;
but it gave me strange results as:
cohort activity counter sumResult
-------------------------------------------
2010-12 0 470 470
2010-12 1 2 472
2010-12 2 1 475
2010-12 3 1 475
2010-12 6 1 475
2011-01 0 550 550
2011-01 1 1 552
2011-01 6 1 552
What could be a problem?
Depends on your RDBMS , some(SQL Server,Oracle,Postgresql) of them will accept SUM() OVER() :
SELECT t.*,
SUM(t.counter) OVER(PARTITION BY t.cohort ORDER BY t.activity) as sumResult
FROM YourTable t
If it's another, that's a bit more complicated and can be dealt with JOINS
The normal way to do this uses the ANSI standard cumulative sum function:
select dt.*,
sum(dt.counter) over (partition by dt.cohort order by dt.counter desc)
from data_table dt
order by cohort, counter desc;
If you want to use a subquery, the you need a stable sort, and activity can give you one. You can use this in the cumulative sum syntax:
select dt.*,
sum(dt.counter) over (partition by dt.cohort order by dt.counter desc, dt.activity)
from data_table dt
order by cohort, counter desc, activity;
Or using a subquery:
select dt.*,
(select sum(dt2.counter)
from data_table dt2
where dt2.cohort = dt.cohort and
(dt2.counter > dt.counter or
dt2.counter = dt.counter and dt2.activity < dt.activity)
)
from data_table dt
order by cohort, counter desc, activity;

How many Days each item was in each State, the full value of the period

This post is really similar to my question:
SQL Server : how many days each item was in each state
but I dont have the column Revision to see wich is the previous state, and also I want to get the full time of a status, I b
....
I'm want to get how long one item has been in one status in general, my table look like this:
ID DATE STATUS
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 2016-04-05 11:30:00.000 1
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 2016-04-08 11:30:00.000 13
274C5DA9-9C38-4A54-A697-009933BB7B7F 2016-04-29 08:00:00.000 5
274C5DA9-9C38-4A54-A697-009933BB7B7F 2016-05-04 08:00:00.000 4
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-14 07:50:00.000 1
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-21 14:00:00.000 2
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-23 12:15:00.000 3
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2016-04-23 16:15:00.000 1
BF122AE1-CB39-4967-8F37-012DC55E92A7 2016-04-05 10:30:00.000 1
BF122AE1-CB39-4967-8F37-012DC55E92A7 2016-04-20 17:00:00.000 5
I want to get this
Column 1 : ID Column 2 : Status Column 3 : Time with the status
Column 3 : Time with the status
= NextDate - PreviosDate + 1
if is the last Status, is count as 1
if is more than one Status on the same day, I get the Last one (u can say that only mather the last Status of the day)
by ID, Status must be unique
I should look like this:
ID STATUS TIME
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 1 3
3D56B7B1-FCB3-4897-BAEB-004796E0DC8D 13 1
274C5DA9-9C38-4A54-A697-009933BB7B7F 5 5
274C5DA9-9C38-4A54-A697-009933BB7B7F 4 1
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 1 8
A70A66DC-9D9E-49BE-93CF-00F9E3E06CE2 2 2
BF122AE1-CB39-4967-8F37-012DC55E92A7 1 15
BF122AE1-CB39-4967-8F37-012DC55E92A 5 1
Thanks to #ConradFrix comments, this is how works ..
WITH CTE
AS
(
SELECT
ID,
STATUS,
DATE,
LEAD(DATE, 1) over (partition by ID order by DATE) LEAD,
ISNULL(DATEDIFF(DAYOFYEAR, DATE,
LEAD(DATE, 1) over (partition by ID order by DATE)), 1) DIF_BY_LEAD
FROM TABLE_NAME
)
SELECT ID, STATUS, SUM(DIF_BY_LEAD) AS TIME_STATUS
FROM CTE GROUP BY ID, STATUS
ORDER BY ID, STATUS

List the last two records for each id

Good Afternoon!
I'm having trouble list the last two records each idmicro
Ex:
idhist idmicro idother room unit Dtmov
100 1102 0 8 coa 2009-10-23 10:40:00.000
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-23 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
105 1201 0 4 dimel 2008-10-22 10:40:00.000
Would look like this:
ex:
result
idhist idmicro idoutros room unit Dtmov
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-22 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
I'm starting to delve into SQL and am having trouble finding this solution
Sorry
Thank you.
EDIT: I am using SQL server, and I made no query.
Yes! is based on the date and time
You can do the same thing with an imbricated SELECT statement.
SELECT *
FROM (
SELECT row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS ind
,*
FROM data
) AS initialResultSet
WHERE initialResultSet.ind < 3
Here is a sample SQLFiddle with how this query works.
WITH etc
AS (
SELECT *
,row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS r
,count() OVER (
PARTITION BY idmicro ORDER BY idhist
) cfrom TABLE
)
SELECT *
FROM etc
WHERE r > c - 2
Use row_number and over partition
SELECT *
FROM (
SELECT *, row_number() OVER (PARTITION BY idmicro ORDER BY idhist desc) AS rownum
FROM data
) AS initialResultSet
WHERE initialResultSet.rownum<=2