Reset cumulative sum column after threshold with groups

Reset cumulative sum column after threshold with groups - sql

I need to calculate a cumulative sum (group based, column GroupNr) that resets after exceeding some number, in this example - 330.
Can this be done using a function or CTE? If so, how?
Current Table
GroupNr Name Sum Cumsum
1 Mary 0.00 0.00
1 Jane 179.00 179.00
1 Tom 106.00 285.00
1 Joseph 175.00 460.00
1 Arthur 253.00 713.00
2 Mary 0.00 0.00
2 Jane 365.00 365.00
2 Tom 365.00 730.00
2 Joseph 365.00 1095.00
2 Arthur 365.00 1460.00
Expected Table
GroupNr Name Sum Cumsum Resetcumsum
1 Mary 0.00 0.00 0.00
1 Jane 179.00 179.00 179.00
1 Tom 106.00 285.00 285.00
1 Joseph 175.00 460.00 460.00 -- Reset point
1 Arthur 253.00 713.00 253.00
2 Mary 0.00 0.00 0.00
2 Jane 365.00 365.00 365.00
2 Tom 365.00 730.00 365.00
2 Joseph 365.00 1095.00 365.00
2 Arthur 365.00 1460.00 365.00
Code for tables
CREATE TABLE Table1 (
GroupNr int,
Name varchar(7),
Sum numeric(14, 2),
Cumsum numeric(14, 2)
)
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (1, 'Mary', 0, 0);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (1, 'Jane', 179, 179);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (1, 'Tom', 106, 285);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (1, 'Joseph', 175, 460);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (1, 'Arthur', 253, 713);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (2, 'Mary', 0, 0);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (2, 'Jane', 365, 365);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (2, 'Tom', 365, 730);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (2, 'Joseph', 365, 1095);
INSERT INTO Table1 (GroupNr, Name, Sum, Cumsum)
VALUES (2, 'Arthur', 365, 1460);

Capping a cumulative SUM by using standard SUM() OVER() is not possible due to threshold. One way to achieve such result is recursive CTE:
WITH cte_r AS (
SELECT t.*, ROW_NUMBER() OVER(PARTITION BY GroupNr ORDER BY (SELECT 1)) AS rn
FROM Table1 t
), cte AS (
SELECT GroupNr, Name, [Sum], [CumSum],
CAST([Sum] AS INT) AS ResetCumSum,
rn
FROM cte_r
WHERE rn = 1
UNION ALL
SELECT cte_r.GroupNr, cte_r.Name, cte_r.[Sum], cte_r.[CumSum],
CAST(CASE WHEN cte.ResetCumSum >= 330 THEN 0 ELSE cte.ResetCumSum END + cte_r.[Sum] AS INT)
AS ResetCumSum,
cte_r.rn
FROM cte
JOIN cte_r
ON cte.rn = cte_r.rn-1
AND cte.GroupNr = cte_r.GroupNr
)
SELECT GroupNr, Name, [Sum], [CumSum], ResetCumSum
FROM cte
ORDER BY GroupNr, rn;
Output:
db<>fiddle demo
Warning: Table by design is unordered set so to get stable result a order column is required(like unqiue id, timestamp). Here to emulate insert ROW_NUMBER() OVER(PARTITION BY GroupNr ORDER BY (SELECT 1)) AS rn was used but it is not stable.
Related:
Conditional SUM and the same using MATCH_RECOGNIZE - in my opinion the cleanest way
Extra:
Quirky UPDATE: Running Total until specific condition is true
Disclaimer: "DO NOT USE IT AT PRODUCTION!!!"
-- source table to be extended with id and Resetcumsum columns
CREATE CLUSTERED INDEX IX_ROW_NUM ON Table1(GroupNr, id);
DECLARE #running_total NUMERIC(14,2) = 0
,#prev_running_total NUMERIC(14,2) = 0
,#prev_GroupNr INT = 0;
UPDATE Table1
SET
#prev_running_total = #running_total
,#running_total = Resetcumsum = IIF(#prev_GroupNr != GroupNr
OR #running_total >= 330, 0, #running_total)
+ [Sum]
,#prev_GroupNr = GroupNr
FROM Table1 WITH(INDEX(IX_ROW_NUM))
OPTION (MAXDOP 1);
SELECT *
FROM Table1
ORDER BY id;
db<>fiddle demo - 2

Related

excluding dups which are lower than max values in SQL

I have the following simple table (Table1), where each row is a student_ID and their name, and each student has one or multiple wins (Wins). I would like to output: Student_ID, Student_name, count of Wins, sorted by count of Wins (descending) and then Student_ID (ascending), excluding those students who have the same count of Wins which is less than the max of the Wins (i.e.5). In other words, Lizzy and Mark have the same count of wins, and 3 is lower than 5, so the output will exclude the two students, Lizzy and Mark.
From comments: "Betty, David and Cathy should be excluded", also.
Table1:
student_id
student_name
wins
1
John
YES
1
John
YES
1
John
YES
1
John
YES
1
John
YES
2
Brandon
YES
2
Brandon
YES
2
Brandon
YES
2
Brandon
YES
2
Brandon
YES
3
Lizzy
YES
3
Lizzy
YES
3
Lizzy
YES
4
Mark
YES
4
Mark
YES
4
Mark
YES
5
Betty
YES
6
David
YES
7
Cathy
YES
8
Joe
YES
8
Joe
YES
Desired output:
student_id
student_name
cnt_wins
1
John
5
2
Brandon
5
8
Joe
2
Here is my SQL in Oracle. I can't figure out what went wrong. The log says "(SELECT b.cnt_wins, count(b.student_id) has too many values".
WITH st_cte AS
(SELECT student_id, student_name, count(wins) cnt_wins
FROM Table1
GROUP BY student_id, student_name
ORDER BY count(wins) DESC, student_id)
SELECT *
FROM st_cte a
WHERE a.cnt_wins not in
(SELECT b.cnt_wins, count(b.student_id)
FROM st_cte b
WHERE b.cnt_wins <
(SELECT max(c.cnt_wins) FROM st_cte c)
GROUP BY b.cnt_wins
HAVING count(b.student_id) > 1);

There are too many values selected inside the 'in' select:
WHERE a.cnt_wins -- 1 value
not in
(SELECT b.cnt_wins, count(b.student_id) -- 2 values
FROM st_cte b
you shoud either do :
WHERE a.cnt_wins not in
(SELECT b.cnt_wins
FROM st_cte ...
or
WHERE (a.cnt_wins, count(something)) not in
(SELECT b.cnt_wins, count(b.student_id)
FROM st_cte ...

Updated based on updated requirements...
The requirement was ambiguous in that Betty, David, and Cathy seem to also meet the criteria to be removed from the result. This requirement was clarified and those rows should have been removed.
Logic has been added to allow only all max_cnt rows, plus any students with a unique count.
Also note that if wins can be any other non-null value, COUNT(wins) is not correct.
Given all that, maybe something like this is a starting point:
Fiddle
WITH cte AS (
SELECT student_id, student_name
, COUNT(wins) cnt_wins
, MAX(COUNT(wins)) OVER () AS max_cnt
FROM Table1
GROUP BY student_id, student_name
)
, cte2 AS (
SELECT cte.*
, COUNT(*) OVER (PARTITION BY cnt_wins) AS cnt_students
FROM cte
)
SELECT student_id, student_name, cnt_wins
FROM cte2
WHERE max_cnt = cnt_wins
OR cnt_students = 1
ORDER BY cnt_wins DESC, student_id
;
and to handle wins that can be other non-null values:
WITH cte AS (
SELECT student_id, student_name
, COUNT(CASE WHEN wins = 'YES' THEN 1 END) cnt_wins
, MAX(COUNT(CASE WHEN wins = 'YES' THEN 1 END)) OVER () AS max_cnt
FROM Table1
GROUP BY student_id, student_name
)
, cte2 AS (
SELECT cte.*
, COUNT(*) OVER (PARTITION BY cnt_wins) AS cnt_students
FROM cte
)
SELECT student_id, student_name, cnt_wins
FROM cte2
WHERE max_cnt = cnt_wins
OR cnt_students = 1
ORDER BY cnt_wins DESC, student_id
;
Result (with data to test the new requirement, one student (Joe) with unique counts (2)):
STUDENT_ID
STUDENT_NAME
CNT_WINS
1
John
5
2
Brandon
5
8
Joe
2
Setup:
CREATE TABLE table1 (
Student_ID int
, Student_Name VARCHAR2(20)
, Wins VARCHAR2(10)
);
BEGIN
-- Assume only wins are stored.
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 5, 'Betty', 'YES');
INSERT INTO table1 VALUES ( 6, 'David', 'YES');
INSERT INTO table1 VALUES ( 7, 'Cathy', 'YES');
INSERT INTO table1 VALUES ( 8, 'Joe', 'YES');
INSERT INTO table1 VALUES ( 8, 'Joe', 'YES');
END;
/
Correction to the original query in the question:
WITH st_cte AS
(SELECT student_id, student_name, count(wins) cnt_wins
FROM Table1
GROUP BY student_id, student_name
ORDER BY count(wins) DESC, student_id
)
SELECT *
FROM st_cte a
WHERE a.cnt_wins not in
(SELECT b.cnt_wins
FROM st_cte b
WHERE b.cnt_wins < (SELECT max(c.cnt_wins) FROM st_cte c)
GROUP BY b.cnt_wins
HAVING count(b.student_id) > 1
)
;

Accumulating previous rows with grouping

I have this table on MS SQL Server
Customer Month Amount
-----------------------------
Tom 1 10
Kate 1 60
Ali 1 70
Tom 2 50
Kate 2 40
Tom 3 80
Ali 3 20
I want the select to get accumulation of the customer for each month
Customer Month Amount
-----------------------------
Tom 1 10
Kate 1 60
Ali 1 70
Tom 2 60
Kate 2 100
Ali 2 70
Tom 3 140
Kate 3 100
Ali 3 90
Noticing that Ali has no data for the month of 2
and Kate has no data for the month of 3
I have done it but the problem is that for the missing month for each customer no data shows
i.e. Kate has to be in month 3 with 100 amount
and Ali has to be in Month 2 with 70 amount
declare #myTable as TABLE (Customer varchar(50), Month int, Amount int)
;
INSERT INTO #myTable
(Customer, Month, Amount)
VALUES
('Tom', 1, 10),
('Kate', 1, 60),
('Ali', 1, 70),
('Tom', 2, 50),
('Kate', 2, 40),
('Tom', 3, 80),
('Ali', 3, 20);
select * from #myTable
select
SUM(b.Amount),a.Customer, a.Month
from
#myTable a
inner join
#myTable b
on a.Customer = b.Customer and
a.Month >= b.Month
group by
a.Customer, a.Month

Use window function
select Customer, Month,
sum(Amount) over (partition by customer order by month) Amount
from table t
So, you want some kind of look up tables which has possible months with customers.
with cte as
(
select * from (
select Customer from table
group by Customer)c
cross join (values (1),(2),(3))a(Months)
) -- look-up table
select c.Customer, c.Months,
sum(t.Amount) over (partition by c.Customer order by c.Months) Amount
from cte c left join table t
on t.Month = c.Months and t.Customer = c.Customer
Result :
Customer Months Amount
Tom 1 10
Kate 1 60
Ali 1 70
Tom 2 60
Ali 2 70
Kate 2 100
Ali 3 90
Kate 3 100
Tom 3 140

with cte as
(select *
from
(select distinct customer from myTable ) c
cross join ( values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) t(month))
select cte.customer, cte.month,
sum(myTable.amount) over (partition by cte.customer order by cte.month) as cumamount
from cte left join myTable
on cte.customer = myTable.customer and cte.month = myTable.month
order by cte.month, cte.customer desc

to be clear(in answer Amount and AmountSum)
DECLARE #myTable TABLE(Customer varchar(50), Month int, Amount int);
INSERT INTO #myTable(Customer, Month, Amount)
VALUES
('Tom', 1, 10),
('Kate', 1, 60),
('Ali', 1, 70),
('Tom', 2, 50),
('Kate', 2, 40),
('Tom', 3, 80),
('Ali', 3, 20);
DECLARE #FullTable TABLE(Customer varchar(50), Month int, Amount int);
INSERT INTO #FullTable(Customer, Month, Amount)
SELECT c.Customer, m.Month, ISNULL(mt.Amount, 0)
FROM (SELECT DISTINCT [Month] FROM #myTable) AS m
CROSS JOIN (SELECT DISTINCT Customer FROM #myTable) AS c
LEFT JOIN #myTable AS mt ON m.Month = mt.Month AND c.Customer = mt.Customer
SELECT t1.Customer, t1.Month, t1.Amount, (t1.Amount + ISNULL(t2.sm, 0)) AS AmountSum
FROM #FullTable AS t1
CROSS APPLY (SELECT SUM(Amount) AS sm FROM #FullTable AS t WHERE t.Customer = t1.Customer AND t.Month < t1.Month) AS t2
ORDER BY Month, Customer

Do you want get the sum amount every month for each customer , what ever the customer has transaction in that month?
In following script, If you have a customer table, you can join the customer table, do not need use (SELECT DISTINCT Customer FROM #myTable)
declare #myTable as TABLE (Customer varchar(50), Month int, Amount int);
INSERT INTO #myTable(Customer, Month, Amount)
VALUES
('Tom', 1, 10),
('Kate', 1, 60),
('Ali', 1, 70),
('Tom', 2, 50),
('Kate', 2, 40),
('Tom', 3, 80),
('Ali', 3, 20),
('Jack', 3, 90);
SELECT c.Customer,sv.number AS Month ,SUM(CASE WHEN t.Month<=sv.number THEN t.Amount ELSE 0 END ) AS Amount
FROM master.dbo.spt_values AS sv
INNER JOIN (SELECT DISTINCT Customer FROM #myTable) AS c ON 1=1
LEFT JOIN #myTable AS t ON t.Customer=c.Customer
WHERE sv.type='P' AND sv.number BETWEEN 1 AND MONTH(GETDATE())
GROUP BY sv.number,c.Customer
ORDER BY c.Customer,sv.number
----------
Customer Month Amount
-------------------------------------------------- ----------- -----------
Ali 1 70
Ali 2 70
Ali 3 90
Jack 1 0
Jack 2 0
Jack 3 90
Kate 1 60
Kate 2 100
Kate 3 100
Tom 1 10
Tom 2 60
Tom 3 140

Try this the table name is "a".
Using a combination of Cte and sub query. Tried it out in MSSQL2008R2
with cte as
(
select * from (
select Customer from a
group by Customer)c
cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9), (10),(11),(12))a(Months)
)
select Customer,Months,
(select SUM(total) from
(select customer , month , sum(amount)as total from a group by customer,
month) as GroupedTable
where GroupedTable.customer= cte.customer and GroupedTable.month<= cte.Months) as total
from cte
Group by Customer,Months
order by Customer,Months

try this:
create table #tmp (Customer VARCHAR(10), [month] INT ,Amount INT)
INSERT INTO #tmp
SELECT 'Tom',1,10
union all
SELECT 'Kate',1,60
union all
SELECT 'Ali',1,70
union all
SELECT 'Tom',2,50
union all
SELECT 'Kate',2,40
union all
SELECT 'Tom',3,80
union all
SELECT 'Ali',3,20
;WITH cte1 AS (
SELECT [month], ROW_NUMBER() OVER(order by [month] desc) rn
FROM (SELECT DISTINCT [month] as [month] FROM #tmp) a
)
, cte2 AS (
SELECT customer, ROW_NUMBER() OVER(order by customer desc) rn
FROM (SELECT DISTINCT customer as customer FROM #tmp) b
)
SELECT t2.Customer,t2.[month],ISNULL(t1.Amount,0) As Amount
into #tmp2
from #tmp t1
RIGHT JOIN
(select [month],customer from cte1
cross apply
cte2) t2 ON t1.customer=t2.customer and t1.[month]=t2.[month]
order by t2.[month]
SELECT Customer,[Month] ,SUM (Amount) OVER(partition by customer order by customer ROWS UNBOUNDED PRECEDING ) as Amount
FROM #tmp2
order by [month]
drop table #tmp
drop table #tmp2

I think this does what you want
declare #myTable as TABLE (Customer varchar(50), Month int, Amount int);
INSERT INTO #myTable (Customer, Month, Amount)
VALUES
('Tom', 1, 10),
('Kate', 1, 60),
('Ali', 1, 70),
('Tom', 2, 50),
('Kate', 2, 40),
('Tom', 3, 80),
('Ali', 3, 20);
select dts.Month, cts.Customer, isnull(t.Amount, 0) as Amount
, sum(isnull(t.Amount, 0)) over(partition by cts.Customer order by dts.Month) as CumAmt
from ( select distinct customer
from #myTable
) cts
cross join ( select distinct Month
from #myTable
) dts
left join #myTable t
on t.Customer = cts.Customer
and t.Month = dts.Month
order by dts.Month, cts.Customer;
Month Customer Amount CumAmt
----------- -------------------------------------------------- ----------- -----------
1 Ali 70 70
1 Kate 60 60
1 Tom 10 10
2 Ali 0 70
2 Kate 40 100
2 Tom 50 60
3 Ali 20 90
3 Kate 0 100
3 Tom 80 140

Try Sum Over Partition By
https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql
This will help you get the idea how to accumulate. If the code i use in postgresql like this
Select sum(amount) over(partition by customer, month)

This should do it for you. Also here is a link to the Microsoft docs regarding aggregation functions.
https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql
Example:
SELECT
Customer, Month, SUM(Amount) as Amount
FROM myTable
GROUP BY Customer, Month
ORDER BY Customer, Month

Select start date and end date form records with subsequent date field in SQL Server 2008 R2

I have a table in SQL Server 2008 R2 called ReserveLog. This is an existing table that stores the reserve date of each room in a complex.
It is like this:
RoomNumber ReserveDate
----------------------
1 2017-07-01
1 2017-07-02
1 2017-07-03
1 2017-07-06
1 2017-07-07
1 2017-07-08
2 2017-01-02
2 2017-01-03
2 2017-01-04
2 2017-01-09
2 2017-01-10
I want to query this table so that I get the following result:
RoomNumber ReserveStartDate ReserveEndDate
------------------------------------------
1 2017-07-01 2017-07-03
1 2017-07-06 2017-07-08
2 2017-07-02 2017-07-04
2 2017-07-09 2017-07-10
Is it possible? I can't make my mind how to do it. Any help is appreciated in advance

create table #reservs
(
roomnumber INT, ReserveDate DATE
)
INSERT INTO #reservs VALUES (1, '2017-07-01');
INSERT INTO #reservs VALUES (1, '2017-07-02');
INSERT INTO #reservs VALUES (1, '2017-07-03');
INSERT INTO #reservs VALUES (1, '2017-07-06');
INSERT INTO #reservs VALUES (1, '2017-07-07');
INSERT INTO #reservs VALUES (1, '2017-07-08');
INSERT INTO #reservs VALUES (2, '2017-01-02');
INSERT INTO #reservs VALUES (2, '2017-01-03');
INSERT INTO #reservs VALUES (2, '2017-01-04');
INSERT INTO #reservs VALUES (2, '2017-01-09');
INSERT INTO #reservs VALUES (2, '2017-01-10');
select roomnumber, MIN(reservedate) as mn, MAX(reservedate) as mx
FROM (
SELECT *
, DATEDIFF(day, ROW_NUMBER() OVER(partition by roomnumber order by reservedate) ,reservedate) as ind
FROM #reservs
) a
group by roomnumber, ind
order by 1, 2

Try this using common table expressions, comments in line and SQL Fiddle link:
SQL Fiddle
create table Reservelog
(
RoomNumber INT,
ReserveDate Date
)
INSERT INTO ReserveLog
VALUES
(1, '2017-07-01'),
(1, '2017-07-02'),
(1, '2017-07-03'),
(1, '2017-07-06'),
(1, '2017-07-07'),
(1, '2017-07-08'),
(2, '2017-01-02'),
(2, '2017-01-03'),
(2, '2017-01-04'),
(2, '2017-01-09'),
(2, '2017-01-10')
Query 1:
;WITH CTE
As
(
SELECT *,
(
-- Get Previous Reserve Date for this room
SELECT TOP 1 ReserveDate
FROM ReserveLog R2
WHERE R1.RoomNumber = R2.RoomNumber AND
R1.ReserveDate > R2.ReserveDate
ORDER BY ReserveDate DESC
) As PrevReserveDate,
(
-- Get NExt ReserveDate For this room
SELECT TOP 1 ReserveDate
FROM ReserveLog R2
WHERE R1.RoomNumber = R2.RoomNumber AND
R1.ReserveDate < R2.ReserveDate
ORDER BY ReserveDate
) As NextReserveDate
FROM ReserveLog R1
),
CTE2
AS
(
SELECT *,
CASE
WHEN PrevReserveDate IS NULL OR
DATEDIFF(D, PrevReserveDate, ReserveDate ) > 1
THEN 1 -- Flag as a StartDate
ELSE 0
END As DateStart,
CASE
WHEN NextReserveDate IS NULL OR
DATEDIFF(D, ReserveDate, NExtReserveDate) > 1
THEN 1 -- Flag as an end date
ELSE 0
END As DateEnd,
ROW_NUMBER() OVER
(PARTITION BY RoomNumber ORDER BY ReserveDate) AS RN
FROM CTE
-- only select rows which have no previous or next reservation or
-- ones where the difference between consecutive reservations > 1 day
WHERE PrevReserveDate IS NULL OR
NextReserveDate IS NULL OR
DATEDIFF(D, PrevReserveDate, ReserveDate ) > 1 OR
DATEDIFF(D, ReserveDate, NExtReserveDate) > 1
)
SELECT startRows.RoomNumber,
startRows.ReserveDate As ReserveStartDate,
endRows.ReserveDate As ReserveEndDate
FROM CTE2 startRows
INNER JOIN CTE2 endRows
ON startRows.RN + 1 = endRows.RN AND
startRows.RoomNumber = endRows.RoomNumber AND
endRows.DateEnd = 1
WHERE startRows.DateStart = 1
Results:
| RoomNumber | ReserveStartDate | ReserveEndDate |
|------------|------------------|----------------|
| 1 | 2017-07-01 | 2017-07-03 |
| 1 | 2017-07-06 | 2017-07-08 |
| 2 | 2017-01-02 | 2017-01-04 |
| 2 | 2017-01-09 | 2017-01-10 |

Use this query:
SELECT * FROM R2 WHERE ReserveDate between ('2017-07-01 ' AND '2017-07-03');

Selecting rows when there is a change in the value of column from the previous row

I have the following table:-
Name Status Timestamp
Ben 1 2015-01-01
Ben 1 2015-01-02
Joe 1 2015-11-12
Joe 2 2015-11-13
Joe 2 2016-12-14
Joe 2 2016-12-15
Paul 1 2015-08-16
Paul 1 2015-08-17
Paul 3 2015-08-18
Paul 3 2015-08-19
Mark 2 2015-09-20
Mark 2 2015-09-25
Mark 2 2015-09-26
Mark 3 2015-10-27
I need a query that returns only the rows where there is a change in the 'Status'. It should return the row when the 'Status' is changed and also the previous row.
For instance the result should be like the below:-
Name Status Timestamp
Joe 1 2015-11-12
Joe 2 2015-11-13
Paul 1 2015-08-17
Paul 3 2015-08-18
Mark 2 2015-09-26
Mark 3 2015-10-27
How can I achieve this result.

You can use a CTE with a CASE and LAG and LEAD to calculate what rows to select. this will work for versions 2012 and higher:
Create and populate sample table (Please save us this step in your future questions)
DECLARE #T as TABLE
(
Name varchar(4),
[Status] int,
[Timestamp] date
)
INSERT INTO #T VALUES
('Joe', 1, '2015-11-12'),
('Joe', 2, '2015-11-13'),
('Joe', 2, '2016-12-14'),
('Joe', 2, '2016-12-15'),
('Paul' ,1, '2015-08-16'),
('Paul' ,1, '2015-08-17'),
('Paul' ,3, '2015-08-18'),
('Paul' ,3, '2015-08-19'),
('Mark' ,2, '2015-09-20'),
('Mark' ,2, '2015-09-25'),
('Mark' ,2, '2015-09-26'),
('Mark' ,3, '2015-10-27')
The cte - Note that I use both lag and lead inside the case expression.
;WITH CTE AS
(
SELECT Name,
[Status],
[Timestamp],
CASE WHEN LAG([Status]) OVER(PARTITION BY Name ORDER BY [Timestamp]) <> [Status] OR
LEAD([Status]) OVER(PARTITION BY Name ORDER BY [Timestamp]) <> [Status] THEN
1
END As Filter
FROM #T
)
The query:
SELECT Name,
[Status],
[Timestamp]
FROM CTE
WHERE Filter = 1
Results:
Name Status Timestamp
Joe 1 12.11.2015 00:00:00
Joe 2 13.11.2015 00:00:00
Mark 2 26.09.2015 00:00:00
Mark 3 27.10.2015 00:00:00
Paul 1 17.08.2015 00:00:00
Paul 3 18.08.2015 00:00:00
See a live demo on rextester

This can be user on versions starting with 2005:
declare #t table (Name varchar(100), S int, T date);
insert into #t values
('Joe', 1 ,'2015-11-12'),
('Joe', 2 ,'2015-11-13'),
('Joe', 2 ,'2016-12-14'),
('Joe', 2 ,'2016-12-15'),
('Paul', 1 ,'2015-08-16'),
('Paul', 1 ,'2015-08-17'),
('Paul', 3 ,'2015-08-18'),
('Paul', 3 ,'2015-08-19'),
('Mark', 2 ,'2015-09-20'),
('Mark', 2 ,'2015-09-25'),
('Mark', 2 ,'2015-09-26'),
('Mark', 3 ,'2015-10-27');
with cte as
(
select *, ROW_NUMBER() over(partition by Name order by T) as rn
from #t
)
,cte1 as
(
select c1.Name, c1.S as S1, c1.T as T1, c2.S as S2, c2.T as T2
from cte c1 join cte c2
on c1.rn + 1 = c2.rn
and c1.Name = c2.Name
where c1.S <> c2.S
)
select Name,
case n
when 1 then S1
when 2 then S2
end as Status,
case n
when 1 then T1
when 2 then T2
end as Timestamp
from cte1 cross join (select 1 n union all select 2) nums;
And this is the same but for versions starting with 2012:
declare #t table (Name varchar(100), S int, T date);
insert into #t values
('Joe', 1 ,'2015-11-12'),
('Joe', 2 ,'2015-11-13'),
('Joe', 2 ,'2016-12-14'),
('Joe', 2 ,'2016-12-15'),
('Paul', 1 ,'2015-08-16'),
('Paul', 1 ,'2015-08-17'),
('Paul', 3 ,'2015-08-18'),
('Paul', 3 ,'2015-08-19'),
('Mark', 2 ,'2015-09-20'),
('Mark', 2 ,'2015-09-25'),
('Mark', 2 ,'2015-09-26'),
('Mark', 3 ,'2015-10-27');
with cte as
(
select name,
S as S1,
lead(S) over(partition by Name order by T) S2,
T as T1,
lead(T) over(partition by Name order by T) T2
from #t
)
,cte1 as
(
select *
from cte
where S1 <> S2
)
select Name,
case n
when 1 then S1
when 2 then S2
end as Status,
case n
when 1 then T1
when 2 then T2
end as Timestamp
from cte1 cross join (select 1 n union all select 2) nums;

Windowed Functions in SQL Server

I have a table called Orders in which the data looks like this:
EMpID OrderValue OrderID
1 100 1
2 167 89
....
There are multiple orders for each empID.
What I want is to get output in this form
EMPID RANK VALUETOTAL VALUETHISEMPID
1 1 300 100
4 2 300 50\
.....
If there are multiple EmpID(s) With same ValueThisEmpID then it should get same rank.
I tried
SELECT EmpID,SUM(val) OVER() as VALUETOTAL,SUM(val) OVER(PARTITION BY EmpID)
How can I obtain rank and order it by ValueThisEmpID?

First some test data:
insert into #t values (1, 10, 100)
insert into #t values (1, 20, 101)
insert into #t values (2, 30, 120)
insert into #t values (3, 10, 130)
insert into #t values (3, 10.5, 131)
insert into #t values (4, 100, 140)
You need two steps, one to get empIds and their summed order value. Step two will be to get the total total and rank:
; with Step1 (EmpId, ValueThisEmpId) as
(select empId, sum(OrderValue)
from #t
group by empId)
select EmpId,
rank() over(order by ValueThisEmpId desc) as "rank",
sum(ValueTHisEmpId) over() as ValueTotal,
ValueThisEmpId
from Step1
This will give output of:
4 1 180.50 100.00
1 2 180.50 30.00
2 2 180.50 30.00
3 4 180.50 20.50
If you don't want gaps in the ranking, use dense rank:
; with Step1 (EmpId, ValueThisEmpId) as
(select empId, sum(OrderValue)
from #t
group by empId)
select EmpId,
dense_rank() over(order by ValueThisEmpId desc) as "rank",
sum(ValueTHisEmpId) over() as TotalValue,
ValueThisEmpId
from Step1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Reset cumulative sum column after threshold with groups - sql

Related

excluding dups which are lower than max values in SQL

Accumulating previous rows with grouping

Select start date and end date form records with subsequent date field in SQL Server 2008 R2

Selecting rows when there is a change in the value of column from the previous row

Windowed Functions in SQL Server

Categories

Resources