Burndown analysis in SQL Server Management Studio - sql

I'm trying to prepare my data to create a burndown visual. As you can see the Rate column isn't simply A - B, as it carries forward the previous value if B is null.
I've tried some case statements using lag and sums but no avail.
Some direction on the case statement or an optimal solution would be ideal.
For example, this is how my data looks:
ID
A
B
1
20
NULL
2
20
3
3
20
NULL
4
20
7
5
20
NULL
6
20
NULL
7
20
NULL
8
20
5
9
20
7
And I want a rate column that looks like this.
ID
A
B
Rate
1
20
NULL
20
2
20
3
17
3
20
NULL
17
4
20
7
10
5
20
NULL
10
6
20
NULL
10
7
20
NULL
10
8
20
5
5
9
20
7
-2

Thanks to #Larnu for the guidance.
Here is the solution when you have your data partitioned by some group ID and ordered by some data or row ID.
SELECT
GROUP_ID,
ROW_ID,
COL_A,
COL_B,
COL_A - (SUM(ISNULL(COL_B,0)) OVER (PARTITION BY GROUP_ID ORDER BY ROW_ID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
FROM table

Related

SQL: Selecting Random Sample based on ID with multiple rows for each ID

My data has the following Structure
ID
Month
Year
Revenue
1
1
20
860
1
2
20
22
1
5
20
339
2
3
20
12098
3
3
20
12
3
4
20
10
3
6
20
9
3
7
20
122
3
8
20
11
There are 1000s of IDs and I want to select a random sample of 100 IDs. So if I randomly select ID 3, I need all rows of data for ID 3. I have to use SQL for this. I welcome any suggestions.
You can use following query.
For MS-Sql
Select top 100 * from table_name where ID=$randomId ORDER BY NEWID(); //like ID=3
For My-Sql
Select * from table_name where ID=$randomId ORDER BY RAND() LIMIT 100; //like ID=3

Sum Multiple Rows But Retain the Number of Rows in a Result

In my SQL Server 2008 stored procedure, I have a table variable with RecordID, TotalMinutes, ProcessID.
Declare #tblSum table(RecordID int, TotalMinutes int, ProcessID int)
RecordID is my primary key, total minutes is the total minutes, and I have different processes but these processes are repeated multiple times on my data.
Here is an example of my data:
RecordID TotalMinutes ProcessID
--------------------------------------------
1 10 1
2 20 1
3 30 1
4 10 2
5 40 2
6 10 2
7 10 3
8 55 3
9 60 3
10 15 4
My plan is to return the data by totaling or adding all the data with same ProcessID and put it on a new table variable with FinalMinutes column just like the table below:
RecordID TotalMinutes ProcessID FinalMinutes
-----------------------------------------------------
1 10 1 60
2 20 1 60
3 30 1 60
4 10 2 80
5 60 2 80
6 10 2 80
7 10 3 125
8 55 3 125
9 60 3 125
10 15 4 15
I cannot do a group by since it will cut the result into 4 rows. I need to retain the number of rows, and every data it has, I will just add a FinalMinutes column on a new table variable.
Here is one way using SUM()Over() windowed aggregate function
Select *,
FinalMinutes = sum(TotalMinutes)over(partition by ProcessID)
From yourtable

SQL - Select rows after reaching minimum value/threshold

Using Sql Server Mgmt Studio. My data set is as below.
ID Days Value Threshold
A 1 10 30
A 2 20 30
A 3 34 30
A 4 25 30
A 5 20 30
B 1 5 15
B 2 10 15
B 3 12 15
B 4 17 15
B 5 20 15
I want to run a query so only rows after the threshold has been reached are selected for each ID. Also, I want to create a new days column starting at 1 from where the rows are selected. The expected output for the above dataset will look like
ID Days Value Threshold NewDayColumn
A 3 34 30 1
A 4 25 30 2
A 5 20 30 3
B 4 17 15 1
B 5 20 15 2
It doesn't matter if the data goes below the threshold for the latter rows, I want to take the first row when threshold is crossed as 1 and continue counting rows for the ID.
Thank you!
You can use window functions for this. Here is one method:
select t.*, row_number() over (partition by id order by days) as newDayColumn
from (select t.*,
min(case when value > threshold then days end) over (partition by id) as threshold_days
from t
) t
where days >= threshold_days;

Using temporary extended table to make a sum

From a given table I want to be able to sum values having the same number (should be easy, right?)
Problem: A given value can be assigned from 2 to n consecutive numbers.
For some reasons this information is stored in a single row describing the value, the starting number and the ending number as below.
TABLE A
id | starting_number | ending_number | value
----+-----------------+---------------+-------
1 2 5 8
2 0 3 5
3 4 6 6
4 7 8 10
For instance the first row means:
value '8' is assigned to numbers: 2, 3 and 4 (5 is excluded)
So, I would like the following intermediairy result table
TABLE B
id | number | value
----+--------+-------
1 2 8
1 3 8
1 4 8
2 0 5
2 1 5
2 2 5
3 4 6
3 5 6
4 7 10
So I can sum 'value' for elements having identical 'number'
SELECT number, sum(value)
FROM B
GROUP BY number
TABLE C
number | sum(value)
--------+------------
2 13
3 8
4 14
0 5
1 5
5 6
7 10
I don't know how to do this and didn't find any answer on the web (maybe not looking with appropriate key words...)
Any idea?
You can do what you want with generate_series(). So, TableB is basically:
select id, generate_series(starting_number, ending_number - 1, 1) as n, value
from tableA;
Your aggregation is then:
select n, sum(value)
from (select id, generate_series(starting_number, ending_number - 1, 1) as n, value
from tableA
) a
group by n;

sql server 2008 - calculated and ordered list needs to return only 2 entries per supplier

I have a dataset like below, but longer. I want to ensure I am picking the 'fleet_id' in terms of their 'StarDriver' value overall, but I want to return only two results for each 'supplier_id' and return a max of 20 in total.
(I'm sorry I didnt work out how to copy the below in proper formatting, couldn't find from toolbar above and google results were about copying data; would also be grateful if someone would point out how)
fleet_id supplier_id Ratings Driver Punctuality Car StarDriver
19442 151 10 5 5 5 5
19634 151 11 5 5 5 5
19437 151 12 5 5 5 5
12832 10 14 5 4.92857142857143 5 4.97619047619048
12217 111 10 5 5 4.9 4.96666666666667
21135 158 19 5 4.89473684210526 5 4.96491228070175
19436 151 14 4.85714285714286 5 5 4.95238095238095
12239 111 12 4.91666666666667 5 4.91666666666667 4.94444444444445
10520 92 12 4.91666666666667 5 4.91666666666667 4.94444444444445
19997 151 12 5 5 4.83333333333333 4.94444444444444
To limit to the top 2 for each supplier, use row_number(). This will enumerate the rows and you can choose just two with where seqnum <= 2.
The rest of the query is just selecting 20 rows based on a field:
select top 20 t.*
from (select t.*,
row_number() over (partition by supplier order by StarDriver desc) as seqnum
from table t
) t
where seqnum <= 2
order by StarDriver;