Median over time - SQL Server

Median over time - SQL Server - sql

I have data (support ticket statistics, to be exact), that I am trying to get the median of, over time.
Right now, I have a query that calculates the difference between the date the ticket was opened and the date it was closed. I take that data and pivot the average of the days to close over a series of months and years, using SQL Servers built in AVG function. This works well, however I am finding that the metrics are skewed due to outliers in the data.
What I really want, is the Median of the data, pivoted over months and years. I am having trouble achieving what I am after, and I am not positive if this is even possible.
The query I have right now, using the AVG function, is:
SELECT 'Support - Days To Close Escalation', *
FROM
(
SELECT
DATEDIFF(HOUR, e.CreatedDate, e.Escalation_Close_Date_Time__c) AS DaysToCloseEscalation,
LEFT(CONVERT(CHAR(10), e.CreatedDate,126), 7) AS EscalationCreateDate
FROM [dbo].[Escalations] AS e WITH(NOLOCK)
LEFT JOIN [dbo].[Case] AS c WITH(NOLOCK)
ON e.Case__c = c.Id
WHERE e.Escalation_Queue__c IN ('PM 10 Tier 2 Support', 'PM 11 Tier 2 Support')
AND e.CreatedDate BETWEEN '2017-04-01 00:00:00.000' AND '2018-04-01 00:00:00.000'
AND e.Escalation_Close_Date_Time__c IS NOT NULL
) AS SupportEscalationVolume
PIVOT
(
AVG(SupportEscalationVolume.DaysToCloseEscalation) FOR SupportEscalationVolume.EscalationCreateDate IN ([2017-04],[2017-05],[2017-06],[2017-07],[2017-08],[2017-09],[2017-10],[2017-11],[2017-12],[2018-01],[2018-02],[2018-03])
) AS SupportEscalationVolumePivot
The result of this query is something along the lines of (except all in a single row, since the data is pivoted):
StatDescription | Support - Days To Close Escalation
----------------------------------------------------
2017-04 | 107
2017-05 | 52
2017-06 | 101
2017-07 | 106
2017-08 | 69
2017-09 | 54
2017-10 | 49
2017-11 | 42
2017-12 | 51
2018-01 | 31
2018-02 | 23
2018-03 | 15
After some research on how to pull of a Median in SQL, I have resorted to using DENSE_RANK(), as shown in the query below. I started with ROW_NUMBER(), but that gave me a counter for ALL records, where what I really want is a median of the time to close the ticket for each month/year grouping.
;
WITH SupportDaysToClose(HoursToCloseEscalation, EscalationCreateDate, RowNumber)
AS
(
SELECT
DATEDIFF(HOUR, e.CreatedDate, e.Escalation_Close_Date_Time__c) AS HoursToCloseEscalation,
LEFT(CONVERT(CHAR(10), e.CreatedDate,126), 7) AS EscalationCreateDate,
DENSE_RANK() OVER(ORDER BY LEFT(CONVERT(CHAR(10), e.CreatedDate,126), 7) ASC) AS RowNumber
FROM [dbo].[Escalations] AS e WITH(NOLOCK)
LEFT JOIN [dbo].[Case] AS c WITH(NOLOCK)
ON e.Case__c = c.Id
WHERE e.Escalation_Queue__c IN ('PM 10 Tier 2 Support', 'PM 11 Tier 2 Support')
AND e.CreatedDate BETWEEN '2017-04-01 00:00:00.000' AND '2018-04-01 00:00:00.000'
AND e.Escalation_Close_Date_Time__c IS NOT NULL
)
SELECT *
FROM SupportDaysToClose
ORDER BY RowNumber,HoursToCloseEscalation
A sample of this data looks like
HoursToClose|CreateDate|RowNumber
---------------------------------
0 |2017-04 |1
7 |2017-08 |5
27 |2017-12 |9
Each RowNumber correlates to a given month and year, the maximum being at 12.
At this point, I am not really sure where to go.
Has anyone ever done anything like this before? I am not sure if I am on the right track or if I need to rethink the whole strategy. I apologize in advance if the syntax is difficult to follow.

Related

Apply SUM( where date between date1 and date2)

My table is currently looking like this:
+---------+---------------+------------+------------------+
| Segment | Product | Pre_Date | ON_Prepaid |
+---------+---------------+------------+------------------+
| RB | 01. Auto Loan | 2020-01-01 | 10645976180.0000 |
| RB | 01. Auto Loan | 2020-01-02 | 4489547174.0000 |
| RB | 01. Auto Loan | 2020-01-03 | 1853117000.0000 |
| RB | 01. Auto Loan | 2020-01-04 | 9350258448.0000 |
+---------+---------------+------------+------------------+
I'm trying to sum values of 'ON_Prepaid' over the course of 7 days, let's say from '2020-01-01' to '2020-01-07'.
Here is what I've tried
drop table if exists ##Prepay_summary_cash
select *,
[1W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 1 following and 7 following),
[2W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 8 following and 14 following),
[3W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 15 following and 21 following),
[1M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 22 following and 30 following),
[1.5M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 31 following and 45 following),
[2M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 46 following and 60 following),
[3M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 61 following and 90 following),
[6M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 91 following and 181 following)
into ##Prepay_summary_cash
from ##Prepay1
Things should be fine if the dates are continuous; however, there are some missing days in 'Pre_Date' (you know banks don't work on Sundays, etc.).
So I'm trying to work on something like
[1W] = SUM(ON_Prepaid) over (where Pre_date between dateadd(d,1,Pre_date) and dateadd(d,7,Pre_date))
something like that. So if per se there's no record on 2020-01-05, the result should only sum the dates on the 1,2,3,4,6,7 of Jan 2020, instead of 1,2,3,4,6,7,8 (8 because of "rows 7 following"). Or for example I have missing records over the span of 30 days or something, then all those 30 should be summed as 0s. So 45 days should return only the value of 15 days.
I've tried looking up all over the forum and the answers did not suffice. Can you guys please help me out? Or link me to a thread which the problem had already been solved.
Thank you so much.

Things should be fine if the dates are continuous
Then make them continuous. Left join your real data (grouped up so it is one row per day) onto your calendar table (make one, or use a recursive cte to generate you a list of 360 dates from X hence) and your query will work out
WITH d as
(
SELECT *
FROM
(
SELECT *
FROM cal
CROSS JOIN
(SELECT DISTINCT segment s, product p FROM ##Prepay1) x
) c
LEFT JOIN ##Prepay1 p
ON
c.d = p.pre_date AND
c.segment = p.segment AND
c.product = p.product
WHERE
c.d BETWEEN '2020-01-01' AND '2021-01-01' -- date range on c.d not c.pre_date
)
--use d.d/s/p not d.pre_date/segment/product in your query (sometimes the latter are null)
select *,
[1W_Prepaid] = sum(ON_Prepaid) over (partition by s, s order by d.d rows between 1 following and 7 following),
...
CAL is just a table with a single column of dates, one per day, no time, extending for n thousand days into the past/future
Wish to note that months have variable number of days so 6M is a bit of a misnomer.. might be better to call the month ones 180D, 90D etc
Also want to point out that your query performs a per row division of your data into into groups. If you want to perform sums up to 180 days after the date of the row you need to pull a year's worth of data so that on row 180(June) you have the December data available to sum (dec being 6 months from June)
If you then want to restrict your query to only showing up to June (but including data summed from 6 months after June) you need to wrap it all again in a sub query. You cannot "where between jan and jun" in the query that does the sum over because where clauses are done before window clauses (doing so will remove the dec data before it is summed)
Some other databases make this easier, Oracle and Postgres spring to mind; they can perform sum in a range where the other rows values are within some distance of the current row's values. SQL server only usefully supports distancing based on a row's index rather than its values (the distancing-based-on-values support is limited to "rows that have the same value", rather than "rows that have values n higher or lower than the current row"). I suppose the requirement could be met with a cross apply, or a coordinated sub in the select, though I'd be careful to check the performance..
SELECT *,
(SELECT SUM(tt.a) FROM x tt WHERE t.x = tt.x AND tt.y = t.y AND tt.z BETWEEN DATEADD(d, 1, t.z) AND DATEADD(d, 7, t.z) AS 1W
FROM
x t

How to LEFT JOIN on ROW_NUM using WITH

Right now I'm in the testing phase of this query so I'm only testing it on two Queries. I've gotten stuck on the final part where I want to left join everything (this will have to be extended to 12 separate queries). The problem is basically as the title suggests--I want to join 12 queries on the created Row_Num column using the WITH() statement, instead of creating 12 separate tables and saving them as table in a database.
WITH Jan_Table AS
(SELECT ROW_NUMBER() OVER (ORDER BY a.SALE_DATE) as Row_ID, a.SALE_DATE, sum(a.revenue) as Jan_Rev
FROM ba.SALE_TABLE a
WHERE a.SALE_DATE BETWEEN '2015-01-01' and '2015-01-31'
GROUP BY a.SALE_DATE)
SELECT ROW_NUMBER() OVER (ORDER BY a.SALE_DATE) as Row_ID, a.SALE_DATE, sum(a.revenue) as Jun_Rev, j.Jan_Rev
FROM ba.SALE_TABLE a
LEFT JOIN Jan_Table j
on "j.Row_ID" = a.Row_ID
WHERE a.SALE_DATE BETWEEN '2015-06-01' and '2015-06-30'
GROUP BY a.SALE_DATE
And then I get this error message:
ERROR: column "j.Row_ID" does not exist
I put in the "j.Row_ID" because the previous message was:
ERROR: column a.row_id does not exist Hint: Perhaps you meant to
reference the column "j.row_id".
Each query works individually without the JOIN and WITH functions. I have one for every month of the year and want to join 12 of these together eventually.
The output should be a single column with ROW_NUM and 12 Monthly Revenues columns. Each row should be a day of the month. I know not every month has 31 days. So, for example, Feb only has 28 days, meaning I'd want days 29, 30, and 31 as NULLs. The query above still has the dates--but I will remove the "SALE_DATE" column after I can just get these two queries to join.
My initially thought was just to create 12 tables but I think that'd be a really bad use of space and not the most logical solution to this problem if I were to extend this solution.
edit
Below are the separate outputs of the two qaruies above and the third table is what I'm trying to make. I can't give you the raw data. Everything above has been altered from the actual column names and purposes of the data that I'm using. And I don't know how to create a dataset--that's too above my head in SQL.
Jan_Table (first five lines)
Row_Num Date Jan_Rev
1 2015-01-01 20
2 2015-01-02 20
3 2015-01-03 20
4 2015-01-04 20
5 2015-01-05 20
Jun_Table (first five lines)
Row_Num Date Jun_Rev
1 2015-06-01 30
2 2015-06-02 30
3 2015-06-03 30
4 2015-06-04 30
5 2015-06-05 30
JOINED_TABLE (first five lines)
Row_Num Date Jun_Rev Date Jan_Rev
1 2015-06-01 30 2015-01-01 20
2 2015-06-02 30 2015-01-02 20
3 2015-06-03 30 2015-01-03 20
4 2015-06-04 30 2015-01-04 20
5 2015-06-05 30 2015-01-05 20

It seems like you can just use group by and conditional aggregation for your full query:
select day(sale_date),
max(case when month(sale_date) = 1 then sale_date end) as jan_date,
max(case when month(sale_date) = 1 then revenue end) as jan_revenue,
max(case when month(sale_date) = 2 then sale_date end) as feb_date,
max(case when month(sale_date) = 2 then revenue end) as feb_revenue,
. . .
from sale_table s
group by day(sale_date)
order by day(sale_date);
You haven't specified the database you are using. DAY() is a common function to get the day of the month; MONTH() is a common function to get the months of the year. However, those particular functions might be different in your database.

SQL Calculating time from last transaction for each ID

Hello I'm stuck trying to calculate the difference in time between each transaction for each ID.
The data looks like
Customer_ID | Transaction_Time
1 00:30
1 00:35
1 00:37
1 00:38
2 00:20
2 00:21
2 00:23
I'm trying to get the result to look something like
Customer_ID | Time_diff
1 5
1 2
1 1
2 1
2 2
I would really appreciate any help.
Thanks

Most databases support the LAG() function. However, the date/time functions can depend on the database. Here is an example for SQL Server:
select t.*
from (select t.*,
datediff(second,
lag(transaction_time) over (partition by customer_id order by transaction_time),
transaction_time
) as diff
from t
) t
where diff is not null;
The logic would be similar in most databases, although the function for calculating the time difference varies.

Calculating incremental differences in a given column

i was searching web and stackoverflow but didn,t find an answer. :( So please help me i am still learning and reading, but i am not yet thinking correctly, there are no IF and FOR LOOPs to do stuff. :)
I have table1:
id| date |state_on_date|year_quantity
1|30.12.2013|23 |100
1|31.12.2013|25 |100
1|1.1.2014 |35 |150
1|2.1.2014 |12 |150
2|30.12.2013|34 |200
2|31.12.2013|65 |200
2|1.1.2014 |43 |300
I am trying to get:
table2:
id| date |state_on_date|year_quantity|state_on_date_compare
1|30.12.2013| 23 |100 |23
1|31.12.2013| 25 |100 |-2
1|1.1.2014 | 35 |150 |-10
1|2.1.2014 | 12 |150 |23
2|30.12.2013| 34 |200 |34
2|31.12.2013| 65 |200 |-31
2|1.1.2014 | 43 |300 |22
Rules to get numbers:
id|date |state_on_date|year_quantity|state_on_date_compare
1|30.12.2013| 23 |100| 23 (lowest state_on_date for id 1)
1|31.12.2013| 25 |100| -2 (23-25)
1| 1.1.2014| 35 |150|-10 (25-35)
1| 2.1.2014| 12 |150| 23 (35-12)
2|30.12.2013| 34 |200| 34 (lowest state_on_date for id 2)
2|31.12.2013| 65 |200|-31 (34-65)
2| 1.1.2014| 43 |300| 22 (65-43)
Thanks in advace for every suggestion or solution you will make.

You have to understand that SQL is misleading because of presentation issues. Like in The Matrix ("there is no spoon"), in a query there is no previous record.
SQL is based on set theory, for which there IS NO ORDER of records. All records are just set members. The theory behind SQL is that anything you do normally should be considered as though you are doing it to ALL RECORDS AT THE SAME TIME! The fact that a datasheet view of a SELECT query shows record A before record B is an artifact of presentation - not of actual record order.
In fact, the records returned by a query are in the same order as they appear in a table UNLESS you have included a GROUP BY or ORDER BY clause. And the order of record appearance in a table is usually the order in which they were created UNLESS there is a functional primary key on that table.
However, both of these statements leave you with the same problem. There is no SYNTAX for the concepts of NEXT and PREVIOUS because it is the CONCEPT of order that doesn't exist in SQL.
VBA recordsets, though based on SQL as recordsources, create an extra context that encapsulates the SQL context. That is why VBA can do what you want and SQL itself cannot. It is the "extra" context in which VBA can define variables holding what you wanted to remember until another record comes along.
Having now rained on your parade, here are some thoughts that MIGHT help.
When you want to see "previous record" data, there MUST be a way for Access to find what you consider to be the "previous record." Therefore, if you have not allowed for this situation, it is a design flaw. (Based on you not realizing the implications of SET theory, which is eminently forgivable for new Access users, so don't take it too hard.) This is based on the "Old Programmer's Rule" that says "Access can't tell you anything you didn't tell it first." Which means - in practical terms - that if order means something to you, you must give Access the data required to remember and later impose that order. If you have no variable to identify proper order with respect to your data set, you cannot impose the desired order later. In this case, it looks like a combination of id and date together will give you an ordering variable.
You can SOMETIMES do something like a DLookup in a query where you look for the record that would precede the current one based on some order identifier.
e.g. if you were ordering by date/time fields and meant "previous" to imply the record with the next earlier time than the record in focus, you would choose the record with the maximum date less than the date in focus. Look at the DMax function. Also notice I said "record in focus" not "current record." This is a fine point, but "Current" also implies ordering by connotation. ("Previous" and "Next" imply order by denotation, a stronger definition.)
Anyway, contemplate this little jewel:
DLookup( "[myvalue]", "mytable", "[mytable]![mydate] = #" & CStr( DMax( "[mydate]", "mytable", "[mytable]![mydate] < #" & CStr( [mydate] ) & "# )" ) & "#" )
I don't guarantee that the parentheses are balanced for the functions and I don't guarantee that the syntax is exactly right. Use Access Help on DLookup, DMax, Cstr, and on strings (in functions) in order to get the exact syntax. The idea is to use a query (implied by DMax) to find the largest date less than the date in focus in order to feed a query (implied by DLookup) to find the value for the record having that date. And the CStr converts the date/time variable to a string so you can use the "#" signs as date-string brackets.
IF you are dealing with different dates for records with different qualifiers, you will also have to include the rest of the qualifies in BOTH the DMax and DLookup functions. That syntax gets awfully nasty awfully fast. Which is why folks take up VBA in the first place.

Johnny Bones makes some good points in his answer, but in fact there is a way to have Access SQL perform the required calculations in this case. Our sample data is in a table named [table1]:
id date state_on_date year_quantity
-- ---------- ------------- -------------
1 2013-12-20 23 100
1 2013-12-31 25 100
1 2014-01-01 25 150
1 2014-01-02 12 150
2 2013-12-30 34 200
2 2013-12-31 65 200
2 2014-01-01 43 300
Step 1: Determining the initial rows for each [id]
We start by creating a saved query in Access named [StartDatesById] to give us the earliest date for each [id]
SELECT id, MIN([date]) AS MinOfDate
FROM table1
GROUP BY id
That gives us
id MinOfDate
-- ----------
1 2013-12-30
2 2013-12-30
Now we can use that in another query to give us the initial rows for each [id]
SELECT
table1.id,
table1.date,
table1.state_on_date,
table1.year_quantity,
table1.state_on_date AS state_on_date_compare
FROM
table1
INNER JOIN
StartDatesById
ON table1.id = StartDatesById.id
AND table1.date = StartDatesById.MinOfDate
which gives us
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-30 23 100 23
2 2013-12-30 34 200 34
Step 2: Calculating the subsequent rows
This step begins with creating a saved query named [PreviousDates] that uses a self-join on [table1] to give us the previous dates for each row in [table1] that is not the first row for that [id]
SELECT
t1a.id,
t1a.date,
MAX(t1b.date) AS previous_date
FROM
table1 AS t1a
INNER JOIN
table1 AS t1b
ON t1a.id = t1b.id
AND t1a.date > t1b.date
GROUP BY
t1a.id,
t1a.date
That query gives us
id date previous_date
-- ---------- -------------
1 2013-12-31 2013-12-30
1 2014-01-01 2013-12-31
1 2014-01-02 2014-01-01
2 2013-12-31 2013-12-30
2 2014-01-01 2013-12-31
Once again, we can use that query in another query to derive the subsequent records for each [id]
SELECT
curr.id,
curr.date,
curr.state_on_date,
curr.year_quantity,
prev.state_on_date - curr.state_on_date AS state_on_date_compare
FROM
(
table1 AS curr
INNER JOIN
PreviousDates
ON curr.id = PreviousDates.id
AND curr.date = PreviousDates.date
)
INNER JOIN
table1 AS prev
ON prev.id = PreviousDates.id
AND prev.date = PreviousDates.previous_date
which returns
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-31 25 100 -2
1 2014-01-01 35 150 -10
1 2014-01-02 12 150 23
2 2013-12-31 65 200 -31
2 2014-01-01 43 300 22
Step 3: Combining the results of steps 1 and 2
To combine the results from the previous two steps we just include them both in a UNION query and sort by the first two columns
SELECT
table1.id,
table1.date,
table1.state_on_date,
table1.year_quantity,
table1.state_on_date AS state_on_date_compare
FROM
table1
INNER JOIN
StartDatesById
ON table1.id = StartDatesById.id
AND table1.date = StartDatesById.MinOfDate
UNION ALL
SELECT
curr.id,
curr.date,
curr.state_on_date,
curr.year_quantity,
prev.state_on_date - curr.state_on_date AS state_on_date_compare
FROM
(
table1 AS curr
INNER JOIN
PreviousDates
ON curr.id = PreviousDates.id
AND curr.date = PreviousDates.date
)
INNER JOIN
table1 AS prev
ON prev.id = PreviousDates.id
AND prev.date = PreviousDates.previous_date
ORDER BY 1, 2
returning
id date state_on_date year_quantity state_on_date_compare
-- ---------- ------------- ------------- ---------------------
1 2013-12-30 23 100 23
1 2013-12-31 25 100 -2
1 2014-01-01 35 150 -10
1 2014-01-02 12 150 23
2 2013-12-30 34 200 34
2 2013-12-31 65 200 -31
2 2014-01-01 43 300 22

I hope this would be helpful
http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/calculating-mean-median-and-mode-with-sq
you can use select * from table1 into table2 where specify your conditions,I am not sure whetehr this would work

SQL query to merge and sum time-periods

I have a database table containing time-periods and amounts. Think of them as contracts with a duration and a price per day:
start | end | amount_per_day
2013-01-01 | 2013-01-31 | 100
2013-02-01 | 2013-06-30 | 200
2013-01-01 | 2013-06-30 | 100
2013-05-01 | 2013-05-15 | 50
2013-05-16 | 2013-05-31 | 50
I would like to make a query that will display the totals for each period, i.e.:
From 2013-01-01 to 2013-01-31, the first and third contract are active, so the total amount per day is 200. From 2013-02-01 to 2013-04-30, the second and third row are active, so the total is 300. From 2013-05-01 to 2013-05-15 the second, third and fourth row are active, so the total is 350. From 2013-05-16 to 2013-05-31 the second, third and fifth row are active, so the total is again 350. Finally, from 2013-06-01 to 2013-06-30 only the second and third are active, so the total is back to 300.
start | end | total_amount_per_day
2013-01-01 | 2013-01-31 | 200
2013-02-01 | 2013-04-30 | 300
2013-05-01 | 2013-05-31 | 350
2013-06-01 | 2013-06-30 | 300
(It is not necessary to detect that the intervals 2013-05-01 -> 2013-05-15 and 2013-05-16 -> 2013-05-31 have the same totals and merge them, but it would be nice).
I would prefer a portable solution, but if it is not possible a SQL Server will work, too.
I can make small changes to the structure of the table, so if it would make the query simpler to e.g. notate the time-periods with the end-date exclusive (so the first period would be start = 2013-01-01, end = 2013-02-01) feel free to make such suggestions.

I'll start with the full query and then break it down and explain it. This is SQL-Server specific, but with minor tweaks could be adapted to any DMBS that supports analytical functions.
WITH Data AS
( SELECT Start, [End], Amount_Per_Day
FROM (VALUES
('20130101', '20130131', 100),
('20130201', '20130630', 200),
('20130101', '20130630', 100),
('20130501', '20130515', 50),
('20130516', '20130531', 50)
) t (Start, [End], Amount_Per_Day)
), Numbers AS
( SELECT Number
FROM Master..spt_values
WHERE Type = 'P'
), DailyData AS
( SELECT [Date] = DATEADD(DAY, Number, Start),
[AmountPerDay] = SUM(Amount_Per_Day)
FROM Data
INNER JOIN Numbers
ON Number BETWEEN 0 AND DATEDIFF(DAY, Start, [End])
GROUP BY DATEADD(DAY, Number, Start)
), GroupedData AS
( SELECT [Date],
AmountPerDay,
[GroupByValue] = DATEADD(DAY, -ROW_NUMBER() OVER(PARTITION BY AmountPerDay ORDER BY [Date]), [Date])
FROM DailyData
)
SELECT [Start] = MIN([Date]),
[End] = MAX([Date]),
AmountPerDay
FROM GroupedData
GROUP BY AmountPerDay, GroupByValue
ORDER BY [Start], [End];
The Data CTE is just your sample data.
The Numbers CTE is just a sequence of numbers from 0 - 2047 (If your start and end dates are more than 2047 days apart this will fail and will need adapting slightly)
The Next CTE DailyData simply uses the numbers to expand your ranges into their individual dates, so
20130101, 20130131, 100
Becomes
20130101, 100
20130102, 100
20130103, 100
....
20130131, 100
Then it is just a case of grouping the data by the amount per day with the help of the ROW_NUMBER function to find when it changes and define ranges of similar amounts per day, then getting the MIN and MAX date for each range.
I always struggle to explain/demonstrate the exact workings of this method of grouping ranges, if it doesn't make sense it is perhaps easiest seen for your self if you just use SELECT * FROM DailyData at the end to see the raw unaggregated data

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas