I have the following table on SQL Server:
ID
FROM
TO
OFFER NUMBER
1
2022.01.02
9999.12.31
1
1
2022.01.02
2022.02.10
2
2
2022.01.05
2022.02.15
1
3
2022.01.02
9999.12.31
1
3
2022.01.15
2022.02.20
2
3
2022.02.03
2022.02.25
3
4
2022.01.16
2022.02.05
1
5
2022.01.17
2022.02.13
1
5
2022.02.05
2022.02.13
2
The range includes the start date but excludes the end date.
The date 9999.12.31 is given (comes from another system), but we could use the last day of the current quarter instead.
I need to find a way to determine the number of days when the customer sees exactly one, two, or three offers. The following picture shows the method upon id 3:
The expected results should be like (without using the last day of the quarter):
ID
# of days when the customer sees only 1 offer
# of days when the customer sees 2 offers
# of days when the customer sees 3 offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
I've found this article but it did not enlighten me.
Also I have limited privileges that is I am not able to declare a variable for example so I need to use "basic" TSQL.
Please provide a detailed explanation besides the code.
Thanks in advance!
The following will (for each ID) extract all distinct dates, construct non-overlapping date ranges to test, and will count up the number of offers per range. The final step is to sum and format.
The fact that the start dates are inclusive and the end dates are exclusive while sometimes non-intuitive for the human, actually works well in algorithms like this.
DECLARE #Data TABLE (Id INT, FromDate DATETIME, ToDate DATETIME, OfferNumber INT)
INSERT #Data
VALUES
(1, '2022-01-02', '9999-12-31', 1),
(1, '2022-01-02', '2022-02-10', 2),
(2, '2022-01-05', '2022-02-15', 1),
(3, '2022-01-02', '9999-12-31', 1),
(3, '2022-01-15', '2022-02-20', 2),
(3, '2022-02-03', '2022-02-25', 3),
(4, '2022-01-16', '2022-02-05', 1),
(5, '2022-01-17', '2022-02-13', 1),
(5, '2022-02-05', '2022-02-13', 2)
;
WITH Dates AS ( -- Gather distinct dates
SELECT Id, Date = FromDate FROM #Data
UNION --(distinct)
SELECT Id, Date = ToDate FROM #Data
),
Ranges AS ( --Construct non-overlapping ranges (The ToDate = NULL case will be ignored later)
SELECT ID, FromDate = Date, ToDate = LEAD(Date) OVER(PARTITION BY Id ORDER BY Date)
FROM Dates
),
Counts AS ( -- Calculate days and count offers per date range
SELECT R.Id, R.FromDate, R.ToDate,
Days = DATEDIFF(DAY, R.FromDate, R.ToDate),
Offers = COUNT(*)
FROM Ranges R
JOIN #Data D ON D.Id = R.Id
AND D.FromDate <= R.FromDate
AND D.ToDate >= R.ToDate
GROUP BY R.Id, R.FromDate, R.ToDate
)
SELECT Id
,[Days with 1 Offer] = SUM(CASE WHEN Offers = 1 THEN Days ELSE 0 END)
,[Days with 2 Offers] = SUM(CASE WHEN Offers = 2 THEN Days ELSE 0 END)
,[Days with 3 Offers] = SUM(CASE WHEN Offers = 3 THEN Days ELSE 0 END)
FROM Counts
GROUP BY Id
The WITH clause introduces Common Table Expressions (CTEs) which progressively build up intermediate results until a final select can be made.
Results:
Id
Days with 1 Offer
Days with 2 Offers
Days with 3 Offers
1
2913863
39
0
2
41
0
0
3
2913861
24
17
4
20
0
0
5
19
8
0
Alternately, the final select could use a pivot. Something like:
SELECT Id,
[Days with 1 Offer] = ISNULL([1], 0),
[Days with 2 Offers] = ISNULL([2], 0),
[Days with 3 Offers] = ISNULL([3], 0)
FROM (SELECT Id, Offers, Days FROM Counts) C
PIVOT (SUM(Days) FOR Offers IN ([1], [2], [3])) PVT
ORDER BY Id
See This db<>fiddle for a working example.
Find all date points for each ID. For each date point, find the number of overlapping.
Refer to comments within query
with
dates as
(
-- get all date points
select ID, theDate = FromDate from offers
union -- union to exclude any duplicate
select ID, theDate = ToDate from offers
),
cte as
(
select ID = d.ID,
Date_Start = d.theDate,
Date_End = LEAD(d.theDate) OVER (PARTITION BY ID ORDER BY theDate),
TheCount = c.cnt
from dates d
cross apply
(
-- Count no of overlapping
select cnt = count(*)
from offers x
where x.ID = d.ID
and x.FromDate <= d.theDate
and x.ToDate > d.theDate
) c
)
select ID, TheCount, days = sum(datediff(day, Date_Start, Date_End))
from cte
where Date_End is not null
group by ID, TheCount
order by ID, TheCount
Result :
ID
TheCount
days
1
1
2913863
1
2
39
2
1
41
3
1
2913861
3
2
29
3
3
12
4
1
20
5
1
19
5
2
8
To get to the required format, use PIVOT
dbfiddle demo
Related
I am attempting to understand the progression of my observations that are on time relative to when they were expected, regardless of the date they were expected. Therefore, I want to reindex each observation and generate a list that starts at day 0 (on the expected day) and then calculate forward for 10 more days (arbitrary).
I am testing this in BigQuery:
CREATE TABLE `db.tbl` (
id INTEGER,
expected DATE,
actual DATE
)
INSERT INTO `db.tbl`
( id , expected , actual )
VALUES
( 1 , '2022-01-01' , '2022-01-02' ),
( 2 , '2022-01-11' , '2022-01-20' ),
( 3 , '2022-01-21' , '2022-01-20' )
So, the first row represents an observation that was "missing"/"late"/"not on time" on day 0 (2022-01-01) and then "on time" from day 1 (2022-01-02) until the end of my window of interest (day 10).
The second row represents an observation that was "late" from day 0 (2022-01-11) to day 8 (2022-01-19) and "on time" after that.
The third row represents an observation that was observed early, so it should be "on time" from day 0 through day 10.
I would want the result to be:
day count fraction
0 1 0.33
1 2 0.67
2 2 0.67
3 2 0.67
4 2 0.67
5 2 0.67
6 2 0.67
7 2 0.67
8 2 0.67
9 3 1.00
10 3 1.00
Is this possible with a SELECT statement?
CREATE TEMP TABLE sample (
id INTEGER,
expected DATE,
actual DATE
);
INSERT INTO sample
( id , expected , actual )
VALUES
( 1 , '2022-01-01' , '2022-01-02' ),
( 2 , '2022-01-11' , '2022-01-20' ),
( 3 , '2022-01-21' , '2022-01-20' );
WITH observations AS (
SELECT day, COUNTIF(v = '1') AS count, (SELECT COUNT(id) FROM sample) AS total
FROM sample,
UNNEST([IF(DATE_DIFF(actual, expected, DAY) < 0, 0, DATE_DIFF(actual, expected, DAY))]) diff,
UNNEST(SPLIT(REPEAT('0', diff) || REPEAT('1', 10 - diff), '')) v WITH OFFSET day
GROUP BY 1
)
SELECT day, count, ROUND(count / total, 2) AS fraction
FROM observations;
output:
Consider below
select day, sum(ontime) cnt, round(avg(ontime),2) fraction
from (
select day, if(dt < actual, 0, 1) ontime
from your_table,
unnest(generate_array(0,10)) day
left join unnest(generate_date_array(expected, actual)) dt with offset as day
using(day)
)
group by day
if applied to sample data in your question
with your_table as (
select 1 id, date '2022-01-01' expected, date '2022-01-02' actual union all
select 2, '2022-01-11' , '2022-01-20' union all
select 3, '2022-01-21' , '2022-01-20'
)
output is
I have a table with the following entries,
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that month with frequency value as 0.
The expected output is
ID
date
Frequency
1
'2012-04-30'
5
1
'2012-05-31'
0
1
'2012-06-30'
4
1
'2012-07-31'
25
2
'2012-04-30'
7
2
'2012-05-31'
4
2
'2012-06-30'
1
2
'2012-07-31'
6
I need to add missing month and the date which gets added should be the last date of that
I would suggest recursive CTEs:
with cte as (
select id, date, frequency,
lead(date) over (partition by id order by date) as next_date
from t
union all
select id, eomonth(date, 1), 0, next_date
from cte
where eomonth(date, 1) < dateadd(day, -1, next_date)
)
select id, date, frequency
from cte
order by id, date;
The anchor part of the CTE calculates the end date for a given row. The recursive part then just keeps adding months to fill in the missing rows (and none if there are none). The use of eomonth(date, 1) is just a handy way of getting the last day of the next month.
Here is a db<>fiddle.
If you have all dates in the table, you can also use cross join to generate the rows and then left join to bring in the existing data:
select i.id, d.date, coalesce(t.frequency, 0) as frequency
from (select distinct id from t) i cross join
(select distinct date from t) d left join
t
on i.id = t.id and d.date = t.date
order by i.id, d.date;
If you have a large amount of data, you can compare performance. This may be a case where a recursive CTE is faster than alternative methods.
I have the following sample table (provided with single ID for simplicity - need to perform the same logic across all IDs)
ID Visit_date
-----------------
ABC 8/7/2019
ABC 9/10/2019
ABC 9/12/2019
ABC 10/1/2019
ABC 10/1/2019
ABC 10/8/2019
ABC 10/15/2019
ABC 10/17/2019
ABC 10/24/2019
Here is what I need to get the sample output
Mark the first visit as 1 in the "new_visit" column
Compare the subsequent dates with the 1st date until it exceeds 21 days condition. Example Sep 10 is compared to Aug 7 and it doesn’t fall within 21 days of Aug 7, therefore this is considered as another new_visit, so mark new_visit as 1
Then we compare Sep 10 with the subsequent dates with 21 days criteria and mark all of them as follow_up of Sep 10 visit. Eg. Sep 12, Oct 1 are within 21 days of Sep 10; hence they are considered as follow up visits, so mark "follow_up" as 1
When the subsequent date exceeds 21 days criteria of the previous new visit (e.g. Oct 8 compared to Sep 10) then Oct 8 will be considered a new visit & mark "New_visit" as 1 and the subsequent dates will be compared against Oct 8
Sample Output :
Dates New_Visit Follow_up
-----------------------------
8/7/2019 1
9/10/2019 1
9/12/2019 1
10/1/2019 1
10/1/2019 1
10/8/2019 1
10/15/2019 1
10/17/2019 1
10/24/2019 1
You need a recursive query for this.
You would enumerate the rows, then walk through the dataset by ascending date, while keeping track of the first visit date of each group; when the interval since the last first visit exceeds 21 days, the date of the first visit resets, and a new group starts.
with
data as (
select t.*, row_number() over(partition by id order by date) rn
from mtytable t
),
cte as (
select id, visit_date, visit_date first_visit_date
from data
where rn = 1
union all
select c.id, d.visit_date, case when d.visit_date > datead(day, 21, c.first_visit_date) then d.visit_date else c.first_visit_date end
from cte c
inner join data d on d.id = c.id and d.rn = c.rn + 1
)
select
id,
date,
case when visit_date = first_visit_date then 1 else 0 end as is_new
case when visit_date = first_visit_date then 0 else 1 end as is_follow_up
from cte
If a patient may have more than 100 visits, then you need to add option (maxrecursion 0) at the very end of the query.
You need a recursive CTE to handle this. This is the idea, although the exact syntax might vary by database:
with recursive t as (
select id, date,
row_number() over (partition by id order by date) as seqnum
from yourtable
),
recursive cte as (
select id, date, visit_start as date, 1 as is_new_visit
from t
where id = 1
union all
select cte.id, t.date,
(case when t.date < visit_start + interval '21 day'
then cte.visit_start else t.date
end) as visit_start,
(case when t.date < cte.visit_start + interval '21 say'
then 0 else 1
end) as is_new_visit
from cte join
t
on t.id = cte.id and t.seqnum = cte.seqnum + 1
)
select *
from cte
where is_new_visit = 1;
I need to compare side by side the companies values by current year vs last year and current month with same month of the previous year.
I use this query to get the values
SELECT STORE, SUM(TOTAL) as VAL, DATE FROM MYTABLE
WHERE DATE=CURRENT_DATE GROUP BY STORE ORDER BY STORE
below the results
STORE | VAL | DATE
1 10 CURRENT_DATE (2018-27-03)
1 20 2018-26-03
1 30 2018-25-03
2 20 CURRENT_DATE (2018-27-03)
2 20 2018-26-02
and i need this
STORE | VALUE CURRENT YEAR | VALUE LAST YEAR
1 60 30 (CALCULATED)
2 40 50 (CALCULATED)
STORE | VALUE CURRENT MONTH | VALUE SAME MONTH OF LAST YEAR
1 60 30 (CALCULATED)
2 20 50 (CALCULATED)
Thank you
You could just join two sub-selects together.
E.g with this DDL and Data
CREATE TABLE MYTABLE (STORE int, VAL int, D DATE);
INSERT INTO MYTABLE VALUES
( 1, 10, '2018-03-27')
,( 1, 20, '2018-03-26')
,( 1, 10, '2018-02-25')
,( 1, 35, '2017-03-25')
,( 2, 20, '2018-03-27')
,( 2, 15, '2017-03-26');
This will get you current month and last month last year values
SELECT C.*, LY.VAL_CURR_MONTH_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_MONTH
FROM MYTABLE WHERE INT(D)/100=INT(CURRENT_DATE)/100
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE INT(D)/100 = INT(CURRENT_DATE)/100 -100
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
Then this for years
SELECT C.*, LY.VAL_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_YEAR
FROM MYTABLE WHERE INT(D)/10000=INT(CURRENT_DATE)/10000
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_LY
FROM MYTABLE
WHERE INT(D)/10000 = INT(CURRENT_DATE)/10000 -1
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
P.S. there are many other ways to manipulate dates, but casting to INT is maybe one of the easier ways
Also, here is a more flexible way to get the "Same Month of Last Year" value. A similar method can get "last Year" values.
SELECT T.*
, AVG(VAL) OVER(
PARTITION BY STORE
ORDER BY YEAR_MONTH
RANGE BETWEEN 101 PRECEDING AND 100 PRECEDING
) AS SAME_MONTH_PREV_YEAR
FROM
( SELECT STORE
, INTEGER(D)/100 AS YEAR_MONTH
, SUM(VAL) AS VAL
FROM
MYTABLE T
GROUP BY
STORE
, INTEGER(D)/100
) AS T
;
Gives
STORE YEAR_MONTH VAL SAME_MONTH_PREV_YEAR
----- ---------- --- --------------------
1 201703 35 NULL
1 201802 10 NULL
1 201803 30 35
2 201703 15 NULL
2 201803 20 15
It is better to avoid functions on table columns in where clauses. Check following SQLs which are based on P. Vernon sample table.
Note: These SQLs are for DB2 LUW 11.1
For month:
SELECT STORE,
SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CURR_MONTH,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE D between first_day(current date) and last_day(current date)
or D between first_day(current date - 1 year) and last_day(current date - 1 year)
GROUP BY STORE
ORDER BY STORE
For year:
SELECT STORE, SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CY,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_LY
FROM MYTABLE
WHERE D between first_day(current date - (month(current date) - 1) months)
and last_day(current date + (12 - month(current date)) months)
or D between first_day(current date - (month(current date) - 1) months - 1 year)
and last_day(current date + (12 - month(current date)) months - 1 year)
GROUP BY STORE
ORDER BY STORE
I have a table with the left 2 columns.
I am trying to achieve the 3th column based on some logic.
Logic: If we take date 1/1 and go further the highest score that wil be reached with going further in dates before the score goes down will be on 3/1. With a score of 12. So as HighestAchievedScore we will retrieve 12 for 1/1. And so forth.
If we are on a date where the next score goes down my highestAchieveScore will be my next score. Like you can see at 3/01/2014
date score HighestAchieveScore
1/01/2014 10 12
2/01/2014 11 12
3/01/2014 12 10
4/01/2014 10 11
5/01/2014 11 9
6/01/2014 9 8
7/01/2014 8 9
8/01/2014 9 9
I hope I explained it clear enough.
Thanks already for every input resolving the problem.
Lets make some test data:
DECLARE #Score TABLE
(
ScoreDate DATETIME,
Score INT
)
INSERT INTO #Score
VALUES
('01-01-2014', 10),
('01-02-2014', 11),
('01-03-2014', 12),
('01-04-2014', 10),
('01-05-2014', 11),
('01-06-2014', 9),
('01-07-2014', 8),
('01-08-2014', 9);
Now we are going to number our rows and then link to the next row to see if we are still going up
WITH ScoreRows AS
(
SELECT
s.ScoreDate,
s.Score,
ROW_NUMBER() OVER (ORDER BY ScoreDate) RN
FROM #Score s
),
ScoreUpDown AS
(
SELECT p.ScoreDate,
p.Score,
p.RN,
CASE WHEN p.Score < n.Score THEN 1 ELSE 0 END GoingUp,
ISNULL(n.Score, p.Score) NextScore
FROM ScoreRows p
LEFT JOIN ScoreRows n
ON n.RN = p.RN + 1
)
We take our data recursively look for the next row that is right before a fall, and take that value as our max for any row that is still going up. otherwise, we use the score for the next falling row.
SELECT
s.ScoreDate,
s.Score,
CASE WHEN s.GoingUp = 1 THEN d.Score ELSE s.NextScore END Test
FROM ScoreUpDown s
OUTER APPLY
(
SELECT TOP 1 * FROM ScoreUpDown d
WHERE d.ScoreDate > s.ScoreDate
AND GoingUp = 0
) d;
Output:
ScoreDate Score Test
2014-01-01 00:00:00.000 10 12
2014-01-02 00:00:00.000 11 12
2014-01-03 00:00:00.000 12 10
2014-01-04 00:00:00.000 10 11
2014-01-05 00:00:00.000 11 9
2014-01-06 00:00:00.000 9 8
2014-01-07 00:00:00.000 8 9
2014-01-08 00:00:00.000 9 9
Assuming you are wanting the third column to be computed, you can create the table like this (or add the column to an existing table), using a function to determine the value of the third column:
Create Function dbo.fnGetMaxScore(#Date Date)
Returns Int
As Begin
Declare #Ret Int
Select #Ret = Max(Score)
From YourTable
Where Date > #Date
Return #Ret
End
Create Table YourTable
(
Date Date,
Score Int,
HighestAchieveScore As dbo.fnGetMaxScore(Date)
)
I'm not sure this will work.... but this is the general concept.
Self join on A.Date < B.Date to get max score, but use coalesce and a 3rd self join on a rowID assigned in a CTE to determine if the score dropped on the next record, and if it did coalesce that score in, otherwise use the max score.
NEED TO TEST but have to setup a fiddle to do so..
WITH CTE as
(SELECT Date, Score, ROW_NUMBER() OVER(ORDER BY A.Date ASC) AS Row FROM tableName)
SELECT A.Date, A.Score, coalesce(c.score, Max(A.Score)) as HighestArchievedScore
FROM CTE A
LEFT JOIN CTE B
on A.Date < B.Date
LEFT JOIN CTE C
on A.Row+1=B.Row
and A.Score > C.Score
GROUP BY A.DATE,
A.SCORE
This should work on SQL Server 2012 but not earlier versions:
WITH cte AS (
SELECT date,
LEAD(score) OVER (ORDER BY date) nextScore
FROM yourTable
)
SELECT t.date, score,
CASE
WHEN nextScore < score THEN nextScore
ELSE (
SELECT ISNULL(MAX(t1.score), t.score)
FROM yourTable t1
JOIN cte ON t1.date = cte.date
WHERE t1.date > t.date
AND ISNULL(nextScore, 0) < score
)
END AS HighestAchieveScore
FROM yourTable t
JOIN cte ON t.date = cte.date