Here is what my data looks like
ID StartDate EndDate
1 1/1/2019 1/15/2019
2 1/10/2019 1/11/2019
3 2/5/2020 3/10/2020
4 3/10/2019 3/19/2019
5 5/1/2020 5/4/2020
I am trying to get a list of every date in my data set,and how many IDs fall in that time range, aggregated to the date level. So for ID-1, it would be in the records for 1/1/2019, 1/2/2019...through 1/15/2019.
I am not sure how to do this. All help is appreciated.
If you don't have a calendar table (highly recommended), you can perform this task with an ad-hoc tally table in concert with a CROSS APPLY
Example
Declare #YourTable Table ([ID] varchar(50),[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,'1/1/2019','1/15/2019')
,(2,'1/10/2019','1/11/2019')
,(3,'2/5/2020','3/10/2020')
,(4,'3/10/2019','3/19/2019')
,(5,'5/1/2020','5/4/2020')
Select A.ID
,B.Date
From #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) Date=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
Returns
ID Date
1 2019-01-01
1 2019-01-02
1 2019-01-03
1 2019-01-04
1 2019-01-05
1 2019-01-06
1 2019-01-07
1 2019-01-08
1 2019-01-09
1 2019-01-15
2 2019-01-10
2 2019-01-11
....
5 2020-05-01
5 2020-05-02
5 2020-05-03
5 2020-05-04
Related
I am trying to create a dates table in SQL based on a set of inputs, but I haven't been able to figure it out.
I am receiving in SQL inputs as below:
This table:
Date
Value
2022-01-01
5
2022-07-12
10
2022-11-15
3
A Start Date = 2022-01-01
A stop Date = 2022-12-01
I need to get a table as below starting from Start Date until Stop Date, assiging each correspondent number based on the initial table to each date in that period:
Date
Value
2022-01-01
5
2022-01-02
5
2022-01-03
5
2022-01-04
5
.
5
.
5
.
5
2022-07-09
5
2022-07-10
5
2022-07-11
5
2022-07-12
10
2022-07-13
10
2022-07-14
10
.
10
.
10
2022-11-13
10
2022-11-14
10
2022-11-15
3
2022-11-16
3
2022-11-17
3
2022-11-18
3
How can I do that?
Thanks.
Using the window function lead() over() in concert with an ad-hoc tally table
Example
Select Date = dateadd(DAY,N,A.Date)
,A.Value
From (
Select *
,nDays = datediff(DAY,Date,lead(Date,1,dateadd(day,1,'2022-12-01')) over (order by date))
From YourTable
) A
Join ( Select Top 1000 N=-1+Row_Number() Over (Order By (Select NULL)) From master..spt_values n1, master..spt_values n2 ) B
on N<NDays
Order by Date
Results
Date Value
2022-01-01 5
2022-01-02 5
2022-01-03 5
2022-01-04 5
2022-01-05 5
...
2022-07-10 5
2022-07-11 5
2022-07-12 10
2022-07-13 10
2022-07-14 10
...
2022-11-12 10
2022-11-13 10
2022-11-14 10
2022-11-15 3
2022-11-16 3
2022-11-17 3
...
2022-11-30 3
2022-12-01 3
I have 2 query result tables containing records for different assessments. There are RAssessments and NAssessments which make up a complete review.
The aim is to eventually determine which reviews were completed. I would like to join the two tables on the ID, and on the date, HOWEVER the date each assessment is completed on may not be identical and may be several days apart, and some ID's may have more of an RAssessment than an NAssessment.
Therefore, I would like to join T1 on to T2 on ID & on T1Date(+ or - 7 days). There is no other way to match the two tables and to align the records other than using the date range, as this is a poorly designed database. I hope for some help with this as I am stumped.
Here is some sample data:
Table #1:
ID
RAssessmentDate
1
2020-01-03
1
2020-03-03
1
2020-05-03
2
2020-01-09
2
2020-04-09
3
2022-07-21
4
2020-06-30
4
2020-12-30
4
2021-06-30
4
2021-12-30
Table #2:
ID
NAssessmentDate
1
2020-01-07
1
2020-03-02
1
2020-05-03
2
2020-01-09
2
2020-07-06
2
2020-04-10
3
2022-07-21
4
2021-01-03
4
2021-06-28
4
2022-01-02
4
2022-06-26
I would like my end result table to look like this:
ID
RAssessmentDate
NAssessmentDate
1
2020-01-03
2020-01-07
1
2020-03-03
2020-03-02
1
2020-05-03
2020-05-03
2
2020-01-09
2020-01-09
2
2020-04-09
2020-04-10
2
NULL
2020-07-06
3
2022-07-21
2022-07-21
4
2020-06-30
NULL
4
2020-12-30
2021-01-03
4
2021-06-30
2021-06-28
4
2021-12-30
2022-01-02
4
NULL
2022-01-02
Try this:
SELECT
COALESCE(a.ID, b.ID) ID,
a.RAssessmentDate,
b.NAssessmentDate
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table1
) a
FULL OUTER JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table2
) b ON a.ID = b.ID AND a.RowId = b.RowId
WHERE (a.RAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
OR (b.NAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
Suppose I have a first table like this:
tbl1:
eventid date1 date2
A 2020-06-21 2020-06-28
B 2020-05-13 2020-05-24
C 2020-07-20 2020-06-28
I also have a second table with a quantity and a date:
tbl2:
quantity date
5 2020-06-24
13 2020-07-24
8 2020-07-28
8 2020-06-20
12 2020-06-27
9 2020-06-29
10 2020-05-24
11 2020-05-12
18 2020-05-18
9 2020-05-14
7 2020-07-18
12 2020-07-21
Now I want select only the rows from table 2 where the dates fall between the dates of table 1 AND to add a column to table with each row containing A, B or C (eventid from table 1) so that we can see which date in table 2 belongs to which eventid.
So my end result would look like:
quantity date eventid
5 2020-06-24 1
13 2020-07-24 3
8 2020-07-28 3
12 2020-06-27 1
10 2020-05-24 2
18 2020-05-18 2
9 2020-05-14 2
12 2020-07-21 3
I've been starring at it for ages now because I need an efficient way to do it..
Is there an efficient way of obtaining the desired result?
This looks like a join:
select t2.*, t1.eventid
from tbl2 t2 join
tbl1 t1
on t2.date >= t1.date1 and t2.date <= t2.date2;
I would like to take this query (see below) and add a where win = in the select statement. here I would like to add a column to show the number of races it took to fulfill the where e.g below where win = 2
I've tried calculating the number between rows but it was wildly wrong on my part
select
date, time, raceid, win
from master
where date = #date
order by time
DATE TIME RACEID WIN
2019-01-06 00:40:00 4445 2
2019-01-06 00:50:00 4432 0
2019-01-06 01:00:00 4441 2
2019-01-06 01:10:00 4446 2
2019-01-06 01:20:00 4433 1
2019-01-06 01:30:00 4439 1
2019-01-06 01:40:00 4447 2
2019-01-06 01:50:00 4434 2
2019-01-06 02:00:00 4442 0
2019-01-06 02:10:00 4448 0
2019-01-06 02:20:00 4435 2
2019-01-06 02:30:00 4443 2
2019-01-06 02:40:00 4449 2
2019-01-06 02:50:00 4436 0
2019-01-06 02:50:00 4444 2
I would like to take this query and add a where win = in the select statement. here I would like to add a column to show the number of races it took to fulfill the where e.g below where win = 2
DATE TIME RACEID WIN RacestoWin
2019-01-06 00:40:00 4445 2 1
2019-01-06 01:00:00 4441 2 2
2019-01-06 01:10:00 4446 2 1
2019-01-06 01:40:00 4447 2 3
2019-01-06 01:50:00 4434 2 1
2019-01-06 02:20:00 4435 2 3
2019-01-06 02:30:00 4443 2 1
2019-01-06 02:40:00 4449 2 1
2019-01-06 02:50:00 4444 2 2
Is there a simple way of doing this? Not the best so any guidance would be greatly appreciated!!
I see. You are counting the rows between the wins. Basically, you want to assign a group. This group is the cumulative number of 2s on or after that record. Then, within each group, you can use row_number() or even aggregation in this case (because you know the last row of the group is "2"):
select date, max(time), 2 as win, count(*) as racestowin
from (select m.*,
sum(case when m.win = 2 then 1 else 0 end) over (partition by m.date order by m.time desc) as grouping
from master m
) m
group by date, grouping;
I have a table that looks like:
id code date1 date2 block
--------------------------------------------------
20 1234 2017-07-01 2017-07-31 1
15 1234 2017-06-01 2017-06-30 1
13 1234 2017-05-01 2017-05-31 0
11 1234 2017-03-01 2017-03-31 0
9 1234 2017-02-01 2017-02-28 1
8 1234 2017-01-01 2017-01-31 0
7 1234 2016-11-01 2016-11-31 0
6 1234 2016-10-01 2016-10-31 1
2 1234 2016-09-01 2016-09-31 1
I need to rank the rows according to the blocks of 0's and 1's, like:
id code date1 date2 block desired_rank
-------------------------------------------------------------------
20 1234 2017-07-01 2017-07-31 1 1
15 1234 2017-06-01 2017-06-30 1 1
13 1234 2017-05-01 2017-05-31 0 2
11 1234 2017-03-01 2017-03-31 0 2
9 1234 2017-02-01 2017-02-28 1 3
8 1234 2017-01-01 2017-01-31 0 4
7 1234 2016-11-01 2016-11-31 0 4
6 1234 2016-10-01 2016-10-31 1 5
2 1234 2016-09-01 2016-09-31 1 5
I've tried to use rank() and dense_rank(), but the result I end up with is:
id code date1 date2 block dense_rank()
-------------------------------------------------------------------
20 1234 2017-07-01 2017-07-31 1 1
15 1234 2017-06-01 2017-06-30 1 2
13 1234 2017-05-01 2017-05-31 0 1
11 1234 2017-03-01 2017-03-31 0 2
9 1234 2017-02-01 2017-02-28 1 3
8 1234 2017-01-01 2017-01-31 0 3
7 1234 2016-11-01 2016-11-31 0 4
6 1234 2016-10-01 2016-10-31 1 4
2 1234 2016-09-01 2016-09-31 1 5
In the last table, the rank doesn't care about the rows, it just takes all the 1's and 0's as a unit and sets an ascending count starting at the first 1 and 0.
My query goes like this:
CREATE TEMP TABLE data (id integer,code text, date1 date, date2 date, block integer);
INSERT INTO data VALUES
(20,'1234', '2017-07-01','2017-07-31',1),
(15,'1234', '2017-06-01','2017-06-30',1),
(13,'1234', '2017-05-01','2017-05-31',0),
(11,'1234', '2017-03-01','2017-03-31',0),
(9, '1234', '2017-02-01','2017-02-28',1),
(8, '1234', '2017-01-01','2017-01-31',0),
(7, '1234', '2016-11-01','2016-11-30',0),
(6, '1234', '2016-10-01','2016-10-31',1),
(2, '1234', '2016-09-01','2016-09-30',1);
SELECT *,dense_rank() OVER (PARTITION BY code,block ORDER BY date2 DESC)
FROM data
ORDER BY date2 DESC;
By the way, the database is in postgreSQL.
I hope there's a workaround... Thanks :)
Edit: Note that the blocks of 0's and 1's aren't equal.
There's no way to get this result using a single Window Function:
SELECT *,
Sum(flag) -- now sum the 0/1 to create the "rank"
Over (PARTITION BY code
ORDER BY date2 DESC)
FROM
(
SELECT *,
CASE
WHEN Lag(block) -- check if this is the 1st row of a new block
Over (PARTITION BY code
ORDER BY date2 DESC) = block
THEN 0
ELSE 1
END AS flag
FROM DATA
) AS dt