link a value from one table to another and slice one table based on columns from another table in sql - sql

Suppose I have a first table like this:
tbl1:
eventid date1 date2
A 2020-06-21 2020-06-28
B 2020-05-13 2020-05-24
C 2020-07-20 2020-06-28
I also have a second table with a quantity and a date:
tbl2:
quantity date
5 2020-06-24
13 2020-07-24
8 2020-07-28
8 2020-06-20
12 2020-06-27
9 2020-06-29
10 2020-05-24
11 2020-05-12
18 2020-05-18
9 2020-05-14
7 2020-07-18
12 2020-07-21
Now I want select only the rows from table 2 where the dates fall between the dates of table 1 AND to add a column to table with each row containing A, B or C (eventid from table 1) so that we can see which date in table 2 belongs to which eventid.
So my end result would look like:
quantity date eventid
5 2020-06-24 1
13 2020-07-24 3
8 2020-07-28 3
12 2020-06-27 1
10 2020-05-24 2
18 2020-05-18 2
9 2020-05-14 2
12 2020-07-21 3
I've been starring at it for ages now because I need an efficient way to do it..
Is there an efficient way of obtaining the desired result?

This looks like a join:
select t2.*, t1.eventid
from tbl2 t2 join
tbl1 t1
on t2.date >= t1.date1 and t2.date <= t2.date2;

Related

SQL - Calculate the average of a value in a table B from date range in table A

I am constructing a table in SQL like this
TABLE A
obj_id start_date end_date
1 2021-03-01 2022-08-02
1 2020-06-01 2021-07-02
2 2021-05-03 2022-08-04
3 2021-04-21 2022-06-05
And I have another table
TABLE B
obj_id date value
1 2021-04-12 21.45
3 2022-06-15 19.02
1 2020-11-02 3.11
2 2022-05-23 45.20
1 2022-07-31 32.45
3 2021-09-01 22.56
2 2021-10-10 34.04
I want to add to TABLE A a column with average value of TABLE B for corresponding obj_id of values where TABLE B date falls between TABLE A date range.
Expected result
TABLE A
obj_id start_date end_date average value
1 2021-03-01 2022-08-02 26.95 <-- Average value of 21.45 and 32.45 excluding 3.11 from average because date in table B is outside date range in table A
1 2020-06-01 2021-07-02 etc.
2 2021-05-03 2022-08-04 etc.
3 2021-04-21 2022-06-05 etc.
Sample query:
select
a.obj_id,
a.start_date,
a.end_date,
avg(b.value) as average
from table_a a
inner join table_b b
on a.obj_id = b.obj_id
and b.date >= a.start_date
and b.date <= a.end_date
group by
a.obj_id,
a.start_date,
a.end_date
order by
a.obj_id

How can I join two tables on an ID and a DATE RANGE in SQL

I have 2 query result tables containing records for different assessments. There are RAssessments and NAssessments which make up a complete review.
The aim is to eventually determine which reviews were completed. I would like to join the two tables on the ID, and on the date, HOWEVER the date each assessment is completed on may not be identical and may be several days apart, and some ID's may have more of an RAssessment than an NAssessment.
Therefore, I would like to join T1 on to T2 on ID & on T1Date(+ or - 7 days). There is no other way to match the two tables and to align the records other than using the date range, as this is a poorly designed database. I hope for some help with this as I am stumped.
Here is some sample data:
Table #1:
ID
RAssessmentDate
1
2020-01-03
1
2020-03-03
1
2020-05-03
2
2020-01-09
2
2020-04-09
3
2022-07-21
4
2020-06-30
4
2020-12-30
4
2021-06-30
4
2021-12-30
Table #2:
ID
NAssessmentDate
1
2020-01-07
1
2020-03-02
1
2020-05-03
2
2020-01-09
2
2020-07-06
2
2020-04-10
3
2022-07-21
4
2021-01-03
4
2021-06-28
4
2022-01-02
4
2022-06-26
I would like my end result table to look like this:
ID
RAssessmentDate
NAssessmentDate
1
2020-01-03
2020-01-07
1
2020-03-03
2020-03-02
1
2020-05-03
2020-05-03
2
2020-01-09
2020-01-09
2
2020-04-09
2020-04-10
2
NULL
2020-07-06
3
2022-07-21
2022-07-21
4
2020-06-30
NULL
4
2020-12-30
2021-01-03
4
2021-06-30
2021-06-28
4
2021-12-30
2022-01-02
4
NULL
2022-01-02
Try this:
SELECT
COALESCE(a.ID, b.ID) ID,
a.RAssessmentDate,
b.NAssessmentDate
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table1
) a
FULL OUTER JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table2
) b ON a.ID = b.ID AND a.RowId = b.RowId
WHERE (a.RAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
OR (b.NAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')

Get all rows from one table stream and the row before in time from an other table

Suppose I have one table (table_1) and one table stream (stream_1) that gets changes made to table_1, in my case only inserts of new rows. And once I have acted on these changes, the rowes will be removed from stream_1 but remain in table_1.
From that I would like to calculate delta values for var1 (var1 - lag(var1) as delta_var1) partitioned on a customer and just leave var2 as it is. So the data in table_1 could look something like this:
timemessage
customerid
var1
var2
2021-04-01 06:00:00
1
10
5
2021-04-01 07:00:00
2
100
7
2021-04-01 08:00:00
1
20
10
2021-04-01 09:00:00
1
40
3
2021-04-01 15:00:00
2
150
5
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
And the data in stream_1 that I want to act on could looks like this:
timemessage
customerid
var1
var2
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
But to be able to calculate delta_var1 for all customers I would need the previous row in time for each customer before the ones in stream_1.
For example: To be able to calculate how much var1 has increased for customerid = 1 between 2021-04-01 09:00:00 and 2021-04-01 23:00:00 I want to include the 2021-04-01 09:00:00 row for customerid = 1 in my output.
So I would like to create a select containing all rows in stream_1 + the previous row in time for each customerid from table_1: The wanted output is the following in regard to the mentioned table_1 and stream_1.
timemessage
customerid
var1
var2
2021-04-01 09:00:00
1
40
3
2021-04-01 15:00:00
2
150
5
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
So given you have the "last value per day" in your wanted output, you are want a QUALIFY to keep only the wanted rows and using ROW_NUMBER partitioned by customerid and timemessage. Assuming the accumulator it positive only you can order by accumulatedvalue thus:
WITH data(timemessage, customerid, accumulatedvalue) AS (
SELECT * FROM VALUES
('2021-04-01', 1, 10)
,('2021-04-01', 2, 100)
,('2021-04-02', 1, 20)
,('2021-04-03', 1, 40)
,('2021-04-03', 2, 150)
,('2021-04-04', 1, 50)
,('2021-04-04', 2, 180)
,('2021-04-05', 1, 55)
,('2021-04-05', 2, 200)
)
SELECT * FROM data
QUALIFY ROW_NUMBER() OVER (PARTITION BY customerid,timemessage ORDER BY accumulatedvalue DESC) = 1
ORDER BY 1,2;
gives:
TIMEMESSAGE CUSTOMERID ACCUMULATEDVALUE
2021-04-01 1 10
2021-04-01 2 100
2021-04-02 1 20
2021-04-03 1 40
2021-04-03 2 150
2021-04-04 1 50
2021-04-04 2 180
2021-04-05 1 55
2021-04-05 2 200
if you can trust your data and data in table2 starts right after data in table1 then you can just get the last records for each customer from table1 and union with table2:
select * from table1
qualify row_number() over (partitioned by customerid order by timemessage desc) = 1
union all
select * from table2
if not
select a.* from table1 a
join table2 b
on a.customerid = b.customerid
and a.timemessage < b.timemessage
qualify row_number() over (partitioned by a.customerid order by a.timemessage desc) = 1
union all
select * from table2
also you can add a condition to not look to data for more than 1 day (or 1 hour or whatever safe interval is to look at) for better performance

Select statement for overlapping dates

I need a SELECT query that returns the RoomID's of rows in which the dates overlap each other, ex.
Client ID 10 and 6 arrive on different days, but they are assigned to the same room during their stay at the hotel.
RoomID ArrivalDate DepartureDate ClientID
2 2020-11-02 2021-11-10 10
2 2021-11-01 2021-11-11 6
4 2021-10-18 2021-10-20 4
4 2021-12-13 2021-12-21 11
4 2021-12-14 2021-12-21 12
8 2021-12-10 2021-12-19 8
9 2021-09-20 2021-09-25 2
9 2021-09-21 2021-09-25 1
9 2021-12-10 2021-12-15 7
10 2021-10-19 2021-10-26 5
11 2021-10-02 2021-10-10 3
11 2021-12-12 2021-12-18 9
12 2021-10-04 2021-10-09 2
CREATE DATABASE Hotel;
CREATE TABLE reservations (
roomID INT NOT NULL,
ArrivalDate DATE NOT NULL,
DepartureDate DATE NOT NULL,
clientID INT NOT NULL,
PRIMARY KEY (roomID, ArrivalDate),
CHECK (ArrivalDate <= DepartureDate)
);
I appreciate any help.
You can get overlaps using exists:
select t.*
from t
where exists (select 1
from t t2
where t2.RoomID = t.RoomId and
t2.ClientID <> t.ClientId and
t2.ArrivalDate < t.DepartureDate and
t2.DepartureDate > t.ArrivalDate
);

SQL - Count total IDs each day between dates

Here is what my data looks like
ID StartDate EndDate
1 1/1/2019 1/15/2019
2 1/10/2019 1/11/2019
3 2/5/2020 3/10/2020
4 3/10/2019 3/19/2019
5 5/1/2020 5/4/2020
I am trying to get a list of every date in my data set,and how many IDs fall in that time range, aggregated to the date level. So for ID-1, it would be in the records for 1/1/2019, 1/2/2019...through 1/15/2019.
I am not sure how to do this. All help is appreciated.
If you don't have a calendar table (highly recommended), you can perform this task with an ad-hoc tally table in concert with a CROSS APPLY
Example
Declare #YourTable Table ([ID] varchar(50),[StartDate] date,[EndDate] date)
Insert Into #YourTable Values
(1,'1/1/2019','1/15/2019')
,(2,'1/10/2019','1/11/2019')
,(3,'2/5/2020','3/10/2020')
,(4,'3/10/2019','3/19/2019')
,(5,'5/1/2020','5/4/2020')
Select A.ID
,B.Date
From #YourTable A
Cross Apply (
Select Top (DateDiff(DAY,A.[StartDate],A.[EndDate])+1) Date=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),A.[StartDate])
From master..spt_values n1,master..spt_values n2
) B
Returns
ID Date
1 2019-01-01
1 2019-01-02
1 2019-01-03
1 2019-01-04
1 2019-01-05
1 2019-01-06
1 2019-01-07
1 2019-01-08
1 2019-01-09
1 2019-01-15
2 2019-01-10
2 2019-01-11
....
5 2020-05-01
5 2020-05-02
5 2020-05-03
5 2020-05-04