Select max date before min date in another table - sql

I have three tables.
Order_Status
Order_ID order_status_id Timestamp
1 2 12/24/19 0:00
1 3 12/24/19 0:10
1 4 12/24/19 0:30
1 5 12/24/19 1:00
2 2 12/24/19 15:00
2 3 12/24/19 15:07
2 9 12/24/19 15:10
2 8 12/24/19 15:33
2 10 12/24/19 16:00
4 4 12/24/19 19:00
4 2 12/24/19 19:30
4 3 12/24/19 19:32
4 4 12/24/19 19:40
4 5 12/24/19 19:45
5 2 1/28/19 19:30
5 6 1/28/19 19:48
Contact
Order_id Contact_time
1 12/24/19 0:25
2 12/24/19 15:30
4 12/24/19 19:38
5 1/28/19 19:46
meta_status
order_status_id status_description
1 desc1
2 desc2
3 desc3
4 desc4
5 desc5
I am trying to retrieve the max order Timestamp before min Contact Time. I need it to be group by orderID, I also need the order_status_id and the status_description
This is my query so far
SELECT a.Order_ID,
a.order_status_id,
c.status_description,
MAX(CASE
WHEN a.order_timestamp < b.Contact_Time then
a.order_timestamp
ELSE
null
END) AS beforeContact
FROM Order_Status a
LEFT JOIN Contact b
ON b.Order_ID = a.Order_ID
LEFT JOIN meta_status c
ON c.order_status_id = a.order_status_id
GROUP BY a.Order_ID, a.order_status_id, c.status_description
But it still returns every row in the tables. I need it to be only 4 rows which represent 4 orders 1,2,4,5 and the max order timestamp before contact time.
Do I need to use subquery or windowing function for this?

This is it:
select a.* from (SELECT a.ordertimestamp,
a.order_status_id,
c.status_description,
a.order_timestamp AS beforeContact,
rank() over (partition by a.Order_ID order by a.order_timestamp desc) as
rank1
FROM Order_Status a
LEFT JOIN Contact b
ON b.Order_ID = a.Order_ID
LEFT JOIN meta_status c
ON c.order_status_id = a.order_status_id
where a.order_timestamp < b.Contact_Time
GROUP BY a.Order_ID, a.order_status_id, c.status_description)as a
where rank1=1;

Related

SQL - Calculate the average of a value in a table B from date range in table A

I am constructing a table in SQL like this
TABLE A
obj_id start_date end_date
1 2021-03-01 2022-08-02
1 2020-06-01 2021-07-02
2 2021-05-03 2022-08-04
3 2021-04-21 2022-06-05
And I have another table
TABLE B
obj_id date value
1 2021-04-12 21.45
3 2022-06-15 19.02
1 2020-11-02 3.11
2 2022-05-23 45.20
1 2022-07-31 32.45
3 2021-09-01 22.56
2 2021-10-10 34.04
I want to add to TABLE A a column with average value of TABLE B for corresponding obj_id of values where TABLE B date falls between TABLE A date range.
Expected result
TABLE A
obj_id start_date end_date average value
1 2021-03-01 2022-08-02 26.95 <-- Average value of 21.45 and 32.45 excluding 3.11 from average because date in table B is outside date range in table A
1 2020-06-01 2021-07-02 etc.
2 2021-05-03 2022-08-04 etc.
3 2021-04-21 2022-06-05 etc.
Sample query:
select
a.obj_id,
a.start_date,
a.end_date,
avg(b.value) as average
from table_a a
inner join table_b b
on a.obj_id = b.obj_id
and b.date >= a.start_date
and b.date <= a.end_date
group by
a.obj_id,
a.start_date,
a.end_date
order by
a.obj_id

Running Sum based on two dates in SQL

I have a table which has three columns shown in the picture.
Ord_dt - Date when the order was placed.
first_order- Date when the first order was placed.(Calculated based on last 52 weeks)
cnt_orders - Total Orders placed on Order Date.
ORD_DT first_order cnt_orders
6/19/2020 6/19/2020 2
6/22/2020 6/19/2020 1
10/8/2020 6/19/2020 2
11/20/2020 6/19/2020 1
12/1/2020 6/19/2020 1
2/4/2021 6/19/2020 1
2/12/2021 6/19/2020 1
3/7/2021 6/19/2020 1
3/30/2021 6/19/2020 1
4/7/2021 6/19/2020 1
4/30/2021 6/19/2020 1
5/11/2021 6/19/2020 1
5/31/2021 6/19/2020 2
7/28/2021 10/8/2020 2
The Final Output should be something like this based on First_order Date. Running_Sum column is a running sum of cnt_orders based on first_order.In the below example row 3 Ord_dt = first_order in row 14, so it should do a sum of all orders for row14 from row3 to row 14.
ORD_DT first_order cnt_orders Running_sum
6/19/2020 6/19/2020 2 2
6/22/2020 6/19/2020 1 3
10/8/2020 6/19/2020 2 5
11/20/2020 6/19/2020 1 6
12/1/2020 6/19/2020 1 7
2/4/2021 6/19/2020 1 8
2/12/2021 6/19/2020 1 9
3/7/2021 6/19/2020 1 10
3/30/2021 6/19/2020 1 11
4/7/2021 6/19/2020 1 12
4/30/2021 6/19/2020 1 13
5/11/2021 6/19/2020 1 14
5/31/2021 6/19/2020 2 16
7/28/2021 10/8/2020 2 15
I have tried with SUM and Partition but it doesn't give me the correct last row data since first_order has been changed. It should give me 15 instead of 18.
How can I achieve this in SQL Server?
Sample Table which required Running Sum
create table t
(ORD_DT date, first_order date, cnt_orders int);
go
insert into t values
('6/19/2020' , '6/19/2020' , 2),
('6/22/2020' , '6/19/2020' , 1),
('10/8/2020' , '6/19/2020' , 2),
('11/20/2020', '6/19/2020' , 1),
('12/1/2020' , '6/19/2020' , 1),
('2/4/2021' , '6/19/2020' , 1),
('2/12/2021' , '6/19/2020' , 1),
('3/7/2021' , '6/19/2020' , 1),
('3/30/2021' , '6/19/2020' , 1),
('4/7/2021' , '6/19/2020' , 1),
('4/30/2021' , '6/19/2020' , 1),
('5/11/2021' , '6/19/2020' , 1),
('5/31/2021' , '6/19/2020' , 2),
('7/28/2021' , '10/8/2020' , 2);
go
select * ,
(select sum(t1.cnt_orders)
from t t1
where t1.ord_dt >= t.first_order and
t1.ord_dt <= t.ORD_DT
) cumsum
from t
order by ord_dt;
ORD_DT first_order cnt_orders cumsum
---------- ----------- ----------- -----------
2020-06-19 2020-06-19 2 2
2020-06-22 2020-06-19 1 3
2020-10-08 2020-06-19 2 5
2020-11-20 2020-06-19 1 6
2020-12-01 2020-06-19 1 7
2021-02-04 2020-06-19 1 8
2021-02-12 2020-06-19 1 9
2021-03-07 2020-06-19 1 10
2021-03-30 2020-06-19 1 11
2021-04-07 2020-06-19 1 12
2021-04-30 2020-06-19 1 13
2021-05-11 2020-06-19 1 14
2021-05-31 2020-06-19 2 16
2021-07-28 2020-10-08 2 15
(14 row(s) affected)

How can I join two tables on an ID and a DATE RANGE in SQL

I have 2 query result tables containing records for different assessments. There are RAssessments and NAssessments which make up a complete review.
The aim is to eventually determine which reviews were completed. I would like to join the two tables on the ID, and on the date, HOWEVER the date each assessment is completed on may not be identical and may be several days apart, and some ID's may have more of an RAssessment than an NAssessment.
Therefore, I would like to join T1 on to T2 on ID & on T1Date(+ or - 7 days). There is no other way to match the two tables and to align the records other than using the date range, as this is a poorly designed database. I hope for some help with this as I am stumped.
Here is some sample data:
Table #1:
ID
RAssessmentDate
1
2020-01-03
1
2020-03-03
1
2020-05-03
2
2020-01-09
2
2020-04-09
3
2022-07-21
4
2020-06-30
4
2020-12-30
4
2021-06-30
4
2021-12-30
Table #2:
ID
NAssessmentDate
1
2020-01-07
1
2020-03-02
1
2020-05-03
2
2020-01-09
2
2020-07-06
2
2020-04-10
3
2022-07-21
4
2021-01-03
4
2021-06-28
4
2022-01-02
4
2022-06-26
I would like my end result table to look like this:
ID
RAssessmentDate
NAssessmentDate
1
2020-01-03
2020-01-07
1
2020-03-03
2020-03-02
1
2020-05-03
2020-05-03
2
2020-01-09
2020-01-09
2
2020-04-09
2020-04-10
2
NULL
2020-07-06
3
2022-07-21
2022-07-21
4
2020-06-30
NULL
4
2020-12-30
2021-01-03
4
2021-06-30
2021-06-28
4
2021-12-30
2022-01-02
4
NULL
2022-01-02
Try this:
SELECT
COALESCE(a.ID, b.ID) ID,
a.RAssessmentDate,
b.NAssessmentDate
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table1
) a
FULL OUTER JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table2
) b ON a.ID = b.ID AND a.RowId = b.RowId
WHERE (a.RAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
OR (b.NAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')

I need to count the total of subscribers in a given rank over time

I have a table with the following format:
user_iD
user_rank
rank_updated
1
A
2021-06-18
2
A
2021-06-18
3
A
2021-06-18
4
A
2021-06-18
2
B
2021-06-19
3
B
2021-06-19
1
B
2021-06-20
2
C
2021-06-20
4
B
2021-06-20
4
C
2021-06-21
I need it to be like:
date
rank
rank_count
2021-06-18
A
4
2021-06-18
B
0
2021-06-18
C
0
2021-06-19
A
2
2021-06-19
B
2
2021-06-19
C
0
2021-06-20
A
0
2021-06-20
B
3
2021-06-20
C
1
2021-06-21
A
0
2021-06-21
B
2
2021-06-21
C
2
and i need to count how many users with the given rank A,B or C exists until a date. Until now I've the following (where my_table is the data source):
With Users_rank AS (
Select
"user_id" AS "user_id",
Cast("rank_updated" AS date) AS "rank_updated"
rank() OVER(partition "user_id" OVER "rank_id") as "user_rank"
FROM "my_table"
)
Select "rank_updated", "user_id", max("user_rank")
FROM Users_rank
GROUP BY "rank_updated", "user_id"
ORDER BY "rank_updated"
This give me the following result:
rank_updated
user_id
max
2021-06-18
1
A
2021-06-18
2
A
2021-06-18
3
A
2021-06-18
4
A
2021-06-19
2
B
2021-06-19
3
B
2021-06-20
1
B
2021-06-20
2
C
2021-06-20
4
B
2021-06-21
4
C
Now I need only to count how many has a given rank until the days, but I don't know how
You can do a cross join to get all dates with all ranks. Then bring in the existing data:
select gs.date, v.user_rank, count(t.user_rank) as on_day,
sum(count(t.user_rank)) over (partition by v.user_rank order by gs.date) as running_cnt
from generate_series('2021-06-18'::date, '2021-06-21'::date, interval '1 day') gs(date) cross join
(values ('A'), ('B'), ('C')) v(user_rank) left join
t
on t.user_rank = v.user_rank and
t.date = gs.date
group by gs.date, v.user_rank;

SQL: Rank / Group a Column by Date

Using SQL Server Management Studio v17.9.1
I'm trying to rank / order / group some data by Site and Area by Date, but I'm struggling to get my head around not ranking the area alphabetically and ranking it by the earliest date it appears.
Here's the data I have:
Site | Area | Space | Date
DCG X 7 02/02/2020 12:13
DCG X 5 04/02/2020 11:47
DCG X 12 10/02/2020 15:14
GNL U 0 03/03/2020 18:35
GNL A 4 04/03/2020 08:28
GNL C 4 06/03/2020 09:07
GNL B 1 16/03/2020 07:10
DPL U 0 18/03/2020 09:28
DPL A 1 18/03/2020 09:36
DPL A 1 20/03/2020 20:04
SGR F 2 21/03/2020 19:42
SGR B 2 22/03/2020 10:30
SGR C 3 24/03/2020 08:17
SGR F 1 01/04/2020 09:00
SGR E 1 02/02/2020 10:57
SGR F 1 02/02/2020 15:50
I want to add 2 columns that rank / group the site and the area in ascending order of date, like so:
Site | Area | Space | Date | Site Order | Area Order |
DCG X 7 02/02/2020 12:13 1 1
DCG X 5 04/02/2020 11:47 1 1
DCG X 12 10/02/2020 15:14 1 1
GNL U 0 03/03/2020 18:35 2 1
GNL A 4 04/03/2020 08:28 2 2
GNL C 4 06/03/2020 09:07 2 3
GNL B 1 16/03/2020 07:10 2 4
DPL U 0 18/03/2020 09:28 3 1
DPL A 1 18/03/2020 09:36 3 2
DPL A 1 20/03/2020 20:04 3 2
SGR F 2 21/03/2020 19:42 4 1
SGR B 2 22/03/2020 10:30 4 2
SGR C 3 24/03/2020 08:17 4 3
SGR F 1 01/04/2020 09:00 4 1
SGR E 1 02/02/2020 10:57 4 4
SGR F 1 02/02/2020 15:50 4 1
Apologies if I've not made it clear
You can use min() as a window function to get the minimum date for each site and site/area combo. Then use dense_rank():
select t.*,
dense_rank() over (order by min_site_date, site) as site_seqnum,
dense_rank() over (partition by site order by min_site_date) as area_seqnum
from (select t.*,
min(date) over (partition by site) as min_site_date,
min(date) over (partition by site, area) as min_site_area_date
from t
) t
You can use window function :
select t.*,
dense_rank() over (order by site, site_date) as site_sequence,
dense_rank() over (partition by site order by area, site_area_date) as area_sequence
from (select t.*,
min([date]) over (partition by [site]) as site_date,
min([date]) over (partition by [site], area) as site_area_date
from table t
) t;