Many to many join with filter - sql

I have two tables like so -
Table 1 -
patient admit_dt discharge_dt
323 2020-01-09 2020-02-01
323 2020-02-18 2020-02-27
231 2020-02-13 2020-02-17
Table 2 -
patient admit_dt discharge_dt
323 2020-02-05 2020-02-07
231 2020-02-23 2020-02-28
The output I am needing is
patient
323
The logic is - if one patient goes from table 1 into table 2 and ends up back in table 1 within 30 days, we want to count them in the output.
Patient 231 is not included in the result because they didn't go back to table 1.

If I understand correctly, you can use join:
select t1.patient
from table1 t1 join
table2 t2
on t2.patient = t1.patient and
t2.admit_dt > t1.discharge_dt join
table1 tt1
on tt1.patient = t1.patient and
tt1.admit_dt > t2.discharge_dt;

Related

running tally in SQL

Need help tallying fork truck training completions at work. Here is an example of the tables I have, and the table I need to create:
table 1:
date
is_work_day
2023-01-25
1
2023-01-26
1
2023-01-27
1
2023-01-28
0
2023-01-29
1
2023-01-30
0
table 2:
employee_id
training_passed
test_date
001
1
2023-01-25
002
1
2023-01-26
003
0
2023-01-26
004
1
2023-01-26
005
0
2023-01-27
006
1
2023-01-29
need table:
date
cumulative_passed_training
2023-01-26
2
2023-01-27
2
2023-01-29
3
The table should count the total passed trainings, but only starting on 2023-01-26 and should only show dates that are work days. Any help would be greatly appreciated.
I think I need to JOIN the two tables, and then SUM the training_passed column, but am unsure how to get it to start at a certain date, and how to make it only show work days on the final table.
JOIN on the date column and add the passed tests as JOIN condition. Also GROUP BY the date so you can sum for each one
select t1.date, count(t2.employee_id)
from table1 t1
join table2 t2 on t1. date = t2.test_date
and t2.training_passed = 1
group by t1.date
It would make no difference if you put the condition
t2.training_passed = 1
in a where clause instead of the INNER JOIN.

Is there is way to get SUM() of column without GROUPING by joining multiple tables in SQL Server

I am getting SUM() of amount in CrowdfundedUser table by GROUP BY CrowdfundID but difficult to get SUM() because all columns are unique.
Crowdfund:
CrowdfundID
GoalAmount
StartedDate
9
10000
09/02/2022
5
20000
10/02/2022
55
350000
11/02/2022
444
541256
12/02/2022
54
78458
13/02/2022
CrowdfundedUser:
ID
User ID
CrowdfundID
Amount
744
12214
9
1000
745
4124
5
8422
746
12214
55
784
747
12214
444
874
748
64554
54
652
CrowdfundiPaymentTransaction:
CrowdfundedUserID
Invoice
Amount
PaymentDate
744
RA45A14124
1000
09/02/2022
745
RA45A12412
8422
10/02/2022
746
RA45U14789
784
11/02/2022
747
RA45F12457
874
12/02/2022
748
RA45M00124
652
13/02/2022
My query :
SELECT
c.CrowdfundID,
SUM(cu.Amount),
SUM(cpt..Amount)
FROM
Crowdfund c
INNER JOIN
CrowdfundedUser cu ON c.CrowdfundID = cu.CrowdfundID
INNER JOIN
CrowdfundiPaymentTransaction cpt ON cu.ID = cpt.CrowdfundedUserID
GROUP BY
c.CrowdfundID
SELECT c.CrowdfundID,
SUM(cu.Amount) OVER (
ORDER BY c.CrowdfundID) Amount,
SUM(cpt..Amount) OVER (
ORDER BY c.CrowdfundID) CptAmount
FROM Crowdfund c
INNER JOIN CrowdfundedUser cu ON c.CrowdfundID = cu.CrowdfundID
INNER JOIN CrowdfundiPaymentTransaction cpt ON cu.ID = cpt.CrowdfundedUserID

Query to get rows based on dates from two table in Athena

I have two table called master_tbl and anom_table
as follows:
master_tbl
date id country value
2017-01-01 26 US 2
2017-01-02 26 US 4
2017-01-03 26 US 9
2017-01-04 26 US 2
2017-01-05 26 US 4
2017-01-06 26 US 1
2017-01-07 26 US 5
2017-01-08 26 US 3
2017-01-09 26 US 100
2017-01-10 26 US 4
anom_tbl
date id country anoms
2017-01-01 26 US 0
2017-01-02 26 US 0
2017-01-03 26 US 9
2017-01-04 26 US 0
2017-01-05 26 US 0
2017-01-06 26 US 0
2017-01-07 26 US 0
2017-01-08 26 US 0
2017-01-09 26 US 100
2017-01-10 26 US 0
I want to create third table from master_tbl and join with anom_tbl to select only rows which dates that has value in anom column in from anom_tbl and one day before and one day after that date from master_tbl
Finally I want to have the following table
date id country value
2017-01-02 26 US 2
2017-01-03 26 US 9
2017-01-04 26 US 4
2017-01-08 26 US 3
2017-01-09 26 US 100
2017-01-10 26 US 4
because I have big data I takes time that I run it in R or python then I want to create table in AWS (athena)
I have tried the following code in the athena however it does not work
FROM
(SELECT t2.value,
t1.id,
t1.country AS country,
cast(t1.date AS DATE) AS orig_date
FROM
(SELECT id,
country,
date
FROM anom_tbl) t1
JOIN master_tbl t2
ON t2.id=t1.id
AND t2.country= t1.country
AND t2.date=t1.date) t3
JOIN master_tbl t2
ON t3.id=t2.id
AND t3.country=t2.country
where t2.date IN(GETDATE()-1)
Could you please help me to modify the sql code to get the proper result.
If I followed you correctly, you could do this with exists:
select m.*
from master_tbl m
where exists (
select 1
from anom_tbl a
where
a.anoms <> 0
and a.id = m.id
and a.country = m.country
and m.date >= a.date - interval '1' day
and m.date <= a.date + interval '1' day
)
This brings all records in the master table for which another record exists in the anom table for the same id and country, with a non-0 value, within a +/- 1 day interval.

How to use the join keyword to join two tables in SQL Server

I do not know how to use the join keyword in the following situation. I have two tables and i need to join them in one table. This is the code
use DEV
select top 10
Casa_de_marcat,
Numar_bon,
Data_bon
from antetBonuri
where Casa_de_marcat=1
order by Data_bon desc
use DEV
select top 10
Total,
Data,
Ora,
Vinzator
from bp
order by Data desc
these are the results from the two tables
Casa_de_marcat Numar_bon Data_bon
-------------- ----------- -----------------------
1 NULL 2018-05-12 00:00:00.000
1 1 2018-04-13 00:00:00.000
1 NULL 2018-03-16 00:00:00.000
1 NULL 2018-03-16 00:00:00.000
1 1 2018-02-16 00:00:00.000
1 1 2018-02-05 00:00:00.000
1 NULL 2018-02-05 00:00:00.000
1 NULL 2018-02-05 00:00:00.000
1 10 2017-11-02 00:00:00.000
1 NULL 2017-09-29 00:00:00.000
(10 rows affected)
Total Data Ora Vinzator
---------------------- ----------------------- ------ ----------
12 2019-11-15 00:00:00.000 1150 naomi
12 2019-11-15 00:00:00.000 1150 naomi
82 2019-10-17 00:00:00.000 1035 MIHAI
12 2019-10-17 00:00:00.000 1038 MIHAI
12 2019-10-17 00:00:00.000 1043 MIHAI
12 2019-10-17 00:00:00.000 1044 MIHAI
12 2019-10-17 00:00:00.000 1044 MIHAI
12 2019-10-17 00:00:00.000 1053 MIHAI
12 2019-10-17 00:00:00.000 1105 MIHAI
12 2019-10-17 00:00:00.000 1108 MIHAI
(10 rows affected)
the final results should be all the above columns joined in a single table , the order does not count.
And yes, my bad, im using SQL server
A FULL JOIN technically does what you want:
select ab.*, bp.*
from (select top 10 Casa_de_marcat, Numar_bon, Data_bon
from antetBonuri
where Casa_de_marcat = 1
order by Data_bon desc
) ab full join
(select top 10 Total, Data, Ora, Vinzator
from bp
order by Data desc
) bp
on 1 = 0; -- never matches
There are no obvious JOIN keys.
This produces 20 rows with all the columns. In each row, one of the sets of columns (per subquery) will all be NULL.
That appears to be what you are asking for. I'm not sure how useful this is. Or why you would prefer using JOIN rather than UNION ALL to get this.
EDIT:
It also strikes me that you might want 10 rows side-by-side. If so, then use row_number():
select ab.Casa_de_marcat, ab.Numar_bon, ab.Data_bon,
bp.Total, bp.Data, bp.Ora, bp.Vinzator
from (select top 10 Casa_de_marcat, Numar_bon, Data_bon,
row_number() over (order by (select null)) as seqnum
from antetBonuri
where Casa_de_marcat = 1
order by Data_bon desc
) ab full join
(select top 10 Total, Data, Ora, Vinzator,
row_number() over (order by (select null)) as seqnum
from bp
order by Data desc
) bp
on ab.seqnum = bp.seqnum;
This results in 10 rows with the results "side-by-side". The rows from the two tables are in an arbitrary order.

SQL - Group By Select Query is close but I need to make it slightly more unique

Updated for more clarity
SQL Sever 2000. I'm trying to make this query slightly more unique.
The Query:
USE MyDatabase
GO
SELECT MAX(x.provider_entry_id) as provider_entry_id, -- this ID is the PK
x.provider_entry_type_id, -- the entry for the specific provider type (the ID)
x.provider_entry, -- the actual provider entry (the ID)
x.provider_entry_visit_dt -- the date the entry was created
FROM tbl_claimant_provider_entry x
JOIN (SELECT p.provider_entry_type_id,
p.provider_entry,
MAX(provider_entry_visit_dt) AS max_date
FROM tbl_claimant_provider_entry p
WHERE provider_entry_clmnt = 4963 -- change this for you user
GROUP BY p.provider_entry_type_id, p.provider_entry) y ON y.provider_entry_type_id = x.provider_entry_type_id
AND y.max_date = x.provider_entry_visit_dt
GROUP BY x.provider_entry_type_id, x.provider_entry, x.provider_entry_visit_dt
returns:
provider_entry_id provider_entry_type_id provider_entry provider_entry_visit_dt
1052 109 1088 2013-01-22 00:00:00.000
1051 109 1665 2013-01-23 00:00:00.000
1049 130 264 2013-01-01 00:00:00.000
1050 130 1126 2013-01-02 00:00:00.000
1045 132 NULL 2013-01-22 00:00:00.000
1047 132 260 2013-01-22 00:00:00.000
1044 132 1115 2013-01-10 00:00:00.000
1048 132 1130 2013-01-22 00:00:00.000
1043 142 1356 2013-01-10 00:00:00.000
I'm looking to narrow this list to show me only one instance of each unique provider_entry_type_id based on the most recent provider_entry_visit_dt
So the results would be (keep in mind that there will not need to be a need for tie breakers for the provider_entry_visit_dt, that's just an error on my part):
provider_entry_id provider_entry_type_id provider_entry provider_entry_visit_dt
1051 109 1665 2013-01-23 00:00:00.000
1050 130 1126 2013-01-02 00:00:00.000
1048 132 1130 2013-01-22 00:00:00.000
1043 142 1356 2013-01-10 00:00:00.000
I think you need to remove the outer GROUP BY clause
SELECT x.*
FROM tbl_claimant_provider_entry x
INNER JOIN
(
SELECT p.provider_entry_type_id,
MAX(created_date) AS max_date
FROM tbl_claimant_provider_entry p
WHERE provider_entry_clmnt = 4963 -- change this for you user ID
GROUP BY p.provider_entry_type_id
) y ON y.provider_entry_type_id = x.provider_entry_type_id AND
y.max_date = x.created_date
You need to remove created_date from the group by statement. You can put a function on it to leave it in the query (i.e. a function like you have for provider_entry_id). For example:
SELECT MAX(x.provider_entry_id) as provider_entry_id, -- this ID is the PK
MAX(x.created_date),
x.provider_entry_type_id, -- the entry for the specific provider type (the ID)
MIN(x.provider_entry) -- the actual provider entry (the ID)
FROM tbl_claimant_provider_entry x
JOIN (SELECT p.provider_entry_type_id,
p.provider_entry,
MAX(created_date) AS max_date
FROM tbl_claimant_provider_entry p
WHERE provider_entry_clmnt = 4963 -- change this for you user ID
GROUP BY p.provider_entry_type_id, p.provider_entry) y ON y.provider_entry_type_id = x.provider_entry_type_id
AND y.max_date = x.created_date
GROUP BY x.provider_entry_type_id
Here's the solution that worked, thank you to all that tried to help:
SELECT b.*
FROM dbo.tbl_claimant_provider_entry AS b
INNER JOIN
(SELECT provider_entry_type_id, MAX(provider_entry_visit_dt) AS maxdate
FROM dbo.tbl_claimant_provider_entry
GROUP BY provider_entry_type_id) AS m ON b.provider_entry_type_id = m.provider_entry_type_id AND b.provider_entry_visit_dt = m.maxdate
WHERE (b.provider_entry_clmnt = 4963)