Join on second table if value not found in first table - sql

I would like to join on a second table only if the results of the first join are blank. Below is a subsection of Table A data:
ID Metro Submarket
1 NYC Manhattan
2 NYC Brooklyn
3 NYC Queens
4 NYC Bronx
5 NYC Newark
The tables I'm using for the joins are:
Table B Table C
Metro Submarket A.Price B.Price C.Price Metro A.Price B.Price C.Price
NYC Manhattan 54 32 48 NYC 50 49 69
NYC Queens 35 39 59 Philly 49 48 37
NYC Brooklyn 20 49 58 Chicago 20 48 36
NYC Bronx 49 30 20
NYC Newark 49 50 -
I'm adding the Price columns from Table B to Table A based on a Metro and Submarket match. However, Table B doesn't have all the prices. If I can't find a match in Table B then I want to look into Table C for a match only on Metro.
For ID 5, we can find the A and B prices in Table B. However, the C price is blank. In that case, I want it to retrieve the C price from Table C (69 is what it would choose).
I'm using SAS 9.4. SQL, macros, or anything else SAS can handle is welcome!

You can left join both tables to the main table and simply use COALESCE(). This will give you the value if present in Table B, otherwise it will give you the value in Table C:
PROC SQL;
CREATE TABLE Output AS
SELECT
ta.ID,
ta.Metro,
ta.Submarket,
COALESCE(tb.A_Price,tc.A_Price) AS A_Price,
COALESCE(tb.B_Price,tc.B_Price) AS B_Price,
COALESCE(tb.C_Price,tc.C_Price) AS C_Price
FROM
tablea ta
LEFT JOIN
tableb tb
ON (tb.Metro = ta.Metro)
AND (tb.Submarket = ta.Submarket)
LEFT JOIN
tablec tc
ON (tc.Metro = ta.Metro);
QUIT;

Related

Check for interchangable column values in SQL

I have 2 tables. The first one contains IDs of certain airports, the second contains flights from one airport to another.
ID Airport
---- ----
12 NYC
23 LOS
21 AMS
54 SFR
33 LSA
from to cost
---- ---- ----
12 23 500
12 23 250
23 12 200
21 23 100
54 12 400
33 21 700
I'd like to return a table where it contains ONLY airports that are interchangeable (NYC -LOS) in that case, with a total cost.
please note that there're identical (from , to) rows with different costs and the desired output needs to aggregate all costs for each unique combination
Desired Output :
airport_1 airport_2 total_cost
---- ---- ----
NYC LOS 950
You can get the result without need of a subquery by using LEAST() and GREATEST() functions along with HAVING clause such as
SELECT MIN(airport) AS airport_1, MAX(airport) AS airport_2, SUM(cost)/2 AS total_cost
FROM flights
JOIN airports
ON id IN ("from" , "to")
GROUP BY LEAST("from","to"), GREATEST("from","to")
HAVING COUNT(DISTINCT "from")*COUNT(DISTINCT "to")=4
where each pair(2) is counted twice(2) -> (2*2=4)
Demo

Replace Id of one column by a name from another table while using the count statement?

I am trying to get the count of patients by province for my school project, I have managed to get the count and the Id of the province in a table but since I am using the count statement it will not let me use join to show the ProvinceName instead of the Id (it says it's not numerical).
Here is the schema of the two tables I am talking about
The content of the Province table is as follow:
ProvinceId
ProvinceName
ProvinceShortName
1
Terre-Neuve-et-Labrador
NL
2
Île-du-Prince-Édouard
PE
3
Nouvelle-Écosse
NS
4
Nouveau-Brunswick
NB
5
Québec
QC
6
Ontario
ON
7
Manitoba
MB
8
Saskatchewan
SK
9
Alberta
AB
10
Colombie-Britannique
BC
11
Yukon
YT
12
Territoires du Nord-Ouest
NT
13
Nunavut
NU
And here is n sample data from the Patient table (don't worry it's fake data!):
SS
FirstName
LastName
InsuranceNumber
InsuranceProvince
DateOfBirth
Sex
PhoneNumber
2
Doris
Patel
PATD778276
5
1977-08-02
F
514-754-6488
3
Judith
Doe
DOEJ7712917
5
1977-12-09
F
418-267-2263
4
Rosemary
Barrett
BARR05122566
6
2005-12-25
F
905-638-5062
5
Cody
Kennedy
KENC047167
10
2004-07-01
M
604-833-7712
I managed to get the patient count by province using the following statement:
select count(SS),InsuranceProvince
from Patient
full JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
group by InsuranceProvince
which gives me the following table:
PatientCount
InsuranceProvince
13
1
33
2
54
3
4
4
608
5
1778
6
25
7
209
8
547
9
649
10
6
11
35
12
24
13
How can I replace the id's with the correct ProvinceShortName to get the following final result?
ProvinceName
PatientCount
NL
13
PE
33
NS
54
NB
4
QC
608
ON
1778
MB
25
SK
209
AB
547
BC
649
YT
6
NT
35
NU
24
Thanks in advance!
So you can actually just specify that in the select. Note that it's best practise to include the thing you group by in the select, but since your question is so specific then...
SELECT ProvinceShortName, COUNT(SS) AS PatientsInProvince
FROM Patient
JOIN Province ON Patient.InsuranceProvince=Province.ProvinceId
GROUP BY InsuranceProvince;
I would suggest:
select pr.ProvinceShortName, count(*)
from Patient p join
Province pr
on p.InsuranceProvince = pr.ProvinceId
group by pr.ProvinceShortName
order by min(pr.ProvinceId);
Notes:
The key is including the columns you want in the select and group by.
You seem to want the results in province number order, so I included an order by.
There is no need to count the non-NULL values of SS. You might as well use count(*).
Table aliases make the query easier to write and to read.
I assume that you need to show the patient count by province.
SELECT
Province.ProvinceShortName AS [ProvinceName]
,COUNT(1) as [PatinetCount]
FROM Patient
RIGHT JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
GROUP BY ProvinceShortName
Just altering your query to
select ProvinceShortName As PatientCount,count(InsuranceProvince) As PatientCount
from Patient
full JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
group by ProvinceShortName

MS Access SQL Query unmatched rows from two tables

Let's say I have Table_A and Table_B with following rows:
Table_A:
ID PART_ID KIT_ID
---------------------
1 1 340
2 12 340
3 19 340
4 30 340
5 1 348
6 19 348
7 27 348
...
Table_B:
PART_ID REQ
-------------
1 Y
12 Y
19 Y
27 Y
30 Y
...
How do I get the following result in Table_C?
Table_C:
PART_ID KIT_ID
----------------
27 340
12 348
30 348
...
I've tried the Query Wizard with the Unmatched Rows and for some reason cannot get any results that resemble what I need.. E.g., a customer orders a kit and each kit contains a bunch of parts (some required and some not); how do I find the missing parts for each kit?
Generate all combinations for the kits and the parts, then filter out the ones that don't exist:
select k.kit_id, p.part_id
from (select distinct kit_id from table_a) as k, -- no cross join in MS ACCESS
table_b p
where not exists (select 1
from table_a as a
where a.kit_id = k.kit_id and a.part_id = p.part_id
);
You may need the condition REQ = "Y" in the outer where clause. I'm not sure if that is important.

How do I change my SQL SELECT GROUP BY query to show me which records are missing a value?

I have a list of codes by area and type. I need to get the unique codes for each type, which I can do with a simple SELECT query with a GROUP BY. I now need to know which area does not have one of the codes. So how do I run a query to group by unique values and tell me how records do not have one of the values?
ID Area Type Code
1 10 A 123
2 10 A 456
3 10 B 789
4 10 B 987
5 10 C 654
6 10 C 321
7 20 A 123
8 20 B 789
9 20 B 987
10 20 C 654
11 20 C 321
12 30 A 137
13 30 A 456
14 30 B 579
15 30 B 789
16 30 B 987
17 30 C 654
18 30 C 321
I can run this query to group them by type and get get the unique codes:
SELECT tblExample.Type, tblExample.Code
FROM tblExample
GROUP BY tblExample.Type, tblExample.Code
This gives me this:
Type Code
A 123
A 137
A 456
B 579
B 789
B 987
C 321
C 654
Now I need to know which areas do not have a given code. For example, Code 123 does not appear for Area 10 and code 137 does not appear for codes 10 and 20. How do I write a query to give me that areas are missing a code? The format of the output doesn't matter, I just need to get the results. I'm thinking the results could be in one column or spread out in multiple columns:
Type Code Missing Areas or Missing1 Missing2
A 123 30 30
A 137 10, 20 10 20
A 456 20 20
B 579 10, 20 10 20
B 789
B 987
C 321
C 654
You can get a list of the missing code/areas by first generating all combinations and then filtering out the ones that exist:
select t.type, c.code
from (select distinct type from tblExample) t cross join
(select distinct code from tblExample) c left join
tblExample e
on t.type = e.type and c.code = e.code
where e.type is null;

select unique combination in two overlap datasets

For example, in SAS, I have 5 IDs in dataset A(below left). There is a dataset B,(could potentially contain some A's IDs,below right).What I need is to find one unique combination( A is the primary outcome dataset) on A and B has same sex, age range within 5 and income range within 10000.Tt is possible that there are a lot of b.id could merge with a.id. But here's the kick, I can only use b.id once. In this case, 101 merge with 106, 102 merge with 111,103 merge with 112,105 merge with 110. Sorry I have a hard time how to describe my question. Hopefully it is clear enough. Thanks!
ID sex age income ID sex age income
101 F 30 20000 106 F 26 21000
102 M 20 10000 102 M 20 10000
103 F 38 30000 110 M 45 44000
104 M 55 35000 111 M 19 14000
105 M 43 45000 112 F 33 34000
outcome
ID_a sex_a age_a income_a ID_b sex_b age_b income_b
101 F 30 20000 106 F 26 21000
102 M 20 10000 111 M 19 14000
103 F 38 30000 112 F 33 34000
104 M 55 35000
105 M 43 45000 110 M 45 44000
select a.Id, b.Id from SetA a
left join SetB b on a.sex = b.sex and
a.age between b.age - 5 and b.age + 5 and
a.income between b.income - 10000 and b.income + 10000
You should be able to use the techniques used in the answers to this question to do your fuzzy matching while making sure that each b.id is only used once.
The idea is to load your B dataset into a temporary array / hash object, so you can keep track of which b.ids have already been used while matching it onto A.
SELECT
A.ID,
B.ID
FROM lefttable A
INNER JOIN righttable B
ON (A.sex = B.sex) AND (A.age BETWEEN B.age -5 AND B.age + 5) AND (A.income BETWEEN B.income -10000 AND B.age + 10000)