Conditional join based on value of column - sql

I have a table of addresses with a column for Canadian provinces and another column for US States as well as a column for country. We support other countries but we treat Canadian and US different.
We have a separate table the is a list of codes, there is a table id and a code value. So the code table id for Canadian provinces is say 5 and has the 13 values for all the provinces and territories. The US states are in the same table but with a code table is say 6 and has all 50 states.
I have been asked to write a report that would reference the province or state and I am struggling with a way to change the code table id used in the join to get the province or state description depending on the country code.
Table structure for address (simplified):
CNTRY_CD (value 3 is Canada, 22 is US, others exist but only these two are linked tothe code table)
PROV_CD (only has values when country is 3, zero otherwise)
STATE_CD (only has values when country is 22, zero otherwise)
Table structure for code tables (simplified):
TABLE_ID (provinces are table 5 and states are table 6)
CODE (for table 5 there are 13 values, for table 6 there 50 values)
DESC (the name of the province or state depending on above)
If you need additional details just let me know.
EDIT
Sample data follows:
ADDR_ID
PROV_CD
STATE_CD
01
3
NULL
02
NULL
25
03
NULL
NULL
TABLE_ID
CODE
DESC
5
3
Manitoba
5
4
Ontario
6
25
Michigan
6
26
Montana
Report
JURISDICTION
DESC
COUNT
Canada
Manitoba
123
US
Michigan
321
Other
Other
5
What has been causing me the most trouble is that the table id is not in the data - it is known only by the column the data comes from. So if the column is PROV_CD then I know to use code table id 5, if STATE_CD then I know to use code table id 6 but the actual data does not contain the code table id. Hope that makes sense.
This is the closest I have been so far:
Here is redacted source data:
CLIENT_ID ADDR_ID CNTRY_CSN CDN_PROV_CSN US_STE_CSN ADDR_L1_TXT PVST_NM
821 72301 104 0 0 International line 1 NULL
821 72302 148 0 1 NULL NULL
821 72303 221 0 14 NULL NULL
821 72304 36 9 0 NULL NULL
821 72305 0 0 0 NULL NULL
821 72306 221 0 44 NULL NULL
821 72307 36 9 0 NULL NULL
821 72308 0 0 0 NULL NULL
821 72309 0 0 0 NULL NULL
821 72310 0 0 0 NULL NULL
822 1481 36 9 0 NULL NULL
822 1482 36 0 0 NULL NULL
Here is redacted SQL:
SELECT CLIENT_ID, ADDR_ID, CNTRY_CSN, CD_EDESC
FROM CLNT_ADDR
LEFT OUTER JOIN TXSCT
ON CD_TBL_ID =
CASE CNTRY_CSN
WHEN 36 THEN 10
WHEN 148 THEN 538
WHEN 221 THEN 12
ELSE NULL
END
AND CSN =
CASE
WHEN CDN_PROV_CSN > 0 THEN CDN_PROV_CSN
WHEN US_STE_CSN > 0 THEN US_STE_CSN
ELSE NULL
END
WHERE CLIENT_ID IN (821, 822)
WITH UR;
Here is the result I get:
CLIENT_ID ADDR_ID CNTRY_CSN CD_EDESC
821 72301 104 NULL
821 72302 148 Aguascalientes
821 72302 148 Aguascalientes
821 72303 221 Idaho
821 72304 36 Ontario
821 72305 0 NULL
821 72306 221 Texas
821 72307 36 Ontario
821 72308 0 NULL
821 72309 0 NULL
821 72310 0 NULL
822 1481 36 Ontario
822 1482 36 NULL
Address id 72302 is repeated, don't know why.

Try this as is.
Two ways to resolve it - with 2 left joins (comment out all lines until the end starting from the one with JOIN and uncomment 2 lines with LEFT JOIN, comment out the line with C.DESC and uncomment one with COALESCE) or 1 inner join (as is below).
WITH
ADDRESS
(
CNTRY_CD --(value 3 is Canada, 22 is US, others exist but only these two are linked tothe code table)
, PROV_CD --(only has values when country is 3, zero otherwise)
, STATE_CD --(only has values when country is 22, zero otherwise)
)
AS
(
VALUES
( '3', 'P1', '')
, ( '3', 'P2', '')
, ('22', '', 'S1')
, ('22', '', 'S2')
)
, CODE
(
TABLE_ID --(provinces are table 5 and states are table 6)
, CODE --(for table 5 there are 13 values, for table 6 there 50 values)
, DESC --(the name of the province or state depending on above) )
)
AS
(
VALUES
(5, 'P1', 'Canadian Province1')
, (5, 'P2', 'Canadian Province2')
-- Just to show, that the same code may be for different countries
-- but the statement works correctly
, (5, 'S1', 'Canadian Province3')
, (6, 'S1', 'US State1')
, (6, 'S2', 'US State2')
, (6, 'P1', 'US State3')
)
SELECT
A.*
--, COALESCE (C.DESC, U.DESC) AS DESC
, C.DESC
FROM ADDRESS A
--LEFT JOIN CODE C ON C.CODE = A.PROV_CD AND (C.TABLE_ID, A.CNTRY_CD) = (5, '3')
--LEFT JOIN CODE U ON U.CODE = A.STATE_CD AND (U.TABLE_ID, A.CNTRY_CD) = (6, '22')
JOIN CODE C ON
(C.CODE = A.PROV_CD AND (C.TABLE_ID, A.CNTRY_CD) = (5, '3'))
OR (C.CODE = A.STATE_CD AND (C.TABLE_ID, A.CNTRY_CD) = (6, '22'))
The result is:
CNTRY_CD
PROV_CD
STATE_CD
DESC
3
P1
Canadian Province1
3
P2
Canadian Province2
22
S1
US State1
22
S2
US State2
If it's not what you want, then please, edit your questions with your sample data in the form used in the WITH clause and show the result desired.

Related

Getting multiple columns from TWO tables using a WHERE EXISTS statement

I'm trying to use two name column's in tables and find the similarities/differences using SQL Server. I would like to print a list of students Table 1 that are not present in Table 2, along with all their respective scores and grade. However I am having trouble populating multiple columns with my current script. My script does not populate the test scores but instead prints only NULL values. I've included the table format's and where I am with my both script's so far. Right now, all the score fields are printing NULL values even when they have populated fields in Table2.
TABLE_1_Today's_List
Name1
Kevin
James
Roger
Bob
TABLE_2_Combined_Scores
Name2
Grade
Score_1
Score_2
Score_3
Score_4
Kevin
10
25
34
12
45
Bob
9
25
23
65
87
Roger
10
43
54
25
98
James
12
43
54
25
98
Students in Table 1 that are also in Table 2:
SELECT c.Name1, Score_1,Score_2,Score_3,Score_4
FROM TABLE_1_Today's_List c
WHERE EXISTS (SELECT c2.Name2,Score_1,Score_2,Score_3,Score_4
FROM TABLE_2_Combined_Scores c2
WHERE c2.Name2 = c.Name1
and Grade = '10');
^This script returns NULL values for all the Score fields but correctly maps the students in both tables. I would like to populate the the Score fields with the results from Table2
Students in Table 1 that are not in Table 2:
SELECT c.Name1, Score_1,Score_2,Score_3,Score_4
FROM TABLE_1_Today's_List c
WHERE NOT EXISTS (SELECT c2.Name2,Score_1,Score_2,Score_3,Score_4
FROM TABLE_2_Combined_Scores c2
WHERE c2.Name2 = c.Name1
and Grade = '10');
^This script returns NULL values for all the Score fields but correctly maps the students in table1 but not table2. I would like to populate the the Score fields with the results from Table2
Ideal Output:
Name1
Score_1
Score_2
Score_3
Score_4
Kevin
25
34
12
45
Roger
43
54
25
98

Efficient code for fetching all duplicate records on large datasets

I want to assign scores of zero to any record having duplicates and a score of 1 to all unique records. I have a set of data like this:
Table 1-
No.
City
1
null
2
null
3
null
4
Aachen
5
Berlin
6
Berlin
7
Berlin
8
Bochum
9
Bochum
10
Bristol
11
Liverpool
12
Liverpool
So, the expected result will be:
Table 2 -
No.
City
Score
1
null
0
2
null
0
3
null
0
4
Aachen
1
5
Berlin
0
6
Berlin
0
7
Berlin
0
8
Bochum
0
9
Bochum
0
10
Bristol
1
11
Liverpool
0
12
Liverpool
0
select city,
case when [City] in ( select [City] from [Table1] group by [City] having count([City]) = 1) then 1 else 0 end as [Score]
from Table1
This code works well on datasets smaller than 100k rows but if it deals with larger datasets it is too slow and the execution time runs out sometimes. It is important to recognize null values as duplicates as well. Can anyone please suggest a more efficient solution than this?
Below is the solution that I have implemented.
Reason for converting the NULLs to some character is because NULLs will be eliminated if we apply a count function
create table temp
(id int,
city varchar(200))
insert into temp
values
(1,null),
(2,null),
(3,null),
(4,'Aachen'),
(5,'Berlin'),
(6,'Berlin'),
(7,'Berlin'),
(8,'Bochum'),
(9,'Bochum'),
(10,'Bristol'),
(11,'Liverpool'),
(12,'Liverpool')
;with cte as (
select id, case when city is null then 'A' else city end as city from temp)
select id, case when city ='A' then NULL else city end as city,
case when count(city) over (partition by city order by city )>1 then 0 else 1 end as score
from cte
I'm not sure if this will be faster, depends on how the query builder works for Sql Server 2008. But it feels faster for me to structure the query like this :)
select [City],
coalesce(score, 0) as score
from [Table1] t
left join
(select [City], 1 as score from [Table1]
group by [City] having count([City]) = 1
) x on x.[City] = t.[City]

Matching multiple rows in where clause for filter

I have two tables as the below:
Table 1 : Product_Information
Information_ID
Product_Name
1
A
2
B
3
C
4
D
5
E
Table 2 : Discriptor_Values
Information_ID
Descriptor_ID
Descriptor_Value
1
1
98
1
2
142
1
3
29.66
2
1
50
2
2
11
2
3
14
3
1
17
3
2
76
3
3
85
4
1
59
4
2
48
4
3
35
5
1
48
5
2
12
5
3
19
Using the above tables, I am creating a filter page like in any online shopping page i.e. for mobile phone Min and max range of price, Min and max range of internal storage are descriptor and range of values.
Likewise I will select descriptor and give min and max values for it and the matching product will be the result.
If I pass any filter range then the filtered list of products will be shown else all the records should be shown.
I am trying as the below query but not getting the correct output. I am getting the union of rows which matches any of the passed row (#tblFilter ).
CREATE TABLE #tblFilter(
[descriptor_id] [int] NULL,
[min_value] [decimal](18, 0) NULL,
[max_value] [decimal](18, 0) NULL
)
insert into #tblFilter values (1, 40.33, 70.33)
insert into #tblFilter values (2, 100.33, 150.33)
insert into #tblFilter values (3, 10, 60)
select p.*
from Product_Information p
inner join Discriptor_Values dv on p.Information_ID = dv.Information_ID
left join #tblFilter t1 on t1.descriptor_id = dv.Descriptor_id
WHERE ((dv.Descriptor_ID = t1.descriptor_id
and convert(decimal, dv.Descriptor_Value)
between CONVERT(decimal, t1.min_value) and CONVERT(decimal, t1.max_value))
or not exists (select 1 from #tblFilter))
drop TABLE #tblFilter
Please help me to minimize the result list by filter and show all records if there is no row in filter table (#tblFilter).
I believe you want:
select p.*
from Product_Information p join
Discriptor_Values dv
on p.Information_ID = dv.Information_ID left join
#tblFilter t1
on t1.descriptor_id = dv.Descriptor_id
where dv.Descriptor_Value between t1.min_value and t1.max_value or
dv.Descriptor_id is null;
I removed the conversions to decimals. You might actually need them, but in the question the values look like numbers and the question doesn't specify that they are stored as strings.

Issue select statement on multiple rows

I use DB2.
Situation: I want do do a query on my table RELATIONS to list ALL the companies that have a RELATION 1 AND a RELATION 2 OR 3 assigned. In my DB design, 1 or more companies could have multiple relations.
I want to do a select statement with multiple AND operators on the same column (RELATION) with SQL but if i execute the code i do not get any hits.
SELECT R_ID, COMPANY_NAME from RELATION
WHERE COMPANY_GROUP = 2245
AND RELATION = 1
AND RELATION in (2,3)
When i execute this i don't get any hits.
This is my DB design.
***This is the the table RELATION
R_ID, RELATION, COMPANY_NAME
121 1 Inbev
122 6 Jupiler
123 1 Unox
124 2` Unox
125 4 Lotus
126 1 Lu
127 1 Felix
128 2 Felix
129 1 Unicoresels
130 3 Unicoresels
131 4 Sporkamt
***This is the table COMPANY
COMPANY_ID, COMPANY_NAME, COMPANY_ADDRESS, COMPANY_GROUP
31 Jupiler Some address 2245
32 Unox Some address 2245
33 Lotus Some address 2245
34 Lu Some address 2245
35 Felix Some address 2245
36 Unicoresels Some address 2245
37 Sporkampt Some address 2245
This is the result i want to achieve with a query.
R_ID, COMPANY_NAME
123 Unox
124 Unox
127 Felix
128 Felix
129 Unicoresels
130 Unicoresels
How can i do this?
One approach is to use group by and having:
SELECT COMPANY_NAME
FROM RELATION
WHERE RELATION IN (1, 2, 3)
GROUP BY COMPANY_NAME
HAVING SUM(CASE WHEN RELATION = 1 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN RELATION IN (2, 3) THEN 1 ELSE 0 END) > 0 ;
Notes:
If you want to filter by company group, then you need to join in the companies table.
The relation table should be using COMPANY_ID, not COMPANY_NAME.
EDIT:
If you want the rows from the RELATION table that match, then a simple method is to use the above as a subquery:
SELECT r.*
FROM RELATION r
WHERE r.COMPANY_NAME IN (SELECT r2.COMPANY_NAME
FROM RELATION r2
WHERE r2.RELATION IN (1, 2, 3)
GROUP BY r2.COMPANY_NAME
HAVING SUM(CASE WHEN r2.RELATION = 1 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN r2.RELATION IN (2, 3) THEN 1 ELSE 0 END) > 0
);

How to get SQL Join Output in the binary form

I have 2 tables:
Patient Visit Table
Visit ID Patient ID Date Disease ID
101 1 22-Feb 11
102 5 5-Apr 22
103 3 2-Jul 77
104 2 4-Feb 55
105 6 5-Jan 99
106 2 6-Jan 66
107 2 8-Jan 77
108 7 9-Jan 44
109 5 22-Jan 88
110 1 23-Jan 33
and 2nd table is,
Disease Table
Disease ID Disease Name
11 Asthama
22 TB
33 Flu
44 AIDS
55 Cancer
66 Heart Disease
77 ABC
88 XYZ
99 MNO
I want the output as follows:
The table with Patient ID as Row and Disease as columns, The binary values indicating which patient has which disease.
What query should i use?
The table with Patient ID as Row and Disease as columns, The binary values indicating which patient has which disease
Try this if you are using SQL Server, hope this could help. Using Case Expression
select t1.patient_id,
case when t2.disease_name='Asthma' then 1 else 0 end as Asthma,
case when t2.disease_name='TB' then 1 else 0 end as TB,
case when t2.disease_name='Flu' then 1 else 0 end as Flu,
case when t2.disease_name='AIDS' then 1 else 0 end as AIDS,
case when t2.disease_name='Cancer' then 1 else 0 end as Cancer,
case when t2.disease_name='Heart Disease' then 1 else 0 end as 'Heart Disease',
case when t2.disease_name='ABC' then 1 else 0 end as ABC,
case when t2.disease_name='XYZ' then 1 else 0 end as XYZ,
case when t2.disease_name='MNO' then 1 else 0 end as MNO
from #table1 t1
left join #table2 t2
on t1.Disease_id=t2.Disease_id
order by t1.patient_id
Try this
SELECT PatientID, [Asthama],[TB],[Flu],[AIDS],[Cancer],[Heart Disease], [ABC],[XYZ],[MNO]
FROM
(SELECT P.PatientID,D.Disease from Patient P inner join Disease D on P.DiseaseID=D.DiseaseID) AS SourceTable
PIVOT
(
count(Disease)
FOR Disease IN ([Asthama],[TB],[Flu],[AIDS],[Cancer],[Heart Disease],[ABC],[XYZ],[MNO])
) AS PivotTable;