alternative solution to too many JOINs - sql

There is a table containing all names:
CREATE TABLE Names(
Name VARCHAR(20)
)
And there are multiple tables with similar schema.
Let's say:
CREATE TABLE T1
(
Name VARCHAR(20),
Description VARCHAR(30),
Version INT
)
CREATE TABLE T2
(
Name VARCHAR(20),
Description VARCHAR(30),
Version INT
)
I need to query description for each name, by following priority:
any records in T1 with matching name and version = 1
any records in T1 with matching name and version = 2
any records in T2 with matching name and version = 1
any records in T2 with matching name and version = 2
I want result from lower priority source only if there are no result from higher priority source.
So far that's I've got:
SELECT
N.Name AS Name, Description =
CASE
WHEN (T11.Description IS NOT NULL) THEN T11.Description
WHEN (T12.Description IS NOT NULL) THEN T12.Description
WHEN (T21.Description IS NOT NULL) THEN T21.Description
WHEN (T22.Description IS NOT NULL) THEN T22.Description
ELSE NULL
END
FROM Names AS N
LEFT JOIN T1 AS T11 ON T11.Name = N.Name AND T11.Version = 1
LEFT JOIN T1 AS T12 ON T12.Name = N.Name AND T12.Version = 2
LEFT JOIN T2 AS T21 ON T21.Name = N.Name AND T21.Version = 1
LEFT JOIN T2 AS T22 ON T22.Name = N.Name AND T22.Version = 2
It's working, but are there too much JOIN here? Is there any better approach?
sqlfiddle
Sample Input:
INSERT INTO Names VALUES('name1')
INSERT INTO Names VALUES('name2')
INSERT INTO Names VALUES('name3')
INSERT INTO Names VALUES('name4')
INSERT INTO Names VALUES('name5')
INSERT INTO Names VALUES('name6')
INSERT INTO T1 VALUES ('name1','name1_T1_1', 1)
INSERT INTO T1 VALUES ('name2','name2_T1_1', 1)
INSERT INTO T1 VALUES ('name3','name3_T1_1', 1)
INSERT INTO T1 VALUES ('name3','name3_T1_2', 2)
INSERT INTO T1 VALUES ('name5','name5_T1_2', 2)
INSERT INTO T2 VALUES ('name1','name1_T2_1', 1)
INSERT INTO T2 VALUES ('name4','name4_T2_1', 1)
Excepted result:
--
-- Excepted result:
-- Name Description
-- name1 name1_T1_1
-- name2 name2_T1_1
-- name3 name3_T1_1
-- name4 name4_T2_1
-- name5 name5_T1_2
-- name6 NULL

Well, this is a solution to eliminate the case statement and minimize the repetitive part of the query, it requires some joins of it's own of course, so you'd need quite some tables and/or versions to get any real benefit out of it:
;WITH
AllDescriptions AS
(
SELECT 1 AS Rank, * FROM T1
UNION ALL SELECT 2 AS Rank, * FROM T2
-- UNION ALL SELECT 3 AS Rank, * FROM T3
-- UNION ALL SELECT 4 AS Rank, * FROM T4
-- etc
),
Ranks AS
(
SELECT
AllDescriptions.Name,
MIN(AllDescriptions.Rank) AS Rank
FROM
AllDescriptions
GROUP BY
Name
),
Versions AS
(
SELECT
AllDescriptions.Name,
AllDescriptions.Rank,
MIN(AllDescriptions.Version) AS Version
FROM
AllDescriptions
INNER JOIN Ranks
ON Ranks.Name = AllDescriptions.Name
AND Ranks.Rank = AllDescriptions.Rank
GROUP BY
AllDescriptions.Name,
AllDescriptions.Rank
),
Descriptions AS
(
SELECT
AllDescriptions.Name,
AllDescriptions.Description
FROM
AllDescriptions
INNER JOIN Versions
ON Versions.Name = AllDescriptions.Name
AND Versions.Rank = AllDescriptions.Rank
AND Versions.Version = AllDescriptions.Version
)
SELECT
Names.*,
Descriptions.Description
FROM
Names
LEFT OUTER JOIN Descriptions
ON Descriptions.Name = Names.Name

Try this query and it will also give you the expected result.
SELECT N.name AS Name,
Description =
CASE
WHEN ( t1.description IS NOT NULL ) THEN t1.description
WHEN ( t2.description IS NOT NULL ) THEN t2.description
ELSE NULL
END
FROM names AS N
LEFT JOIN t1
ON t1.name = N.name
AND t1.version IN( 1, 2 )
LEFT JOIN t2
ON t2.name = N.name
AND t2.version IN ( 1, 2 )

select n.name, isnull(d.description,d1.Description) description
from Names n
outer apply (select top 1 t1.Name, t1.Description
from T1
WHERE t1.Name = n.name
order by Version asc
) d
outer apply (select top 1 t2.Name, t2.Description
from T2
WHERE t2.Name = n.name
order by Version asc
) d1

Related

Matching multiple columns in one join

I have two tables:
Table 1
item_name | assocID_1 | assocID_2 | assocID_3
ball 123 456 789
Table 2
assoc_key assoc_value
123 red
456 white
789 blue
Am I able to create an output of:
ball red white blue
With only one join? I understand I can just join the tables multiple times to easily get this result, but in my actual tables there are much more than 3 columns, and the app I'm using can only support 4 joins per query apparently.
Many thanks for any help.
If you don't care about performance, you can do:
select t1.item_name,
max(case when t2.assoc_key = t1.assocID_1 then t2.assoc_value end),
max(case when t2.assoc_key = t1.assocID_2 then t2.assoc_value end),
max(case when t2.assoc_key = t1.assocID_3 then t2.assoc_value end)
from table1 t1 join
table2 t2
on t2.assoc_key in (t1.assocID_1, t1.assocID_2, t1.assocID_3)
group by t1.item_name;
You can also use subqueries. If we assume that there is only one matching row in table2:
select t1.item_name,
(select t2.assoc_value from table2 t2 where t2.assoc_key = t1.assocID_1),
(select t2.assoc_value from table2 t2 where t2.assoc_key = t1.assocID_2),
(select t2.assoc_value from table2 t2 where t2.assoc_key = t1.assocID_3)
from table1 t1;
If there can be more than one match, you can arbitrarily choose one of them using aggregation functions:
select t1.item_name,
(select max(t2.assoc_value) from table2 t2 where t2.assoc_key = t1.assocID_1),
(select max(t2.assoc_value) from table2 t2 where t2.assoc_key = t1.assocID_2),
(select max(t2.assoc_value) from table2 t2 where t2.assoc_key = t1.assocID_3)
from table1 t1;
I do not think you need a join here. You just need to look up which you can do in the SELECT statement directly. Here is an implementation in SQL Server (In Sample Data preparation code, if you are using version older than SQL Server 2016, please replace the DROP TABLE IF EXISTS with older way of doing the same)
DDL and Test Data:
DROP TABLE IF EXISTS Table1
SELECT item_name = 'ball'
,assocID_1 = 123
,assocID_2 = 456
,assocID_3 = 789
INTO Table1
DROP TABLE IF EXISTS Table2
SELECT assoc_key = 123
,assoc_value = 'red'
INTO Table2
UNION ALL
SELECT assoc_key = 456
,assoc_value = 'white'
UNION ALL
SELECT assoc_key = 789
,assoc_value = 'blue'
SELECT * FROM Table1
SELECT * FROM Table2
1. Brute Force Approach:
SELECT item_name = T1.item_name
,(SELECT TOP 1 assoc_value FROM Table2 WHERE assoc_key = T1.assocID_1)
,(SELECT TOP 1 assoc_value FROM Table2 WHERE assoc_key = T1.assocID_2)
,(SELECT TOP 1 assoc_value FROM Table2 WHERE assoc_key = T1.assocID_3)
FROM Table1 T1
2. Dynamically Building the Query For Ease And Then Executing It. With this approach Number of Columns Would Not Be a Concern:
DECLARE #SQL NVARCHAR(MAX) = 'SELECT item_name = T1.item_name '
SELECT #SQL += '
,(SELECT TOP 1 assoc_value FROM Table2 WHERE assoc_key = T1.'+COLUMN_NAME+')'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'dbo' -- provide your proper schema name here
AND TABLE_NAME = 'Table1'
AND COLUMN_NAME <> 'item_name' -- provide the columns you want to avoid doing lookups
ORDER BY ORDINAL_POSITION
SET #SQL+='
FROM Table1 T1 '
PRINT #SQL
EXEC sp_executesql #statement=#SQL
3. Combination of UNPIVOT, JOIN and PIVOT
SELECT item_name, [assocID_1], [assocID_2], [assocID_3] -- you can dynamically build the select list like above example if you need
FROM
(
SELECT IQ.item_name, IQ.assocId, T2.assoc_value
FROM (
SELECT UNP.item_name, UNP.assocId, UNP.Value
FROM Table1 T1
UNPIVOT
(
Value FOR assocId IN ([assocId_1], [assocId_2], [assocId_3]) -- you can dynamically build this column list like above example if you need
) UNP
) IQ
INNER JOIN Table2 T2
ON IQ.Value = T2.assoc_key
) OQ
PIVOT
(
MAX(assoc_value)
FOR associd IN ([assocID_1], [assocID_2], [assocID_3]) -- you can dynamically build this column list like above example if you need
) PV
select item_name, decode(ASSOCID_1,(select assocID_1 from t1 ), (select assoc from t2 where assoc_key =aa.assocID_1),null ) ,
decode(ASSOCID_2,(select assocID_2 from t1 ) , (select assoc from t2 where assoc_key =aa.assocID_1),null ),
decode(ASSOCID_3,(select assocID_3 from t1 ), (select assoc from t2 where assoc_key =aa.assocID_1),null ) from t1 aa

Get the list of name column values which are not common in both the tables?

recently i gave an interview where the question was
suppose there are two tables in database.
Table T1 has a column named "name" in it and few other columns
Table T2 also has a column name "name" and few other columns
suppose table T1 has values in name column as
[n1,n2,n3,n4,n5]
and values in the "name" column of table T2 are
[n2,n4]
then output should be
[n1,n3,n5] as n2 and n4 are common in both tables
we needs to find the list of names which are not common in both the tables.
The solution that i provided him was using join in the below form
select name from table1 where name not in (select t1.name from table1 t1 join table2 t2 on t1.name=t2.name)
UNION
select name from table2 where name not in (select t1.name from table1 t1 join table2 t2 on t1.name=t2.name)
But he said there is still a better solution. I was not able to come up with any different and more efficient solution. What is the other efficient way to get the list of names if there is any?
If the NAME column does not have NULL values, there is also
select distinct(coalesce(a.name, b.name)) name
from table1 a
full join table1 b on a.name = b.name
where a.name is null or b.name is null
(Corrected WHERE condition, sorry...)
Use FULL OUTER JOIN:
SELECT DISTINCT(COALESCE(t1.NAME, t2.NAME)) AS NAME
FROM TABLE1 t1
FULL OUTER JOIN TABLE2 t2
ON t2.NAME = t1.NAME
WHERE t1.NAME IS NULL OR
t2.NAME IS NULL
A FULL OUTER JOIN is similar to a LEFT OUTER JOIN unioned with a RIGHT OUTER JOIN - it returns rows where data exists in the first table but not the second, or where it data exists in the second table but not the first. You could get the same effect by using
SELECT t1.NAME
FROM TABLE1 t1
LEFT OUTER JOIN TABLE2 t2
ON t2.NAME = t1.NAME
WHERE t2.NAME IS NULL
UNION
SELECT t2.NAME
FROM TABLE1 t1
RIGHT OUTER JOIN TABLE2 t2
ON t2.NAME = t1.NAME
WHERE t1.NAME IS NULL
and in fact the above is what you'd need to do if you were using a database which doesn't support the FULL OUTER JOIN syntax (e.g. MySQL, the last time I looked).
See this dbfiddle
Union the tables and return the values that don't have a count of 2:
create table t1 (
c1 int
);
create table t2 (
c1 int
);
insert into t1 values ( 1 );
insert into t1 values ( 3 );
insert into t2 values ( 2 );
insert into t2 values ( 3 );
commit;
select c1 only_in_one_table
from (
select 'T1' t, c1 from t1
union
select 'T2' t, c1 from t2
)
group by c1
having count(*) <> 2;
ONLY_IN_ONE_TABLE
1
2
I'm not a fan of not in with subqueries, because it behaves unexpectedly with null values. And the person asking would have to explain what "better" means. Your version is actually reasonable.
I might be inclined to approach this using aggregation:
select name
from ((select distinct name, 1 as in_table1, 0 as in_table2
from table1
) union all
(select distinct name, 0 as in_table1, 0\1 as in_table2
from table2
)
) t
group by name
having max(in_table1) <> max(in_table2);
In a real world case, you would probably have a separate table with all names. If so:
select n.*
from names n
where (not exists (select 1 from table1 t1 where t1.name = n.name) and
exists (select 1 from table2 t2 where t2.name = n.name
) or
(exists (select 1 from table1 t1 where t1.name = n.name) and
not exists (select 1 from table2 t2 where t2.name = n.name
);
This is usually the fastest approach because it does not involve any aggregation or duplicate removal.
If you want to use SET operator then find the solution as below:
CREATE TABLE TABLE1(NAME VARCHAR2(100));
CREATE TABLE TABLE2(NAME VARCHAR2(100));
INSERT INTO TABLE1 VALUES('A');
INSERT INTO TABLE1 VALUES('B');
INSERT INTO TABLE1 VALUES('C');
INSERT INTO TABLE2 VALUES('A');
INSERT INTO TABLE2 VALUES('B');
INSERT INTO TABLE2 VALUES('D');
SELECT
NAME
FROM
(
SELECT
NAME
FROM
TABLE1
UNION
SELECT
NAME
FROM
TABLE2
)
WHERE
NAME NOT IN (
SELECT
NAME
FROM
TABLE1
JOIN TABLE2 USING ( NAME )
);
Cheers!!
Yet another possible solution:
Find those that are in the first table but not in the second table using the MINUS operator (which is Oracle's implementation of the standard EXCEPT). Then UNION that with those that are in the second but not in the first.
(
select name
from t1
minus
select name
from t2
)
union all
(
select name
from t2
minus
select name
from t1
);
Given this setup:
create table t1
(
name varchar(10)
);
insert into t1 values ('Arthur');
insert into t1 values ('Zaphod');
create table t2
(
name varchar(10)
);
insert into t2 values ('Tricia');
insert into t2 values ('Zaphod');
This returns:
NAME
------
Arthur
Tricia
select id from((select id from table1)
union all
(select id from table2)) as t1
group by id having count(id)=1
Using the basic set operations the following query should work.
(
select name from table1
union all
select name from table2
)
minus
(
select name from table1
intersect
select name from table2
)
;
Regards
Akash

SQL Server : self join - retrieve records matching the where clause as well as when no records in alias table

I have a table like below:
Query:
select t1.*
from TABLE2 as t1
left join TABLE2 as t2 on t1.itemcode = t2.itemcode
and t1.warehouseid = '576'
and t1.flag = 'Y'
and t2.warehouseid = '276'
and t2.flag = 'Y';
I have the above query and understand this is not perfect.
For an itemcode, if these conditions are met (t1.warehouseid='576' and t1.flag='Y' and t2.warehouseid='276' and t2.flag='Y') I want to retrieve that from t1.
Also, If there is no entry for an itemcode in t2 (Ex: 456 is not available for warehouseid 276), that also I want to retrieve from t1.
Expecting the following output,
123 576 Y
456 576 Y
What is the correct query for this?
Edit:
To make the post more clear,
Warehouse id 576 is the main element.
For an itemcode, present in both warehouse id (576 , 276) with the flags being same ('Y') , I want to retrieve.
And If the itemcode is not in the other warehouse (276), that also I want to retrieve
For an itemcode, present in both warehouse id (576 , 276) with different flags ('Y' , 'N') , I don't want that.
interpret directly from your 2 conditions in WHERE clause
select *
from TABLE2 t
where warehouseid = 576
and (
exists -- condition 1
(
select *
from TABLE2 x
where x.itemcode = t.itemcode
and x.warehouseid = 276
and x.flag = 'Y'
)
or not exists -- condition 2
(
select *
from TABLE2 x
where x.itemcode = t.itemcode
and x.warehouseid = 276
)
)
Hope this will work for you
select t1.* from TABLE2 as t1
left join TABLE2 as t2
on t1.itemcode=t2.itemcode and t2.warehouseid='276' and t2.flag='Y';
where
t1.warehouseid='576' and t1.flag='Y'
There is another approach using row_number()
with cte as
(select t1.*,
row_number() over(partition by itemcode order by warehouseid desc
) rn
from TABLE2 t1
where not exists ( select 1 from TABLE2 t2 where t1.itemcode=t2.itemcode
and t2.flag='N'
) and t1.warehouseid=576
) select * from cte where rn=1

Tsql select from related table with AND condition

I've two related tables:
Table1
Id
-----
1
2
3
Table2
Id Feature
--------------
1 Car
1 Moto
1 Camper
2 Moto
2 Scooter
3 Apple
I want to select Ids which have, for example, both 'Car' AND 'Moto'.
So in the example i want to get only Id = 1.
Use the INTERSECT operator:
select id from table2 where feature = 'Car'
intersect
select id from table2 where feature = 'Moto'
This:
WITH features AS
(
SELECT feature
FROM (
VALUES
('Car'),
('Moto')
) q (feature)
)
SELECT *
FROM table1 t1
WHERE NOT EXISTS
(
SELECT feature
FROM features
EXCEPT
SELECT feature
FROM table2 t2
WHERE t2.id = t1.id
)
or this:
SELECT *
FROM table t1
WHERE (
SELECT COUNT(*)
FROM table2 t2
WHERE t2.id = t1.id
AND t2.feature IN ('Car', 'Moto')
) = 2
Which query is more efficient depends on how many records you have in both tables and how many matches there are.
This select does two LEFT OUTER JOINs to table2 (one based on 'Car' and the other based on 'Moto') and makes sure that each JOIN returned a result. The DISTINCT ensures that you get each ID only once.
SELECT DISTINCT t1.id
FROM table2 t2
LEFT OUTER JOIN table2 t2_2 ON t2.id = t2_2.id AND t2_2.feature = 'Moto'
WHERE t2.feature = 'Car'
AND t2_2.id IS NOT NULL
Edit: Removed join to table1 since it really isn't needed.

Select distinct rows that contain a given set of data

I have a following table:
bid | data
1 | a
1 | b
1 | c
2 | a
3 | c
3 | a
I want to select all bids that contain given set of data.
For example, all bids that 'contains' data "a" and "b" (result should be bid 1), or ones that contain "a" and "c" (1 and 3).
Only solution I could think of is kind of nasty, so I would appreciate some help/suggestions.
My first try:
select bid from my_table as t1 where
exists (select * from my_table t2 where
t2.bid = t1.bid and
t2.data='a'
)
and
exists (select * from my_table t2 where
t2.bid = t1.bid and
t2.data='b'
)
group by bid;
Thanks.
select t1.bid
from table_1 t1
inner join table_1 t2 on t1.bid = t2.bid
where t1.data = 'a' and t2.data = 'c'
By the way:
all bids that 'contains' data "a" and "b" (result should be bid 1)
--> bid 2 also contains data 'a' and 'b'
While I would not recommend this solution for only two variable lookups it's rate of growth for query cost when matching on more variables increases very slowly as opposed to doing an inner join for each match. As a disclaimer I realize that if pipe is a valid field or there are xml encoded charcters that this break.
select e.bid
from myTable e
cross apply ( select '|'+ i.data + '|'
from myTable i
where e.bid = i.bid
for xml path('')) T(v)
where v like '%|A|%' and v like '%|B|%' --and v like '%|C|%'.....
group by e.bid
as a side not about other options your answer could be simplified into
select bid from my_table as t1 where
exists (select * from my_table t2 where
t2.bid = t1.bid and
t2.data='a'
)
and t1.data = 'c'
group by bid;
This is roughly an equivalent of christian's answer. The optimizer will most likely treat these the same.
select distinct t1.bid
from table_1 t1
inner join table_1 t2 on t1.bid = t2.bid
where t1.data = 'a' and t2.data = 'c'
With a subquery, count the number of right occurences you have in your table.
SELECT DISTINCT m.bid
FROM myTable m
WHERE (
SELECT COUNT(1)
FROM myTable m2
WHERE (m2.data = 'a'
OR m2.data = 'b')
AND m.bid = m2.bid
) = 2
Maybe not the best answer but:
select bid from mytable where data = 'a'
intersect
select bid from mytable where data = 'c'
Uses exists:
declare #t table(bid int, data char)
insert #t values(1,'a'),(1,'b'),(1,'c'),(2,'b'),(2,'a'),(3,'c'),(3,'a')
select distinct t1.bid
from #t t1
where exists(
select 1
from #t t2
where t2.bid = t1.bid and t2.data = 'a'
)
and exists(
select 1
from #t t2
where t2.bid = t1.bid and t2.data = 'b'
)
XML PATH and XQuery version:
select distinct t.bid
from
(
select *
, (
select *
from #t t2
where t2.bid = t1.bid
for xml path, root('root'), type
) [x]
from #t t1
) t
where t.x.exist('root[*/data[text() = "a"] and */data[. = "b"]]') = 1