Full outer join without on condition - sql

New to SQL.
I have two SQL tables T1 and T2 which looks like the following
T1
customer_key X1 X2 X3
1000 60 10 2018-02-01
1001 42 9 2018-02-01
1002 03 1 2018-02-01
1005 15 1 2018-02-01
1002 32 2 2018-02-05
T2
customer_key A1 A2 A3
1001 20 2 2018-02-17
1002 25 2 2018-02-11
1005 04 1 2018-02-17
1009 02 0 2018-02-17
I want to get T3 as shown below by joining T1 and T2 and filtering on T1.X3 = '2018-02-01'
and T2.A3 = '2018-02-17'
T3
customer_key X1 X2
1000 60 10
1001 42 9
1005 15 1
1009 null null
I tried doing full outer join in the following way
create table T3
AS
select T1.customer_key, T3.customer_key, T1.X1, T1.X2
from T1
full outer join T2
on T1.Customer_key = T2.customer_key
where T1.X3 = '2018-02-01' and T2.A3 = '2018-02-17'
It returns lesser number of rows than the total records that satisfying the where clause. Please advice

Full outer join with filtering is just confusing. I recommend filtering in subqueries:
select T1.customer_key, T3.customer_key, T1.X1, T1.X2
from (select t1.*
from T1
where T1.X3 = '2018-02-01'
) t1 full outer join
(select t2.*
from T2
where T2.A3 = '2018-02-17'
) t2
on T1.Customer_key = T2.customer_key ;
Your filter turns the outer join into an inner join. Moving the conditions to the on clause returns all rows in both tables -- but generally with lots of null values. Using (T1.X3 = '2018-02-01' or t1.X3 is null) and (T2.A3 = '2018-02-17' or T2.A3 is null) doesn't quite do the right thing either. Filtering first is what you are looking for.

When you join tables via FULL OUTER JOIN, the query engine finds all matched records (INNER JOIN) and also add any unmatched records from both tables to the JOIN result set. The latter have NULLs in all columns for the other table.
So for example, customer = 1000 in T1 will be included in the JOIN result although there is no such customer in T2 but all columns from T2 will be NULL (because of FULL OUTER). Then, when you apply a WHERE clause on these records (WHERE is executed after JOIN operations) your result set will EXCLUDE customer = 1000 because T2.A3 = '2018-02-17' will fail (because T2.A3 is NULL for customer = 100).
I couldn't provide a query to your question because your explanation is lacking what you are trying to accomplish and I couldn't make sense of your result set: why customer = 1000 included but not customer = 1002 for example. Please describe what you're trying to accomplish.
Depending on what you're trying to accomplish, you can move part of your WHERE clause to JOIN or use filters like T1.customer is NULL / T1.customer is NOT NULL to identify cases where records didn't match / did match and filter exactly what you need.

Related

Full outer merge of more than two tables in ms sql server to prepare desired table as shown in the example

I have 3 tables as shown below.
t1:
id runs
001 1200
020 600
123 1500
t2:
id wickets
008 4
030 7
123 0
020 6
t3:
id catches
007 4
030
123 2
040 6
I would like to perform FULL OUTER JOIN of all the three tables and prepare the as shown below.
Expected output:
id runs wickets catches
001 1200
020 600 6
123 1500 0 2
008 4
030 7
007 4
040 6
I tried below code and did not works.
SELECT *
FROM t1
FULL OUTER JOIN t2
ON t1.id = t2.id
FULL OUTER JOIN t2.id = t3.id
I did the same using pandas using following code and it worked well.
from functools import reduce
dfl=[t1, t2, t3]
df_merged = reduce(lambda left,right: pd.merge(left,right,on=['id'],
how='outer'), dfl)
You can select the expected columns you want to obtain from each table:
SELECT coalesce(t1.id,t2.id, t3.id), t1.runs, t2.wickets, t3.catches
FROM t1
FULL OUTER JOIN t2 ON t1.id = t2.id
FULL OUTER JOIN t3 ON COALESCE(t1.id, t2.id) = t3.id
You could use UNION ALL in a sub-query and then GROUP BY.
This would give you zeros where there is no value. If this is a problem we could modify the presentation.
If you have another table with all the players we could us LEFT JOIN onto the three tables with
WHERE runs <>'' OR wickets <> '' OR catches <> ''
select
sum(runs) "runs",
sum(wickets) "wickets",
sum(catches) "catches)
from(
select id, runs, 0 wickets, 0 catches from t1
union all
select id, 0, wickets,0 from t2
union all
select id, 0,0, catches from t3
)
group by id,
order by id;

Join two tables with switch case in order to avoid one to many join

I have two tables, t1 and t2.
Table t1:
Name address id
---- ------- --
rob 32 cgr 12
mary 31 lmo 42
tom axel St 2
Table t2:
ID Flag expense
-- ---- --------
12 Shop 1200
12 Educ 14000
42 educ 4000
Now I will have to create a table which will have attributes from t1 plus two more attributes that is expense in shop and expense in educ
Table t3
Name address id Shop_ex Educ_ex
---- ------- -- ------- -------
rob 32 cgr 12 1200 14000
mary 31 lmo 42 NULL 4000
tom axel st 2 NULL NULL
How to accomplish this?
I tried doing a left join t2 with switch case but it gives me multiple record as the join is becoming one to many.
select
t1.name, t1.address, t1.id,
case
when t2.flag = "shop" then t2.expense
else null
end as shop_ex
case
when t2.flag = "educ" then t2.expense
else null
end as educ_ex
from
t1
left join
t2 on (t1.id = t2.id)
It seems I will have to convert t2 table first before joining, to have a single record on the basis of flag. But I am not sure how to do that.
Please mind the tables are huge and optimized query will be nice.
Please suggest.
You only need to join the first table to the second one, twice:
SELECT t1.Name, t1.address, t1.id, t2a.expense AS Shop_ex, t2b.expense AS Educ_ex
FROM table1 t1
LEFT JOIN table2 t2a
ON t2a.ID = t1.id AND t2a.Flag = 'Shop'
LEFT JOIN table2 t2b
ON t2b.ID = t1.id AND t2b.Flag = 'Educ'
Demo

SQL join with complex case condition

I need to build a query to get the following details.
Table 1
Code Text
1 A
2 B
3 C
Table 2
Code Min Max
1 1.00 1.75
2 1.76 2.25
3 2.26 3.00
Table 3
Eid Value
1234 1.2
3456 2.56
I am looking at a query which gives me the following output in a single SQL query.
Table 3, should look at the Value, compare the value in Table 2 to see if it lies between Min and max and get the equivalent code and get the code compare with the Table 1 and get the final Text value.
Final Output
Eid Text
1234 A
3456 C
here is a way to do this
select t3.eid,t1.text
from t3
join t2
on t3.value between t2.min and t2.max
join t1
on t2.code=t1.code
You can try the below -
select eid,text
from table3 t3 inner join table2 t2 on t3.Value>=t2.min and t3.Value<=t2.max
inner join table1 t1 on t1.code=t2.code

How to select rows not present in another table with 2 columns comparison?

I have the following tables:
T1 T2 Desired result
CA CB CA CC CA CB
1 2 1 3 1 4
1 4 1 2 2 1
1 3 1 5 2 3
2 1 2 4
2 3
3 6
3 1
4 ...
I need to make a join between T1 and T2 (using column CA) and return only those rows which the values in CB do not exists in T2.CC
A simple way to achieve that is using the following query:
SELECT T1.* FROM T1 INNER JOIN T2 ON t1.CA = t2.CA AND
t1.CB NOT IN (SELECT CC FROM T2 WHERE T2.CA = T1.CA)
I think the previous query is not very efficient. For that reason I am looking for something better
Any help will be appreciated
Generally, a more efficient means of achieving this sort of result is finding records which fail a simpler join condition. Those can be found by doing an outer join and checking for null, as follows:
select t1.ca, t1.cb
from t1 left outer join t2 on t1.ca=t2.ca and t1.cb=t2.cc
where t2.ca is null;
I think you just want not exists:
select t1.*
from t1
where not exists (select 1
from t2
where t2.ca = t1.ca and t2.cb = t1.cb
);
For performance, you want an index on t2(ca, cb).

How to left join on two tables on just unique ids

I have two tables
Table 1:
color_id | label
---------|------
2 | 0
3 | 0
2 | 0
1 | 0
4 | 1
4 | 1
5 | 0
Table 2:
color_id
--------
2
1
4
I want a query that just gives me results for color_ids that are present in Table 2
So, I wrote:
SELECT *
FROM table1
LEFT JOIN table2
ON table1.color_id = table2.color_id
WHERE table2.color_id IS NOT NULL
however, the above gives duplicates as well. Meaning I get
2 | 0
2 | 0
1 | 0
4 | 1
4 | 1
I don't want the duplicates in the results. I just want unique items.
I want a query that just gives me results for color_ids that are present in Table 2
So, you shouldn't use LEFT JOIN in this case:
SELECT DISTINCT a.color_id, a.label
FROM table_1 a JOIN table_2 b
ON a.color_id = b.color_id
When you add the keyword Left (or Right or full) to a join specifier, you make the join an outer join. This means that you get all the rows from one side of the join, and only those rows from the other side that match. If you only want the rows from table_1 where the color_id is in table_2, then you want an inner join, specified by writing inner join or just writing join, without a left, right or full.
to eliminate duplicates, add the keyword distinct to the select clause...
Select distinct color_id, label
From table1 t1
join table2 t2
on t2.color_id = t1.color_id
Try the below query
SELECT DISTINCT color_id
FROM table_1 T1
WHERE EXISTS (SELECT 1 FROM table_2 T2 where T1.color_id = T2.color_id)
Use an inner join and a distinct clause:
SELECT DISTINCT table1.color_id, table1.label
FROM table1
INNER JOIN table2
ON table1.color_id = table2.color_id
What you are looking for is an INNER JOIN combined with a
SELECT distinct table1.color_id, tabl1.label
FROM table1
INNER JOIN table2 ON table1.color_id = table2.color_id
This eliminates any item in table1 not present in table 2 and duplicated rows.
the reason of that is you used Left Join, which will keep all obs in table1.
Try this:
SELECT table1.* FROM table1 Inner JOIN table2 ON table1.color_id = table2.color_id
this should works as actually all table2 obs are in table1. To be more serious, if table2 has obs that are not in table1 and you do want to keep them, replace inner join with right join.