LEFT JOIN on 3 tables to get a value - sql

I'm trying to create an new interface for a database but I don't know how to do what I want.
I have 3 tables :
- table1(id1, time, ...)
id11 ..
id12 ..
id13 ..
- table2(id2, price, ...)
id21 ..
id22 ..
id23 ..
- table1_table2(#id1, #id2, value)
id11, id22, 6
id11, id23, 10
id13, id22, 5
So I want to have something like this :
id11, id21, 0
id11, id22, 6
id11, id23, 10
id12, id21, 0
id12, id22, 0
id12, id23, 0
id13, id21, 0
id13, id22, 5
id13, id23, 0
I've tried lots of requests but nothing efficient..
Please, help me ^^
EDIT : I'm using Access ( :'( ) 2007, and apparently, it doesn't support CROSS JOIN...
I tried to use this : http://blog.jooq.org/2014/02/12/no-cross-join-in-ms-access/
but still have a syntax error on the JOIN or the FROM..
EDIT 2 : Here is my query (I'm french, so don't take care of names please ^^)
SELECT Chantier.id_chantier, Indicateur.id_indicateur, Indicateur_chantier.valeur
FROM ((Chantier INNER JOIN Indicateur ON (Chantier.id_chantier*0 = Indicateur.id_indicateur*0))
LEFT JOIN Indicateur_chantier ON ( (Chantier.id_chantier = Indicateur_chantier.id_chantier)
AND (Indicateur.id_indicateur = Indicateur_chantier.id_indicateur) ) )

You should first cross join table1 and table2 to produce their Cartesian product and the left join to get the values where exist :
SELECT t1.id1,t2.id2,ISNULL(t12.value,0)
FROM table1 t1
CROSS JOIN table2 t2
LEFT JOIN table1_table2 t12 on t12.id1=t.id1 and t12.id2=t2.id2
Finally use ISNULL to replace null values with zeros.

Answer may vary by database, this works in SQL Server, you need a CROSS JOIN to get every combination of table1 and table2, then a LEFT JOIN to return pairs with values:
SELECT a.id1, b.id2, COALESCE(c.value,0)
FROM table1 a
CROSS JOIN table2 b
LEFT JOIN table3 c
ON a.id1 = c.id1
AND b.id2 = c.id2
Pairs without values would return NULL, so you can use COALESCE() to return 0 instead.
Demo: SQL Fiddle

In your question you say that Access "doesn't support CROSS JOIN". While it is true that Access SQL does not support
... FROM tableX CROSS JOIN tableY ...
you can perform a cross join in Access by simply using
... FROM tableX, tableY ...
In your case,
SELECT
crossjoin.id1,
crossjoin.id2,
Nz(table1_table2.value, 0) AS [value]
FROM
(
SELECT table1.id1, table2.id2
FROM table1, table2
) AS crossjoin
LEFT JOIN
table1_table2
ON table1_table2.id1 = crossjoin.id1
AND table1_table2.id2 = crossjoin.id2
ORDER BY crossjoin.id1, crossjoin.id2

Related

DISTINCT in JOIN

I have a question in Oracle SQL.
To simplify my problem, let's say I have two tables:
TAB1: TAB2:
Usr Fruit Fruit Calories
1 A A 100
1 A B 200
1 A C 150
1 C D 400
1 C E 50
2 A
2 A
2 E
It's important that there are double entries in TAB1.
Now I want to know the calories for usr 1. But by joining both tables
SELECT TAB2.calories from TAB1
JOIN TAB2 ON TAB1.Fruit = TAB2.Fruit
WHERE TAB1.Usr = 1;
I get double results for the double entries. I could of course use distinct in the header, but is there a possibility to distinct the values (to A and C) directly in the join? I am sure that would improve my (much larger) performance.
Thanks!
I'm a big fan of the semi-join. For tables this small, it won't matter, but for larger tables it can make a big difference:
select
tab2.calories
from tab2
where exists (
select null
from tab1
where tab1.fruit = tab2.fruit and tab1.usr = 1
)
Try as this:
SELECT TAB2.calories
from (select distinct usr, fruit from TAB1) as T1
JOIN TAB2 ON T1.Fruit = TAB2.Fruit
WHERE T1.Usr = 1;
You should do distinct before the join
select sum(tab2.calories) as TotalCalories
from (select distinct tab1.*
from tabl
) t1 join
tab2
on t1.fruit = tab2.fruit
where t1.user = 1;
Also, to add the values, use an aggregation function.
since you select nothing in tabA and maybe you have some usefull index, i'd go for an IN instead of the join
SELECT TAB2.calories
FROM TAB2
WHERE TAB2.Fruit IN ( SELECT TAB1.Fruit FROM TAB1 WHERE TAB1.Usr = 1)
i'm pretty sure this one will take longer but you can still try:
SELECT TAB2.calories
FROM TAB2
WHERE TAB2.Fruit IN ( SELECT DISTINCT TAB1.Fruit FROM TAB1 WHERE TAB1.Usr = 1)

SQL alternative to sub-query in SELECT Item list

I have RDBMS table and Queries which are working perfectly. I have offloaded data from RDBMS to HIVE table.To run the existing queries on HIVE, we need first to make them compatible to HIVE.
Let's take below example with sub-query in select item list. It is syntactically valid and working fine on RDBMS system. But It Will not work on HIVE As per HIVE manual , Hive supports subqueries only in the FROM and WHERE clause.
Example 1 :
SELECT t1.one
,(SELECT t2.two
FROM TEST2 t2
WHERE t1.one=t2.two) t21
,(SELECT t3.three
FROM TEST3 t3
WHERE t1.one=t3.three) t31
FROM TEST1 t1 ;
Example 2:
SELECT a.*
, CASE
WHEN EXISTS
(SELECT 1
FROM tblOrder O
INNER JOIN tblProduct P
ON O.Product_id = P.Product_id
WHERE O.customer_id = C.customer_id
AND P.Product_Type IN (2, 5, 6, 9)
)
THEN 1
ELSE 0
END AS My_Custom_Indicator
FROM tblCustomer C
INNER JOIN tblOtherStuff S
ON C.CustomerID = S.CustomerID ;
Example 3 :
Select component_location_id, component_type_code,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'AXLE'
and component_location_id = cl.component_location_id ) as AXLE,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'SIDE'
and component_location_id = cl.component_location_id ) as SIDE
from stg_dev.component_locations cl ;
I want to know the possible alternative of sub-queries in select item list to make it compatible to hive. Apparently I will be able to transform existing queries in HIVE format.
Any help and guidance is highly appreciated !
The query you provided could be transformed to a simple query with LEFT JOINs.
SELECT
t1.one, t2.two AS t21, t3.three AS t31
FROM
TEST1 t1
LEFT JOIN TEST2 t2
ON t1.one = t2.two
LEFT JOIN TEST3 t3
ON t1.one = t3.three
Since there is no limitation in the subqueries, the joins will return the same data. (The subqueries should return only one or no row for each row in TEST1.)
Please note, that your original query could not handle 1..n connections. In most DBMS, subqueries in the SELECT list should return only with a resultset with one columns and one or no row.
Based on HIVE manual:
SELECT t1.one, t2.two, t3.three
FROM TEST1 t1,TEST2 t2, TEST3 t3
WHERE t1.one=t2.two AND t1.one=t3.three;
SELECT t1.one,t2.two,t3.three FROM TEST1 t1 INNER
JOIN TEST2 t2 ON t1.one=t2.two INNER JOIN TEST3 t3
ON t1.one=t3.three WHERE t1.one=t2.two AND t1.one=t3.three;
SELECT t1.one,t2.two as t21,t3.three as t31 FROM TEST1 t1
INNER JOIN TEST2 t2 ON t1.one=t2.two
INNER JOIN TEST3 t3 ON t1.one=t3.three

Left outer join on multiple tables

I have the following sql statement:
select
a.desc
,sum(bdd.amount)
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination def a on (a.id =bds.sId)
where c.repId=1000000134
group by a.desc;
When I run it I get the following result:
desc amount
NW 12.00
SW 10
When I try to add another left outer join to get another set of values:
select
a.desc
,sum(bdd.amount)
,sum(i.amt)
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination def a on (a.id =bdd.sId)
left outer join t_ind i on (i.id=c.id)
where c.repId=1000000134
group by a.desc;
It basically doubles the amount field like:
desc amount amt
NW 24.00 234.00
SE 20.00 234.00
While result should be:
desc amount amt
NW 12.00 234.00
SE 10.00 NULL
How do I fix this?
If you really need to receive the data as you mentioned, your can use sub-queries to perform the needed calculations. In this case you code may looks like the following:
select x.[desc], x.amount, y.amt
from
(
select
c.[desc]
, sum (bdd.amount) as amount
, c.id
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination_def bdd on (bdd.id = bds.sId)
where c.repId=1000000134
group by c.id, c.[desc]
) x
left join
(
select t.id, sum (t.amt) as amt
from t_ind t
inner join t_main c
on t.id = c.id
where c.repID = 1000000134
group by t.id
) y
on x.id = y.id
In the first sub-select you will receive the aggregated data for the two first columns: desc and amount, grouped as you need.
The second select will return the needed amt value for each id of the first set.
Left join between those results will gives the needed result. The addition of the t_main table to the second select was done because of performance issues.
Another solution can be the following:
select
c.[desc]
, sum (bdd.amount) as amount
, amt = (select sum (amt) from t_ind where id = c.id)
from #t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination_def bdd on (bdd.id = bds.sId)
where c.repId = 1000000134
group by c.id, c.[desc]
The result will be the same. Basically, instead of using of nested selects the calculating of the amt sum is performing inline per each row of the result joins. In case of large tables the performance of the second solution will be worse that the first one.
Your new left outer join is forcing some rows to be returned in the result set a few times due to multiple relations most likely. Remove your SUM and just review the returned rows and work out exactly which ones you require (maybe restrict it to on certain type of t_ind record if that is applicable??), then adjust your query accordingly.
Left Outer Join - Driving Table Row Count
A left outer join may return more rows than there are in the driving table if there are multiple matches on the join clause.
Using MS SQL-Server:
DECLARE #t1 TABLE ( id INT )
INSERT INTO #t1 VALUES ( 1 ),( 2 ),( 3 ),( 4 ),( 5 );
DECLARE #t2 TABLE ( id INT )
INSERT INTO #t2 VALUES ( 2 ),( 2 ),( 3 ),( 10 ),( 11 ),( 12 );
SELECT * FROM #t1 t1
LEFT OUTER JOIN #t2 t2 ON t2.id = t1.id
This gives:
1 NULL
2 2
2 2
3 3
4 NULL
5 NULL
There are 5 rows in the driving table (t1), but 6 rows are returned because there are multiple matches for id 2.
So if an aggregate function is used, eg SUM() etc, grouped by the driving table column(s), this will give the wrong results.
To fix this, use derived tables or sub-queries to calculate the aggregate values, as already stated.
Left Outer Join - Multiple Tables
Where there are left outer joins over multiple tables, or any join for that matter, the query generates a series of derived tables in the order of joins.
SELECT * FROM t1
LEFT OUTER JOIN t2 ON t2.col2 = <...>
LEFT OUTER JOIN t3 ON t3.col3 = <...>
This is equivalent to:
SELECT * FROM
(
SELECT * FROM t1
LEFT OUTER JOIN t2 ON t2.col2 = <...>
) dt1
LEFT OUTER JOIN t3 ON t3.col3 = <...>
Here, for both queries, the results of the 1st left outer join are put into a derived table (dt1) which is then left outer joined to the 3rd table (t3).
For left outer joins over multiple tables, the order of the tables in the join clauses is critical.

Sql NOT IN optimization

I'm having trouble optimizing a query. Here are two example tables I am working with:
Table 1:
UID
A
B
Table 2:
UID Parent
A 2
B 2
C 3
D 2
E 3
F 2
Here is what I am doing now:
Select Table1.UID
FROM Table1 R
INNER JOIN Table2 T ON
R.UID = T.UID
INNER JOIN Table2 E ON
T.PARENT = E.PARENT
AND E.UID NOT IN (SELECT UID FROM Table1)
I'm trying to avoid using the NOT IN clause because of obvious hindrances in performance for large numbers of records.
I know the typical ways to avoid NOT IN clauses like the LEFT JOIN where the other table is null, but can't seem to get what I want with all of the other Joins going on.
I will continue working and post if I find a solution.
EDIT: Here is what I am trying to end up with
After the first Inner Join I would have
A
B
AFter the second Inner join I would have:
A D
A F
B D
B F
The second column above is just to represent that it is matching to the other UIDs with the same parent, but I still need the As and Bs as the UID.
EDIT: RDBMS is SQL server 2005, 2008r2, 2012
Table1 is declared in the query with no index
DECLARE #Table1 TABLE ( [UNIQUE_ID] INT PRIMARY KEY )
Table2 has a clustered index on Unique ID
The general approach to this is to use a LEFT JOIN with a where clause that only selects the non-matching rows:
Select Table1.UID
FROM Table1 R
JOIN Table2 T ON R.UID = T.UID
JOIN Table2 E ON T.PARENT = E.PARENT
LEFT JOIN Table3 E2 ON E.UID = R.UID
WHERE E2.UID IS NULL
SELECT Table2.*
FROM Table2
INNER JOIN (
SELECT id FROM Table2
EXCEPT
SELECT id FROM Table1
) AS Filter ON (Table2.id = Filter.id)

SQL joining 4 tables issue

I have four tables:
T1
ID ID1 TITLE
1 100 TITLE1
2 100 TITLE2
3 100 TITLE3
T2
ID TEXT
1 LONG1
2 LONG2
T3
ID1 ID2
100 200
T4
ID4 ID2 SUBJECT
1 200 A
2 200 B
3 200 C
4 200 D
5 200 E
I want output in this result format:
TITLE TEXT SUBJECT
TITLE1 LONG1 A
TITLE2 LONG2 B
TITLE3 null C
null null D
null null E
So I made this query but it gives me much more results than it should be.On example titles asre displayed more times than just once etc.
SELECT
t1.title,
t2.text,
t4.subject
FROM t1
LEFT OUTER JOIN t2 ON t1.id=t2.id
INNER JOIN t3 ON t1.id1=t3.id1
LEFT OUTER JOIN t4 ON t4.id2=t3.id2
WHERE
t1.id1=100
Thanks for help
Disclaimer: I don't work with DB2. After some browsing through documentation I have found that DB2 supports row_number() and full outer join, but I might easily be wrong.
To get rid of n:m relationship one has to build additional key. In this case simple solution is to add row number to each record in t1 and t4 and use it as join condition. Row_number does just that, produces numbers for groups of data defined by partition by in ascending sequence in order defined by order by.
As there is difference in number of records in t1 and t4, and it is unknown which one always has more records, I use full outer join to join them.
You can see the test (Sql Server version) # Sql Fiddle.
select t1_rn.title,
t2.[text],
t4_rn.subject
from
(
select t1.id,
t1.title,
t1.id1,
t3.id2,
row_number() over(partition by t1.id1
order by id) rn
from t1
inner join t3
on t1.id1 = t3.id1
) t1_rn
full outer join
(
select t4.subject,
t3.id1,
t4.id2,
row_number() over(partition by t4.id2
order by id4) rn
from t4
inner join t3
on t4.id2 = t3.id2
) t4_rn
on t1_rn.id1 = t4_rn.id1
and t1_rn.id2 = t4_rn.id2
and t1_rn.rn = t4_rn.rn
left join t2
on t1_rn.id = t2.id
This kind of work should definitely be done on presentation side of an application, but I believe that software you are using requires already prepared data.
try this :
select t1.title,t2.text,t4.subject
from t4
left join t3
on t4.id2=t3.id2
left join t1
on t1.id1=t3.id1
left join t2
on t1.id=t2.id
where t1.id=100
You should change your tables. Your last join does that to your output -just analyze your query. for every record from T1 you have every record from T4.
Outer joins are guaranteed to replicate rows, instead of matching only the ones you need. You may want to look at this:
http://blog.sqlauthority.com/2009/04/13/sql-server-introduction-to-joins-basic-of-joins/
To understand what the join types are, and how you can use them.
You are looking for a list of subjects, with associated text and title, but this may not be unique; more than one null exist for each of the titles. You want to drive the join from table 4, and get a list of subjects, with associated titles for each.
Looking at your ouput it appears you want all subjects displayed. Knowing this you should first off build everything off this table.
SELECT columns
FROM T4
Next build up your inner joins.
SELECT columns
FROM T4 subjectTable
INNER JOIN T3 mapTable
ON mapTable.ID2 = subjectTable.ID2
When happy with them, add on your optional columns with the outer join.
SELECT columns
FROM T4 subjectTable
INNER JOIN T3 mapTable
ON mapTable.ID2 = subjectTable.ID2
LEFT OUTER JOIN T2 textTable
ON textTable.ID = subjectTable.ID4
LEFT OUTER JOIN T1 titleTable
ON titleTable.ID1 = mapTable.ID1
WHERE
subjectTable.ID = 100;