SQL LEFT JOIN behaviour - sql

Trying to get a SQL LEFT JOIN to return NULLs where there are no corresponding rows in the other table.
Table 1 - T1
id n
1 aaa
2 bbb
3 ccc
Table 2 - T2
t1_id t3_id
1 1
2 1
3 1
1 2
3 2
2 3
3 3
In T2, note that there is no combination of 2 - 2 or 1 - 3.
select *
from t1
left join t2
on t1.id = t2.t1_id
order by t2.t3_id, t1_id
Output:
id n t1_id t3_id
1 aaa 1 1
2 bbb 2 1
3 ccc 3 1
1 aaa 1 2
3 ccc 3 2
2 bbb 2 3
3 ccc 3 3
I was expecting there to be two additional rows
1 aaa null null
2 bbb null null
...corresponding to the previously mentioned missing combinations in T2.
Note the ORDER BY is only there for convenience - it makes no difference to the rows returned.
Please help me understand why this is happening, and how to get around it.

If you want all rows in t2 in the result set, it should be the first table referenced in the left join:
select *
from t2 left join
t1
on t1.id = t2.t1_id
order by t2.t3_id, t2.t1_id;
EDIT:
You seem to want to generate new rows not in the original data. Use cross join to generate the rows and then left join to bring them in:
select t3.t3_id, t1.id, t1.n
from (select distinct t2.t3_id from t2) as t3 cross join
t1 left join
t2
on t3.t3_id = t2.t3_id and t2.t1_id

Alternatively, if you do want to reference t1 first in the FROM, you would use a RIGHT JOIN instead (personally I find these less intuitive but some do prefer them, and they do have their uses when more tables are involved):
SELECT *
FROM t1
RIGHT JOIN t2 ON t1.id = t2.t1_id
ORDER BY t2.t3_id,
t1_id;

table1:
-------------
| id | name |
-------------
| 1 | john |
-------------
| 2 | mark |
-------------
| 3 | will |
-------------
table2:
-----------------
| t1_id | t3_id |
-----------------
| 1 | 3 |
-----------------
| 1 | 2 |
-----------------
| 3 | 1 |
-----------------
if i want to get table1 through table2:
SELECT t1.*
FROM table2 as t2
LEFT JOIN table1 as t1
ON t1.id = t2.t1_id
WHERE 1
ORDER BY t1.id ASC;
you'll get:
-------------
| id | name |
-------------
| 1 | john |
-------------
| 1 | john |
-------------
| 3 | will |
-------------
edited:
so the query will get all what's in table2, so (t1_id: 1, t3_id: 3), (t1_id: 1, t3_id: 2), (t1_id: 3, t3_id: 1), then the left join will compare the id of t1 with t1_id in t2 and return all columns in t1 as i wrote *.

Related

behavior of filters in outer join

I understand filters in JOIN clause and in WHERE clause is different when using outer join. Let's say I have these 2 tables.
table1
id | value
---+------
1 | 11
2 | 12
table2
id | value
---+------
1 | 101
Now if I query for
select a.id as id1, a.value as value1, b.value as value2
from table1 as a
left join table2 on a.id=b.id and a.value=11
The result is this, an extra row with value1=12
id1 | value1 | value2
----+--------+--------
1 | 11 | 101
2 | 12 | NULL
However, if I put the filter in where clause, it gives me what I want. The question is why it behaves like this?
The second condition used on your left join example limits which rows will be considered for joining.
select f1.id as id1, t1.value as value1, t2.value as value2
from t1
left join t2 on t1.id=t2.id AND T2.VALUE=11
t1
id | value
---+------
1 | 11 ONLY join on this row because t1.value=11
2 | 12
t2
id | value
---+------
1 | 101 this has t1.id=t2.id, so it does get joined
which would produce this final result:
id1 | value1 | value 2
----+--------+--------
1 | 11 | 101
2 | 12 | NULL
Moving the predicate T2.VALUE=11 to the where clause has a different series of events, as follows:
select f1.id as id1, t1.value as value1, t2.value as value2
from t1
left join t2 on t1.id=t2.id
WHERE T2.VALUE=11
t1
id | value
---+------
1 | 11 this row does meet t1.id=t2.id, so it gets joined
2 | 12 this row does NOT meet t1.id=t2.id, FAILS to join
t2
id | value
---+------
1 | 101 this row does meet t1.id=t2.id, so it gets joined
which would produce this INTERIM result:
id1 | value1 | value 2
----+--------+--------
1 | 11 | 101
2 | 12 | NULL
NOW the where clause is considered
id1 | value1 | value 2
----+--------+--------
1 | 11 | 101 T2.VALUE does equal 11 so this row will be returned
2 | 12 | NULL T2.VALUE does NOT = 11 so this row is NOT returned
Thus the final result is:
id1 | value1 | value 2
----+--------+--------
1 | 11 | 101

Join three tables with counts

I have these three tables (Soknad, Prognose and Did) in the SQL Server database:
Table Soknad has columns: S_ID (key), S_REFNR
Table Prognose has columns: P_ID (key), P_S_ID
Table Did has columns: D_ID (key), D_S_ID, Did_Something
Prognose.P_S_ID is foreign key to Soknad.S_ID.
Did.D_S_ID is foreign key to Soknad.S_ID.
The tables are like this:
SOKNAD
S_ID | S_REFNR |
1 | abc |
2 | cbc |
3 | sdf |
PROGNOSE
P_ID | P_S_ID |
10 | 1 |
11 | 2 |
DID
D_ID | D_S_ID | D_Did_Something |
100 | 1 | 1 |
101 | 1 | 1 |
102 | 1 | 0 |
103 | 2 | 1 |
104 | 2 | 1 |
I want to join these tables (like a view or select statement). From the Did table a count of column Did_Something should be returned, as well as a count of the same column where the value is 1 (one).
The result should be:
S_ID | S_REFNR | P_ID | Count_D_Did_Something | Count_D_Did_Something_Is_One |
1 | abc | 10 | 3 | 2 |
2 | cbc | 11 | 2 | 2 |
3 | sdf | | | |
Any help would be appreciated!
I believe what you want to do is join two tables and put the counts where the rows match the left table in separate columns for each table.
This would accomplish that.
select t1.id, count(t2.id) t2_count , count(t3.id) t3_count
from table1 as t1
left outer join table2 as t2 on t2.table1_id = t1.id
left outer join table3 as t3 on t3.table1_id = t1.id
group by t1.id;
To accomplish the counts you want based on criteria from one of the outer joined tables, you can do that this way, using a derived table...
select t1.id, count(t2.id) t2_count, count(tt2.mCount) Did_SomethingCount, count(t3.id) t3_count
from table1 as t1
left outer join table2 as t2 on t2.table1_id = t1.id
left outer join (select count(*), table1_id mCount from table2 where Did_Something = 1 group by table1_id) as tt2 on tt2.table1_id = t1.id
left outer join table3 as t3 on t3.table1_id = t1.id
group by t1.id;
Here you go:
select s.s_id,
p.p_id,
count(d.Did_Something) as Count_D_Did_Something, -- nulls won't be counted
sum(CASE WHEN d.Did_Something = 1 THEN 1 ELSE 0 END) as Count_D_Did_Something_is_one
from Soknad as s
left join Prognose as p on p.P_S_ID = s.s_id
left join Did as d on d.D_S_ID = s.s_id
This should be simply give the results you need. Note, I simply summed up when you want count to be for values having one,
SELECT S.S_ID, S.REFNR, P.P_ID, COUNT(D.D_DID_SOMETHING) AS COUNT_D_DID_SOMETHING,
SUM(D_DID_SOMETHING) AS COUNT_D_DID_SOMETHING_IS_ONE
FROM SOKNAD AS S
INNER JOIN PROGNOSE AS P
ON P.P_S_ID = S.S_ID
INNER JOIN DID AS D
ON D.D_S_ID = S.S_ID
GROUP BY S.S_ID, S.REFNR, P.P_ID

Working with 2 different tables and getting the latest/most recent values values

Been struggling in Microsoft SQL a couple of hours with this.
I have 2 Tables.
Table1
ID | STOCK | STATUS
-----------------------------
1 | 1 | Out
2 | 1 | In
3 | 1 | Out
4 | 2 | Out
5 | 2 | In
Table2
ID | DATE
---------------
1 | 2013-07-01
2 | 2013-07-02
3 | 2013-07-03
4 | 2013-07-01
5 | 2013-07-02
I want to get the latest STOCK with the latest DATE and STATUS
-> Result must be
Result Table
ID| STOCK | STATUS | DATE
-------------------------------
3 | 1 | Out | 2013-07-03
5 | 2 | In | 2013-07-02
I have done the following:
SELECT Table1.*, Table2.* FROM Table1, Table2 WHERE Table1.ID=Table2.ID
This joins the table but gives all 5 records. So I thought I would use the MAX() function like so
SELECT Table1.*, MAX(Table2.ID),Table2.Date FROM Table1, Table2 WHERE Table1.ID=Table2.ID GROUP BY Table2.Date
But this does not run in the query windows.
SELECT t1.id, t1.stock, t1.status, t5.mdate
FROM Table1 t1
inner join Table2 t2 on t1.id = t2.id
inner join
(
select t3.stock, max(t4.date) as mdate
from table1 t3
inner join Table2 t4 on t3.ID = t4.ID
group by t3.stock
) t5 on t5.stock = t1.stock and t5.mdate = t2.date
SQLFiddle demo
Use this:
Select t1.id,t1.stock,t1.status,t2.date
FROM Table1 as t1,Table2 as t2
Where t1.id=t2.id
Order By t2.date DESC
LIMIT 1

When joining a table to itself to identfy duplicate date in a column, how do you keep it from returning the inverse in the results?

I am trying to write a sql statement to return me the list of duplicate items I find in a table. For the sake of simplicity imagine a table named TEST with a rowid column and a text column called column 1 with the following date:
rowid | column1
---------------
1 | A
2 | B
3 | C
4 | A
5 | B
6 | C
7 | D
The query I currently have is:
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid <> t2.rowid
It gives me the following results, as I would expect it to do:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
4 | A | 1 | A
5 | B | 2 | B
6 | C | 3 | C
What I really want is just:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
What black sql magic to I need to call upon in order to get my desired result?
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid < t2.rowid
Another approach to produce results in the same form as the original table:
SELECT t.rowid, t.column1
FROM (SELECT column1
FROM test
GROUP BY column1
HAVING COUNT(*) > 1) q
INNER JOIN test t
ON q.column1 = t.column1
ORDER BY t.column1, t.rowid
Have you tried this?
select min(rowid), column1, max(rowid), column1
from test
group by column1
having count(*)>1
Saves doing self-joins or subqueries, gotta be faster.

MySQL get data from another table with duplicate ID/data

How to query data from table_1 which ID is not available on table_2 that has duplicate ID's. See example below.
I want to get ID 5 and 6 of Table 1 from Table 2
Table 1
-------------
| ID | Name |
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
| 5 | e |
| 6 | f |
-------------
Table 2
-------------
Table 1 ID |
| 1 |
| 1 |
| 2 |
| 2 |
| 2 |
| 3 |
| 4 |
-------------
Thanks!
Minus query would be very helpful, see this link: minus query replacement
for your data this would look like this:
SELECT table_1.id FROM table_1 LEFT JOIN table_2 ON table_2.id = table_1.id WHERE table_2.id IS NULL
Use:
SELECT t.id
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
WHERE t2.id IS NULL
Using NOT EXISTS:
SELECT t.id
FROM TABLE_1 t1
WHERE NOT EXISTS(SELECT NULL
FROM TABLE_2 t2
WHERE t2.id = t1.id)
Using NOT IN:
SELECT t.id
FROM TABLE_1 t1
WHERE t1.id NOT IN (SELECT t2.id
FROM TABLE_2 t2)
Because there shouldn't be NULL values in table2's id column, the LEFT JOIN/IS NULL is the fastest means: http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
If I am understanding you correctly you want to do an outer join. In this case it would be:
SELECT * FROM
table_1 LEFT JOIN ON table_2
ON table_1.id = table_2.id
WHERE table_2.id is NULL
This one does what you want:
Select t1.id
From table1 t1
Left Join table2 t2
On t2.id = t1.id
Where t2.id Is Null
Result:
id
--
5
6