Left join - two tables with the same data - sql

Lets say that that I have two simple tables with the following columns and data:
Table 1 Table 2
year month year month
2017 01 2017 01
2016 12 2016 12
The primary key is a composite key that consists of the year and the month.
So a classical left join, gives me all the data in the left table with the matching rows in the right table.
If I do a left join like this:
select
t1.year, t2.month
from
table1 t1
left join table 2 t2 on (t1.year = t2.year and t1.month = t2.month)
Why do I get only two rows?? Shouldn't I get 4 rows??
Tnx,
Tom

A classical left join will give you the number of rows in the "Left Table" (the one in from) multiplied by the number of matches in the "Right Table" (the one in LEFT JOIN in this case), plus all the rows in the LEFT Table that have no match in the first table.
Number of rows in LEFT Table = 2
Number of matches in Right Table = 1
Number of rows in LEFT Table withouth matches = 0
2 x 1 + 0 = 2
Edit: Actually the multiplication is given for each row. Would be something like
Sum (row_i x matches_i) + unmatched
Where row_i is means each row, and matches_i to the matches for the i row in the first table. The difference with this is that each row could have different number of matches (the previous formula is only adapted to your case)
This will result in
1 (row1) x 1 (matches for row 1) + 1 (row2) x 1 (matches for row 2) +
0 (unmatched rows in table 1) = result
1x1 + 1x1 + 0 = result
1 + 1 = 2 = result
If you expected 4 rows maybe you wanted to get a Cartesian Product. As the comment stated, you can use Cross Join in that case

When you join tables together, you're essentially asking the database to combine data from two different tables and display it as a single record. When you perform a left join, you are saying:
Give me all the rows from Table1, as well as any associated data from
Table2 (if it exists).
In this sense, the data from Table2 doesn't represent separate or additional records to Table1 (even though they are stored as separate records in a separate table), it represents associated data. You are linking the data between the tables, not appending rows from each table.
Imagine that Table1 stored people, and Table2 stored phone numbers.
Table1 Table2
+------+-------+--------+ +------+-------+-------------+
| Year | Month | Person | | Year | Month | Phone |
+------+-------+--------+ +------+-------+-------------+
| 2017 | 12 | Bob | | 2017 | 12 | 555-123-4567|
| 2016 | 01 | Frank | | 2016 | 01 | 555-234-5678|
+------+-------+-------+ +------+-------+--------------+
You could join them together to get a list of people and their corresponding phone numbers. But you wouldn't expect to get a combination of rows from each table (two rows of people and two rows of phone numbers).

You will get two rows as both the columns have 2 rows that match exactly the sam and its a composite key.
It will make the same way if you had 4 rows in each you will only get 4 rows in total.

The Left Join takes Table1 (t1) as the Left table.
It searches for and retrieves all values from the Right ie:- from Table 2 (t2) matching the criteria T1.Year&Month = T2.Year&Month (alias GOD/s) as well as the additional join condition T1.Month=T2.Month. The result is that only 2 rows from T1 match the join criteria as well as the additional join criteria
Another takeaway : The AND T1.Month=T2.Month condition on the left join is redundant as the composite GOD key takes care of it explicitly.

cross join returns every row you can make by combining a row from each argument. (inner) join on returns the rows from cross join that satisfy its condition. Ie (inner) join on returns every row you can make that combines a row from each argument and that satisfies its condition.
left join on returns the rows from (inner) join on plus the rows you can make by extending unjoined left argument rows by null for columns of the right argument.
Notice that this is regardless of primary keys, unique column sets, foreign keys or any other constraints.
Here there are 2 rows in each argument so there are 2 X 2 = 4 rows in the cross join. But only 2 meet the condition--the ones where a row is combined with itself.
(If you left join a table with itself where the condition is the conjunction of one or more equalities of the left and right versions of a column and there are no nulls in those columns then every left argument row gets joined with at least itself from the right argument. So there are no unjoined left argument rows. So only the rows of the (inner) join on are returned.)

Related

Do a specific query for each row of a table, one by one

Let's say I have a table:
| key1 | key2 | value |
+------+------+-------+
| 1 | 1 | 1337 |
| 1 | 2 | 6545 |
| 2 | 1 | 213 |
| 3 | 1 | 131 |
What I would like to do is traverse this table row by row, then using the key two values in further queries (all other tables contain the unique combination of these two keys + other data)
How do I do this kind of thing in SQL?
EDIT: I would want to extract key1, key2 from row 1 (1,1) then do a query on it, which would result in a number.
Then I would move to the second row, an identical query which would again result in a number.
All of these numbers would be then inserted into a pre-prepared view.
EDIT2: I need to traverse it because the specific use of my database.
It is a database of planets which contains sectors (the keys are the IDs of these two). All of these sectors contain resources, turrets and walls.
The table I have in my post is an example of table of sectors, with the value being enemy force.
Table of resources, turrets etc. contain these two keys so they are linked to and only to a specific sector.
I need to go row by row so I can use this keys to select only specific resources/turrets/walls from my tables, aggregate them and then subtract them from the value in my sector table. Resulting number would then be inserted into a pre-prepared view (again, into the row which matches the combination of my two keys)
This sounds like a correlated subquery or lateral join. You don't have that much explanation, but something like this:
select t1.*, t2.*
from table1 t1 cross join lateral
(select . . .
from table2 t2 . . .
where t2.key1 = t1.key1 and t2.key2 = t1.key2
) t2
You are not clear on what the second query looks like. The where clause is called a correlation clause. It connects the subquery to the outer query. A correlation clause is not strictly needed for this to work.
The columns from the outer query can be used elsewhere in the subquery. I am just assuming that an equality condition connects the two (lacking other information).

SQL select from two tabels wrong result

I can not understand what I am doing wrong.
My array:
ps_product
id_product
active
1
1
2
1
and
ps_product_sync
id_product
status
1
0
2
1
and my SQL code
SELECT pr_product.id_product, pr_product.active
FROM pr_product, pr_product_sync
WHERE pr_product.active = pr_product_sync.status
I get a result like this:
id_product
status
2
1
2
1
2
1
...
..
24 rows
I try the same with inner but result is the same, I don't have duplicates in the arrays... I don't understand why I get one row 24 times
PS. all tables looks good before posting/saving
If you query two tables and include both in the FROM clause, you create a Cartesian product of these tables. In other words, if one table has 4 rows and the other 6 rows, the result is 24 rows.
It is better to create an INNER JOIN using the key of the first table and the foreign key of the second table.
Change your query accordingly
SELECT pr_product.id_product, pr_product.active
FROM pr_product
INNER JOIN pr_product_sync
ON pr_product.id_product = ps_product_sync.id_product
WHERE pr_product.active = pr_product_sync.status
Of course, you could also compare the Keys in the WHERE clause or eliminate duplicates using DISTINCT. IMHO the most understandable solution is an INNER JOIN.
I hope this solves your problem.
Missing the primary key join.
Add:
WHERE
pr_product.id_product=pr_product_sync.id_product
AND pr_product.active=pr_product_sync.status

Microsoft Access Query - Merging two queries into one

Using the query wizard, I built two different queries doing similar functionalities that I am trying to combine into one query. I have two tables (same structure) that I am matching to find duplicates:
Query #1 is as follows (Include ALL records from Table 1 and only those records from Table 2 where the joined fields are equal are applied to all the columns below):
Match Table 1 Column 3 to Table 2 Column 3
Match Table 1 Column 4 to Table 2 Column 4
Match Table 1 Column 5 to Table 2 Column 5
Match Table 1 Column 7 to Table 2 Column 7
If all of those columns from Table 1 match what’s in Table 2, it will identify the duplicates (I bring in Table 2 Column 7 which will show the duplicates I am looking for).
Query #2 is as follows (Include ALL records from Table 1 and only those records from Table 2 where the joined fields are equal are applied to all the columns below):
Match Table 1 Column 3 to Table 2 Column 3
Match Table 1 Column 4 to Table 2 Column 4
Match Table 1 Column 5 to Table 2 Column 5
Match Table 1 Column 8 to Table 2 Column 8
My second query has the same 3 columns, except the last column is different.
If all of those columns from Table 1, match what’s in Table 2, it will identify the duplicates (I bring in Table 2 Column 7/8 which will show the duplicates I am looking for).
What I am trying to do:
Add an OR statement for the query to show both duplicates on matches for Columns 8 and Columns 7. Such as if Table 1 Column 7 matches Table 2 Column 7 OR Table 1 Column 8 matches Table 2 Column 8, show the duplicates.
Would this require a UNION query?
Here is the query for one of them:
SELECT TABLE1.COLUMN_3, TABLE1.COLUMN_4, TABLE1. COLUMN_7,
TABLE2.COLUMN_7, TABLE1.COLUMN_5
FROM TABLE1
LEFT JOIN TABLE2
ON (TABLE1.COLUMN_7 = TABLE2.COLUMN_7)
AND (TABLE1.COLUMN_3 = TABLE2.COLUMN_3)
AND (TABLE1.COLUMN_4 = TABLE2.COLUMN_4)
AND (TABLE1.COLUMN_5 = TABLE2.COLUMN_5);
The given SQL will not run because of spaces in table and column names.
This example eliminates those spaces.
SELECT
Table1.Column3 , Table2.Column3
,Table1.Column4 , Table2.Column4
,Table1.Column5 , Table2.Column5
,Table1.Column7 , Table2.Column7
,Table1.Column8 , Table2.Column8
From Table1
Left Join Table2
On
( Table1.Column3 = Table2.Column3
AND Table1.Column4 = Table2.Column4
AND Table1.Column5 = Table2.Column5
AND ( Table1.Column7 = Table2.Column7
OR Table1.Column8 = Table2.Column8
)
)
Create query 3; add Q1 and Q2 - join them on F7. This will result in all matches of that field.
Create query 4; add Q1 and Q2 - join them on F8. This will result in all matches of that field.
You now have 2 data sets with those matches (you call duplicates). Then the decision is how to present/display. If you need them in a single record set - write those into a single common temp table. But otherwise you can easily present them together as sub reports/forms.

For MS Access SQL, want to use EXCEPT in Access for three columns in table1 to 1 column in table 2

I have 3 columns in table A. I am trying to design a query that will call out all the values (in the three columns) that do not apepar in the 1 column I have in table B. If it helps to make it more clear, table B is a list of currencies in ISO codes and table A is three columns of currencies being used, I am identifying all those values that are NOT using ISO codes to denote their currency.
Currently, I can't seem to get them all to match to the one column, so I made 2 more columns in table B so I can match them individually. My constraints are, I cannot change table A and I must do this in one query. What I got so far is below.
SELECT m.Currency1, i.ISO_Code, m.Currency2 , i.ISO_Code1, m.Currency3, i.ISO_Code2
FROM A AS m
LEFT JOIN B AS i
ON m.Currency=i.ISO_Code
AND m.Currency2=i.ISO_Code1
AND m.Currency3=i.ISO_Code2
WHERE i.ISO_Code is NULL
OR i.ISO_Code1 is NULL
OR i.ISO_Code2 is NULL;
I wouldn't bother making multiple columns in 'B'. I played with this in SQLFiddle and got it to work.
Something like this:
SELECT
m.Currency1, i.ISO_Code,
m.Currency2, j.ISO_Code AS ISO_Code1,
m.Currency3, k.ISO_Code AS ISO_Code2
FROM A AS m
LEFT JOIN B as i
ON m.Currency1 = i.ISO_Code
LEFT JOIN B as j
ON m.Currency2 = j.ISO_Code
LEFT JOIN B as k
ON m.Currency3 = k.ISO_Code
WHERE
i.ISO_Code IS NULL OR
j.ISO_Code IS NULL OR
k.ISO_Code IS NULL

Combining tables where some keys are in common and some are missing

I have a table of landcover values in my postgres db (version 9.3). The key values are "huc12code", which represent a spatial area. This is the first entry of the landcover table (it has 1700+ rows):
huc12code | open_water | developed | bare_earth | managed | heritage
030402080205 | 0.107027 | 0.0215444 | 0.406911 | |
The area of managed and heritage land is empty in this table but are stored in separate tables. They also have huc12code as the first column:
From the temp_heritage table:
huc12code | heritage
-------------+-------------
030101020801 | 0.0402684
From temp_managed table:
huc12code | managed
-------------+-------------
030101020802 | 0.000385026
I'd like to incorporate the managed & heritage fields into my landcover table, but I see two problems:
Most of the huc12code fields in the landcover table are found in temp_managed and temp_heritage, but some are not. This is because the "temp" tables don't include places where the heritage and managed fields equal 0.
I do want the landcover table to include a 0 value where the heritage and managed fields are 0, but since these values aren't actually present in the "temp" tables, I need some logic that inserts a zero for the heritage field for each huc12code that is found in landcover but not in temp_heritage (and the same for the manage field).
Assuming huc12code to be unique in all tables.
Only update rows where values are available
To do it all in a single UPDATE join the two source tables with a FULL [OUTER] JOIN to preserve all rows. Then use this in the FROM clause of the UPDATE command:
UPDATE landcover l
SET managed = COALESCE(t.managed, 0)
, heritage = COALESCE(t.heritage, 0)
FROM (
SELECT huc12code, m.managed, h.heritage
FROM temp_managed m
FULL JOIN temp_heritage h USING (huc12code)
) t
WHERE l.huc12code = t.huc12code;
Every row in landcover is updated exactly one time if there is a (combined) row in the source. The rest is not touched at all.
Update all rows
Use two LEFT JOIN instead:
UPDATE landcover l
SET managed = COALESCE(t.managed, 0)
, heritage = COALESCE(t.heritage, 0)
FROM (
SELECT l.huc12code, m.managed, h.heritage
FROM landcover l
LEFT JOIN temp_managed m USING (huc12code)
LEFT JOIN temp_heritage h USING (huc12code)
) t
WHERE l.huc12code = t.huc12code;
Columns without a source or source NULL value are set to 0.