How can I JOIN tables conditionally? - sql

I am working with a badly designed database and I ran into some problem.
I have two tables that I need to join on a 'not unique ID'.
The software that currently works with the database sets a '0 value' in the id if there is an error. This means if I try to join them a massive amount of records is joined on the 0 values.
Underneath an example of two tables and their not unique id fields i want to join them on
tbl1 tbl2
-----------
2 2
3 6
4 5
0 3
0 4
6 0
5 0
----------
What I want to achieve is this
tbl1 tbl2
-----------
2 2
3 3
4 4
0 * (no join)
0 * (no join)
6 6
5 5
----------
In other words, I don't want the '0 values' from tbl1 to join with all other '0 values' from tbl2. I still want to have the record tbl1 without a join though.
Is this possible in 1 query?
extra information: SQL SERVER 2005 and there is no option to make the ID's unique.

You did not show us the full structure of the tables so I need to make up some columns, but it basically goes like this:
select *
from tbl1
join tbl2
on tbl1.id = tbl2.t1_id
and tbl1.that_flag <> 0
Note the and condition that is part of the join condition.

This should do the trick
SELECT * FROM tbl1
LEFT JOIN tbl2 ON tbl1.value = tbl2.value and tbl2.value <> 0
this will give null values on the two that you put as *

SELECT tbl1.id, tbl2.id
FROM dbo.tbl1
LEFT OUTER JOIN dbo.tbl2
ON tbl1.id = NULLIF(tbl2.id, 0);

Related

Multiple table join query - IDs and data tables

SOLVED turns out it was my where clause which was throwing off the results, I changed this out and added the where clause to the ON statement
I need some help.
I have a table with 25 million IDs and 4 tables with IDs and data. I need to create a new table with these 25 million IDs as well as the associated table data from the 4 tables. Each data table will not contain the full 25 million IDs. So as an example;
ID Table:
ID
A
B
Table 1
ID
measure_a
measure_b
B
1
3
Table 2
ID
measure_f
measure_g
A
3
4
etc..
Expected output:
ID
measure_a
measure_b
measure_f
measure_g
A
3
4
NULL
NULL
B
NULL
NULL
1
3
The most important thing is the 25 million IDs are in the final table. I've tried multiple joins but end up with a hugely reduced number of IDs which I believe is due to the IDs which don't match on the join condition being filtered out.
Any help is greatly appreciated.
You would use left joins:
select ids.id, t1.measure_a, t1.measure_b, t2.measure_f, t2.measure_g
from ids left join
table1 t1
on ids.id = t1.id left join
table2 t2
on ids.id = t2.id;

Duplicate rows in left join

I have 2 tables. There are about 100000 of null in one column, other values are integer, total values are about 200000. Another table has only the integer value. When I use the left join on this column, it gave me a lot of duplicates rows. Is it ok to use left join here?
Table 1:
Column 1
2
3
5
null
null
Table 2:
Column 1
1
2
3
so on
Your example is really odd. Why would anyone have null values in an ID field? But anyway.
If you need fields from table 2 in the resultset as you say above then you must use an INNER JOIN not a LEFT JOIN
Something like:
SELECT DISTINCT a.id, a.name, b.someOtherField
FROM Table1 a
INNER JOIN Table2 b ON a.id = b.id
Please note: Since only the ID field of table 1 has null values there will be no records selected from table 1 with id IS NULL because they have no equivalent in table 2. Adding the DISTINCT keyword helps in case this query would still produce duplicates.

SQL comparing two sets of complex data

Sorry, probably the title is not the best one but I hope you will understand what problem I have.
I need to compare and analyse two sets of data and I'm using MS-Access for that. My data is organized in two tables. Following is not the real data I'm working with but will serve ok as example:
TABLE 1
ID Name
1 Zoie
2 Rohan
2 Simon
3 Jerome
4 Jakob
4 Mathew
4 Cora
6 Keely
7 Aiyana
7 Jake
8 Reid
9 Emerson
TABLE 2
ID Name
1 Michael
2 Rohan
2 Simon
3 Jill
4 Jakob
4 Cora
5 Charlie
7 John
8 Reid
9 Yadiel
9 Emerson
9 Paris
So, I need to select only those IDs which fully corresponds (all names under specific IDs are the same) in both tables and those are: 2 and 8
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 1 which also appears in table 2 (all from table 1 plus possible some extra in table 2 under the same ID). So that would be: 2, 8, 9
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 2 which also appears in table 1 (all from table 2 plus possible some extra in table 1 under the same ID). So that would be: 2, 4, 8
I would also like to have separate select statement which would be a combination of last two.
So result would be: 2, 4, 8, 9
I would appreciate any suggestions.
Thanks in advance!
Best regards,
Mario
Q#1:
select id
from table1
group by id
having count(*) =
(
select count(*)
from table2
group by table2.id
having table2.id = table1.id
)
and count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Explanation of this query:
It is grouping table1 by ID
For each group (for each ID), it is counting the number of rows in table1 that have this ID.
For each group, it is counting the number of rows in table2 that have this ID.
For each group, it is counting the number of rows where the name appears in both tables for this ID (it does that by inner joining table1 and table2 on the ID and Name which means only rows where both ID and Name match in both tables will be counted, for each ID).
It then returns IDs (from table1) where each of the above counts are equal. This is what results in returning IDs where all names are in both tables (no more, no less).
Q#2 - In this case you don't care that table2 has the same number of names per ID. So remove the first sub-query (that counts matching rows in table2).
select id
from table1
group by id
having count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Although the above is easy enough to understand following the same logic as Q#1, it is probably more efficient to do the following, and more straightforward. It only matters if you find it running too slow for your data (which is subjective and context dependent).
select table1.id
from table1
left join table2 on table1.id = table2.id and table1.name = table2.name
group by table1.id
having count(table1.id) = count(table2.id)
Here, the two tables are LEFT (outer) joined which means all records from table1 are gathered and records in table2 that match by ID and Name are also included alongside. Then, we group them by ID and we compare the count of each group in table1 with those that had matching names in table2.
Q#3 - This case is the same as Q#2 except table1 and table2 are swapped.
Q#4 - In this case you only care about IDs that have at least one name that appears in both tables. So join the tables and return the distinct IDs:
select distinct id
from table1
inner join table2 on table1.id = table2.id and table1.name = table2.name
Here is a SQLFiddle to play with containing the four queries: http://www.sqlfiddle.com/#!18/3fc71/22

SQL - using JOIN while filling missing values with NULL

The most related question I looked into was this but sadly I did not get a solution for my problem there.
I have two tables, both have a similar column. The only difference is that one column is missing a few values. I want to join the tables, so that for the missing value in one column, the join will show the missing values.
Ill provide an example since this might be confusing -
table 1 table 2
ID count ID count
1 9 1 2
2 2 2 1
3 1
I want the result to be
table 3
ID count2 count1
1 2 9
2 1 2
3 NULL 1
However, using LEFT OUTER JOIN I could only achieve the table "table 3" without the row for id 3, because it has no representation in table 2.
Can you help me with my problem?
A left join would work for your sample data, I'm guessing you want to know what to do if you move the row with id 3 into table 2 so that your query will show all ids. To show all rows from both tables, use a FULL OUTER JOIN:
SELECT CASE WHEN t1.id IS NULL THEN t2.id ELSE t1.id END AS id,
t2.count as count2, t1.count as count1
FROM t1
FULL OUTER JOIN t2 ON t2.id = t1.id

SQL Join query to return matching and non-matching record

I am working on a SQL Syntax to write a Join query.
I couldnt get the expected output, i request expert to help me.
Table: Table1
ScriptNumber Date Filled RefillsLeft
100 01/02/2014 1
200 01/03/2014 0
300 01/22/2014 3
Table : Table 2
ScriptNumber Date Filled RefillsLeft
100 02/02/2014 0
Expected output
ScriptNumber Date Filled RefillsLeft
100 02/02/2014 0
300 null null
SQL Statement
SELECT Table_2.ScriptNumber
,Table_2.DateFilled
,Table_2.RefillsLeft
FROM Table_1
LEFT JOIN Table_2
ON Table_1.ScriptNumber = Table_2.ScriptNumber
Your problem is coming from including columns in SELECT statement from table_2 that do not have values for rows that exists in table_1. You need to change SELECT Table_2.ScriptNumber to SELECT Table_1.ScriptNumber
As future reference make sure you always select all relevant columns from LEFT tables and only columns you need from RIGHT table. Otherwise you end up with less NULL rows instead of having data that is present in LEFT table.
A left join could be useful here to get the records you want in Table_1 and any relevant details that may exist in Table_2
select Table_1.ScriptNumber
, Table_2.DateFilled
, Table_2.RefillsLeft
from Table_1
left join Table_2 on Table_1.ScriptNumber = Table_2.ScriptNumber
where Table_1.RefillsLeft > 0
Such a description is helpful in your questions, too.