Handling null values with join across multiple tables - sql

My mind is exploding right now.. I can't get any of this to work the way I want to! SQL is seriously such a pain in the butt. (/End Rant)
I have three tables that have some common columns to link with. I am trying to retrieve the ID off one table based on the name from the middle table based on the code from the farthest table. (Excuse my vocabulary, I am not skilled with SQL or its' lingo) If the farthest table has a code not found in the middle table, it is to default to a certain value. Then, the first table will return the default for null values. etc.
Example,
tblCounty table has an ID and name column. I am to return the ID from tblCounty based on the name column matching the name column of tblCode.
tblCode has two columns name and code. tblCode returns the respective name based on the matching code column with tblAddress's code column.
tblAdress has many columns, but shares in common a code field.
My attempt,
INSERT INTO vendor (CountyID, Contact)
SELECT
(SELECT a.id
FROM county a
WHERE a.name = (CASE WHEN (SELECT TOP(1) c.countyID
FROM tblAdress c
INNER JOIN tblCode d ON c.CountyID = d.CodeID
WHERE d.CodeID = b.CountyID) IS NULL THEN '**NONE**'
ELSE (SELECT a.CodeName
FROM tblCode a
WHERE a.CodeID = b.CountyID) END)),
b.Contact
FROM
tblAdress b
The error I am receiving is:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Now of course I googled this and looked at results on StackOverflow, but I was unable to apply what I found to my problem.
Vendor:
CountyID | ....
-------------------
1 | ...
2 | ...
3 | ...
2 | ...
tblCounty:
ID | Name | ...
----------------------
1 | **None**
2 | NYC
3 | Buffalo
tblCode:
Name | Code
--------------
**None** | **None**
NYC | 56A
Buffalo | 75B
tblAdress:
Code | ....
----------------
**None** | ....
56A | ......
75B | .....
56A | .....
Using the above tables, I want to transfer all data out of tblAdress into another table (vendor). In the process I will convert column Code to tblCode's column name via code comparison, then to tblCounty.ID via name comparison.
Essentially a catch all is needed. If a code in tblAddress does not exist in tblCode or the code is null in tblAddress, it will return a default value (None). Then tblCounty will convert that default value into ID = 1, then store it into the Vendor table.
Edit
(SELECT TOP(1)
c.ID
FROM
dbo.Address a
LEFT OUTER JOIN
dbo.tblCode cd ON ISNULL(CASE a.CountyID WHEN ' ' THEN '**None**' ELSE a.CountyID END, '**None**') = cd.CodeID
LEFT OUTER JOIN
dbo.tblCounty c ON c.NAME = cd.CodeName
WHERE a.CountyID = b.countyID)

Firstly, your database doesn't seem to be following the best practices of creating a database.
Ideally the design of the relationships and tables should prevent you having to do null checks in joins and the majority of the time a simple left join would do most of what you want. Could you can use constraints and ISNULLs when the data is being added to ensure its integrity? Also, I would advise against joining tables on text like county if you can - It would be much more elegant to use an integer primary key.
I suggest that you make sure that your design is solid before progressing, as these problems may just multiply in the future.
That being said, if you are insistent on continuing the way you're going, the following query should do what you want:
SELECT tblCounty.ID,
ISNULL(tblAddress.Code, 'none')
--Whatever you want to select
FROM tblCounty
LEFT JOIN tblCode ON tblCounty.Name = tblCode.Name
LEFT JOIN tblAddres ON ISNULL(tblCode.Code, 'none') = ISNULL(tblAddress.Code, 'none')

Would this not get you the desired results?
select isnull(a.ID, '**NONE**') as CountyID, c.Contact
from tblCounty a
left join tblCode b on a.Name = b.Name
left join tblAddress c on b.Code = c.Code

OK, so let's try to build this query:
from tblCounty, you want the ID?
tblCounty and tblCode are linked via the name column? (bad idea - opens all sorts of issues - I'd rather use code or something!)
tblAdress is linked to tblCode via the code column
Right?
OK, so let's try this:
if you want to "link" two tables that have a column in common, and you want only rows that exist in both tables - use an INNER JOIN
if you want to "link" two tables that have a column in common, and you want all rows, even those that don't exist in the "right" table, use a LEFT OUTER JOIN
So I'd say you need something like this:
SELECT
c.ID, c.Name, ...(whatever other columns you want),
-- if there's no entry in `tblAddress`, then `a.Name` will be `NULL`
-- so just replace that `NULL` with your default value
ISNULL(a.Name, '*DEFAULT NAME*')
FROM
dbo.tblCounty c
INNER JOIN
dbo.tblCode cd ON c.Name = cd.Name
LEFT OUTER JOIN
dbo.tblAddress a ON cd.Code = a.Code
Update: OK, so I tried with your sample data - how about this query?
SELECT
c.ID, cd.Code,
a.StreetName
FROM
dbo.tblAdress a
LEFT OUTER JOIN
dbo.tblCode cd ON ISNULL(a.Code, 'None') = cd.COde
LEFT OUTER JOIN
dbo.tblCounty c ON c.NAME = cd.NAME

Related

How to Check join value for null and lookup in the other tables

If the join values of accomm_bk and type_bk is Null then how to lookup values in tables say lookup_accomm_bk, lookup_type_bk.
Any help will be appreciated.
select accomm_bk,type_bk
from
staging.contract a
left join dim.accomm_dim b on (a.accomm_id)= b.accomm_hash
left join dim.type_dim c on (a.accomm_id)= c.type_hash
IF Result is NULL, then how to lookup staging.contract a with tables lookup_accomm_bk for column accomm_bk and lookup_type_bk for column type_bk and get values.
Example
accomm_bk | type_bk
--------------------
NULL | NULL
If Result is NULL, then how to lookup staging.contract a with tables lookup_accomm_bk for column accomm_bk and lookup_type_bk for column type_bk and get values.
You would need to add two more LEFT JOINs to your query to link the contract table to tables lookup_accomm_bk and lookup_type_bk.
Then use the COALESCE function to display the looked up values if they can't be found in accomm_dim and type_dim.
Here is a skeleton for the query (you need to define the proper ON clauses for the additional LEFT JOINs) :
select
COALESCE(b.accomm_bk, lb.accomm_bk),
COALESCE(c.type_bk, lc.type_bk)
from
staging.contract a
left join dim.accomm_dim b on (a.accomm_id)= b.accomm_hash
left join dim.type_dim c on (a.accomm_id)= c.type_hash
left join dim.lookup_accomm_bk lb on ...
left join dim.lookup_type_bk lc on ...

WHERE clause and LEFT JOIN in SQL Server query

I have this query:
SELECT
EnrollmentID, MarketID
FROM
Contracts AS CO
LEFT JOIN
Customers AS C ON C.EnrollmentID = CO.BatchID AND MarketID = 'AB'
WHERE
C.EnrollmentID IS NULL
Here, I have a question that in this query is it possible that the query will verify data for MarketID = 'AB' in left join because as per WHERE condition?
I am getting result of EnrollmentIDs and MarketIDs are all NULL.
Note: The LEFT JOIN keyword returns all the rows from the left table (Contracts ), even if there are no matches in the right table (Customers ).
Now, if you want to select right table column and there are no matching data in the right table ,like.
SELECT CO.EnrollmentID, CO.MarketID ,C.Some_col
FROM Contracts AS CO
LEFT JOIN Customers AS C ON C.EnrollmentID = CO.BatchID
so, C.Some_col column will get all the null value for no matching rows in the right table.i think this is the reason why you are getting the null value for
MarketIDs and EnrollmentIDs.
hope, this help you.

Trouble with tsql not in query

I'm trying to find rows where a value in table 1 column A are not in table 2 column A
This is the query...
SELECT contactsid
FROM contacts
WHERE (email1 NOT IN (SELECT email
FROM old_contact))
It returns 0 rows, which I know is incorrect. There are many rows in contacts.email1 that are not in old_contact.email
How should I be writing this query?
My guess is that old_contract.email takes on a NULL value.
For this reason, not exists is often a better choice:
SELECT contactsid
FROM contacts c
WHERE NOT EXISTS (SELECT 1
FROM old_contract oc
WHERE c.email = oc.email1
) ;
You could also add where email1 is not null to the subquery. However, I find just using not exists is generally safer in case I forget that condition.
Try:
SELECT contactsid
FROM Contacts a
LEFT JOIN old_contact b
ON a.email1 = b.email
WHERE b.email IS NULL
This will join Contacts to old_contact using a LEFT JOIN -- a type of join that, based on the join condition, returns all records from the left side (i.e. Contacts) even if no records exist on the right side. Then, the WHERE clause filters the results so that it returns only the records from the left side where the ride side records don't exist.

sql server - how to modify values in a query statement?

I have a statement like this:
select lastname,firstname,email,floorid
from employee
where locationid=1
and (statusid=1 or statusid=3)
order by floorid,lastname,firstname,email
The problem is the column floorid. The result of this query is showing the id of the floors.
There is this table called floor (has like 30 rows), which has columns id and floornumber. The floorid (in above statement) values match the id of the table floor.
I want the above query to switch the floorid values into the associated values of the floornumber column in the floor table.
Can anyone show me how to do this please?
I am using Microsoft sql server 2008 r2.
I am new to sql and I need a clear and understandable method if possible.
select lastname,
firstname,
email,
floor.floornumber
from employee
inner join floor on floor.id = employee.floorid
where locationid = 1
and (statusid = 1 or statusid = 3)
order by floorid, lastname, firstname, email
You have to do a simple join where you check, if the floorid matches the id of your floor table. Then you use the floornumber of the table floor.
select a.lastname,a.firstname,a.email,b.floornumber
from employee a
join floor b on a.floorid = b.id
where a.locationid=1 and (a.statusid=1 or a.statusid=3)
order by a.floorid,a.lastname,a.firstname,a.email
You need to use a join.
This will join the two tables on a certain field.
This way you can SELECTcolumns from more than one table at the time.
When you join two tables you have to specify on which column you want to join them.
In your example, you'd have to do this:
from employee join floor on employee.floorid = floor.id
Since you are new to SQL you must know a few things. With the other enaswers you have on this question, people use aliases instead of repeating the table name.
from employee a join floor b
means that from now on the table employee will be known as a and the table floor as b. This is really usefull when you have a lot of joins to do.
Now let's say both table have a column name. In your select you have to say from which table you want to pick the column name. If you only write this
SELECT name from Employee a join floor b on a.id = b.id
the compiler won't understand from which table you want to get the column name. You would have to specify it like this :
SELECT Employee.name from Employee a join floor b on a.id = b.id or if you prefer with aliases :
SELECT a.name from Employee a join floor b on a.id = b.id
Finally there are many type of joins.
Inner join ( what you are using because simply typing Join will refer to an inner join.
Left outer join
Right outer join
Self join
...
To should refer to this article about joins to know how to use them correctly.
Hope this helps.

Filter a SQL Server table dynamically using multiple joins

I am trying to filter a single table (master) by the values in multiple other tables (filter1, filter2, filter3 ... filterN) using only joins.
I want the following rules to apply:
(A) If one or more rows exist in a filter table, then include only those rows from the master that match the values in the filter table.
(B) If no rows exist in a filter table, then ignore it and return all the rows from the master table.
(C) This solution should work for N filter tables in combination.
(D) Static SQL using JOIN syntax only, no Dynamic SQL.
I'm really trying to get rid of dynamic SQL wherever possible, and this is one of those places I truly think it's possible, but just can't quite figure it out. Note: I have solved this using Dynamic SQL already, and it was fairly easy, but not particularly efficient or elegant.
What I have tried:
Various INNER JOINS between master and filter tables - works for (A) but fails on (B) because the join removes all records from the master (left) side when the filter (right) side has no rows.
LEFT JOINS - Always returns all records from the master (left) side. This fails (A) when some filter tables have records and some do not.
What I really need:
It seems like what I need is to be able to INNER JOIN on each filter table that has 1 or more rows and LEFT JOIN (or not JOIN at all) on each filter table that is empty.
My question: How would I accomplish this without resorting to Dynamic SQL?
In SQL Server 2005+ you could try this:
WITH
filter1 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable1 f ON join_condition
),
filter2 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable2 f ON join_condition
),
…
SELECT m.*
FROM masterdata m
INNER JOIN filter1 f1 ON m.ID = f1.ID AND f1.HasMatched = f1.AllHasMatched
INNER JOIN filter2 f2 ON m.ID = f2.ID AND f2.HasMatched = f2.AllHasMatched
…
My understanding is, filter tables without any matches simply must not affect the resulting set. The output should only consist of those masterdata rows that have matched all the filters where matches have taken place.
SELECT *
FROM master_table mt
WHERE (0 = (select count(*) from filter_table_1)
OR mt.id IN (select id from filter_table_1)
AND (0 = (select count(*) from filter_table_2)
OR mt.id IN (select id from filter_table_2)
AND (0 = (select count(*) from filter_table_3)
OR mt.id IN (select id from filter_table_3)
Be warned that this could be inefficient in practice. Unless you have a specific reason to kill your existing, working, solution, I would keep it.
Do inner join to get results for (A) only and do left join to get results for (B) only (you will have to put something like this in the where clause: filterN.column is null) combine results from inner join and left join with UNION.
Left Outer Join - gives you the MISSING entries in master table ....
SELECT * FROM MASTER M
INNER JOIN APPRENTICE A ON A.PK = M.PK
LEFT OUTER JOIN FOREIGN F ON F.FK = M.PK
If FOREIGN has keys that is not a part of MASTER you will have "null columns" where the slots are missing
I think that is what you looking for ...
Mike
First off, it is impossible to have "N number of Joins" or "N number of filters" without resorting to dynamic SQL. The SQL language was not designed for dynamic determination of the entities against which you are querying.
Second, one way to accomplish what you want (but would be built dynamically) would be something along the lines of:
Select ...
From master
Where Exists (
Select 1
From filter_1
Where filter_1 = master.col1
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_1
)
Intersect
Select 1
From filter_2
Where filter_2 = master.col2
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_2
)
...
Intersect
Select 1
From filter_N
Where filter_N = master.colN
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_N
)
)
I have previously posted a - now deleted - answer based on wrong assumptions on you problems.
But I think you could go for a solution where you split your initial search problem into a matter of constructing the set of ids from the master table, and then select the data joining on that set of ids. Here I naturally assume you have a kind of ID on your master table. The filter tables contains the filter values only. This could then be combined into the statement below, where each SELECT in the eligble subset provides a set of master ids, these are unioned to avoid duplicates and that set of ids are joined to the table with data.
SELECT * FROM tblData INNER JOIN
(
SELECT id FROM tblData td
INNER JOIN fa on fa.a = td.a
UNION
SELECT id FROM tblData td
INNER JOIN fb on fb.b = td.b
UNION
SELECT id FROM tblData td
INNER JOIN fc on fc.c = td.c
) eligible ON eligible.id = tblData.id
The test has been made against the tables and values shown below. These are just an appendix.
CREATE TABLE tblData (id int not null primary key identity(1,1), a varchar(40), b datetime, c int)
CREATE TABLE fa (a varchar(40) not null primary key)
CREATE TABLE fb (b datetime not null primary key)
CREATE TABLE fc (c int not null primary key)
Since you have filter tables, I am assuming that these tables are probably dynamically populated from a front-end. This would mean that you have these tables as #temp_table (or even a materialized table, doesn't matter really) in your script before filtering on the master data table.
Personally, I use the below code bit for filtering dynamically without using dynamic SQL.
SELECT *
FROM [masterdata] [m]
INNER JOIN
[filter_table_1] [f1]
ON
[m].[filter_column_1] = ISNULL(NULLIF([f1].[filter_column_1], ''), [m].[filter_column_1])
As you can see, the code NULLs the JOIN condition if the column value is a blank record in the filter table. However, the gist in this is that you will have to actively populate the column value to blank in case you do not have any filter records on which you want to curtail the total set of the master data. Once you have populated the filter table with a blank, the JOIN condition NULLs in those cases and instead joins on itself with the same column from the master data table. This should work for all the cases you mentioned in your question.
I have found this bit of code to be faster in terms of performance.
Hope this helps. Please let me know in the comments.