Last two joins cause duplicate rows - sql

Ok, so I have a query that is returning more rows than expected with repeating data. Here is my query:
SELECT AP.RECEIPTNUMBER
,AP.FOLDERRSN
,ABS(AP.PAYMENTAMOUNT)
,ABS(AP.PAYMENTAMOUNT - AP.AMOUNTAPPLIED)
,TO_CHAR(AP.PAYMENTDATE,'MM/DD/YYYY')
,F.REFERENCEFILE
,F.FOLDERTYPE
,VS.SUBDESC
,P.NAMEFIRST||' '||P.NAMELAST
,P.ORGANIZATIONNAME
,VAF.FEEDESC
,VAF.GLACCOUNTNUMBER
FROM ACCOUNTPAYMENT AP
INNER JOIN FOLDER F ON AP.FOLDERRSN = F.FOLDERRSN
INNER JOIN VALIDSUB VS ON F.SUBCODE = VS.SUBCODE
INNER JOIN FOLDERPEOPLE FP ON FP.FOLDERRSN = F.FOLDERRSN
INNER JOIN PEOPLE P ON FP.PEOPLERSN = P.PEOPLERSN
INNER JOIN ACCOUNTBILLFEE ABF ON F.FOLDERRSN = ABF.FOLDERRSN
INNER JOIN VALIDACCOUNTFEE VAF ON ABF.FEECODE = VAF.FEECODE
WHERE AP.NSFFLAG = 'Y'
AND F.FOLDERTYPE IN ('405B','405O')
Everything works fine until I add the bottom two Inner Joins. I'm basically trying to get all payments that had NSF. When I run the simple query:
SELECT *
FROM ACCOUNTPAYMENT
WHERE NSFFLAG = 'Y'
I get only 3 rows pertaining to 405B and 405O folders. So I'm only expecting 3 rows to be returned in the above query but I get 9 with information repeating in some columns. I need the exact feedesc and gl account number based on the fee code that can be found in both the Valid Account Fee and Account Bill Fee tables.
I can't post a picture of my output.
Note: when I run the query without the two bottom joins I get the expected output.
Can someone help me make my query more efficient? Thanks!
As requested, below are the results that my query is returning for vaf.feedesc and vaf.glaccountnumber columns:
Boiler Operator License Fee 2423809
Boiler Certificate of Operation without Manway - Revolving 2423813
Installers (Boiler License)/API Exam 2423807
Boiler Public Inspection/Certification (State or Insurance) 2423816
Boiler Certificate of Operation with Manway 2423801
Boiler Certificate of Operation without Manway 2423801
Boiler Certificate of Operation with Manway - Revolving 2423813
BPV Owner/User Program Fee 2423801
Installers (Boiler License)/API Exam Renewal 2423807

The cause is that at least one of the connections ACCOUNTBILLFEE-FOLDER or VALIDACCOUNTFEE-ACCOUNTBILLFEE is not one-to-one. It allows for one Folder to have many AccountBillFees or for one ValidAccountFee to have many AccountBillFees.
To find the cause of such a problem this is what I usually do:
Change the SELECT A, B, C part of your query to SELECT *.
Reduce the results to one of the rows that is causing you trouble (by adding a WHERE ...). That is a single row without your last two joins and a few rows after you add those two joins.
Look at the result table from left to right. The first columns will probably show the same values for all rows. Once you see a difference between the values in a column, you know that the table of the column you are currently looking at is causing your "multiple row problem".
Now create a SELECT * statement that includes only the two tables joined together that cause multiple rows with the same WHERE ... you used above.
The result should give you a clear picture of the cause.
Once you know the reason for your problem you can think of a solution ;)

Try this if it helps then those tables have additional rows which are not relevant. If it doesn't then look at the results of the subqueries I have below to see what additional filters are needed
SELECT AP.RECEIPTNUMBER
,AP.FOLDERRSN
,ABS(AP.PAYMENTAMOUNT)
,ABS(AP.PAYMENTAMOUNT - AP.AMOUNTAPPLIED)
,TO_CHAR(AP.PAYMENTDATE,'MM/DD/YYYY')
,F.REFERENCEFILE
,F.FOLDERTYPE
,VS.SUBDESC
,P.NAMEFIRST||' '||P.NAMELAST
,P.ORGANIZATIONNAME
,VAF.FEEDESC
,VAF.GLACCOUNTNUMBER
FROM ACCOUNTPAYMENT AP
INNER JOIN FOLDER F ON AP.FOLDERRSN = F.FOLDERRSN
INNER JOIN VALIDSUB VS ON F.SUBCODE = VS.SUBCODE
INNER JOIN FOLDERPEOPLE FP ON FP.FOLDERRSN = F.FOLDERRSN
INNER JOIN PEOPLE P ON FP.PEOPLERSN = P.PEOPLERSN
INNER JOIN
(
SELECT DISTINCT ABF.FEECODE, ABF.FOLDERRSN
FROM ACCOUNTBILLFEE ABF
) ABF ON F.FOLDERRSN = ABF.FOLDERRSN
INNER JOIN
(
SELECT DISTINCT VAF.FEEDESC, VAF.GLACCOUNTNUMBER, VAF.FEECODE
FROM VALIDACCOUNTFEE VAF
) VAF ON ABF.FEECODE = VAF.FEECODE
WHERE AP.NSFFLAG = 'Y'
AND F.FOLDERTYPE IN ('405B','405O')

The data for those last two tables is different in different records in the one to many relationship. Since distinct did not fix the problem, then you have to accept that 9 records is the correct return because you are returning the fields that are different or you have to determine which of the multiple records you don't want returned based on business rules that must come from someone in your company not us.
I don't think you fully understand how SQl works as 9 records is exactly what I would have expected given the information you gave in the question. The following are some queries that show how joining in a one to many relationship can affect output and ways that you can adjust the query to get rid of the duplicated output.
Note that in some of the cases, the query cannot be adjusted to get rid of the output because of the columns you want returned. So even if some of the columns are repeated, if even one of the columns you want return has differnt records and you have no approriate business rules for which of them you want to see, you can't reduce the records set. Which rules you need are based on the type of data you are querying and what the rqeuirements are. This is not a question we can answer here, only your company knows whether a min or max value would be acceptable or if you need to add a where clause and if so what field to put it on and what values to use it to exclude. Those are business rules not SQL.
create table #temp (myid int , mydescription varchar(30))
insert into #temp(myid, mydescription)
values (1, 'test') , (2, 'test2')
create table #temp2 (myid int, myotherdescription varchar(30))
insert into #temp2(myid, myotherdescription)
values (1, 'othertest') , (1, 'othertest2'), (2, 'myothertest') , (1, 'othertest3')
select *
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t2.myid, t.mydescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select distinct t2.myid, t.mydescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t.myid, t.mydescription, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select distinct t.myid, t.mydescription, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
select t.myid, min(t2.myotherdescription)
from #temp t
join #temp2 t2 on t.myid = t2.myid
group by t.myid
select t.myid, t2.myotherdescription
from #temp t
join #temp2 t2 on t.myid = t2.myid
where t2.myid = 2

Related

SQL - Table Join To Compare NULL Values Where Join is NULL

I've been asked to basically write a report that displays data in two different databases and be able to see in either database if something is missing.
IE, the invoice number may exist in database1, but not in database2 and vice versa.
I've got the following query below but it only returns all the data from the second table, with NULL values for the first. I'd like to set it up to return the NULL Values in both, but I think the problem is because my join is on the values that can be NULL, so it won't return the values that exist in the first table and not the second.
Can someone step me through how to resolve an issue like this?
As far as I'm aware, I don't necessarily have any other tables to join unless I try to join more tables from each database.
Query:
Select TC.PO_Number, TC.Invoice_Date, TC.Invoice_, H.RefPoNum, H.InvoiceNum
From Table1 TC
RIGHT JOIN [SERVERNAME].[DBNAME].[TABLE2] H ON (TC.Invoice_ = H.InvoiceNum)
Where TC.Invoice_Date Between '2018-10-31' AND '2018-10-31'
AND H.Company Like 'COMPANY'
You can do what you want with a full join. Filtering is tricky with a full join, so I recommend subqueries:
select tc.PO_Number, tc.Invoice_Date, tc.Invoice_, h.RefPoNum, h.InvoiceNum
From (select tc.*
from Table1 tc
where tc.Invoice_Date Between '2018-10-31' AND '2018-10-31'
) tc full join
(select h.*
from [SERVERNAME].[DBNAME].[TABLE2] h
where h.Company Like 'COMPANY'
) h
on TC.Invoice_ = H.InvoiceNum;
Just make sure that the column you are comparing has the same data type and you can safely use this query below:
Server01 NOT IN Server02
select t1.InvoiceNumber from server01.dbo.Invoice t1
except
select t2.InvoiceNumber from server02.dbo.Invoice t2
Server02 NOT IN Server01
select t1.InvoiceNumber from server02.dbo.Invoice t1
except
select t2.InvoiceNumber from server01.dbo.Invoice t2
P.S.
While this may not be the exact query you are looking for, but this template may help.

SQL query not returning Null-value records

I am using SQL Server 2014 on a Windows 10 PC. I am sending SQL queries directly into Swiftpage’s Act! CRM system (via Topline Dash). I am trying to figure out how to get the query to give me records even when some of the records have certain Null values in the Opportunity_Name field.
I am using a series of Join statements in the query to connect 4 tables: History, Contacts, Opportunity, and Groups. History is positioned at the “center” of it all. They all have many-to-many relationships with each other, and are thus each linked by an intermediate table that sits “between” the main tables, like so:
History – Group_History – Group
History – Contact_History – Contact
History – Opportunity_History – Opportunity
The intermediate tables consist only of the PKs in each of the main tables. E.g. History_Group is only a listing of HistoryIDs and GroupIDs. Thus, any given History entry can have multiple Groups, and each Group has many Histories associated with it.
Here’s what the whole SQL statement looks like:
SELECT Group_Name, Opportunity_Name, Start_Date_Time, Contact.Contact, Contact.Company, History_Type, (SQRT(SQUARE(Duration))/60) AS Hours, Regarding, HistoryID
FROM HISTORY
JOIN Group_History
ON Group_History.HistoryID = History.HistoryID
JOIN "Group"
ON Group_History.GroupID = "Group".GroupID
JOIN Contact_History
ON Contact_History.HistoryID = History.HistoryID
JOIN Contact
ON Contact_History.ContactID = Contact.ContactID
JOIN Opportunity_History
ON Opportunity_History.HistoryID = History.HistoryID
JOIN Opportunity
ON Opportunity_History.OpportunityID = Opportunity.OpportunityID
WHERE
( Start_Date_Time >= ('2018/02/02') AND
Start_Date_Time <= ('2018/02/16') )
ORDER BY Group_NAME, START_DATE_TIME;
The problem is that when the Opportunity table is linked in, any record that has no Opportunity (i.e. a Null value) won’t show up. If you remove the Opportunity references in the Join statement, the listing will show all history events in the Date range just fine, the way I want it, whether or not they have an Opportunity associated with them.
I tried adding the following to the WHERE part of the statement, and it did not work.
AND ( ISNULL(Opportunity_Name, 'x') = 'x' OR
ISNULL(Opportunity_Name, 'x') <> 'x' )
I also tried changing the Opportunity_Name reference up in the SELECT part of the statement to read: ISNULL(Opportunity_Name, 'x') – this didn’t work either.
Can anyone suggest a way to get the listing to contain all records regardless of whether they have a Null value in the Opportunity Name or not? Many thanks!!!
I believe this is because a default JOIN statement discards unmatched rows from both tables. You can fix this by using LEFT JOIN.
Example:
CREATE TABLE dataframe (
A int,
B int
);
insert into dataframe (A,B) values
(1, null),
(null, 1)
select a.A from dataframe a
join dataframe b ON a.A = b.A
select a.A from dataframe a
left join dataframe b ON a.A = b.A
You can see that the first query returns only 1 record, while the second returns both.
SELECT Group_Name, Opportunity_Name, Start_Date_Time, Contact.Contact, Contact.Company, History_Type, (SQRT(SQUARE(Duration))/60) AS Hours, Regarding, HistoryID
FROM HISTORY
LEFT JOIN Group_History
ON Group_History.HistoryID = History.HistoryID
LEFT JOIN "Group"
ON Group_History.GroupID = "Group".GroupID
LEFT JOIN Contact_History
ON Contact_History.HistoryID = History.HistoryID
LEFT JOIN Contact
ON Contact_History.ContactID = Contact.ContactID
LEFT JOIN Opportunity_History
ON Opportunity_History.HistoryID = History.HistoryID
LEFT JOIN Opportunity
ON Opportunity_History.OpportunityID = Opportunity.OpportunityID
WHERE
( Start_Date_Time >= ('2018/02/02') AND
Start_Date_Time <= ('2018/02/16') )
ORDER BY Group_NAME, START_DATE_TIME;
You will want to make sure you are using a LEFT JOIN with the table Opportunity. This will keep records that do not relate to records in the Opportunity table.
Also, BE CAREFUL you do not filter records using the WHERE clause for the Opportunity table being LEFT JOINED. Include those filter conditions relating to Opportunity instead in the LEFT JOIN ... ON clause.

SQL, only if matching all foreign key values to return the record?

I have two tables
Table A
type_uid, allowed_type_uid
9,1
9,2
9,4
1,1
1,2
24,1
25,3
Table B
type_uid
1
2
From table A I need to return
9
1
Using a WHERE IN clause I can return
9
1
24
SELECT
TableA.type_uid
FROM
TableA
INNER JOIN
TableB
ON TableA.allowed_type_uid = TableB.type_uid
GROUP BY
TableA.type_uid
HAVING
COUNT(distinct TableB.type_uid) = (SELECT COUNT(distinct type_uid) FROM TableB)
Join the two tables togeter, so that you only have the records matching the types you are interested in.
Group the result set by TableA.type_uid.
Check that each group has the same number of allowed_type_uid values as exist in TableB.type_uid.
distinct is required only if there can be duplicate records in either table. If both tables are know to only have unique values, the distinct can be removed.
It should also be noted that as TableA grows in size, this type of query will quickly degrade in performance. This is because indexes are not actually much help here.
It can still be a useful structure, but not one where I'd recommend running the queries in real-time. Rather use it to create another persisted/cached result set, and use this only to refresh those results as/when needed.
Or a slightly cheaper version (resource wise):
SELECT
Data.type_uid
FROM
A AS Data
CROSS JOIN
B
LEFT JOIN
A
ON Data.type_uid = A.type_uid AND B.type_uid = A.allowed_type_uid
GROUP BY
Data.type_uid
HAVING
MIN(ISNULL(A.allowed_type_uid,-999)) != -999
Your explanation is not very clear. I think you want to get those type_uid's from table A where for all records in table B there is a matching A.Allowed_type_uid.
SELECT T2.type_uid
FROM (SELECT COUNT(*) as AllAllowedTypes FROM #B) as T1,
(SELECT #A.type_uid, COUNT(*) as AllowedTypes
FROM #A
INNER JOIN #B ON
#A.allowed_type_uid = #B.type_uid
GROUP BY #A.type_uid
) as T2
WHERE T1.AllAllowedTypes = T2.AllowedTypes
(Dems, you were faster than me :) )

Converting a nested sql where-in pattern to joins

I have a query that is returning the correct data to me, but being a developer rather than a DBA I'm wondering if there is any reason to convert it to joins rather than nested selects and if so, what it would look like.
My code currently is
select * from adjustments where store_id in (
select id from stores where original_id = (
select original_id from stores where name ='abcd'))
Any references to the better use of joins would be appreciated too.
Besides any likely performance improvements, I find following much easier to read.
SELECT *
FROM adjustments a
INNER JOIN stores s ON s.id = a.store_id
INNER JOIN stores s2 ON s2.original_id = s.original_id
WHERE s.name = 'abcd'
Test script showing my original fault in ommitting original_id
DECLARE #Adjustments TABLE (store_id INTEGER)
DECLARE #Stores TABLE (id INTEGER, name VARCHAR(32), original_id INTEGER)
INSERT INTO #Adjustments VALUES (1), (2), (3)
INSERT INTO #Stores VALUES (1, 'abcd', 1), (2, '2', 1), (3, '3', 1)
/*
OP's Original statement returns store_id's 1, 2 & 3
due to original_id being all the same
*/
SELECT * FROM #Adjustments WHERE store_id IN (
SELECT id FROM #Stores WHERE original_id = (
SELECT original_id FROM #Stores WHERE name ='abcd'))
/*
Faulty first attempt with removing original_id from the equation
only returns store_id 1
*/
SELECT a.store_id
FROM #Adjustments a
INNER JOIN #Stores s ON s.id = a.store_id
WHERE s.name = 'abcd'
If you would use joins, it would look like this:
select *
from adjustments
inner join stores on stores.id = adjustments.store_id
inner join stores as stores2 on stores2.original_id = stores.original_id
where stores2.name = 'abcd'
(Apparently you can omit the second SELECT on the stores table (I left it out of my query) because if I'm interpreting your table structure correctly,
select id from stores where original_id = (select original_id from stores where name ='abcd')
is the same as
select * from stores where name ='abcd'.)
--> edited my query back to the original form, thanks to Lieven for pointing out my mistake in his answer!
I prefer using joins, but for simple queries like that, there is normally no performance difference. SQL Server treats both queries the same internally.
If you want to be sure, you can look at the execution plan.
If you run both queries together, SQL Server will also tell you which query took more resources than the other (in percent).
A slightly different approach:
select * from adjustments a where exists
(select null from stores s1, stores s2
where a.store_id = s1.id and s1.original_id = s2.original_id and s2.name ='abcd')
As say Microsoft here:
Many Transact-SQL statements that include subqueries can be
alternatively formulated as joins. Other questions can be posed only
with subqueries. In Transact-SQL, there is usually no performance
difference between a statement that includes a subquery and a
semantically equivalent version that does not. However, in some cases
where existence must be checked, a join yields better performance.
Otherwise, the nested query must be processed for each result of the
outer query to ensure elimination of duplicates. In such cases, a join
approach would yield better results.
Your case is exactly when Join and subquery gives the same performance.
Example when subquery can not be converted to "simple" JOIN:
select Country,TR_Country.Name as Country_Translated_Name,TR_Country.Language_Code
from Country
JOIN TR_Country ON Country.Country=Tr_Country.Country
where country =
(select top 1 country
from Northwind.dbo.Customers C
join
Northwind.dbo.Orders O
on C.CustomerId = O.CustomerID
group by country
order by count(*))
As you can see, every country can have different name translations so we can not just join and count records (in that case, countries with larger quantities of translations will have more record counts)
Of cource, you can can transform this example to:
JOIN with derived table
CTE
but it is an other tale-)

Filter a SQL Server table dynamically using multiple joins

I am trying to filter a single table (master) by the values in multiple other tables (filter1, filter2, filter3 ... filterN) using only joins.
I want the following rules to apply:
(A) If one or more rows exist in a filter table, then include only those rows from the master that match the values in the filter table.
(B) If no rows exist in a filter table, then ignore it and return all the rows from the master table.
(C) This solution should work for N filter tables in combination.
(D) Static SQL using JOIN syntax only, no Dynamic SQL.
I'm really trying to get rid of dynamic SQL wherever possible, and this is one of those places I truly think it's possible, but just can't quite figure it out. Note: I have solved this using Dynamic SQL already, and it was fairly easy, but not particularly efficient or elegant.
What I have tried:
Various INNER JOINS between master and filter tables - works for (A) but fails on (B) because the join removes all records from the master (left) side when the filter (right) side has no rows.
LEFT JOINS - Always returns all records from the master (left) side. This fails (A) when some filter tables have records and some do not.
What I really need:
It seems like what I need is to be able to INNER JOIN on each filter table that has 1 or more rows and LEFT JOIN (or not JOIN at all) on each filter table that is empty.
My question: How would I accomplish this without resorting to Dynamic SQL?
In SQL Server 2005+ you could try this:
WITH
filter1 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable1 f ON join_condition
),
filter2 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable2 f ON join_condition
),
…
SELECT m.*
FROM masterdata m
INNER JOIN filter1 f1 ON m.ID = f1.ID AND f1.HasMatched = f1.AllHasMatched
INNER JOIN filter2 f2 ON m.ID = f2.ID AND f2.HasMatched = f2.AllHasMatched
…
My understanding is, filter tables without any matches simply must not affect the resulting set. The output should only consist of those masterdata rows that have matched all the filters where matches have taken place.
SELECT *
FROM master_table mt
WHERE (0 = (select count(*) from filter_table_1)
OR mt.id IN (select id from filter_table_1)
AND (0 = (select count(*) from filter_table_2)
OR mt.id IN (select id from filter_table_2)
AND (0 = (select count(*) from filter_table_3)
OR mt.id IN (select id from filter_table_3)
Be warned that this could be inefficient in practice. Unless you have a specific reason to kill your existing, working, solution, I would keep it.
Do inner join to get results for (A) only and do left join to get results for (B) only (you will have to put something like this in the where clause: filterN.column is null) combine results from inner join and left join with UNION.
Left Outer Join - gives you the MISSING entries in master table ....
SELECT * FROM MASTER M
INNER JOIN APPRENTICE A ON A.PK = M.PK
LEFT OUTER JOIN FOREIGN F ON F.FK = M.PK
If FOREIGN has keys that is not a part of MASTER you will have "null columns" where the slots are missing
I think that is what you looking for ...
Mike
First off, it is impossible to have "N number of Joins" or "N number of filters" without resorting to dynamic SQL. The SQL language was not designed for dynamic determination of the entities against which you are querying.
Second, one way to accomplish what you want (but would be built dynamically) would be something along the lines of:
Select ...
From master
Where Exists (
Select 1
From filter_1
Where filter_1 = master.col1
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_1
)
Intersect
Select 1
From filter_2
Where filter_2 = master.col2
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_2
)
...
Intersect
Select 1
From filter_N
Where filter_N = master.colN
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_N
)
)
I have previously posted a - now deleted - answer based on wrong assumptions on you problems.
But I think you could go for a solution where you split your initial search problem into a matter of constructing the set of ids from the master table, and then select the data joining on that set of ids. Here I naturally assume you have a kind of ID on your master table. The filter tables contains the filter values only. This could then be combined into the statement below, where each SELECT in the eligble subset provides a set of master ids, these are unioned to avoid duplicates and that set of ids are joined to the table with data.
SELECT * FROM tblData INNER JOIN
(
SELECT id FROM tblData td
INNER JOIN fa on fa.a = td.a
UNION
SELECT id FROM tblData td
INNER JOIN fb on fb.b = td.b
UNION
SELECT id FROM tblData td
INNER JOIN fc on fc.c = td.c
) eligible ON eligible.id = tblData.id
The test has been made against the tables and values shown below. These are just an appendix.
CREATE TABLE tblData (id int not null primary key identity(1,1), a varchar(40), b datetime, c int)
CREATE TABLE fa (a varchar(40) not null primary key)
CREATE TABLE fb (b datetime not null primary key)
CREATE TABLE fc (c int not null primary key)
Since you have filter tables, I am assuming that these tables are probably dynamically populated from a front-end. This would mean that you have these tables as #temp_table (or even a materialized table, doesn't matter really) in your script before filtering on the master data table.
Personally, I use the below code bit for filtering dynamically without using dynamic SQL.
SELECT *
FROM [masterdata] [m]
INNER JOIN
[filter_table_1] [f1]
ON
[m].[filter_column_1] = ISNULL(NULLIF([f1].[filter_column_1], ''), [m].[filter_column_1])
As you can see, the code NULLs the JOIN condition if the column value is a blank record in the filter table. However, the gist in this is that you will have to actively populate the column value to blank in case you do not have any filter records on which you want to curtail the total set of the master data. Once you have populated the filter table with a blank, the JOIN condition NULLs in those cases and instead joins on itself with the same column from the master data table. This should work for all the cases you mentioned in your question.
I have found this bit of code to be faster in terms of performance.
Hope this helps. Please let me know in the comments.