Left Outer Join Throws NULL Value in the select statement - sql

I am getting null values if I use a left outer join even after mentioning t.contractid=111111 in the select statement. Please let me know how to resolve this issue.
select t.contractid,r.contractid,t.batchno
from tableA t
left join tableB r
on t.contractid=r.contractid and t.PayGrp=r.PayGrp and t.PriNo=r.PriNo
where t.contractid=111111 and t.PayGrp=0 and t.batchno=201701 and t.prino=3
and r.contractid is null
Sample Output:
null null 201701
null null 201701
null null 201701
null null 201701
null null 201701

Looking at your query and making assumptions based on the query the results your are reporting would be impossible which means something improbable is going on.
Why? It is not possible for r.contractid is null and r.contractid = t.contractid and t.contractid=111111 to all to evaluate to be true. Yet your query is asking the database that question and it is returning results.
The only way to explain the result is there is a custom data type for tableA.contractid and tableB.contractid in which comparing the data types results in a null value being equal to 111111. Similarly PayGrpt.PayGrp would require a datatype which comparing a null to 0 would evaluate to true. Of course, only looking at the data definitions of the tables could tell us this.
That being said I have experience weird results with columns matching in a join statement where one side or the other was null. So before considering the other possibilities, try wrapping your join and where conditions in IsNull() this:
select t.contractid, r.contractid, t.batchno
from tableA t left join
tableB r
on IsNull(t.contractid,0) = IsNull(r.contractid,0) and
IsNull(t.PayGrp,0) = IsNull(r.PayGrp,0) and
IsNull(t.PriNo,0) = IsNull(r.PriNo, 0)
where IsNull(t.contractid,0) = 111111 and IsNull(t.PayGrp,0) = 0 and
IsNull(t.batchno,0) = 201701 and IsNull(t.prino,0) = 3 and
r.contractid is null;
Without the columns being some custom data type if the query above still produces the same result then you are dealing with some kind of system error, such as a corrupt database (or corrupt memory cache of that database) or even more unlikely a hardware error. First try rebooting the database server and the client machine performing the query (if they are not the same) and if the error still exists after a reboot try moving the data to a different database/db server and performing the same query. At that point you will know whether your have a corrupt database or are experiencing a system error on your database server.

You are restricting your query by
r.contractid is null
therefore it is ONLY returning NULL records - the result you are seeing is expected.

This is your query:
select t.contractid, r.contractid, t.batchno
from tableA t left join
tableB r
on t.contractid = r.contractid and
t.PayGrp = r.PayGrp and
t.PriNo = r.PriNo
where t.contractid = 111111 and t.PayGrp = 0 and
t.batchno = 201701 and t.prino = 3 and
r.contractid is null;
The result set will always begin with 111111, NULL, based on the WHERE clause. t.contractid has to have a value. Are you sure the table aliases are correct? That the where clause is correct?

Related

How did this old SQL query work without a join in the subquery

Here is the T-SQL. The code has been around for years and it was handed to me to migrate to another SQL server. It apparently works, but I don't know why. The execution plan doesn't show any predicates being used, so how does it know which rows to exclude. If I run the subquery I get 1146 rows with the value 1
SELECT EM.PERSON_ID
FROM EMP_BEN_ELECTS EBE, EMPLOYEE_MAP EM
WHERE EBE.BW_ID = EM.BW_ID
AND CHANGE_BENEFIT_EVENT_DATE IS NULL
AND OPTION_ID <> 'WAIVE'
AND NOT EXISTS (SELECT 1 FROM EMPLOYEE_BILLING WHERE BILLING_GROUPING_ID
IN('HWMONTHLY','HWINDIVIDUALBILLED') AND END_DATE IS NULL)
I plan rewrite it without the subquery and use a left join instead, but this just boggled me that it works. The only time I seen code written like this without the join being qualified was when I seen code coming from an Oracle developer.
The subquery of NOT EXISTS is not used to return any (of the 1146) rows.
It is used to check if at least 1 row exists in the table EMPLOYEE_BILLING with the specified conditions:
BILLING_GROUPING_ID IN('HWMONTHLY','HWINDIVIDUALBILLED') AND END_DATE IS NULL
If there is such a row, then NOT EXISTS returns FALSE and since all the conditions in the WHERE clause of the main query are linked with the operator AND, then the final result is WHERE FALSE, making the query to not return any rows.
Don't rewrite the query with a LEFT join.
EXISTS and NOT EXISTS provide usually better performance than joins.
What you must change though, is that archaic join syntax with the ,.
Change it to a proper INNER join with an ON clause:
SELECT EM.PERSON_ID
FROM EMP_BEN_ELECTS EBE INNER JOIN EMPLOYEE_MAP EM
ON EBE.BW_ID = EM.BW_ID
WHERE CHANGE_BENEFIT_EVENT_DATE IS NULL
AND OPTION_ID <> 'WAIVE'
AND NOT EXISTS (
SELECT 1
FROM EMPLOYEE_BILLING
WHERE BILLING_GROUPING_ID IN('HWMONTHLY','HWINDIVIDUALBILLED')
AND END_DATE IS NULL
)
Also, you should qualify all the column names with the table's name/alias they belong to (CHANGE_BENEFIT_EVENT_DATE and OPTION_ID which I left unqualified because I don't know which alias to use).

SQL Not In vs Left Join

I ran into a problem today that I couldn't quite understand, so I was hoping for some outside knowledge. I was trying to find the number of items in a table where their id isn't referenced in another. I ran two different queries and seem to have conflicting results.
select count(*)
from TableA
where ID not in (select aID from TableB)
returns 0
select count(*)
from TableA a
left join TableB b on b.aID = a.ID
where b.aID is null
returns a few thousand.
All IDs in both TableA and TableB are unique. An ID from TableA never shows up in the aID column from TableB more than once. To me, it seems like I am querying the same thing but receiving different results. Where am I going wrong?
Do not use not in with a subquery. If any value in the subquery is NULL, then all rows are filtered out. These are the rules of how NULL is defined in SQL. The LEFT JOIN is correct.
The reason is that NULL means an unknown value. Almost any comparison with NULL returns NULL, which is treated as false. So, the only possibilities with NOT IN with NULL are that an element matches what you are looking for -- and the expression returns false -- or an element is NULL -- and the expression returns NULL which is treated as false.
I usually advise replacing the NOT IN with NOT EXISTS:
select count(*)
from TableA a
where not exists (select 1 from TableB b where b.aID = a.ID);
The LEFT JOIN performs correctly and usually has good performance.
We should always use the EXISTS operator if the columns involved are nullables. Also,Exist is faster than In clause.
Using IN/Not IN operator might produce an inferior plan and also can lead to misleading results if a null value is inserted in the table just like in you case.

Parameters table and conditions

In my current project there is a query where a set of parameters is given and I need to check those parameters against another table. Each of these parameters can be NULL and in this case has to be ignored. What I currently do is the following:
SELECT t.col1,
t.col2,
t.col3,
t.col4,
t.col5,
t.col6,
t.col7,
t.col8
FROM table1 t
INNER JOIN #parameters p ON (p.col1 IS NULL OR p.col1 = t.col1)
AND (p.col2 IS NULL OR p.col2 = t.col2)
AND (p.col3 IS NULL OR p.col3 = t.col3)
AND (p.col4 IS NULL OR p.col4 = t.col4)
AND (p.col5 IS NULL OR p.col5 = t.col5)
AND (p.col6 IS NULL OR p.col6 = t.col6)
AND (p.col7 IS NULL OR b.col7 >= t.col7)
AND (p.col8 IS NULL OR b.col8 <= t.col8)
This means if the column in the parameters table is NULL it will be ignored otherwise it will be compared to the corresponding column in table1. This works but unfortunately is VERY slow. Does anybody know a better solution (other then concatenating a string query)?
It seems like you don't have any real criteria that could be used to limit the data in your table, and that kind of structure usually never performs well. As far as I know, there's not much you can do to try to improve that.
Is any of these columns such that it is included in the parameters often (for all rows) and could limit the data a lot? You could use union to do something like this:
SELECT ...
FROM table1 t
INNER JOIN #parameters p ON p.col1 = t.col1 ...
union
SELECT ...
FROM table1 t
INNER JOIN #parameters p ... where p.col1 is NULL
If you're lucky something like that might work.
The other option that comes to my mind is somehow iterate the rows in the #parameters table, which is probably what you meant by string concatenating. Either by building a dynamic SQL with either or clauses or union or have a temp. table maybe with ignore dup key index and create & run dynamic insert clauses one by one for all the rows in parameters -table.

Using distinct keyword returns invalid rows in linked server

I am getting some strange results from a query in SQL server that I can't work out.
The query is
select
ref5
from
t_wholesale_history
left outer join
AP21..AP21.REFTBL on
reftbl.tblname = 'SOrder'
and reftbl.rg_ord = 5
left outer join
AP21..AP21.REFCODE on
refcode.code = t_wholesale_history.ref5
and refcode.rgidx = reftbl.rgidx
and refcode.active = 1
where
t_wholesale_history.ref5 is not null
and refcode.rcidx is null
which looks in a remote database and lists all records where REF5 doesn't exist. The above query returns 0 rows (which is correct).
Now, if I change the select ref5 to a select distinct ref5, I get 5 rows!
Why does using the word distinct returns rows? The rows it returns are not even the correct rows so this doesn't make sense to me.
Unfortunately I can't provide a fiddle of this due to the remote server complication, but I am guessing the fact I am getting these results has something to do with the remote server.
Any ideas?

Setting the ID in a fact table from a dimension table

In my dimension table for abandoned calls I have the ID 1 Code NO , ID 2 Code YES
I am wanting to load these ID's into the fact table based on whether or not the call was abandoned using a join.
How ever the problem I'm having it that the Abandoned value in my database is NULL for NO and 1 for YES.
So when i join
INNER JOIN datamartend.dbo.Abandoned_Call_Dim
ON incoming_measure.Abandoned = Abandoned_Call_Dim.abandoned_code
It's pulling no results?
Any ideas around this?
Basically what is needed is:
I want the abandoned ID from the abandoned dimension to be 2 if the abandonded value in measure is null and abandoned id 2 if not null
Thanks
You can use a CASE WHEN clause to get around this (or ISNULL, but case when is more portable across different DB engines)
INNER JOIN datamartend.dbo.Abandoned_Call_Dim
ON case when incoming_measure.Abandoned is null then '0'
else incoming_measure.Abandoned end
= case when Abandoned_Call_Dim.abandoned_code is null then '0'
else Abandoned_Call_Dim.abandoned_code end
This will replace nulls with 0. As long as you don't have a 0 code, you should be fine. If you do, try -1, or some other value you know is not in the possible set of codes.
Another thing to do if you have an unknown set of codes would be to do the join and add:
OR (incoming_measure.Abandoned is null and Abandoned_Call_Dim.abandoned_code is null)
Which doesn't technically join - it cross joins the null records (and as long as there's only one null that matters on the abandoned call dim, you're fine).
Can you check whether it is possible for you to use Decode function for the ID before doing Join.
Decode(value) = joining column
or try using
COALESCE(REPLACE(COL, VAL_TO_B_REPLACE_IF_NOT_NULL), VALUE_TO_REPLCE_WHEN_NULL)