Oracle "NOT IN" not returning correct result? - sql

I'm comparing two tables that share unique values between each other using NOT IN function in Oracle but I'm getting
select count(distinct CHARGING_ID) from BILLINGDB201908 where CDR_TYPE='GPRSO'
the output is: 521254 for all charging ids --< this is the total unique charging ID's in BILLINGDB201908
Now I want to find id's in table BILLINGDB201908 that also exist in table CBS_CHRG_ID_AUG
select count(distinct CHARGING_ID) from BILLINGDB201908 where CDR_TYPE='GPRSO'
AND charging_id IN (select CHARGINGID from CBS_CHRG_ID_AUG);
--- the result back315567 charging ID exist BILLINGDB201908 and also exist in CBS_CHRG_ID_AUG
Now I want to find charging ids that not exist in CBS_CHRG_ID_AUG but exist BILLINGDB201908
select count(distinct CHARGING_ID) from prmdb.CDR_TAPIN_201908#prmdb where CDR_TYPE='GPRSO'
AND charging_id NOT IN (select CHARGINGID from CBS_CHRG_ID_AUG);
--the result back 0 !? I should get 205687 exactly because 521254-315567 = 205687 ?

NOT IN returns no rows if any value from the subquery is NULL. Hence, I strongly, strongly recommend NOT EXISTS:
SELECT count(distinct CHARGING_ID)
FROM prmdb.CDR_TAPIN_201908#prmdb ct
WHERE CDR_TYPE = 'GPRSO' AND
NOT EXISTS (SELECT 1
FROM CBS_CHRG_ID_AUG ccia
WHERE ccia.charging_id = ct.charging_id
);
I also recommend changing your first query to EXISTS. In fact, just don't use IN and NOT IN with subqueries, and you won't have this problem.

The missing record is having null value CHARGINGID.
Please try doing select where CHARGINGID is null vs is not null

I would recommend not exists rather than not in; it is null-safe, and usually more efficient:
select count(distinct charging_id)
from billingdb201908 b
where
b.cdr_type = 'gprso'
and not exists (select 1 from cbs_chrg_id_aug a where a.chargingid = b.chargingid)

You can get this list using LEFT OUTER JOIN.
SQL to return list of charging ids that not exist in CBS_CHRG_ID_AUG but exist BILLINGDB201908 -
select count(distinct CHARGING_ID)
from prmdb.CDR_TAPIN_201908#prmdb a
left join CBS_CHRG_ID_AUG b on a.CHARGING_ID = b.CHARGINGID
where a.CDR_TYPE='GPRSO' and b.CHARGINGID is null;

There are two dangers with not in when the subquery key may contain nulls:
If there actually is a null value, you may not get the result you were expecting (as you have found). The database is actually correct, even though nobody in the history of SQL has ever expected this result.
Even if all key values are populated, if it is possible for the key column to be null (if it is not defined as not null) then the database has to check in case there is a null value, so queries are limited to inefficient row by row filter operations, which can perform disastrously for large volumes. (This was true historically, although these days there is a Null-aware anti-join and so the performance issue may not be so disastrous.)
create table demo (id) as select 1 from dual;
select * from demo;
ID
----------
1
create table table_with_nulls (id) as (
select 2 from dual union all
select null from dual
);
select * from table_with_nulls;
ID
----------
2
select d.id
from demo d
where d.id not in
( select id from table_with_nulls );
no rows selected
select d.id
from demo d
where d.id not in
( select id from table_with_nulls
where id is not null );
ID
----------
1
The reason is that 1 <> null is null, not false. If you substitute a fixed list for the not in subquery, it would be:
select d.id
from demo d
where d.id not in (2, null);
which is really the same thing as
select d.id
from demo d
where d.id <> 2 and d.id <> null;
Obviously d.id <> null will never be true. This is why your not in query returned no rows.

Related

Checking 2 tables for identical IDs and returning 1/0 in column if ID in table 1 exists in table 2

I have 2 tables, 1 containing all reservation ids, and 1 containing reservation ids for livestream reservations. I am trying to write a query that checks to see if a reservation id exists in the livestream table, and returns '1' if true & '0' if false. I figure the best way to do this is with a case statement that returns my result if the reservation id exists in the livestream table, but I am running into issues. Is there a better way to do this?
with table_name as(
select
reservation_id
from all_reservations
)
select t.*,
case when exists(l.reservation_id)
then '1'
else '0' end as is_livestream
from livestream_reservations l
left join table name t
on l.reservation_id = t.reservation_id
So long as reservation_id shows up with at most one record in livestream_reservations, this will work for you:
select r.*,
case
when l.reservation_id is null then 0
else 1
end as is_livestream
from reservations r
left join livestream_reservations l
on l.reservation_id = r.reservation_id;
The case relies on the fact that a failure to join to livestream_reservations returns null in all columns from that table.
In case there may be more than one row with the same reservation_id in the livestream_reservations table, then you could do this:
with ls_count as (
select reservation_id, count(*) as count_livestream
from livestream_reservations
group by reservation_id
)
select r.*, coalesce(lc.count_livestream, 0) as count_livestream
from reservations r
left join ls_count lc on lc.reservation_id = r.reservation_id;
I would recommend exists and using booleans:
select r.*,
(exists (select 1 from livestream_reservations lr where lr.reservation_id = r. reservation_id)
) as is_livestream
from reservations r;
There is a good chance that this is faster than other solutions. More importantly, it avoids problems with duplicates in livestream_reservations.

SQL Query – records within the SQL Select statement, but NOT in the table being queried

I have a large list of CustIDs that I need to query on to find if they are within the CUSTOMER table; I want the result to tell me which CustIDs ARE on the table and which CustIDs are NOT on the table.
I provided a short list below to give an idea of what I need to do.
Oracle database
Table: Customer
Primary Key: CustID
Scenario:
Customer table only has the following (2) CustID: ‘12345’, ‘56789’
Sql:
Select * from CUSTOMERS where CUSTID in (‘12345’, ‘56789’, ‘01234’);
I want the result to tell me that both ‘12345’ and ‘56789’ are in the table, AND that ‘01234’ is NOT.
select
v.CustID,
exists (select * from Customer where Customer.CustID = v.CustID)
from (values (12345), (56789), (01234)) v (CustID);
Results:
custid exists
12345 true
56789 true
1234 false
You need a left join or subquery for this. The precise syntax varies by database. Typical syntax is:
select i.custid,
(case when c.custid is not null then 1 else 0 end) as exists_flag
from (select '12345' as custid union all
select '56789' union all
select '01234'
) ci left join
customers c
on c.cust = i.custid;

Will this left join on same table ever return data?

In SQL Server, on a re-engineering project, I'm walking through some old sprocs, and I've come across this bit. I've hopefully captured the essence in this example:
Example Table
SELECT * FROM People
Id | Name
-------------------------
1 | Bob Slydell
2 | Jim Halpert
3 | Pamela Landy
4 | Bob Wiley
5 | Jim Hawkins
Example Query
SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL
Please disregard formatting, style, and query efficiency issues here. This example is merely an attempt to capture the exact essence of the real query I'm working with.
After looking over the real, more complex version of the query, I burned it down to this above, and I cannot for the life of me see how it would ever return any data. The LEFT JOIN should always exclude everything that was just selected because of the b.Name IS NULL check, right? (and it being the same table). If a row from People was found where b.Name IS NULL evals to true, then shouldn't that mean that data found in People a was never found? (impossible?)
Just to be very clear, I'm not looking for a "solution". The code is what it is. I'm merely trying to understand its behavior for the purpose of re-engineering it.
If this code indeed never returns results, then I'll conclude it was written incorrectly and use that knowledge during the re-engineering.
If there is a valid data scenario where it would/could return results, then that will be news to me and I'll have to go back to the books on SQL Joins! #DrivenCrazy
Yes. There are circumstances where this query will retrieve rows.
The query
SELECT a.*
FROM (
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.PName = b.PName
WHERE b.PName IS NULL;
is roughly (maybe even exactly) equivalent to...
select distinct Id, PName
from People
where Id > 3 and PName is null;
Why?
Tested it using this code (mysql).
create table People (Id int, PName varchar(50));
insert into People (Id, Pname)
values (1, 'Bob Slydell'),
(2, 'Jim Halpert'),
(3,'Pamela Landy'),
(4,'Bob Wiley'),
(5,'Jim Hawkins');
insert into People (Id, PName) values (6,null);
Now run the query. You get
6, Null
I don't know if your schema allows null Name.
What value can P.Name have such that a.PName = b.PName finds no match and b.PName is Null?
Well it's written right there. b.PName is Null.
Can we prove that there is no other case where a row is returned?
Suppose there is a value for (Id,PName) such that PName is not null and a row is returned.
In order to satisfy the condition...
where b.PName is null
...such a value must include a PName that does not match any PName in the People table.
All a.PName and all b.PName values are drawn from People.PName ...
So a.PName may not match itself.
The only scalar value in SQL that does not equal itself is Null.
Therefore if there are no rows with Null PName this query will not return a row.
That's my proposed casual proof.
This is very confusing code. So #DrivenCrazy is appropriate.
The meaning of the query is exactly "return people with id > 3 and a null as name", i.e. it may return data but only if there are null-values in the name:
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3 and PName is null
The proof for this is rather simple, if we consider the meaning of the left join condition ... LEFT JOIN People b ON a.PName = b.PName together with the (overall) condition where p.pname is null:
Generally, a condition where PName = PName is true if and only if PName is not null, and it has exactly the same meaning as where PName is not null. Hence, the left join will match only tuples where pname is not null, but any matching row will subsequently be filtered out by the overall condition where pname is null.
Hence, the left join cannot introduce any new rows in the query, and it cannot reduce the set of rows of the left hand side (as a left join never does). So the left join is superfluous, and the only effective condition is where PName is null.
LEFT JOIN ON returns the rows that INNER JOIN ON returns plus unmatched rows of the left table extended by NULL for the right table columns. If the ON condition does not allow a matched row to have NULL in some column (like b.NAME here being equal to something) then the only NULLs in that column in the result are from unmatched left hand rows. So keeping rows with NULL for that column as the result gives exactly the rows unmatched by the INNER JOIN ON. (This is an idiom. In some cases it can also be expressed via NOT IN or EXCEPT.)
In your case the left table has distinct People rows with a.Id > 3 and the right table has all People rows. So the only a rows unmatched in a.Name = b.Name are those where a.Name IS NULL. So the WHERE returns those rows extended by NULLs.
SELECT * FROM
(SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL) a
LEFT JOIN People b ON 1=0;
But then you SELECT a.*. So the entire query is just
SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL;
sure.left join will return data even if the join is done on the same table.
according to your query
"SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL"
it returns null because of the final filtering "b.Name IS NULL".without that filtering it will return 2 records with id > 3

SELECT Statement in CASE

Please don't downgrade this as it is bit complex for me to explain. I'm working on data migration so some of the structures look weird because it was designed by someone like that.
For ex, I have a table Person with PersonID and PersonName as columns. I have duplicates in the table.
I have Details table where I have PersonName stored in a column. This PersonName may or may not exist in the Person table. I need to retrieve PersonID from the matching records otherwise put some hardcode value in PersonID.
I can't write below query because PersonName is duplicated in Person Table, this join doubles the rows if there is a matching record due to join.
SELECT d.Fields, PersonID
FROM Details d
JOIN Person p ON d.PersonName = p.PersonName
The below query works but I don't know how to replace "NULL" with some value I want in place of NULL
SELECT d.Fields, (SELECT TOP 1 PersonID FROM Person where PersonName = d.PersonName )
FROM Details d
So, there are some PersonNames in the Details table which are not existent in Person table. How do I write CASE WHEN in this case?
I tried below but it didn't work
SELECT d.Fields,
CASE WHEN (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) = null
THEN 123
ELSE (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) END Name
FROM Details d
This query is still showing the same output as 2nd query. Please advise me on this. Let me know, if I'm unclear anywhere. Thanks
well.. I figured I can put ISNULL on top of SELECT to make it work.
SELECT d.Fields,
ISNULL(SELECT TOP 1 p.PersonID
FROM Person p where p.PersonName = d.PersonName, 124) id
FROM Details d
A simple left outer join to pull back all persons with an optional match on the details table should work with a case statement to get your desired result.
SELECT
*
FROM
(
SELECT
Instance=ROW_NUMBER() OVER (PARTITION BY PersonName),
PersonID=CASE WHEN d.PersonName IS NULL THEN 'XXXX' ELSE p.PersonID END,
d.Fields
FROM
Person p
LEFT OUTER JOIN Details d on d.PersonName=p.PersonName
)AS X
WHERE
Instance=1
Ooh goody, a chance to use two LEFT JOINs. The first will list the IDs where they exist, and insert a default otherwise; the second will eliminate the duplicates.
SELECT d.Fields, ISNULL(p1.PersonID, 123)
FROM Details d
LEFT JOIN Person p1 ON d.PersonName = p1.PersonName
LEFT JOIN Person p2 ON p2.PersonName = p1.PersonName
AND p2.PersonID < p1.PersonID
WHERE p2.PersonID IS NULL
You could use common table expressions to build up the missing datasets, i.e. your complete Person table, then join that to your Detail table as follows;
declare #n int;
-- set your default PersonID here;
set #n = 123;
-- Make sure previous SQL statement is terminated with semilcolon for with clause to parse successfully.
-- First build our unique list of names from table Detail.
with cteUniqueDetailPerson
(
[PersonName]
)
as
(
select distinct [PersonName]
from [Details]
)
-- Second get unique Person entries and record the most recent PersonID value as the active Person.
, cteUniquePersonPerson
(
[PersonID]
, [PersonName]
)
as
(
select
max([PersonID]) -- if you wanted the original Person record instead of the last, change this to min.
, [PersonName]
from [Person]
group by [PersonName]
)
-- Third join unique datasets to get the PersonID when there is a match, otherwise use our default id #n.
-- NB, this would also include records when a Person exists with no Detail rows (they are filtered out with the final inner join)
, cteSudoPerson
(
[PersonID]
, [PersonName]
)
as
(
select
coalesce(upp.[PersonID],#n) as [PersonID]
coalesce(upp.[PersonName],udp.[PersonName]) as [PersonName]
from cteUniquePersonPerson upp
full outer join cteUniqueDetailPerson udp
on udp.[PersonName] = p.[PersonName]
)
-- Fourth, join detail to the sudo person table that includes either the original ID or our default ID.
select
d.[Fields]
, sp.[PersonID]
from [Details] d
inner join cteSudoPerson sp
on sp.[PersonName] = d.[PersonName];

if...else in sql

I'm a beginner at SQL and have this fairly easy conditional problem: Every installation number in the database has a customer. But I have been told that the customer is in either the AUDEB table or the AFORD table. I should first look in AUDEB for CUSTOMER_NO and use that if it is not NULL. If it is NULL, then take the CUSTOMER_NO from the AFORD table.
Use this if CUSTOMER_NO is not NULL
SELECT CUSTOMER_NO
FROM AUDEB
WHERE INST_NO = 2
Else use this CUSTOMER_NO
SELECT CUSTOMER_NO
FROM AFORD
WHERE INST_NO = 2
I see that there exist IF...ELSE condtions in SQL, but isn't there an easier way of selecting between the values from two queries where I want to use the first if the result is not null, else use the other?
You could union the tables using a subquery to retrieve a complete list of customers:
select CUSTOMER_NO
from (
select CUSTOMER_NO
, INST_NO
from AUDEB
union all
select CUSTOMER_NO
, INST_NO
from AFORD
) as all_customers
where INST_NO = 2
If the two tables follow the same schema and there is no overlap in customrer_no you could use UNION:
SELECT T.CUSTOMER_NO
FROM (SELECT CUSTOMER_NO, INST_NO FROM AUDEB
UNION
SELECT CUSTOMER_NO, INST_NO FROM AFORD) AS T
WHERE T.INST_NO = 2
Or if the inst_no could be in both tables then join them (even if the schemas differ)
SELECT COALESCE(T1.Customer_no, T2.CUSTOMER_NO)
FROM AUDEB as T1
FULL OUTER JOIN AFORD as T2
ON T1.INST_NO = T2.INST_NO
WHERE T1.INST_NO = 2 OR T2.INST_NO = 2
COALESCE will return the first non null result
Just join both tables and use ISNULL() to get the value from the corresponding table.
SELECT ISNULL(A.CUSTOMER_NO, B.CUSTOMER_NO) AS CUSTOMER_NO
FROM AUDEB A INNER JOIN AFORD B
ON A.INST_NO = B.INST_NO
WHERE A.INST_NO = 2
Edit: This assumes INST_NO is the primary key, but now it's been stated in comments that it is not. The OP should use the correct fields to join this 2 tables.