I am trying to write a single query that simply looks for a record based on 2 values. However, if the record doesn't exist I want to search again where 1 of the values (last name) is null. I'm trying to figure out if this is possible outside of PL/SQL through some use of EXISTS or IN keywords.
SELECT t.id
FROM table t
WHERE t.first_name = :firstName AND
EXISTS (SELECT t.id FROM table t WHERE t.first_name = :firstName AND t.last_name = :lastName)
ELSE t.last_name IS NULL;
EDIT:
I have 2 records:
(1, John, null) & (2, John, Frank)
If we search for John Jonas then, we expect 1 to be returned. Alternatively, if we search for John Frank we expect 2 to be returned.
You might use COALESCE:
select t.id
from my_table t
where t.first_name = :firstName
and coalesce(t.last_name, :lastName) = :lastName;
The above query returns all the rows where first_name is equal to :firstName and last_name is null or equal to :lastName. The logic you want (conditional querying) is much more complex:
with condition(do_exist) as (
select exists(
select from my_table
where first_name = :firstName
and last_name = :lastName)
)
select id
from my_table
cross join condition
where first_name = :firstName
and case when do_exist then last_name = :lastName else last_name is null end;
Test it in Db<>fiddle.
Related
I have a table as following.
id | firstname| lastname | email | homephone
-------------------------------------------------------
1 | aaa | bbb | xxx#yyy.com | 12344444
2 | aaa | bbb | null | null
3 | ccc | ddd | zzz#fff.com | null
4 | ccc | ddd | null | 34343322
The issue is I want to keep only 1 record since these are considered as duplicates and merge the nulls so that the table appears as follows
1 aaa | bbb | xxx#yyy.com | 12344444
3 ccc | ddd | zzz#fff.com | 3433322
So far I have managed to get the duplicates using the following the code
Select
max(a.id) as original id, b.id as DuplicateId,
a.firstname, b.firstname as dup_fname,
a.lastname, b.lastname as dup_lname,
a.email, b.email
From
tbl_xxx a
join
tbl_xxx b on a.firstname = b.firstname
and a.lastname = b.lastname
and a.email is null
and a.homephone is null
and b.email is null
and b.homephone is null
and v.id < v2.id
Group by
b.id, a.firstname, b.firstname, a.lastname,
b.lastname, a.homephone, b.homephone
My merge query looks like this
update tbl_xxx
SET
email = email ,
phone = phone
where
firstname = firstname
and lastname = lastname
and email is null
and phone is null
Eventually I will get distinct rows.
Is my approach correct? kindly suggest how can I make my query more efficient
update tbl_tmpdupes3 SET
email = email ,
phone = phone ,
where
firstname=firstname
and lastname=lastname
and email is null
and homephone is null
will do absolutely nothing, since the query is not comparing the table against itself, but each row against itself. Using an update won't work either, you'll still have duplicates. What you want is to remove the duplicate data entirely, AFTER doing an update comparing the table to itself. So, basically we run one query that ensures all information in duplicates that is not null is copied to the originals, then we delete the higher value pk.
One way of tackling the problem update one column at a time:
update tbl_xxx SET tbl_xxx.email = tmp.email
FROM (SELECT tbl_xxx.firstname,tbl_xxx.lastname,tbl_xxx.email FROM tbl_xxx
WHERE NOT tbl_xxx.email IS NULL LIMIT 1)
AS tmp ON tmp.firstname = tbl_xxx.firstname AND tmp.lastname = tbl_xxx.lastname
WHERE tbl_xxx.email IS NULL;
update tbl_xxx SET tbl_xxx.phone = tmp.phone
FROM (SELECT tbl_xxx.firstname,tbl_xxx.lastname,tbl_xxx.phone FROM tbl_xxx
WHERE NOT tbl_xxx.phone IS NULL LIMIT 1)
AS tmp ON tmp.firstname = tbl_xxx.firstname AND tmp.lastname = tbl_xxx.lastname
WHERE tbl_xxx.phone is NULL;
The query finds the values of each column for first name and last name, and copies over first value it finds into null fields. So, if the original data was missing, it will add it. It may not be 100% correct if two different people in the DB have the same name, you'll have to take that into consideration.
That said, follow it up with this query, which should only delete the higher-pk row that is identical.
DELETE FROM tbl_xxx WHERE tbl_xxx.id IN (
SELECT max(id) FROM tbl_xxx
GROUP BY tbl_xxx.firstname,tbl_xxx.lastname,tbl_xxx.phone,tbl_xxx.email
HAVING count(tbl_xxx.id) > 1));
Edit: if there are potentially multiple duplicates, you could do:
DELETE FROM tbl_xxx WHERE tbl_xxx.id NOT IN (
SELECT min(id) FROM tbl_xxx
GROUP BY tbl_xxx.firstname,tbl_xxx.lastname,tbl_xxx.phone,tbl_xxx.email);
you can use Merge statement for this. Try this sample.
create table temptable (id int, firstname varchar(50), lastname varchar(50), email varchar(50), homephone varchar(50))
insert into temptable values
(1,'aaa' , 'bbb', 'xxx#yyy.com', 1234444),
(2,'aaa' , 'bbb', null, null),
(3,'ccc' , 'ddd', 'abc#ddey.com', null),
(4,'ccc' , 'ddd', null, 34343322 )
select * from temptable
;with cte as
(
select firstname, lastname
,(select top 1 id from temptable b where b.firstname = a.firstname and b.lastname = a.lastname and ( b.email is not null or b.homephone is not null)) tid
,(select top 1 email from temptable b where b.firstname = a.firstname and b.lastname = a.lastname and b.email is not null ) email
,(select top 1 homephone from temptable b where b.firstname = a.firstname and b.lastname = a.lastname and b.homephone is not null ) homephone
from temptable a
group by firstname , lastname
)
--select * from cte
merge temptable as a
using cte as b
on ( a.id = b.tid )
when matched
then
update set a.email = b.email , a.homephone = b.homephone
when not matched by source then
delete ;
select * from temptable
drop table temptable
I have this SQL statement (modified, because the real query is huge):
select tblInfo.IDNum, tblAddress.PrimaryAddress
from tblInfo
join tblAddress
on tblInfo.Agent = tblAddress.Agent
where (some stuff)
And I get a table that looks roughly like this:
|| IDNum || PrimaryAddress ||
-----------------------------
|| 01234 || 1 ||
|| 23456 || 1 ||
|| abcde || 0 ||
|| abcde || 1 ||
|| zyxwv || 0 ||
I need a way to return all records that have a PrimaryAddress of 1, as well as all records that have a PrimaryAddress of 0 and don't have an IDNum already returning the PrimaryAddress of 1. i.e. In the above example, (abcde || 0) should be excluded because (abcde || 1) exists.
Use NOT EXISTS
SELECT tblInfo.IDNum, tblAddress.PrimaryAddress
FROM tblInfo
INNER JOIN tblAddress
ON tblInfo.Agent = tblAddress.Agent
WHERE tblAddress.PrimaryAddress = 1
OR ( tblAddress.PrimaryAddress = 0 AND NOT EXISTS
(
SELECT 1 FROM tblInfo t2 INNER JOIN tblAddress a2 ON t2.Agent = a2.Agent
WHERE t2.IDNum = tblInfo.IDNum AND a2.PrimaryAddress = 1
)
)
In this case, a simple GROUP BY should work for what you are trying to do. Effectively you are saying you want all IDNum values to appear once, with the PrimaryAddress value corresponding to the highest value (1 if it exists, 0 if it doesn't).
Assuming you need to preserve your original query because you're doing other work with it, you could use:
SELECT IDNum, MAX(PrimaryAddress) AS PrimaryAddress
FROM
(
select tblInfo.IDNum, tblAddress.PrimaryAddress
from tblInfo
join tblAddress
on tblInfo.Agent = tblAddress.Agent
where (some stuff)
)
GROUP BY IDNum
This should work in MS SQL Server and Oracle, not sure about other DBMSs. If the nested query doesn't work in the DBMS you're using, you should be able to populate a temporary table with the results of your first query, then perform the grouping against that table.
I hope this helps
select * from tblAddress mainTBL
where mainTBL.primaryaddress = (Case when EXISTS(select t1.IDNUM from tblAddress t1 where
mainTBL.IDNUM = t1.IDNUM AND t1.primaryAddress = 1) THEN 1 ELSE 0 END)
Check the SQL fiddle
I need to compare email record (by email address) between two tables (Table A as Production data and B as old data) to find the difference and show the result in column such as "New", "Delete" and etc.
If exists in Table A, not in Table B, it should mark "New"
else if exists in Table B, not in Table A, it should mark "Delete"
if appears in both table, it should mark "Maintain"
I want the result like that
DisplayName LastName Diremail Result
==============================================
XXX XXX a#a.com New
ABC ABC 1#a.com Delete
DDD DDD 2#a.com Maintain
My code as follows:
SELECT b.DisplayName,
b.LastName,
b.diremail,
Result = CASE WHEN a.DirEmail IS NULL THEN 'New'
when b.DirEmail IS null then 'delete'
else 'Maintain'
END
FROM vHRIS_StaffDB b
LEFT JOIN HRIS_DL_Lists a
ON a.DirEmail = b.DirEmail
WHERE (
a.DirEmail IS NULL
OR a.DisplayName != b.DisplayName
)
but the data not correct as the code not return record which should "Delete"
(found in table b, not in table a)
pls advise. Thanks.
Sounds like you need full join instead of left join. Try this:
SELECT coalesce(b.DisplayName, a.DisplayName) DisplayName,
coalesce(b.LastName, a.LastName) LastName,
coalesce(b.diremail, a.diremail) diremail,
Result = CASE WHEN a.DirEmail IS NULL THEN 'New'
when b.DirEmail IS null then 'delete'
else 'Maintain'
END
FROM vHRIS_StaffDB b
FULL JOIN HRIS_DL_Lists a
ON a.DirEmail = b.DirEmail
WHERE (
a.DirEmail IS NULL
OR a.DisplayName != b.DisplayName
)
If I have two tables, A and B which have identical layout of:
Forename
Middlename
Surname
Date of Birth
Table A contains my data, table B contains data I wish to compare to table A.
I'd like to return all matches that are full matches (Forename, Middlename and Surname) as well as partial matches (First initial, surname, dob).
What would be the most efficient way of doing this and being able to distinguish between the two?
My initial thoughts are that I could do this with two passes however there must be a more efficient way as over a large number of records this could be quite inefficient.
You can do this:
select T1.*, T2.*, 'exact-match' as mode
from T1 inner join T2
on T1.fname = T2.fname
and T1.mname = T2.mname
and T1.lname = T2.lname
and t1.dob = T2.dob
UNION
select t1.*, t2.*, 'partial-match' as mode
from T1 inner join T2
on left(T1.fname,1) = LEFT(T2.fname,1)
and T1.lname = T2.lname
and T1.dob = T2.dob
where T1.fname <> T2.fname
The last line is there because otherwise exact matches would also satisfy the partial match test. You can get rid of that where-clause if you like. The second part of the query ignores middle name, and treats "Tim Q Jones" and "Tom X Jones" as a partial-match if they're born on the same day. That's what you asked for, right?
If you really want to avoid two queries, you could do something like this:
SELECT A.*,
CASE WHEN A.Middlename <> B.Middlename) THEN 'Partial'
ELSE 'Full'
END AS MatchType
FROM A
JOIN B ON (A.Forename = B.Forename AND
A.Middlename = B.Middlename AND
A.Surname = B.Surname)
OR
(LEFT(A.Forename,1) = LEFT(B.Forename,1) AND
A.Surname = B.Surname AND
A.DoB = B.DoB)
A JOIN with two different sets of JOIN criteria, and a case in the select that identifies which of the sets must have resulted in the joined records (If Middlename doesn't match, it must not have been a "full" match that resulted in the join).
This will do it in a single pass.
The condition for recognizing a full match has to be on both forename and middlename, otherwise it will classify some matches incorrectly.
select Forename, Middlename, Surname, DateOfBirth,
Case
when A.ForeName=B.ForeName and A.Middlename = B.middlename then 'full'
Else 'partial'
end as MatchType
from A
inner join B on
-- (Forename, Middlename and Surname)
(A.ForeName=B.ForeName
and A.Middlename = B.middlename
and A.Surname = B.surname)
or
-- (First initial, surname, dob)
(A.ForeName LIKE LEFT(B.ForeName,1)+'%'
and A.Surname = B.surname
and A.DateOfBirth = B.DateOfBirth)
Select
T1.Forename
, T1.Middlename
, T1.Surname
, T1.[Date of Birth]
, Case When T1.[Forename] = T2.[Forename] and T1.Middlename = T2.Middlename
Then 'Full'
else 'Partial'
end as Match_Type
From Table1 as T1
Inner Join Table2
on Left(Table1.[Forename], 1) = Left(Table2.[Forename], 1)
and Table1.[Date Of Birth] = Table2.[Date Of Birth]
and Table1.Surname = Table2.Surname
Can anyone tell me the exact syntax for NOT IN condition in SQL on two columns.
This is my query written in VBA.
strNewSql = "SELECT distinct(tblRevRelLog_Detail.PartNumber), tblRevRelLog_Detail.ChangeLevel, tblRevRelLog_Detail.ID FROM tblRevRelLog_Detail LEFT JOIN tblEventLog ON tblRevRelLog_Detail.PartNumber = tblEventLog.PartNumber"
strNewSql = strNewSql & " WHERE (tblEventLog.PartNumber) Not In(SELECT tblEventLog.PartNumber FROM tblEventLog WHERE tblEventLog.EventTypeSelected = 'pn REMOVED From Wrapper') AND tblEventLog.TrackingNumber = """ & tempTrackingNumber & """ AND tblEventLog.TrackingNumber = tblRevRelLog_Detail.RevRelTrackingNumber;"
I want to change this sub query like, it should apply on the combination of two columns as follows:
strNewSql = "SELECT tblRevRelLog_Detail.PartNumber, tblRevRelLog_Detail.ChangeLevel, tblRevRelLog_Detail.ID FROM tblRevRelLog_Detail LEFT JOIN tblEventLog ON tblRevRelLog_Detail.PartNumber = tblEventLog.PartNumber"
strNewSql = strNewSql & " WHERE (((tblEventLog.PartNumber, tblEventLog.PartNumberChgLvl) Not In(SELECT tblEventLog.PartNumber,tblEventLog.PartNumberChgLvl FROM tblEventLog WHERE tblEventLog.EventTypeSelected = 'pn REMOVED From Wrapper') AND tblEventLog.TrackingNumber = """ & tempTrackingNumber & """ AND tblEventLog.TrackingNumber = tblRevRelLog_Detail.RevRelTrackingNumber);"
But this is not working.....
You can't use IN with more than one column but you can usually achieve the same effect using EXISTS:
SELECT *
FROM tbl1
WHERE NOT EXISTS
(
SELECT *
FROM tbl2
WHERE tbl2.col1 = tbl1.col1
AND tbl2.col2 = tbl1.col2
)
General syntax:
where col not in (items)
Items can be
a list of items -- (4,5,3,5,2) or ('243','3','cdds') or any other datatype.
Or a select statement (select hatefulthings from table)
Addition 6 years later
All major platforms support tuples with NOT IN, for example
SELECT *
FROM empoyee
WHERE (empID, #date) NOT IN
(SELECT empID, vacationDay
FROM empVacation
)
In this example we select everything from the employee table where the tuple of employee id and date are not in a table containing vacation days.
Your question is a bit unclear. Is this what you need?
SELECT *
FROM
MY_TABLE MT
WHERE 'Smith' NOT IN (MT.FIRST_NAME)
AND 'Smith' NOT IN (MT.LAST_NAME)
This will show you all records where the search phrase ("Smith") is in neither the first_name nor the last_name column.
Perhaps you meant
SELECT *
FROM MY_TABLE
WHERE (FIRST_NAME, LAST_NAME) NOT IN (SELECT FIRST_NAME, LAST_NAME
FROM SOME_OTHER_TABLE)
This is allowed under Oracle - not sure about SQL Server.
Share and enjoy.
Select * from some_table
where (((Some_Value) Not IN (Select Column1 & Column2 from Some_Other_Table)));
I saw where you indicated this was for Access
It is not clear what you are asking, but I gather that you would like to have a NOT IN condition based on two columns instead of just one.
As an example, lets say you have a column called F_Name and another called L_Name (both variable in size), and that you want to exclude specific combination of those names from another table that has them already combined as NAME. In that case, you could do this:
select F_name
, L_name
, col1
, coln
from mytable1
where F_name -- First name (variable length)
|| ' ' -- appended to a blank space
|| L_name -- appended to the last name (v)
not in -- is not one of these names
( select name
from mytable2
where ...
)
The main issue with this query is that you must get the formatting just right so that they can match up exactly.
If you are dealing with a combination of fields with different types, like numeric and timestamps, then use any of the conversion commands (DECIMAL, INTEGER, CHAR, SUBSTR ...) at your disposal to convert to text equal and then match it up accordingly.