Find matching rows in database table using SQL where no matching key is present - sql

I have an old table with legacy data and approx 10,000 rows and a new table with about 500 rows. The columns are the same in both tables. I need to compare a few columns in the new table with the old one and report on data that is duplicated in the new table.
I've researched articles with similar issues, attempted table joins and where exists / where not exists clauses but I just can't get the SQL right. I have included my latest version.
One issue causing trouble for me, I think, is that there is no "Key" as such like a userid or similar unique identifier in either table.
What I want to do is find the data in the "new" table where all rows except for the "reference_number" (doesn't matter if it does or does not) is duplicated, i.e. exists already in the "old" table.
I have this so far...
select
old.reference_number
new.reference_number
new.component
new.privileges
new.protocol
new.authority
new.score
new.means
new.difficulty
new.hierarchy
new.interaction
new.scope
new.conf
new.integrity
new.availability
new.version
from old, new
where
old.component = new.component
old.privileges = new.privileges
old.protocol = new.protocol
old.authority = new.authority
old.score = new.score
old.means = new.means
old.difficulty = new.difficulty
old.hierarchy = new.hierarchy
old.interaction = new.interaction
old.scope = new.scope
old.conf = new.conf
old.integrity = new.integrity
old.availability = new.availability
old.version = new.version
I have tried this here but it doesn't seem to pull out ALL of the data for some reason.
It is evident that actually there are MORE rows in the old table that are duplicated in the new table but I'm only getting a small number of rows returned from the query.
Can anyone spot why that might be, is there another way I should be approaching this?
If it matters, this is Postgresql.
Thanks for any help given.

The following should do what you want:
select distinct o.reference_number,
n.reference_number,
n.component,
n.privileges,
n.protocol,
n.authority,
n.score,
n.means,
n.difficulty,
n.hierarchy,
n.interaction,
n.scope,
n.conf,
n.integrity,
n.availability,
n.version
from new n
inner join old o
on o.component = n.component and
o.privileges = n.privileges and
o.protocol = n.protocol and
o.authority = n.authority and
o.score = n.score and
o.means = n.means and
o.difficulty = n.difficulty and
o.hierarchy = n.hierarchy and
o.interaction = n.interaction and
o.scope = n.scope and
o.conf = n.conf and
o.integrity = n.integrity and
o.availability = n.availability and
o.version = n.version

You should use left join and then select only rows with new values is null. sql should be something like this:
select
old.reference_number
new.reference_number
new.component
new.privileges
new.protocol
new.authority
new.score
new.means
new.difficulty
new.hierarchy
new.interaction
new.scope
new.conf
new.integrity
new.availability
new.version
from old
left join new
on
old.component = new.component
old.privileges = new.privileges
old.protocol = new.protocol
old.authority = new.authority
old.score = new.score
old.means = new.means
old.difficulty = new.difficulty
old.hierarchy = new.hierarchy
old.interaction = new.interaction
old.scope = new.scope
old.conf = new.conf
old.integrity = new.integrity
old.availability = new.availability
old.version = new.version
where new.component is null

Related

Issues with Joins where joins are reliant on 2 conditions - SQL Server 2014

Image - click here for sample data tables
I have attached an image of the tables and some sample data and I am trying to create a query that returns the following in the select statement, but am getting a bit confused and don't seem to be able to quite get the joins and where clauses right.
OFF_HIRES.Off_Hire_ID,
HIRE_CONTRACTS.Contract_Number,
MOBILE_JOB.MBW_Completed_Time,
Collection Charge - (This is the sum of the Charge_Amount for each HIRE_ITEMS.Contract_Number
where the HIRE_ITEMS.Stock_Number = 'COLLECTION' ),
TOTAL WEIGHT - (This is the sum of the STOCK_ITEMS.Weight * OFF_HIRE_ITEMS.QTY_OFF_HIRED)
The Links between the tables are as follows
OFF_HIRES.Contract_Number = Hire_Contracts.Contract_Number
OFF_HIRES.Off_hire_ID = OFF_HIRE_ITEMS.Off_hire_ID
HIRE_CONTRACTS.Contract_Number = HIRE_ITEMS.Contract_Number
HIRE_ITEMS.Stock_Number = STOCK_ITEMS.Stock_Number
MOBILE_JOB.Mbw_Doc_ID = OFF_HIRE_ITEMS.Off_Hire_Conntract_Number
and then there is a link between the OFF_HIRE_ITEMS and HIRE_ITEMS but requires both links to be true -
OFF_HIRE_ITEMS.Off_Hire_Contract_Number = HIRE_ITEMS.Contract_Number AND
OFF_HIRE_ITEMS.OH_ITEM_SEQ = HIRE_ITEMS.HIT_SEQ
You data structure is a little mess, but i think i got you:
SELECT
OFF_HIRES.Off_Hire_ID,
HIRE_CONTRACTS.Contract_Number,
(SELECT MOBILE_JOB.MBW_Completed_Time
FROM Mobile_JOB
WHERE Mobile_JOB.MB_Doc_ID = OFF_HIRES.Off_HIRE_ID) 'MBW_Completed_Time',
(SELECT SUM(Charge_Amount)
FROM HIRE_ITEMS
WHERE HIRE_ITEMS.Contract_Number = OFF_HIRES.Contract_Number
AND HIRE_ITEMS.Stock_Number = 'COLLECTION') 'SumCollectionCharge',
(SELECT SUM(ISNULL(STOCK_ITEMS.Weight,0) * OFF_HIRE_ITEMS.QTY_OFF_HIRED)
FROM HIRE_ITEMS
INNER JOIN OFF_HIRE_ITEMS
ON OFF_HIRE_TIEMS.OH_ITEM_SEQ = HIRE_ITEMS.HIT_SEQ
AND OFF_HIRE_ITEMS.Off_Hire_Contract_Number = HIRE_ITEMS.Contract_Number
INNER JOIN STOCK_ITEMS
ON STOCK_ITEMS.Stock_Number = HIRE_ITEMS.Stock_Number
) 'SumTotalWeight'
FROM OFF_Hires
INNER JOIN HIRE_CONTRACTS
ON HIRE_CONTRACTS.Contract_Number = OFF_Hires.Contract_Number

Access "find unmatched query" return records which are available both tables when one of the fields is empty

I am trying to list records from table N that do not have a match in table O
The result is shown in query N Without Matching O
As one can see on above image, it is returning records which exist in both tables, which it should not.
In above example table N and table O are identical (copy)
The query is as follows:
SELECT N.*
FROM N LEFT JOIN O ON (N.[F8] = O.[F8]) AND (N.[F7] = O.[F7]) AND (N.[F6] = O.[F6]) AND (N.[F5] = O.[F5]) AND (N.[F4] = O.[F4]) AND (N.[F3] = O.[F3]) AND (N.[F2] = O.[F2]) AND (N.[F1] = O.[F1])
WHERE (((O.F1) Is Null));
I guess it is because of empty values (e.g. field F5 in the example above)
Further down the table other fields may be empty as well
My question is:
How do I return unmatched records from a table which may contain empty cells
Try using Nz on that field (F5):
SELECT N.*
FROM N LEFT JOIN O ON (N.[F8] = O.[F8]) AND (N.[F7] = O.[F7]) AND (N.[F6] = O.[F6]) AND (Nz(N.[F5]) = Nz(O.[F5])) AND (N.[F4] = O.[F4]) AND (N.[F3] = O.[F3]) AND (N.[F2] = O.[F2]) AND (N.[F1] = O.[F1])
WHERE (((O.F1) Is Null));

sql query is not working properly

i am trying to get non matching records from two table by comparing some columns which are common in both tables.i am using sql query to get the result. my first table is snd_marketvisits this table have properties like id ,pjpCode , section code, popCode .pop_name and landmark similary my 2nd table have pjpcode , section code, popcode popname are common and there are some other fields.i want to get the names of the pop which are not in second table but present in snd_marketvisit table by comparing popcode, sectioncode and pjpcode in both tables.
SELECT *
FROM snd_marketvisits sm
LEFT JOIN snd_marketvisit_pops sp ON
sm.distributorCode = sp.distributor AND
sm.pjpCode = sp.pjp AND
sm.sectionCode = sp.sectionCode AND
sm.popCode = sp.popCode
WHERE
sm.sectionCode = '00016' AND
sm.pjpCode = '0001' AND
sm.distributorCode = '00190A'
It depends on the database, as far as I know, but if you ask for NULL inside your yoined fields you should get only the rows without a match.
SELECT *
FROM snd_marketvisits sm
LEFT JOIN snd_marketvisit_pops sp ON
sm.distributorCode = sp.distributor AND
sm.pjpCode = sp.pjp AND
sm.sectionCode = sp.sectionCode AND
sm.popCode = sp.popCode
WHERE
sm.sectionCode = '00016' AND
sm.pjpCode = '0001' AND
sm.distributorCode = '00190A'
AND sp.distributor IS NULL

Postgres no results from linked tables where linked values are null

I have this query:
update CA_San_Francisco as p
set geo = u.geo
from parcels_union u
where u.street_number = p.street_number
and u.street_name = p.street_name
and u.street_type = p.street_type
and u.street_direction = p.street_direction
and u.street_unit = p.street_unit
However it does not update any rows where both fields are null. In other words, if there is no value in street_direction for both tables, I get no result even though they are both the same - both null values.
I get that something can't be = Null. So how do I get all of the results?
Thanks, Brad
You could check if the field is NULL and if it is then change it to something you can check for.
For instance, you can change your Where clause to the following
where coalesce(u.street_number,'') = coalesce(p.street_number,'')
and coalesce(u.street_name,'') = coalesce(p.street_name,'')
and coalesce(u.street_type,'') = coalesce(p.street_type,'')
and coalesce(u.street_direction,'') = coalesce(p.street_direction,'')
and coalesce(u.street_unit,'') = coalesce(p.street_unit,'')
But if there are multiple rows that have NULL in these columns then you will get unexpected assignments in your update...

Query with multiple EXISTS returing too many rows

Original table has 1466303 records in it, I have inserted 1108441 of those records in to a separate table. What I would like to know is what data is left over? So I have made a query using multiple exists to find the data that was left:
SELECT SG_customer,
PHONE,
SG_Name,
SG_Secondary_Address,
SG_Primary_Address,
SG_City,
SG_State,
SG_Zip,
SG_Email
FROM FMJ_DB_VPI_EXPANDED_DATA X
WHERE NOT EXISTS (SELECT 1
FROM FMJScore
WHERE SGID = X.SG_Customer
AND Phone = X.Phone
AND Name = X.SG_Name
AND SecondAddress = X.SG_Secondary_Address
AND Address = X.SG_Primary_Address
AND City = X.SG_City
AND State = X.SG_State
AND Zip = X.SG_Zip
AND Email = X.SG_Email)
Running this returns back 144391 records, there should be a difference of 357862, I don't understand why its returning back so many records.
I assume you want null to be treated equal to null, I also assume that '' is not used as a value, if it is replace it with something that does not normally occur:
SELECT SG_customer,
PHONE,
SG_Name,
SG_Secondary_Address,
SG_Primary_Address,
SG_City,
SG_State,
SG_Zip,
SG_Email
FROM FMJ_DB_VPI_EXPANDED_DATA X
WHERE NOT EXISTS (SELECT 1
FROM FMJScore
WHERE coalesce(SGID,'') = coalesce(X.SG_Customer,'')
AND coalesce(Phone,'') = coalesce(X.Phone,'')
AND coalesce(Name,'') = coalesce(X.SG_Name,'')
AND coalesce(SecondAddress,'') = coalesce(X.SG_Secondary_Address,'')
AND coalesce(Address,'') = coalesce(X.SG_Primary_Address,'')
AND coalesce(City,'') = coalesce(X.SG_City,'')
AND coalesce(State,'') = coalesce(X.SG_State,'')
AND coalesce(Zip,'') = coalesce(X.SG_Zip,'')
AND coalesce(Email,'') = coalesce(X.SG_Email,''))
The optimizer might not be able to use indexes efficiently due to the function call