Find Modified/New/Deleted Records Between Two Tables - sql

I want to find new, modified and deleted records in one table (tableA) by comparing it to another table (tableB). Both tables are of the same schema and has a unique ID field.
In my situation, tableA is originally the same as tableB but it has been edited by some external organisation and once they have done their edits, they send the table back via ZIP file, and we re-populate (truncate and insert) that data to tableA. So I want to find out what records have changed in tableA. I am using SQL Server 2012.
I can get new and modified records with the "except" keyword:
select * from tableA
except
select * form tableB
(Let's call the above results ResultsA)
I can also get deleted and modified records:
select * from tableB
except
select * form tableA
(Let's call the above results ResultsB)
The problem is, both ResultsA and ResultsB have the same records that have been modified/edited. So the modified/edited records are doubled up. I can use inner join or intersect on ResultsA and ResultsB to get just the modified records (call this results ResultsC). But then I will need to use join/except again between ResultsA and ResultsC to get just the new records, and join/except again between ResultsB and ResultsC to get just the deleted records... I tried this and this but they are not working for me.
Obviously this is not good. Are there any elegant and simpler ways to find out the records that have been deleted, modified or added in tableA compared to tableB?

How about:
-- DELETED
SELECT B.*, 'DELETED' AS 'CHANGE_TYPE'
FROM TableB B
LEFT JOIN TableA A ON B.PK_ID = A.PK_ID
WHERE A.PK_ID IS NULL
UNION
-- NEW
SELECT A.*, 'NEW' AS 'CHANGE_TYPE'
FROM TableA A
LEFT JOIN TableB B ON B.PK_ID = A.PK_ID
WHERE B.PK_ID IS NULL
UNION
-- MODIFIED
SELECT B.*, 'MODIFIED' AS 'CHANGE_TYPE'
FROM (
SELECT * FROM TableA
EXCEPT
SELECT * FROM TableB
) S1
INNER JOIN TableB B ON S1.PK_ID = B.PK_ID;
Not exactly elegant, but it works.

Based on what i understood i came up with the following solution.
DECLARE #tableA TABLE (ID INT, Number INT)
DECLARE #tableB TABLE (ID INT, Number INT)
INSERT INTO #tableA VALUES
(1,10),
(2,20),
(3,30),
(4,40)
INSERT INTO #tableB VALUES
(1,11),
(2,20),
(4,40),
(5,50)
SELECT *,'Modified or deleted' as 'Status' FROM
(
select * from #tableA
except
select * from #tableB
)a WHERE ID NOT IN
(
select ID from #tableB
except
select ID from #tableA
)
UNION
SELECT *,'New' as 'Status' FROM
(
select * from #tableB
except
select * from #tableA
)b WHERE ID NOT IN
(
SELECT ID FROM
(
select * from #tableA
except
select * from #tableB
)a WHERE ID NOT IN
(
select ID from #tableB
except
select ID from #tableA
)
)

You can use the OUTPUT clause:
Returns information from, or expressions based on, each row affected by an INSERT, UPDATE, or DELETE statement. These results can be returned to the processing application for use in such things as confirmation messages, archiving, and other such application requirements. Alternatively, results can be inserted into a table or table variable.
See the the following, sorry I don't have a practical code for you. But note the SQL output clause can be used to return any value from ‘inserted’ and ‘deleted’ (New value and Old value) tables when doing an insert or update. follow this for more info

declare #DBOrderItem table
(
OrderItemGuid UniqueIdentifier default newid(),
Name VarChar(100)
);
declare #PayloadOrderItem table
(
OrderItemGuid UniqueIdentifier default newid(),
Name VarChar(100)
);
insert into #DBOrderItem (Name) values ('Phone');
insert into #DBOrderItem (Name) values ('Laptop');
insert into #PayloadOrderItem
select top 1 * from #DBOrderItem;
insert into #PayloadOrderItem (Name) values ('Tablet');
select doi.OrderItemGuid,
doi.Name,
case when poi.OrderItemGuid is null then 'Delete' else 'Update' end ActionType
from #DBOrderItem doi
left join #PayloadOrderItem poi on doi.OrderItemGuid = poi.OrderItemGuid
union
select poi.OrderItemGuid,
poi.Name,
'Add' ActionType
from #PayloadOrderItem poi
left join #DBOrderItem doi on doi.OrderItemGuid = poi.OrderItemGuid
where doi.OrderItemGuid is null;

Another solution that works quite efficiently is to use a where not exists an intersect between the two tables. Its very compact.
SELECT
IsNull(tableB.ID,tableA.ID) as 'ID',
IsNull(tableB.Number,tableA.Number) as 'Number',
'Action' = CASE
WHEN tableB.ID IS NULL THEN 'Deleted'
WHEN tableA.ID IS NULL THEN 'Created'
ELSE 'Updated'
END
FROM tableA
FULL OUTER JOIN tableB
ON tableB.ID = tableA.ID
WHERE
NOT EXISTS (SELECT tableB.* INTERSECT SELECT tableA.*)
This keeps the table scans down to a minimum, and provides detection of new, deleted and changed records.
I put all three from here into fiddle, and its surprising how differently they all compile.
http://sqlfiddle.com/#!6/b1a5a/5

This one works without primary key also a bit more elegant .(in my opinion!)
WITh A AS (SELECT 1,2,3 FROM DUAL
UNION ALL
SELECT 1,3,2 FROM DUAL
UNION ALL
SELECT 1,3,1 FROM DUAL),
B AS (SELECT 1,3,2 FROM DUAL
UNION ALL
SELECT 1,2,3 FROM DUAL
UNION ALL
SELECT 1,3,5 FROM DUAL
)
,
C AS
(SELECT * FROM A
MINUS
SELECT * FROM B
),
D AS( SELECT * FROM b
MINUS
SELECT * FROM A)
SELECT C.* ,'Deleted' FROM c
UNION ALL
SELECT D.* ,'Added' FROM D

Related

IF table is empty insert another table

How can i check if table A is empty? And if is empty, how can i insert table B content to table A? (they are identical). I would like to create something like this if table A is empty :
INSERT INTO tableA
SELECT * FROM tableB
You can use NOT EXISTS in the WHERE clause:
INSERT INTO tableA
SELECT * FROM tableB
WHERE NOT EXISTS (SELECT 1 FROM tableA)
or:
INSERT INTO tableA
SELECT * FROM tableB
WHERE (SELECT COUNT(*) FROM tableA) = 0

SQL Server - Exclude Records from other tables

I used the search function which brought me to the following solution.
Starting Point is the following: I have one table A which stores all data.
From that table I select a certain amount of records and store it in table B.
In a new statement I want to select new records from table A that do not appear in table B and store them in table c. I tried to solve this with a AND ... NOT IN statement.
But I still receive records in table C that are in table B.
Important: I can only work with select statements, each statement needs to start with select as well.
Does anybody have an idea where the problem in the following statement could be:
Select *
From
(Select TOP 10000 *
FROM [table_A]
WHERE Email like '%#domain_A%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Union
(Select TOP 7500 *
FROM table_A]
WHERE Email like '%#domain_B%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Union
(SELECT TOP 5000 *
FROM [table_A]
WHERE Email like '%#domain_C%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Try NOT EXISTS instead of NOT IN
SELECT
*
FROM TableA A
WHERE NOT EXISTS
(
SELECT 1 FROM TableB WHERE Id = A.Id
)
So Basically the idea here is to select everything from table A that doesnt exists in table B and Insert all that into Table C?
INSERT INTO Table_C
SELECT a.colum1, a.column2,......
FROM [table_A]
LEFT JOIN [table_B] ON a.id = b.ID
WHERE a.Email like '%#domain_A%' AND b.id IS NULL
Thank you guys all for your feedback, from which I learned a lot.
I was able to fix the statement with your help. Above is the statement which is working now with the desired results:
Select Id
From
(Select TOP 10000 * FROM Table_A
WHERE Email like '%#domain_a%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t1
Union
Select Id
From
(Select TOP 7500 * FROM Table_A
WHERE Email like '%#domain_b%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t2
Union
Select Id
From
(SELECT TOP 5000 * FROM Table_A
WHERE Email like '%#domain_c%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t3

SQL Select where condition : value 1 <> value 2

Need your help to know if possible to select values from a table with the below condition :
Table content : matching between 2 objects
(Id_obj_A; name_obj_A; country_obj_A; Id_obj_B; name_obj_B; country_obj_B)
Select *
from table
Where (only if country_obj_A <> country_obj_B)
Many thanks for your help
Yes. There are a few ways, one is to use NOT EXISTS like this:
select
*
from tableA
where NOT EXISTS (
select NULL
from tableB
where tableB.country_obj_B = tableA.country_obj_A
)
or, using NOT IN
select
*
from tableA
where country_obj_A NOT IN (
select country_obj_B
from tableB
)
or, using a LEFT JOIN then exclude the joined rows:
select
*
from tableA
left join tableB on tableA.country_obj_A = tableB.country_obj_B
where tableB.country_obj_B IS NULL

SQL Server : filtering though large set of data

I have 1000 rows in which I want to check if these records exits in the table as follows
select *
from table
where ID in ('TS145698', 'TF58964', 'TG47896', 'TS12369')
If I enter 1000 ID's, I retrieve data for 786, how do I know which of the 214 IDs are not located in the table?
You can with a template table.
DECLARE #Template TABLE (ID NVARCHAR(50))
INSERT INTO #Template
VALUES
('TS145698'),
('TF58964'),
('TG47896'),
('TS12369')
SELECT *
FROM
#Template A LEFT JOIN
table B ON A.ID = B.ID
WHERE
B.ID IS NULL
One way to do it is to enter these values into a table parameter, cte, or temporary table, and then use left join with the actual table.
Another way is to use the values clause:
Create and populate sample table (Please save us this step in your future questions)
DECLARE #T as TABLE
(
Id int
)
INSERT INTO #T VALUES (1), (2), (3), (4)
The query:
SELECT v.Id
FROM (VALUES (1), (2), (3), (5), (6)) AS v(Id) -- Use this instead of the IN operator
LEFT JOIN #T T ON v.Id = T.Id
WHERE T.Id IS NULL
Results:
Id
-----------
5
6
Another option is to use UNION to create you values list:
SELECT v.Id
FROM (
SELECT 1 As Id
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6
) AS v -- Use this instead of the IN operator
LEFT JOIN #T T ON v.Id = T.Id
WHERE T.Id IS NULL
You can create table for criteria values, and then make a left join to main table:
SELECT A.*, B.*
FROM [YourCriteriasTable] AS A
LEFT JOIN [table] AS B ON A.ID = B.ID
And your wanted ID-s will have null values in B.* Fields.
Do you mean that:
select * from table where ID not in
('TS145698'
,'TF58964'
,'TG47896'
,'TS12369'
)

SQL: how to find unused primary key

I've got a table with > 1'000'000 entries; this table is referenced from about 130 other tables. My problem is that a lot of those 1-mio-entries is old and unused.
What's the fastet way to find the entries not referenced by any of the other tables? I don't like to do a
select * from (
select * from table-a TA
minus
select * from table-a TA where TA.id in (
select "ID" from (
(select distinct FK-ID "ID" from table-b)
union all
(select distinct FK-ID "ID" from table-c)
...
Is there an easier, more general way?
Thank you all!
You could do this:
select * from table_a a
where not exists (select * from table_b where fk_id = a.id)
and not exists (select * from table_c where fk_id = a.id)
and not exists (select * from table_d where fk_id = a.id)
...
try :
select a.*
from table_a a
left join table_b b on a.id=b.fk_id
left join table_c c on a.id=c.fk_id
left join table_d d on a.id=d.fk_id
left join table_e e on a.id=e.fk_id
......
where b.fk_id is null
and c.fk_id is null
and d.fk_id is null
and e.fk_id is null
.....
you might also try:
select a.*
from table_a a
left join
(select b.fk_id from table_b b union
select c.fk_id from table_c c union
...) table_union on a.id=table_union.fk_id
where table_union.fk_id is null
This is more SQL oriented and it will not take forever like the above solution.
Not sure about efficiency but:
select * from table_a
where id not in (
select id from table_b
union
select id from table_c )
If your concern is allowing the database to continue normal operations while you do the house keeping you could split it into multiple stages:
insert into tblIds
select id from table_a
union
select id from table_b
as may times as you need and then:
delete * from table_a where id not in ( select id from tableIds )
Of course sometimes doing a lot of processing takes a lot of time.
I like #Patrick's answer above, but I would like to add to that.
Rather than building the 130-step query by hand, you could build these INSERT statements by scanning sysObjects, finding key relations and generating your INSERT statements.
That would not only save you time, but should also help you to know for sure whether you've covered all the tables - maybe there are 131, or only 129.
I'm inclined to Marcelo Cantos' answer (and have upvoted it), but here is an alternative in an attempt to circumvent the problem of not having indexes on the foreign keys...
WITH
ids_a AS
(
SELECT id FROM myTable
)
,
ids_b AS
(
SELECT id FROM ids_a WHERE NOT EXISTS (SELECT * FROM table_a WHERE fk_id = ids_a.id)
)
,
ids_c AS
(
SELECT id FROM ids_b WHERE NOT EXISTS (SELECT * FROM table_b WHERE fk_id = ids_b.id)
)
,
...
,
ids_z AS
(
SELECT id FROM ids_y WHERE NOT EXISTS (SELECT * FROM table_y WHERE fk_id = ids_y.id)
)
SELECT * FROM ids_z
All I'm trying to do is to suggest an order to Oracle to minimise its efforts. Unfortunately Oracle will compile this to comething very similar to Marcelo Cantos' answer and it may not performa any differently.