I need to delete records that are present in the destination table but not in the source table. The primary key in the destination table is an auto_increment ID which is not there in the source table. Both the source and destination tables contain a set of unique key combinations which can be used to uniquely identify the rows in either tables. What is the approach that I should follow? How can I delete if I am to use multiple column combinations as the unique key and not one primary key(not there in source)?
delete from dest_table
where (uniq_key_col1,uniq_key_col2) not in (
select dest.uniq_key_col1,dest.uniq_key_col2
from dest_table dest
join source_table source
on dest.uniq_key_col1=source.uniq_key_col1
and dest.uniq_key_col2=source.uniq_key_col2
)
This is how it should ideally look (provided just for clarity and please ignore the error in where clause because of multiple columns)
You can use exists. ie:
delete from dest_table
where not exists (
select *
from source_table source
where dest_table.uniq_key_col1=source.uniq_key_col1
and dest_table.uniq_key_col2=source.uniq_key_col2
);
You can do like this :
DELETE
FROM dbo.dest a
WHERE NOT EXISTS (
SELECT 1
FROM dbo.source1 b
WHERE a.id1 = b.ID1 and a.id2 = b.id2
)
It sounds like NOT EXISTS is what you need
DELETE d FROM dest_table d
WHERE NOT EXISTS (SELECT (PUT_APPROPRIATE_COLUMNS_HERE) from source_table s
WHERE d.col1 = s.col
AND d.col2 = s.col2
... etc for other columns
)
Note the table aliasing, you need that. And it might be more appropriate to use an inner join, if that is possible with your data.
Another option for you
DELETE dest_table
FROM dest_table
LEFT JOIN source_table
ON dest_table.uniq_key_col1 = source_table.uniq_key_col1
AND dest_table.uniq_key_col2 = source_table.uniq_key_col2
WHERE source_table.uniq_key_col1 IS NULL
Related
I have a following requirement:
I need to delete records from one table based on the given ID, now this table is referenced by another table and that other table is referenced by yet another table and that last table is also referenced by another table, so I have a chain like this:
table_1 <- table_2 <- table_3 <- table_4
I'm not that experienced with SQL so my solution involves using subqueries to perform this.
I have something like this:
DELETE FROM table_4
WHERE pk_of_table_3 IN
(SELECT id
FROM table_3
WHERE pk_of_table_2 IN
(SELECT id FROM table 2 WHERE pk_of_table_1 = ?
)
)
So this would clear records from table_4 which reference targeted records in table_3.
For table_3 I would do something like:
DELETE FROM table_3
WHERE pk_of_table_2 IN
(SELECT id FROM table_2 WHERE pk_of_table_1 = ?)
Now I move to table_2:
DELETE FROM table_2 WHERE pk_of_table_1 = 5;
So in the end it leaves me with possibility to cleanup necessary records from table_1 since it's not being constrained by anything.
I wanted to ask if this seems like viable solution and if there are better ways to do this?
Do it with a common table expression that chains the DELETE statements:
with delete_t1 as (
delete from table_1
where pk = 42
returning pk
), delete_t2 as (
delete from table_2
where pk_of_table_1 in (select pk from delete_t1)
returning pk
), delete_t3 as (
delete from table_3
where pk_of_table_2 in (select pk from delete_t2)
returning pk
)
delete from table_4
where pk_of_table_3 in (select pk from delete_t3);
If you always do it like that, consider defining the foreign key constraints as on delete cascade, then you only need to delete from table_1 and Postgres will take care of the rest.
Consider the scenario of loading of a table from a flat file. the table has no constraints or indexes defined.Somehow in between loading was interrupted and after some time the table was again loaded from the same file. So this time the records already inserted during first loading were duplicated. how to find the duplicate rows now ? assume there are 150 columns in the table so group by each and every column is tedious
A record is truly duplicate only if all the column values match. It becomes different or unique even if 1 column has a different value. If your table has no primary constraints, you must compare all columns.
An alternative way could be that you could do your 2nd load on a new temp table and populate your old table with records from this temp table where the records do not exist in the old table. In any case you have to compare all columns between the 2 tables to identify truly unique records.
You could also consider adding a primary key to your table and then running your delete query. Check the accepted answer on this link
You can use ROWID for deleting duplicate rows;
Select * FROM table_name A
WHERE
a.rowid > ANY (
SELECT
B.rowid
FROM
table_name B
WHERE
A.col1 = B.col1
AND
A.col2 = B.col2
);
here is a useful link:
[http://www.dba-oracle.com/t_delete_duplicate_table_rows.htm
Tested... Appears to work...
1st we get a list of the table columns in a comma separated list
SELECT wm_concat(column_Name)
FROM all_tab_cols
WHERE table_name = 'TABLENAME'Select and Column_ID is not null;
copy the results into query below where ResultList is defined.
adjust 'Tablename' to your table.
WITH CTE AS (SELECT TN.*, RowNum RN from 'TableName' TN order by ResultList),
SELECT * FROM CTE A
INNER JOIN CTE B using (ResultList)
WHERE A.RN <> B.RN
The above uses natrual joins to join all the tables columns to the same table columns and since duplicate rows will have different row numbers, the result set will list both offending records.
I got this snippet somewhere along the line for deleting dups:
DELETE FROM TABLE_NAME
WHERE ROWID IN
(SELECT ROWID FROM TABLE_NAME
MINUS
SELECT MIN(ROWID) FROM TABLE_NAME
GROUP BY <column list> );
Note the column_list lists the columns that are used to determine uniqueness.
Select * FROM table_name A
WHERE
a.rowid > (
SELECT
min (B.rowid)
FROM
table_name B
WHERE
A.row_id = B.row_id
);
Suppose you are having a test table(table in which you moved the record using flat file) dummd which is having multiple columns (like 150 and you are not sure which column is unique or primary )and duplicate rows so to find all the unique records you can use union and then create a view or new table like i did as test1 :-
create table test1
as
select * from dummd
union
select * from dummd
I have a table of accounting transactions and I am trying to copy them into a new table. I have the copy worked out, but I need to update the copy with any new transactions from my source table. The problem I have is that the source data is coming from a report that links many different sources to create these transactions and doesn't have a unique key.
If I had a unique key, I would create an update query and do a left join from the source table to the copied table and anytime the key is null in the copied table, update those fields.
Since I don't have a unique key, I don't know how to accomplish this. Any ideas?
-----Edit due to Answers-----
SourceTable
Field1 Field2 Field3
CopiedTable
Field1 Field2 Field3
So to update CopiedTable with new records I would do this??
UPDATE CopiedTable SET
CopiedTable.Field1 = SourceTable.Field1,
CopiedTable.Field2 = SourceTable.Field2,
CopiedTable.Field3 = SourceTable.Field3
WHERE (SourceTable.Field1 <> CopiedTable.Field1 AND
SourceTable.Field2 <> CopiedTable.Field2 AND
SourceTable.Field3 <> CopiedTable.Field3)
It is difficult to answer the question without seeing the source tables, the destination table and the query.
Use a compound key consisting of all the unique key fields of the different tables.
EDIT:
Every table must have a primary key. This is a basic design rule for databases. Let us assume that you have two source tables A and B
Table A
-------
A_ID, DataField1, DataField2
Table B (lined to Table A through A_ID)
-------
B_ID, A_ID, DataField3, DataField4
Now you can create table C like this
SELECT
CLng(A.A_ID) AS A_ID, CLng(B.B_ID) AS B_ID,
A.DataField1, A.DataField2, B.DataField3, B.DataField4
INTO
C
FROM
A INNER JOIN B ON A.A_ID = B.A_ID;
I would make A_ID and B_ID a primary key in C.
If A_ID and B_ID are AutoNumbers we need to do a trick with CLng in order to create regular number fields in C.
If later we want to refill C with fresh data from A and B, we can do
DELETE * FROM C;
INSERT INTO C
(A_ID, B_ID, DataField1, DataField2, DataField3, DataField4)
SELECT
A.A_ID, B.B_ID, A.DataField1, A.DataField2, B.DataField3, B.DataField4
FROM
A INNER JOIN B ON A.A_ID = B.A_ID;
If we want to update only changed records, we need to link the source with the copy and, in addition, test if we have changes in the WHERE clause
UPDATE
C
INNER JOIN (SELECT
A.A_ID, B.B_ID,
A.DataField1, A.DataField2, B.DataField3, B.DataField4
FROM
A INNER JOIN B ON A.A_ID = B.A_ID) AS Src
ON C.B_ID = Src.B_ID AND C.A_ID = Src.A_ID
SET
C.DataField1 = Src.DataField1,
C.DataField2 = Src.DataField2,
C.DataField3 = Src.DataField3,
C.DataField4 = Src.DataField4
WHERE
C.DataField1<>Src.DataField1 OR
C.DataField2<>Src.DataField2 OR
C.DataField3<>Src.DataField3 OR
C.DataField4<>Src.DataField4;
The sub-select Src could be another stored query.
I have two tables, with a same column named user_name, saying table_a, table_b.
I want to, copy from table_b, column_b_1, column_b2, to table_b1, column_a_1, column_a_2, respectively, where the user_name is the same, how to do it in SQL statement?
As long as you have suitable indexes in place this should work alright:
UPDATE table_a
SET
column_a_1 = (SELECT table_b.column_b_1
FROM table_b
WHERE table_b.user_name = table_a.user_name )
, column_a_2 = (SELECT table_b.column_b_2
FROM table_b
WHERE table_b.user_name = table_a.user_name )
WHERE
EXISTS (
SELECT *
FROM table_b
WHERE table_b.user_name = table_a.user_name
)
UPDATE in sqlite3 did not support a FROM clause for a long time, which made this a little more work than in other RDBMS. UPDATE FROM was implemented in SQLite 3.33 however (2020-08-14) as mentioned at: https://stackoverflow.com/a/63079219/895245
If performance is not satisfactory, another option might be to build up new rows for table_a using a select and join with table_a into a temporary table. Then delete the data from table_a and repopulate from the temporary.
Starting from the sqlite version 3.15 the syntax for UPDATE admits a column-name-list
in the SET part so the query can be written as
UPDATE table_a
SET
(column_a_1, column_a_2) = (SELECT table_b.column_b_1, table_b.column_b_2
FROM table_b
WHERE table_b.user_name = table_a.user_name )
which is not only shorter but also faster
the last "WHERE EXISTS" part
WHERE
EXISTS (
SELECT *
FROM table_b
WHERE table_b.user_name = table_a.user_name
)
is actually not necessary
It could be achieved using UPDATE FROM syntax:
UPDATE table_a
SET column_a_1 = table_b.column_b_1
,column_a_2 = table_b.column_b_2
FROM table_b
WHERE table_b.user_name = table_a.user_name;
Alternatively:
UPDATE table_a
SET (column_a_1, column_a_2) = (table_b.column_b_1, table_b.column_b_2)
FROM table_b
WHERE table_b.user_name = table_a.user_name;
UPDATE FROM - SQLite version 3.33.0
The UPDATE-FROM idea is an extension to SQL that allows an UPDATE statement to be driven by other tables in the database. The "target" table is the specific table that is being updated. With UPDATE-FROM you can join the target table against other tables in the database in order to help compute which rows need updating and what the new values should be on those rows
There is an even much better solution to update one table from another table:
;WITH a AS
(
SELECT
song_id,
artist_id
FROM
online_performance
)
UPDATE record_performance
SET
op_song_id=(SELECT song_id FROM a),
op_artist_id=(SELECT artist_id FROM a)
;
Update tbl1
Set field1 = values
field2 = values
Where primary key in tbl1 IN ( select tbl2.primary key in tbl1
From tbl2
Where tbl2.primary key in tbl1 =
values);
The accepted answer was very slow for me, which is in contrast to the following:
CREATE TEMPORARY TABLE t1 AS SELECT c_new AS c1, table_a.c2 AS c2 FROM table_b INNER JOIN table_a ON table_b.c=table_a.c1;
CREATE TEMPORARY TABLE t2 AS SELECT t1.c1 AS c1, c_new AS c2 FROM table_b INNER JOIN t1 ON table_b.c=t1.c2;
I have tables A and B. Items of table B might exist also in table A, and I want to delete those items. What would the SQL statements to do this look like?
this is an option
delete from a
where a.key in (select key from b)
Either:
DELETE a
WHERE a.some_field IN (SELECT some_field FROM b)
or
DELETE A
WHERE EXISTS (SELECT 1 FROM b WHERE b.field1 = a.field2)
Depending on your database, you may find one works particularly better than the other. IIRC Oracle tends to prefer WHERE EXISTS to IN but this can depend on a number of factors.
In certain DBs the rather exotic looking DELETE FROM FROM is very efficient
delete from foo from foo as f
where exists
(
select 1 from bar as b where b.field = f.field
)
Something like:
DELETE
FROM TableA as A
WHERE A.ID IN (SELECT ID
FROM TableB AS B
WHERE [your condition here])
If your tables use InnoDB, the easiest way would be to setup table A with foreign keys from table B, and use ON DELETE CASCADE. That way no code changes are necessary, and the integrity of your database is guaranteed.
This is allowed according to the standard.
delete a from a join b on a.id = b.id;
DELETE FROM FIRST_TABLE FT
WHERE EXISTS(
SELECT 1
FROM SECOND_TABLE ST
WHERE ST.PRIMARY_KEY = FT.PRIMARY_KEY
);
delete a
--select a.*
from tablea a
join tableb b on a.someid = b.someid
Make sure that you run the select part first to ensure that you are getting the records you want.