Oracle SQL - Absurdly expensive update statement? - sql

I have this statement:
update new_table t2
set t2.creation_date_utc =
(select creation_date from old_table t1 where t2.id = t1.id)
where exists
(select 1 from old_table t1 where t2.id = t1.id);
The cost according to the explain plan is 150959919. The explain plan shows some full table access for a total cost of like 3000, and then the update has that basically infinite cost. It indeed seems to go on forever if run.
FYI, these tables have no more than 300k rows each.
In addition, this query.
select
(select creation_date from old_table t1 where t2.id = t1.id)
from new_table t2;
finishes basically instantly.
What could be the cause of this?

if you have an update statement in the format that you mentioned, then it implies that the id field is unique across the old_table (otherwise the first inner query would have raised error returning multiple values for update when in fact, only one value can be processed). So, you can modify your first query to be(removing the where clause since it is redundant) :-
update new_table t2
set t2.creation_date_utc =
(select creation_date from old_table t1 where t2.id = t1.id);
it might be possible that the above query will still take a long time owing to full table scans. So, you have two options :-
apply an index to the old_table table on the id field using the following command.
Create index index_name on old_table(id);
modify your update query to the following (untested) :-
update new_table t2
set t2.creation_date_utc=
(select creation_date from old_table t1 where t2.id=t1.id and rownum=1);
The rownum=1 should instruct oracle not to do any more searching as soon as the first match is found in old_table.
I would recommend the first approach though.

You don't have an index on old_table.id and nested loop joins are very expensive. An index on old_table(id, creation_date) would be best.

Why not use merge statement? I've found it to be more efficient in similar cases.
merge into new_table t2
using old_table t1
on (t2.id = t1.id)
when matched then
update set t2.creation_date_utc = t1.creation_date;

This may be faster and equivalent:
update new_table t2
set t2.creation_date_utc =
(select creation_date from old_table t1 where t2.id = t1.id)
where t2.id in
(select id from old_table t1);

Related

Update Table1 if no referencing row in TABLE2 exists, limit updates

The PK of table1 is a FK in table2.
I wish to update the status in table1 where no record exists in table2 and limit the number of updates. There may be no records in table2.
Something like:
UPDATE t1
SET status = 0
WHERE NOT EXISTS (
SELECT id
FROM t2
WHERE t1.id = t2.id
LIMIT 1000
)
This is a little complicated in Postgres, because there is no limit. Assuming that you have a primary key in t1 (which I'll assume is id), you can use a subquery to determine the rows to update and then match in the WHERE clause:
UPDATE t1
SET status = 0
FROM (SELECT tt1.*
FROM t1 tt1
WHERE NOT EXISTS (SELECT t2.id FROM t2 WHERE tt1.id = t2.id)
LIMIT 1000
) ttl
WHERE t1.id = tt1.id;
If you are doing this under concurrent write load, there is a race condition between the subquery (the SELECT to determine rows) and the outer UPDATE, which can lead to wrong results. To defend against this, add a row locking clause.
However, this needs to be done in a CTE to be reliable (at least in my test wit Postgres up to version 10). So:
WITH cte AS (
SELECT id -- PK
FROM t1
WHERE NOT EXISTS (SELECT FROM t2 WHERE t2.id = t1.id)
LIMIT 1000
FOR UPDATE -- SKIP LOCKED -- ?
)
UPDATE t1
SET status = 0
FROM cte
WHERE t1.id = cte.id
RETURNING id; -- optional
If you run multiple commands like this, possibly in parallel, add SKIP LOCKED, so that they don't block each other.
This only secures existing rows in t1. There is still the problem that conflicting rows might be added in t2 between SELECT and UPDATE.
You mentioned a FK constraint. I am not sure from the top of my head whether depending rows in t2 are blocked from being added by the FK constraint while there is a FOR UPDATE lock on the parent row. Would have to test, but out of time right now.
Postgres has no predicate-locking for user commands. (Users can only lock existing rows.) To be absolutely sure, you could also use the (more expensive) SERIALIZABLE transaction isolation.
See:
Postgres UPDATE … LIMIT 1

ORA-01427 Subquery returns more then one row..Oracle Update Statement

How do you write a update statement with a Sub-Select in an Oracle Environment (SQL Developer)?
Example: UPDATE table SET column = (SELECT....)
Every time I try this it gives me ORA-01427 "Sub select returns more then one row" even if there is no WHERE clause..
Based on the understanding of your question I'd suggest use Merge statement.
Merge into Table1
Using
(SELECT * from table2 where condition) Temp
On (Table1.columname condition Temp.columname)
When matched Then update Set Table1.column_name = Temp.column_name;
Table1 is the table where you want to update the records.
Table2 is the table from which you want to get the data (The sub query which you are talking about )
Using this merge statement you will be able to update n number of rows.
If you want to update multiple rows, you can either use a MERGE statement (as in #jackkds7's answer above) or you can use a filter on your subselect:
UPDATE table t1
SET column = ( SELECT column FROM table2 t2 WHERE t2.key = t1.key );
If there aren't matches in table2 for all the records in table then column will be set to NULL for the non-matches. To avoid that, add a WHERE EXISTS clause:
UPDATE table t1
SET column = ( SELECT column FROM table2 t2 WHERE t2.key = t1.key )
WHERE EXISTS ( SELECT 1 FROM table2 t2 WHERE t2.key = t1.key );
Oh and in the event that key is not unique for table2, you can aggregate (up to you to figure out which function would be best):
UPDATE table t1
SET column = ( SELECT MAX(column) FROM table2 t2 WHERE t2.key = t1.key )
WHERE EXISTS ( SELECT 1 FROM table2 t2 WHERE t2.key = t1.key );
Hope this helps.
I think it would help if you posted your actual query.
In essence, the "inner" select would be executed for each row that would be updated. This inner select query is called a correlated subquery:
UPDATE table t SET t.column = (
select ot.othercolumn from othertable ot
where ot.fk = t.id --This is the correlation part, that finds
--he right value for the row you are currently updating
)
You must ensure the subquery you use will always return just a single row and a single column for every time it runs (that is, for every row that is going to be updated). If needed, you can use MAX(), or ROWNUM to ensure you always only get 1 value
More examples:
Using Correlated Subqueries

Deleting rows in Access based on rows in another table [duplicate]

I can't seem to ever remember this query!
I want to delete all rows in table1 whose ID's are the same as in Table2.
So:
DELETE table1 t1
WHERE t1.ID = t2.ID
I know I can do a WHERE ID IN (SELECT ID FROM table2) but I want to do this query using a JOIN if possible.
DELETE t1
FROM Table1 t1
JOIN Table2 t2 ON t1.ID = t2.ID;
I always use the alias in the delete statement as it prevents the accidental
DELETE Table1
caused when failing to highlight the whole query before running it.
DELETE Table1
FROM Table1
INNER JOIN Table2 ON Table1.ID = Table2.ID
There is no solution in ANSI SQL to use joins in deletes, AFAIK.
DELETE FROM Table1
WHERE Table1.id IN (SELECT Table2.id FROM Table2)
Later edit
Other solution (sometimes performing faster):
DELETE FROM Table1
WHERE EXISTS( SELECT 1 FROM Table2 Where Table1.id = Table2.id)
PostgreSQL implementation would be:
DELETE FROM t1
USING t2
WHERE t1.id = t2.id;
Try this:
DELETE Table1
FROM Table1 t1, Table2 t2
WHERE t1.ID = t2.ID;
or
DELETE Table1
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.ID = t2.ID;
I think that you might get a little more performance if you tried this
DELETE FROM Table1
WHERE EXISTS (
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)
This will delete all rows in Table1 that match the criteria:
DELETE Table1
FROM Table2
WHERE Table1.JoinColumn = Table2.JoinColumn And Table1.SomeStuff = 'SomeStuff'
Found this link useful
Copied from there
Oftentimes, one wants to delete some records from a table based on criteria in another table. How do you delete from one of those tables without removing the records in both table?
DELETE DeletingFromTable
FROM DeletingFromTable INNER JOIN CriteriaTable
ON DeletingFromTable.field_id = CriteriaTable.id
WHERE CriteriaTable.criteria = "value";
The key is that you specify the name of the table to be deleted from as the SELECT. So, the JOIN and WHERE do the selection and limiting, while the DELETE does the deleting. You're not limited to just one table, though. If you have a many-to-many relationship (for instance, Magazines and Subscribers, joined by a Subscription) and you're removing a Subscriber, you need to remove any potential records from the join model as well.
DELETE subscribers, subscriptions
FROM subscribers INNER JOIN subscriptions
ON subscribers.id = subscriptions.subscriber_id
INNER JOIN magazines
ON subscriptions.magazine_id = magazines.id
WHERE subscribers.name='Wes';
Deleting records with a join could also be done with a LEFT JOIN and a WHERE to see if the joined table was NULL, so that you could remove records in one table that didn't have a match (like in preparation for adding a relationship.) Example post to come.
Since the OP does not ask for a specific DB, better use a standard compliant statement.
Only MERGE is in SQL standard for deleting (or updating) rows while joining something on target table.
merge table1 t1
using (
select t2.ID
from table2 t2
) as d
on t1.ID = d.ID
when matched then delete;
MERGE has a stricter semantic, protecting from some error cases which may go unnoticed with DELETE ... FROM. It enforces 'uniqueness' of match : if many rows in the source (the statement inside using) match the same row in the target, the merge must be canceled and an error must be raised by the SQL engine.
To Delete table records based on another table
Delete From Table1 a,Table2 b where a.id=b.id
Or
DELETE FROM Table1
WHERE Table1.id IN (SELECT Table2.id FROM Table2)
Or
DELETE Table1
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.ID = t2.ID;
I often do things like the following made-up example. (This example is from Informix SE running on Linux.)
The point of of this example is to delete all real estate exemption/abatement transaction records -- because the abatement application has a bug -- based on information in the real_estate table.
In this case last_update != nullmeans the account is not closed, and res_exempt != 'p' means the accounts are not personal property (commercial equipment/furnishings).
delete from trans
where yr = '16'
and tran_date = '01/22/2016'
and acct_type = 'r'
and tran_type = 'a'
and bill_no in
(select acct_no from real_estate where last_update is not null
and res_exempt != 'p');
I like this method, because the filtering criteria -- at least for me -- is easier to read while creating the query, and to understand many months from now when I'm looking at it and wondering what I was thinking.
Referencing MSDN T-SQL DELETE (Example D):
DELETE FROM Table1
FROM Tabel1 t1
INNER JOIN Table2 t2 on t1.ID = t2.ID
This is old I know, but just a pointer to anyone using this ass a reference. I have just tried this and if you are using Oracle, JOIN does not work in DELETE statements.
You get a the following message:
ORA-00933: SQL command not properly ended.
While the OP doesn't want to use an 'in' statement, in reply to Ankur Gupta, this was the easiest way I found to delete the records in one table which didn't exist in another table, in a one to many relationship:
DELETE
FROM Table1 as t1
WHERE ID_Number NOT IN
(SELECT ID_Number FROM Table2 as t2)
Worked like a charm in Access 2016, for me.
delete
table1
from
t2
where
table1.ID=t2.ID
Works on mssql

SQL Optimizer / Execution Plan - Repeated Subqueries

NOTE: I'm currently running my queries in question on a sqlite3 DB, though answers from expertise in any other DBMS will be welcome insight...
I was wondering if the query optimizer makes any attempt to identify repeated queries/subqueries and run them only once if so.
Here is my example query:
SELECT *
FROM table1 AS t1
WHERE t1.fk_id =
(
SELECT t2.fk_id
FROM table2 AS t2
WHERE t2.id = 1111
)
OR t1.fk_id =
(
SELECT local_id
FROM ID_MAP
WHERE remote_id =
(
SELECT t2.fk_id
FROM table2 AS t2
WHERE t2.id = 1111
)
);
Will the nested query
SELECT t2.fk_id
FROM table2 AS t2
WHERE t2.id = 1111
be run only once (and its results cached for further access) ?
Its not a big deal in this example, since its a simple query that executes only twice, however I need it to run
about 4-5 more times (x2, twice for each child record, so 8-10 really) in my actual program (its grabbing all child records (table1)
associated to a parent record (table2), bound by a foreign key. Its also checking an id mapping table to make sure it queries
for both a locally generated id, as well as the real/updated/new key).
I really appreciate any help with this, thank you.
SQLite has a very simple query optimizer, and does not even try to detect identical subqueries:
> create table t(x);
> explain query plan
select * from t
where x in (select x from t) or
x in (select x from t);
0|0|0|SCAN TABLE t (~500000 rows)
0|0|0|EXECUTE LIST SUBQUERY 1
1|0|0|SCAN TABLE t (~1000000 rows)
0|0|0|EXECUTE LIST SUBQUERY 2
2|0|0|SCAN TABLE t (~1000000 rows)
The same applies to CTEs and views; if the performance actually matters, your best bet is to create a temporary table for the result of the subquery.
As you asked for insight from other DBs....
In Oracle DBMS, any independent subquery will be executed only once.
SELECT t2.fk_id
FROM table2 AS t2
WHERE t2.id = 1111 -- The result will be the same for any row in t1.
Dependant subqueries will need to executed repeatedly, of course.
Example of dependent subquery:
SELECT t2.fk_id
FROM table2 AS t2
WHERE t2.id = t1.t2_id -- t1.t2_id will have different values for different rows in t1.

Delete all rows in a table based on another table

I can't seem to ever remember this query!
I want to delete all rows in table1 whose ID's are the same as in Table2.
So:
DELETE table1 t1
WHERE t1.ID = t2.ID
I know I can do a WHERE ID IN (SELECT ID FROM table2) but I want to do this query using a JOIN if possible.
DELETE t1
FROM Table1 t1
JOIN Table2 t2 ON t1.ID = t2.ID;
I always use the alias in the delete statement as it prevents the accidental
DELETE Table1
caused when failing to highlight the whole query before running it.
DELETE Table1
FROM Table1
INNER JOIN Table2 ON Table1.ID = Table2.ID
There is no solution in ANSI SQL to use joins in deletes, AFAIK.
DELETE FROM Table1
WHERE Table1.id IN (SELECT Table2.id FROM Table2)
Later edit
Other solution (sometimes performing faster):
DELETE FROM Table1
WHERE EXISTS( SELECT 1 FROM Table2 Where Table1.id = Table2.id)
PostgreSQL implementation would be:
DELETE FROM t1
USING t2
WHERE t1.id = t2.id;
Try this:
DELETE Table1
FROM Table1 t1, Table2 t2
WHERE t1.ID = t2.ID;
or
DELETE Table1
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.ID = t2.ID;
I think that you might get a little more performance if you tried this
DELETE FROM Table1
WHERE EXISTS (
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)
This will delete all rows in Table1 that match the criteria:
DELETE Table1
FROM Table2
WHERE Table1.JoinColumn = Table2.JoinColumn And Table1.SomeStuff = 'SomeStuff'
Found this link useful
Copied from there
Oftentimes, one wants to delete some records from a table based on criteria in another table. How do you delete from one of those tables without removing the records in both table?
DELETE DeletingFromTable
FROM DeletingFromTable INNER JOIN CriteriaTable
ON DeletingFromTable.field_id = CriteriaTable.id
WHERE CriteriaTable.criteria = "value";
The key is that you specify the name of the table to be deleted from as the SELECT. So, the JOIN and WHERE do the selection and limiting, while the DELETE does the deleting. You're not limited to just one table, though. If you have a many-to-many relationship (for instance, Magazines and Subscribers, joined by a Subscription) and you're removing a Subscriber, you need to remove any potential records from the join model as well.
DELETE subscribers, subscriptions
FROM subscribers INNER JOIN subscriptions
ON subscribers.id = subscriptions.subscriber_id
INNER JOIN magazines
ON subscriptions.magazine_id = magazines.id
WHERE subscribers.name='Wes';
Deleting records with a join could also be done with a LEFT JOIN and a WHERE to see if the joined table was NULL, so that you could remove records in one table that didn't have a match (like in preparation for adding a relationship.) Example post to come.
Since the OP does not ask for a specific DB, better use a standard compliant statement.
Only MERGE is in SQL standard for deleting (or updating) rows while joining something on target table.
merge table1 t1
using (
select t2.ID
from table2 t2
) as d
on t1.ID = d.ID
when matched then delete;
MERGE has a stricter semantic, protecting from some error cases which may go unnoticed with DELETE ... FROM. It enforces 'uniqueness' of match : if many rows in the source (the statement inside using) match the same row in the target, the merge must be canceled and an error must be raised by the SQL engine.
To Delete table records based on another table
Delete From Table1 a,Table2 b where a.id=b.id
Or
DELETE FROM Table1
WHERE Table1.id IN (SELECT Table2.id FROM Table2)
Or
DELETE Table1
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.ID = t2.ID;
I often do things like the following made-up example. (This example is from Informix SE running on Linux.)
The point of of this example is to delete all real estate exemption/abatement transaction records -- because the abatement application has a bug -- based on information in the real_estate table.
In this case last_update != nullmeans the account is not closed, and res_exempt != 'p' means the accounts are not personal property (commercial equipment/furnishings).
delete from trans
where yr = '16'
and tran_date = '01/22/2016'
and acct_type = 'r'
and tran_type = 'a'
and bill_no in
(select acct_no from real_estate where last_update is not null
and res_exempt != 'p');
I like this method, because the filtering criteria -- at least for me -- is easier to read while creating the query, and to understand many months from now when I'm looking at it and wondering what I was thinking.
Referencing MSDN T-SQL DELETE (Example D):
DELETE FROM Table1
FROM Tabel1 t1
INNER JOIN Table2 t2 on t1.ID = t2.ID
This is old I know, but just a pointer to anyone using this ass a reference. I have just tried this and if you are using Oracle, JOIN does not work in DELETE statements.
You get a the following message:
ORA-00933: SQL command not properly ended.
While the OP doesn't want to use an 'in' statement, in reply to Ankur Gupta, this was the easiest way I found to delete the records in one table which didn't exist in another table, in a one to many relationship:
DELETE
FROM Table1 as t1
WHERE ID_Number NOT IN
(SELECT ID_Number FROM Table2 as t2)
Worked like a charm in Access 2016, for me.
delete
table1
from
t2
where
table1.ID=t2.ID
Works on mssql