I use oracle 10.
I have update statement like that :
update table1 t1
set v_value=(select v_value
from table2 t2
where t2.user_id=t1.user_id
and t2.item_id=t1.item_id )
It works but takes too much time. How can I optimize it ?
You can try a merge statement:
merge into table1 t1
using
(
select user_id,
item_id,
v_value
from table2
) t2 ON (t1.user_id = t1.user_id and t1.item_id = t1.item_id)
when matched then update
set v_value = t2.v_value;
(you might need to check the syntax, there were same changes to which parts in a MERGE are mandatory and which not between 10 and 11 - haven't used 10g for a long time)
It is almost impossible to tune a statement just by looking at it.
get explain plan, it will tell you if you are using a full table scan (set autotrace on).
I also suggest to use a free tool, i just posted - odbtools.com. It allows you to analyze your statement and generate the explain plan, It will show you if you have a full table scan and your select part doesn't use an index.
Related
I am quite a newbie with SQL queries but I need to modify a column of a table relatively to the column of another table. For now I have the following query working:
UPDATE table1
SET date1=(
SELECT last_day(max(date2))+1
FROM table2
WHERE id=123
)
WHERE id=123
AND date1=to_date('31/12/9999', 'dd/mm/yyyy');
The problem with this structure is that, I suppose, the SELECT query will be executed for every line of the table1. So I tried to create another query but this one has a syntax error somewhere after the FROM keyword:
UPDATE t1
SET t1.date1=last_day(max(t2.date2))+1
FROM table1 t1
INNER JOIN table2 t2
ON t1.id=t2.id
WHERE t1.id=123
AND t1.date1=to_date('31/12/9999', 'dd/mm/yyyy');
AND besides that I don't even know if this one is faster than the first one...
Do you have any idea how I can handle this issue?
Thanks a lot!
Kind regards,
Julien
The first code you wrote is fine. It won't be executed for every line of the table1 as you fear. It will do the following:
it will run the subquery to find a value you want to use in your UPDATE statement, searching through table2, but as you have stated the exact id from
the table, it should be as fast as possible, as long as you have
created an index on that (I guess a primary key) column
it will run the outer query, finding the single row you want to update. As before, it should be as fast as possible as you have stated the exact id, as long as there is an index on that column
To summarize, If those ID's are unique, both your subquery and your query should return only one row and it should execute as fast as possible. If you think that execution is not fast enough (at least that it takes longer than the amount of data would justify) check if those columns have unique values and if they have unique indexes on them.
In fact, it would be best to add those indexes regardless of this problem, if they do not exist and if these columns have unique values, as it would drastically improve all of the performances on these tables that search through these id columns.
Please try to use MERGE
MERGE INTO (
SELECT id,
date1
FROM table1
WHERE date1 = to_date('31/12/9999', 'dd/mm/yyyy')
AND id = 123
) t1
USING (
SELECT id,
last_day(max(date2))+1 max_date
FROM table2
WHERE id=123
GROUP BY id
) t2 ON (t1.id = t2.id)
WHEN MATCHED THEN
UPDATE SET t1.date1 = t2.max_date
;
I have two tables. Tables 2 contains more recent records.
Table 1 has 900K records and Table 2 about the same.
To execute the query below takes about 10 mins. Most of the queries (at the time of execution the query below) to table 1 give timeout exception.
DELETE T1
FROM Table1 T1 WITH(NOLOCK)
LEFT OUTER JOIN Table2 T2
ON T1.ID = T2.ID
WHERE T2.ID IS NULL AND T1.ID IS NOT NULL
Could someone help me to optimize the query above or write something more efficient?
Also how to fix the problem with time out issue?
Optimizer will likely chose to block whole table as it is easier to do if it needs to delete that many rows. In the case like this I delete in chunks.
while(1 = 1)
begin
with cte
as
(
select *
from Table1
where Id not in (select Id from Table2)
)
delete top(1000) cte
if ##rowcount = 0
break
waitfor delay '00:00:01' -- give it some rest :)
end
So the query deletes 1000 rows at a time. Optimizer will likely lock just a page to delete the rows, not whole table.
The total time of this query execution will be longer, but it will not block other callers.
Disclaimer: assumed MS SQL.
Another approach is to use SNAPSHOT transaction. This way table readers will not be blocked while rows are being deleted.
Wait a second, are you trying to do this...
DELETE Table1 WHERE ID NOT IN (SELECT ID FROM Table2)
?
If so, that's how I would write it.
You could also try to update the statistics on both tables. And of course indexes on Table1.ID and Table2.ID could speed things up considerably.
EDIT: If you're getting timeouts from the designer, increase the "Designer" timeout value in SSMS (default is 30 seconds). Tools -> Options -> Designers -> "Override connection string time-out value for table designer updates" -> enter reasonable number (in seconds).
Both ID columns need an index
Then use simpler SQL
DELETE Table1 WHERE NOT EXISTS (SELECT * FROM Table2 WHERE Table1.ID = Table2.ID)
I have a scenario where I would like to update multiple fields in multiple Tables using just one instuction. I need a Syntax to perform such opperations on multiple Databases (Oracle and MSSQL).
At the moment I am stuck at the following statement from MSSQL:
update table1
set table1.value = 'foo'
from table1 t1 join table2 t2 on t1.id = t2.tab1_id
where t1.id = 1234
I would like to update a field in t2 aswell in the same statement.
Further I would like to perform the same Update(s) on Oracle.
EDIT:Seems like I can not update multiple Tables in just one statement. Is there a syntax that works for Oracle and MSSql when updating using a Join?
Regards
Seems like I can not update multiple
Tables in just one statement.
Is there a syntax that works for
Oracle and MSSql when updating using a
Join?
I assume when you re-posed the question you want syntax that will work on both Oracle and SQL Server even though it will inevitably affect only one table.
Entry level SQL-92 Standard code is supported by both platforms, therefore the following 'scalar subqueries' SQL-92 code should work:
UPDATE table1
SET my_value = (
SELECT t2.tab1_id
FROM table2 AS t2
WHERE t2.tab1_id = table1.id
)
WHERE id = 1234
AND EXISTS (
SELECT *
FROM table2 AS t2
WHERE t2.tab1_id = table1.id
);
Note that while using the correlation name t1 for Ttble1 is valid syntax according to the SQL-92 Standard this will materialize a table and the UPDATE will then target the materialized table 't1' and leave your base table 'table1` unaffected, which I assume is not the desired affect. While I'm fairly sure both Oracle and SQL Server are non-compliant is this regard and that in practise would work as expected, there's no harm in being ultra cautious and sticking to the SQL-92 syntax by fully qualifying the target table.
Folk tend not to like the 'repeated' code in the above subqueries (even though the optimizer should be smart enough to evaluate it only once).
More recent versions of Oracle and SQL Server support both support Standard SQL:2003 MERGE syntax, would may be able to use something close to this:
MERGE INTO table1
USING (
SELECT t2.tab1_id
FROM table2 AS t2
) AS source
ON id = source.tab1_id
AND id = 1234
WHEN MATCHED THEN
UPDATE
SET my_value = source.tab1_id;
I just noticed your example is even simpler than I first thought and merely requires a simple subquery that should run on most SQL products e.g.
UPDATE table1
SET my_value = 'foo'
WHERE EXISTS (
SELECT *
FROM table2 AS t2
WHERE t2.tab1_id = table1.id
);
on Oracle, you can update only one table , but you could think of using a trigger .
I've added a field to a MySQL table. I need to populate the new column with the value from another table. Here is the query that I'd like to run:
UPDATE table1 t1
SET t1.user_id =
(
SELECT t2.user_id
FROM table2 t2
WHERE t2.usr_id = t1.usr_id
)
I ran that query locally on 239K rows and it took about 10 minutes. Before I do that on the live environment I wanted to ask if what I am doing looks ok i.e. does 10 minutes sound reasonable. Or should I do it another way, a php loop? a better query?
Use an UPDATE JOIN! This will provide you a native inner join to update from, rather than run the subquery for every bloody row. It tends to be much faster.
update table1 t1
inner join table2 t2 on
t1.usr_id = t2.usr_id
set t1.user_id = t2.user_id
Ensure that you have an index on each of the usr_id columns, too. That will speed things up quite a bit.
If you have some rows that don't match up, and you want to set t1.user_id = null, you will need to do a left join in lieu of an inner join. If the column is null already, and you're just looking to update it to the values in t2, use an inner join, since it's faster.
I should make mention, for posterity, that this is MySQL syntax only. The other RDBMS's have different ways of doing an update join.
There are two rather important pieces of information missing:
What type of tables are they?
What indexes exist on them?
If table2 has an index that contains user_id and usr_id as the first two columns and table1 is indexed on user_id, it shouldn't be that bad.
You don't have an index on t2.usr_id.
Create this index and run your query again, or a multiple-table UPDATE proposed by #Eric (with LEFT JOIN, of course).
Note that MySQL lacks other JOIN methods than NESTED LOOPS, so it's index that matters, not the UPDATE syntax.
However, the multiple table UPDATE is more readable.
I got a query with five joins on some rather large tables (largest table is 10 mil. records), and I want to know if rows exists. So far I've done this to check if rows exists:
SELECT TOP 1 tbl.Id
FROM table tbl
INNER JOIN ... ON ... = ... (x5)
WHERE tbl.xxx = ...
Using this query, in a stored procedure takes 22 seconds and I would like it to be close to "instant". Is this even possible? What can I do to speed it up?
I got indexes on the fields that I'm joining on and the fields in the WHERE clause.
Any ideas?
switch to EXISTS predicate. In general I have found it to be faster than selecting top 1 etc.
So you could write like this IF EXISTS (SELECT * FROM table tbl INNER JOIN table tbl2 .. do your stuff
Depending on your RDBMS you can check what parts of the query are taking a long time and which indexes are being used (so you can know they're being used properly).
In MSSQL, you can use see a diagram of the execution path of any query you submit.
In Oracle and MySQL you can use the EXPLAIN keyword to get details about how the query is working.
But it might just be that 22 seconds is the best you can do with your query. We can't answer that, only the execution details provided by your RDBMS can. If you tell us which RDBMS you're using we can tell you how to find the information you need to see what the bottleneck is.
4 options
Try COUNT(*) in place of TOP 1 tbl.id
An index per column may not be good enough: you may need to use composite indexes
Are you on SQL Server 2005? If som, you can find missing indexes. Or try the database tuning advisor
Also, it's possible that you don't need 5 joins.
Assuming parent-child-grandchild etc, then grandchild rows can't exist without the parent rows (assuming you have foreign keys)
So your query could become
SELECT TOP 1
tbl.Id --or count(*)
FROM
grandchildtable tbl
INNER JOIN
anothertable ON ... = ...
WHERE
tbl.xxx = ...
Try EXISTS.
For either for 5 tables or for assumed heirarchy
SELECT TOP 1 --or count(*)
tbl.Id
FROM
grandchildtable tbl
WHERE
tbl.xxx = ...
AND
EXISTS (SELECT *
FROM
anothertable T2
WHERE
tbl.key = T2.key /* AND T2 condition*/)
-- or
SELECT TOP 1 --or count(*)
tbl.Id
FROM
mytable tbl
WHERE
tbl.xxx = ...
AND
EXISTS (SELECT *
FROM
anothertable T2
WHERE
tbl.key = T2.key /* AND T2 condition*/)
AND
EXISTS (SELECT *
FROM
yetanothertable T3
WHERE
tbl.key = T3.key /* AND T3 condition*/)
Doing a filter early on your first select will help if you can do it; as you filter the data in the first instance all the joins will join on reduced data.
Select top 1 tbl.id
From
(
Select top 1 * from
table tbl1
Where Key = Key
) tbl1
inner join ...
After that you will likely need to provide more of the query to understand how it works.
Maybe you could offload/cache this fact-finding mission. Like if it doesn't need to be done dynamically or at runtime, just cache the result into a much smaller table and then query that. Also, make sure all the tables you're querying to have the appropriate clustered index. Granted you may be using these tables for other types of queries, but for the absolute fastest way to go, you can tune all your clustered indexes for this one query.
Edit: Yes, what other people said. Measure, measure, measure! Your query plan estimate can show you what your bottleneck is.
Use the maximun row table first in every join and if more than one condition use
in where then sequence of the where is condition is important use the condition
which give you maximum rows.
use filters very carefully for optimizing Query.