The answer to another SO question was to use this SQL query:
SELECT o.Id, o.attrib1, o.attrib2
FROM table1 o
JOIN (SELECT DISTINCT Id
FROM table1, table2, table3
WHERE ...) T1 ON o.id = T1.Id
Now I wonder how I can use this statement together with the keyword FOR UPDATE. If I simply append it to the query, Oracle will tell me:
ORA-02014: cannot select FOR UPDATE from view
Do I have to modify the query or is there a trick to do this with Oracle?
With MySql the statement works fine.
try:
select .....
from <choose your table>
where id in (<your join query here>) for UPDATE;
EDIT: that might seem a bit counter-intuitive bearing in mind the question you linked to (which asked how to dispense with an IN), but may still provide benefit if your join returns a restricted set. However, there is no workaround: the oracle exception is pretty self-explanatory; oracle doesn't know which rows to lock becasue of the DISTINCT. You could either leave out the DISTINCT or define everything in a view and then update that, if you wanted to, without the explicit lock: http://www.dba-oracle.com/t_ora_02014_cannot_select_for_update.htm
It may depend what you want to update. You can do
... FOR UPDATE OF o.attrib1
to tell it you're only interested in updating data from the main table, which seems likely to be the case; this means it will only try to lock that table and not worry about the implicit view in the join. (And you can still update multiple columns within that table, naming one still locks the whole row - though it will be clearer if you specify all the columns you want to update in the FOR UPDATE OF clause).
Don't know if that will work with MySQL though, which brings us back to Mark Byers point.
Related
On snowflake, is there an alternative to query with DELETE SQL statement with CTE? seems it is not possible.
with t as (
select * from "SNOWFLAKE_SAMPLE_DATA"."TPCDS_SF100TCL"."CALL_CENTER"
), p as (select t.CC_REC_END_DATE, t.CC_CALL_CENTER_ID , t.CC_REC_START_DATE from t where 1=1 AND t.CC_REC_START_DATE > '2000-01-01')
delete from p
For example: if we use a select, we got some results.
but if I use a delete. It shows a syntax error
The problem with this thinking is that SELECT returns some values, that might, or might not be, matching individual records in your table. In principle, they might return even combinations of values from multiple tables, multiple rows, or no rows at all even. What would you want DELETE to do then?
So, for DELETE, you need to specify which rows in the table you're operating on are deleted. You can do it with a simple WHERE clause, or with a USING clause.
If you want CTEs with DELETE, the closest would be to use USING, put a subquery there, and join its result with the table. In many cases that's a very useful approach, but it is causing a join, which has a performance impact. Example:
delete from CALL_CENTER t
using (
select cc_call_center_sk from CALL_CENTER
where 1=1 AND t.CC_REC_START_DATE > '2000-01-01'
) AS sub
where sub.cc_call_center_sk = t.cc_call_center_sk.
But again, for your query it doesn't make much sense, and what #cddt wrote is probably your best bet.
The expected outcome from the question is not clear, so this is my best interpretation of it.
It seems like you want to delete records from the table SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CALL_CENTER where the field CC_REC_START_DATE is greater than 2000-01-01. If that's what you want to achieve, then you don't need a CTE at all.
DELETE FROM SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CALL_CENTER t
WHERE t.CC_REC_START_DATE > '2000-01-01'
So I'm trying to wrap my head around cursors. I have task to transfer data from one database to another, but they have slightly diffrent schemas. Let's say I have TableOne (Id, Name, Gold) and TableTwo (Id, Name, Lvl). I want to take all records from TableTwo and insert it into TableOne, but it can be duplicated data on Name column. So if single record from TableTwo exist (on Name column comparison) in TableOne, I want to skip it, if don't - create record in TableOne with unique Id.
I was thinking about looping on each record in TableTwo, and for every record check if it's exist in TableOne. So, how do I make this check without making call to another database every time? I wanted first select all record from TableOne, save it into variable and in loop itself make check against this variable. Is this even possible in SQL? I'm not so familiar with SQL, some code sample would help a lot.
I'm using Microsoft SQL Server Management Studio if that matters. And of course, TableOne and TableTwo exists in diffrent databases.
Try this
Insert into table1(id,name,gold)
Select id,name,lvl from table2
Where table2.name not in(select t1.name from table1 t1)
If you want to add newId for every row you can try
Insert into table1(id,name,gold)
Select (select max(m.id) from table1 m) + row_number() over (order by t2.id) ,name,lvl from table2 t2
Where t2.name not in(select t1.name from table1 t1)
It is possible yes, but I would not recommend it. Looping (which is essentially what a cursor does) is usually not advisable in SQL when a set-based operation will do.
At a high level, you probably want to join the two tables together (the fact that they're in different databases shouldn't make a difference). You mention one table has duplicates. You can eliminate those in a number of ways such as using a group by or a row_number. Both approaches will require you understanding which rows you want to "pick" and which ones you want to "ignore". You could also do what another user posted in a comment where you do an existence check against the target table using a correlated subquery. That will essentially mean that if any rows exist in the target table that have duplicates you're trying to insert, none of those duplicates will be put in.
As far as cursors are concerned, to do something like this, you'd be doing essentially the same thing, except on each pass of the cursor you would be temporarily assigning and using variables instead of columns. This approach is sometimes called RBAR (for "Rob by Agonizing Row"). On every pass of the cursor or loop, it has to re-open the table, figure out what data it needs, then operate on it. Even if that's efficient and it's only pulling back one row, there's still lots of overhead to doing that query. So while, yes, you can force SQL to do what you've describe, the database engine already has an operation for this (joins) which does it far faster than any loop you could conceivably write
I'm working with some very old legacy code, and I've seen a few queries that are structured like this
SELECT
FieldA,
FieldB,
FieldC
FROM
(
SELECT * FROM TABLE1
)
LEFT JOIN TABLE2 ON...
Is there any advantage to writing a query this way?
This is in Oracle.
There would seem to be no advantage to using a subquery like this. The reason may be a historical relic, regarding the code.
Perhaps once upon a time, there was a more complicated query there. The query was replaced by a table/view, and the author simply left the original structure.
Similarly, once upon a time, perhaps a column needed to be calculated (say for the outer query or select). This column was then included in the table/view, but the structure remained.
I'm pretty sure that Oracle is smart enough to ignore the subquery when optimizing the query. Not all databases are that smart, but you might want to clean-up the code. At the very least, such as subquery looks awkward.
As a basic good practice in SQL, you should not code a full-scan from a table (SELECT * FROM table, without a WHERE clause), unless necessary, for performance issues.
In this case, it's not necessary: The same result can be obtained by:
SELECT
Fields
FROM
TABLE1 LEFT JOIN TABLE2 ON...
I have a multi-table SELECT query which compares column values with itself like below:
SELECT * FROM table1 t1,table2 t2
WHERE t1.col1=t2.col1 --Different tables,So OK.
AND t1.col1=t1.col1 --Same tables??
AND t2.col1=t2.col1 --Same tables??
This seems redundant to me.
My query is, Will removing them have any impact on logic/performance?
Thanks in advance.
This seems redundant, its only effect is removing lines that have NULL values in these columns. Make sure the columns are NOT NULL before removing those clauses.
If the columns are nullable you can safely replace these lines with (easier to read, easier to maintain):
AND t1.col1 IS NOT NULL
AND t2.col1 IS NOT NULL
Update following Jeffrey's comment
You're absolutely right, I don't know how I didn't see it myself: the join condition t1.col1=t2.col1 implies that only the rows with the join columns not null will be considered. The clauses tx.col1=tx.col1 are therefore completely redundant and can be safely removed.
Don't remove them until you understand the impact. If, as others are pointing out, they have no effect on the query and are probably optimised out, there's no harm in leaving them there but there may be harm in removing them.
Don't try to fix something that's working until your damn sure you're not breaking something else.
The reason I mention this is because we inherited a legacy reporting application that had exactly this construct, along the lines of:
where id = id
And, being a sensible fellow, I ditched it, only to discover that the database engine wasn't the only thing using the query.
It first went through a pre-processor which extracted every column that existed in a where clause and ensured they were indexed. Basically an auto-tuning database.
Well, imagine our surprise on the next iteration when the database slowed to a fraction of its former speed when users were doing ad-hoc queries on the id field :-)
Turns out this was a kludge put in by the previous support team to ensure common ad-hoc queries were using indexed columns as well, even though none of our standard queries did so.
So, I'm not saying you can't do it, just suggesting that it might be a good idea to understand why it was put in first.
Yes, the conditions are obviously redundant since they're identical!
SELECT * FROM table1 t1,table2 t2
WHERE t1.col1=t2.col1
But you do need at least one of them. Otherwise, you'd have a cartesian join on your hands: each row from table1 will be joined to every row in table2. If table1 had 100 rows, and table2 had 1,000 rows, the resultant query would return 100,000 results.
SELECT * FROM table1 t1,table2 t2 --warning!
I have two tables: a source table and a target table. The target table will have a subset of the columns of the source table. I need to update a single column in the target table by joining with the source table based on another column. The update statement is as follows:
UPDATE target_table tt
SET special_id = ( SELECT source_special_id
FROM source_table st
WHERE tt.another_id = st.another_id )
For some reason, this statement seems to run indefinitely. The inner select happens almost immediately when executed by itself. The table has roughly 50,000 records and its hosted on a powerful machine (resources are not an issue).
Am I doing this correctly? Any reasons the above wouldn't work in a timely manner? Any better way to do this?
Your initial query executes the inner subquery once for every row in the outer table. See if Oracle likes this better:
UPDATE target_table
SET special_id = st.source_special_id
FROM
target_table tt
INNER JOIN
source_table st
WHERE tt.another_id = st.another_id
(edited after posted query was corrected)
Add:
If the join syntax doesn't work on Oracle, how about:
UPDATE target_table
SET special_id = st.source_special_id
FROM
target_table tt, source_table st
WHERE tt.another_id = st.another_id
The point is to join the two tables rather than using the outer query syntax you are currently using.
Is there an index on source_table(another_id)? If not source_table will be fully scanned once for each row in target_table. This will take some time if target_table is big.
Is it possible for there to be no match in source_table for some target_table rows? If so, your update will set special_id to null for those rows. If you want to avoid that do this:
UPDATE target_table tt
SET special_id = ( SELECT source_special_id
FROM source_table st
WHERE tt.another_id = st.another_id )
WHERE EXISTS( SELECT NULL
FROM source_table st
WHERE tt.another_id = st.another_id );
If target_table.another_id was declared as a foreign key referencing source_table.another_id (unlikely in this scenario), this would work:
UPDATE
( SELECT tt.primary_key, tt.special_id, st.source_special_id
FROM tatget_table tt
JOIN source_table st ON st.another_id = tt.another_id
)
SET special_id = source_special_id;
Are you actually sure that it's running?
Have you looked for blocking locks? indefinitely is a long time and that's usually only achieved via something stalling execution.
Update: Ok, now that the query has been fixed -- I've done this exact thing many times, on unindexed tables well over 50K rows, and it worked fine in Oracle 10g and 9i. So something else is going on here; yes, you are calling for nested loops, but no, it shouldn't run forever, even so. What are the primary keys on these tables? Do you by any chance have multiple rows from the second table matching the same row for the first table? You could be trying to rewrite the whole table over and over, throwing the locking system into a fit.
Original Response
That statement doesn't really make sense -- you are telling it to update all the rows where ids match, to the same id (meaning, no change happens!).
I imagine the real statement looks a bit different?
Please also provide table schema information (primary keys for the 2 tables, any available indexes, etc).
Not sure what Oracle has available, but MS Sql Server has a tuning advisor that you can feed your queries into and it will give recommendation for adding indexes, etc... I would assume Oracle has something similar.
That would be the quickest way to pinpoint the issue.
I don't know Oracle, but MSSQLServer optimizer would have no problem converting the subquery into a join for you.
It sounds like you might be doing a data import against a short-lived or newly created table. It is easy to overlook indexing these kinds of tables. I'd make sure there's an index on sourcetable.anotherid - or a covering index on sourcetable.anotherid, sourcetable.specialid (order matters, anotherid should be first).
In cases such as these (query running unexpectedly for longer than 1 second). It is best to figure out how your environment provides query plans. Examine that plan and the problem will become clear.
You see, there is no such thing as "optimized sql code". Sql code is never executed - query plans are generated from the code and then those plans are executed.
Check that the statistics are up to date on the tables - see this question
I had the same problem, and got a "SQL command not properly ended" when trying Codewerks' answer in Oracle 11g. A bit of Googling turned up the Oracle MERGE statement, which I adapted as follows:
MERGE INTO target_table tt
USING source_table st
ON (tt.another_id = st.another_id)
WHEN MATCHED THEN
UPDATE SET tt.special_id = st.special_id;
If you're not sure all the values of another_id will be in source_table then you can use the WHEN NOT MATCHED THEN clause to handle that case.
If you have constraints that let Oracle know there is a 1-1 relationship when you join on "another_id", I think this might work well:
UPDATE ( SELECT tt.rowid tt_rowid, tt.another_id, tt.special_id, st.source_special_id
FROM target_table tt, source_table st
WHERE tt.another_id = st.other_id
ORDER BY tt.rowid )
SET special_id = source_special_id
Ordering by ROWID is important when you are updating and updatable view with a join.