Alternatives to UPDATE statement Oracle 11g - sql

I'm currently using Oracle 11g and let's say I have a table with the following columns (more or less)
Table1
ID varchar(64)
Status int(1)
Transaction_date date
tons of other columns
And this table has about 1 Billion rows. I would want to update the status column with a specific where clause, let's say
where transaction_date = somedatehere
What other alternatives can I use rather than just the normal UPDATE statement?
Currently what I'm trying to do is using CTAS or Insert into select to get the rows that I want to update and put on another table while using AS COLUMN_NAME so the values are already updated on the new/temporary table, which looks something like this:
INSERT INTO TABLE1_TEMPORARY (
ID,
STATUS,
TRANSACTION_DATE,
TONS_OF_OTHER_COLUMNS)
SELECT
ID
3 AS STATUS,
TRANSACTION_DATE,
TONS_OF_OTHER_COLUMNS
FROM TABLE1
WHERE
TRANSACTION_DATE = SOMEDATE
So far everything seems to work faster than the normal update statement. The problem now is I would want to get the remaining data from the original table which I do not need to update but I do need to be included on my updated table/list.
What I tried to do at first was use DELETE on the same original table using the same where clause so that in theory, everything that should be left on that table should be all the data that i do not need to update, leaving me now with the two tables:
TABLE1 --which now contains the rows that i did not need to update
TABLE1_TEMPORARY --which contains the data I updated
But the delete statement in itself is also too slow or as slow as the orginal UPDATE statement so without the delete statement brings me to this point.
TABLE1 --which contains BOTH the data that I want to update and do not want to update
TABLE1_TEMPORARY --which contains the data I updated
What other alternatives can I use in order to get the data that's the opposite of my WHERE clause (take note that the where clause in this example has been simplified so I'm not looking for an answer of NOT EXISTS/NOT IN/NOT EQUALS plus those clauses are slower too compared to positive clauses)
I have ruled out deletion by partition since the data I need to update and not update can exist in different partitions, as well as TRUNCATE since I'm not updating all of the data, just part of it.
Is there some kind of JOIN statement I use with my TABLE1 and TABLE1_TEMPORARY in order to filter out the data that does not need to be updated?
I would also like to achieve this using as less REDO/UNDO/LOGGING as possible.
Thanks in advance.

I'm assuming this is not a one-time operation, but you are trying to design for a repeatable procedure.
Partition/subpartition the table in a way so the rows touched are not totally spread over all partitions but confined to a few partitions.
Ensure your transactions wouldn't use these partitions for now.
Per each partition/subpartition you would normally UPDATE, perform CTAS of all the rows (I mean even the rows which stay the same go to TABLE1_TEMPORARY). Then EXCHANGE PARTITION and rebuild index partitions.
At the end rebuild global indexes.
If you don't have Oracle Enterprise Edition, you would need to either CTAS entire billion of rows (followed by ALTER TABLE RENAME instead of ALTER TABLE EXCHANGE PARTITION) or to prepare some kind of "poor man's partitioning" using a view (SELECT UNION ALL SELECT UNION ALL SELECT etc) and a bunch of tables.
There is some chance that this mess would actually be faster than UPDATE.
I'm not saying that this is elegant or optimal, I'm saying that this is the canonical way of speeding up large UPDATE operations in Oracle.

How about keeping in the UPDATE in the same table, but breaking it into multiple small chunks?
UPDATE .. WHERE transaction_date = somedatehere AND id BETWEEN 0000000 and 0999999
COMMIT
UPDATE .. WHERE transaction_date = somedatehere AND id BETWEEN 1000000 and 1999999
COMMIT
UPDATE .. WHERE transaction_date = somedatehere AND id BETWEEN 2000000 and 2999999
COMMIT
This could help if the total workload is potentially manageable, but doing it all in one chunk is the problem. This approach breaks it into modest-sized pieces.
Doing it this way could, for example, enable other apps to keep running & give other workloads a look in; and would avoid needing a single humungous transaction in the logfile.

Related

Oracle efficient way of updating non-indexed and non-partioned table?

Is there an efficient way to update rows of a table that has no indexes and no partitions (and ~50millions rows)?
I have a date field LOAD_DTTM and values of this field for rows that require update (around 2000 distinct dates).
WIll update be faster if i specify a date in a WHERE clause along with the UNIQUE_ID of a row?
If you want to update all, or a large number, of the rows then the quickest way is:
create table my_table_copy as
select ... -- all the columns, updating values as required
from my_table;
drop table my_table;
rename my_table_copy to my_table;
If your table had any indexes, constraints or triggers you would now need to re-add them - but it seems you don't have that issue!
You could create a PL/SQL procedure that loops and update and commit the table every n row count -- Say every 20.000 rows. I do not advise to update the full table as it will create a lock for a looong time and expose you to data loss in case of external factors.
The answer is NO.
Even if you specify both conditions in your WHERE clause as you stated, it won't help you to avoid a full scan of your table.
Even if one of your criteria will uniquely identify the row, it still won't help.
There is a real example tested on Oracle 12C ver.2 similar to your case. No indexes, no partitions, nothing. Just plain table with 4 columns
I have a table with 18mn records.
I also have CUSTOMER_ID which is a UNIQUE identifier for a row.
I also have ORDER_DATE column there.
Even if I do the query that you mentioned
update hit set status = 1 where customer_id = 408518625844 and order_date = '09-DEC-19';
it won't help me to avoid a full table scan. See below Execution Plan. Therefore under conditions, you've specified, you will be always getting the slowest execution time possible. Full Table Scan on 50mn rows is actually the worst-case scenario.
And pay attention to that Cost, it is 26539 on 18mn rows.
So if you have 50mn rows we can easily expect much more Cost for your query

updating a very large oracle table

I have a very large table people with 60M rows indexed on id, wish to populate a field newid for every record based on a look up table id_conversion (1M rows) which contains id and newid, indexed on id.
when I run
update people p set p.newid=(select l.newid from id_conversion l where l.id=p.id)
it runs for an hour or so and then I get an archive error ora 00257.
Any suggestions for either running update in sections or better sql command?
To avoid writing to Oracle's undo log if your update statement hits every single row of the table then you are likely better off running a create table as select query which will bypass all undo logs, which is likely the issue you're running into as it is logging the impact across 60 million rows. You can then drop the old table and rename the new table to that of the old table's name.
Something like:
create table new_people as
select l.newid,
p.col2,
p.col3,
p.col4,
p.col5
from people p
join id_conversion l
on p.id = l.id;
drop table people;
-- rebuild any constraints and indexes
-- from old people table to new people table
alter table new_people rename to people;
For reference, read some of the tips here: http://www.dba-oracle.com/t_efficient_update_sql_dml_tips.htm
If you are basically creating a new table and not just updating some of the rows of a table it will likely prove the faster method.
I doubt you will be able to get this to run in seconds. Your query, as written, needs to update all 60 million rows.
My first advice is to add an index on id_conversion(id, newid), to make the subquery more efficient. If that doesn't help, then doing the update in batches might be the best way to go.
I should add. Because you are updating all the rows, it might be faster to take the following approach:
Copy the data into a new table with the new values.
Truncate the original table.
Insert the new data into the old table.
Inserts are faster than updates.
In addition to the answers above, which probably will work better in this case, you should know the MERGE statement
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_9016.htm
that is used for updating one table according to another table and is far faster then update according to a select statement

Updating Table Records in a Batch and Auditing it

Consider this Table:
Table: ORDER
Columns: id, order_num, order_date, order_status
This table has 1 million records. I want to update the order_status to value of '5', for a bunch (about 10,000) of order_num's that i will be reading from a input text file.
My SQL could be:
(A) update ORDER set order_status=5 where order_num in ('34343', '34454', '454545',...)
OR
(B) update ORDER set order_status=5 where order_num='34343'
I can loop over this update several times until I have covered my 10,000 order updates.
(Also note that i have few Child Tables of ORDER like ORDER_ITEMS, where similar status must be updated and information audited)
My problem is here is:
How can i Audit this update in a separate ORDER_AUDIT Table:
Order_Num: 34343 - Updated Successfully
Order_Num: 34454 - Order Not Found
Order_Num: 454545 - Updated Successfully
Order_Num: 45457 - Order Not Found
If i go for batch update as in (A), I cannot Audit at Order Level.
If i go for Single Order at at time update as in (B), I will have to loop 10,000 times - that may be quite slow - but I can Audit at Order level in this case.
Is there any other way?
First of all, build an external table over your "input text file". That way you can run a simple single UPDATE statement:
update ORDER
set order_status=5
where order_num in ( select col1 from ext_table order by col1)
Neat and efficient. (Sorting the sub-query is optional: it may improve the performance of the update but the key point is, we can treat external tables like regular tables and use the full panoply of the SELECT syntax on them.) Find out more.
Secondly use the RETURNING clause to capture the hits.
update ORDER
set order_status=5
where order_num in ( select col1 from ext_table order by col1)
returning order_num bulk collect into l_nums;
l_nums in this context is a PL/SQL collection of type number. The RETURNING clause will give you all the ORDER_NUM values for updated rows only. Find out more.
If you declare the type for l_nums as a SQL nested table object you can use it in further SQL statements for your auditing:
insert into order_audit
select 'Order_Num: '||to_char(t.column_value)||' - Updated Succesfully'
from table ( l_nums ) t
/
insert into order_audit
select 'Order_Num: '||to_char(col1)||' - Order Not Found'
from ext_table
minus
select * from table ( l_nums )
/
Notes on performance:
You don't say how many of the rows you have in the input text file will match. Perhaps you don't know (actually on re-reading it's not clear whether 10,000 is the number of rows in the file or the number of matching rows). Pl/SQL collections use private session memory, so very large collections can blow the PGA. However, you should be able to cope with ten thousand NUMBER instances without blinching.
My solution does require you to read the external table twice. This shouldn't be a problem. And it will certainly be way faster than dynamically assembling one hundred IN clauses of a thousand numbers and looping over each.
Note that update is often the slowest bulk operation known to man. There are ways of speeding them up, but those methods can get quite involved. However, if this is something you'll want to do often and performance becomes a sticking point you should read this OraFAQ article.
Use MERGE. Firstly load data into a temporary table called ORDER_UPD_TMP with only one column id. You can do it using SQLDeveloper import feature. Then use MERGE in order to udpate your base table:
MERGE INTO ORDER b
USING (
SELECT order_id
FROM ORDER_UPD_TMP
) e
ON (b.id = e.id)
WHEN MATCHED THEN
UPDATE SET b.status = 5
You can also update with a different status when records don't match. Check the documentation for more details:
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_9016.htm
I think the best way will be:
to import your file to the database first
then do few SQL UPDATE/INSERT queries in one transaction to update status for all orders and create audit records.

minus vs delete where exist in oracle

I have a CREATE TABLE query which can be done using two methods (create as select statement for thousands/million records):
First method:
create table as select some data minus (select data from other table)
OR
first i should create the table as
create table as select .....
and then
delete from ..where exist.
I guess the second method is better.For which query the cost is less?Why is minus query not as fast as the second method?
EDIT:
I forgot to mention that the create statement has join from two tables as well.
The minus is slow probably because it needs to sort the tables on disk in order to compare them.
Try to rewrite the first query with NOT EXISTS instead of MINUS, it should be faster and will generate less REDO and UNDO (as a_horse_with_no_name mentioned). Of course, make sure that all the fields involved in the WHERE clauses are indexed!
The second one will write lots of records to disk and then remove them. This will in 9 of 10 cases take way longer then filtering what you write in to begin with.
So if the first one actually isn't faster we need more information about the tables and statements involved.

SQL - renumbering a sequential column to be sequential again after deletion

I've researched and realize I have a unique situation.
First off, I am not allowed to post images yet to the board since I'm a new user, so see appropriate links below
I have multiple tables where a column (not always the identifier column) is sequentially numbered and shouldn't have any breaks in the numbering. My goal is to make sure this stays true.
Down and Dirty
We have an 'Event' table where we randomly select a percentage of the rows and insert the rows into table 'Results'. The "ID" column from the 'Results' is passed to a bunch of delete queries.
This more or less ensures that there are missing rows in several tables.
My problem:
Figuring out an sql query that will renumber the column I specify. I prefer to not drop the column.
Example delete query:
delete ItemVoid
from ItemTicket
join ItemVoid
on ItemTicket.item_ticket_id = itemvoid.item_ticket_id
where itemticket.ID in (select ID
from results)
Example Tables Before:
Example Tables After:
As you can see 2 rows were delete from both tables based on the ID column. So now I gotta figure out how to renumber the item_ticket_id and the item_void_id columns where the the higher number decreases to the missing value, and the next highest one decreases, etc. Problem #2, if the item_ticket_id changes in order to be sequential in ItemTickets, then
it has to update that change in ItemVoid's item_ticket_id.
I appreciate any advice you can give on this.
(answering an old question as it's the first search result when I was looking this up)
(MS T-SQL)
To resequence an ID column (not an Identity one) that has gaps,
can be performed using only a simple CTE with a row_number() to generate a new sequence.
The UPDATE works via the CTE 'virtual table' without any extra problems, actually updating the underlying original table.
Don't worry about the ID fields clashing during the update, if you wonder what happens when ID's are set that already exist, it
doesn't suffer that problem - the original sequence is changed to the new sequence in one go.
WITH NewSequence AS
(
SELECT
ID,
ROW_NUMBER() OVER (ORDER BY ID) as ID_New
FROM YourTable
)
UPDATE NewSequence SET ID = ID_New;
Since you are looking for advice on this, my advice is you need to redesign this as I see a big flaw in your design.
Instead of deleting the records and then going through the hassle of renumbering the remaining records, use a bit flag that will mark the records as Inactive. Then when you are querying the records, just include a WHERE clause to only include the records are that active:
SELECT *
FROM yourTable
WHERE Inactive = 0
Then you never have to worry about re-numbering the records. This also gives you the ability to go back and see the records that would have been deleted and you do not lose the history.
If you really want to delete the records and renumber them then you can perform this task the following way:
create a new table
Insert your original data into your new table using the new numbers
drop your old table
rename your new table with the corrected numbers
As you can see there would be a lot of steps involved in re-numbering the records. You are creating much more work this way when you could just perform an UPDATE of the bit flag.
You would change your DELETE query to something similar to this:
UPDATE ItemVoid
SET InActive = 1
FROM ItemVoid
JOIN ItemTicket
on ItemVoid.item_ticket_id = ItemTicket.item_ticket_id
WHERE ItemTicket.ID IN (select ID from results)
The bit flag is much easier and that would be the method that I would recommend.
The function that you are looking for is a window function. In standard SQL (SQL Server, MySQL), the function is row_number(). You use it as follows:
select row_number() over (partition by <col>)
from <table>
In order to use this in your case, you would delete the rows from the table, then use a with statement to recalculate the row numbers, and then assign them using an update. For transactional integrity, you might wrap the delete and update into a single transaction.
Oracle supports similar functionality, but the syntax is a bit different. Oracle calls these functions analytic functions and they support a richer set of operations on them.
I would strongly caution you from using cursors, since these have lousy performance. Of course, this will not work on an identity column, since such a column cannot be modified.