Bulk delete (truncate vs delete)

Bulk delete (truncate vs delete) - sql

We have a table with a 150+ million records. We need to clear/delete all rows. Delete operation would take forever due to it writing to the t-logs and we cannot change our recovery model for the whole DB. We have tested the truncate table option.
What we realized that truncate deallocates pages from the table, and if I am not wrong makes them available for reuse but doesn't shrink the db automatically. So, if we want to reduce the DB size, we really would need to do run the shrink db command after truncating the table.
Is this normal procedure? Anything we need to be careful or aware about, or are there any better alternatives?

truncate is what you're looking for. If you need to slim down the db afterwards, run a shrink.
This MSDN refernce (if you're talking T-SQL) compares the behind the scenes of deleting rows versus truncating.

"Delete all rows"... wouldn't DROP TABLE (and re-recreate an empty one with same schema / indices) be preferable ? (I personally like "fresh starts" ;-) )
This said TRUNCATE TABLE is quite OK too, and yes, DBCC SHRINKFILE may be required afterwards if you wish to recover the space.

Depending on the size of the full database, the shrink may take a while; I've found it to go faster if it is shrunk in smaller chunks, rather than trying to get it back all at once.

One thing to remember with Truncate Table (as well as drop table) is going forward this will not work if you ever have foreign keys referencing the table.

As pointed out, if you can't use truncate or drop
SELECT 1
WHILE ##ROWCOUNT <> 0
DELETE TOP (100000) MyTable

You have a normal solution (truncate + shrink db) to remove all the records from a table.
As Irwin pointed out. The TRUNCATE command won't work while being referenced by a Foreign key constraint. So first drop the constraints, truncate the table and recreate the constraints.
If your concerned about performance and this is a regular routine for your system. You might want to look into moving this table to it's own data file, then run shrink only against the target datafile!

Related

Truncation of large table in SQL Server database

I would like to completely clear one table in my SQL Server database.
Unfortunately, the table is large (> 90GB). I am going to use the TRUNCATE statement.
The question is whether I should pay attention to something before?
I am also wondering if it will somehow affect the server's disk space (currently about 110 GB free)?
After all the action, SHRINK DATABASE will probably be necessary.

TRUNCATE TABLE is faster and uses fewer system and transaction log resources
than DELETE with no WHERE clause,
but if you need even faster solution, you can create new version of the table (table1), drop the old table, and rename table1 into table.
R

Oracle performance - disabling FKs to make DELETE statement work faster?

When deleting from a large table in Oracle - let's call it table X - does it make sense to disable table X's FKs that do not have ON DELETE CASCADE? I'm not referring to disabling FKs on other tables that link to table X, but just disabling FKs on table X to improve the performance of the DELETE statements.
I'm making the indexes on table X unusable, but the DELETE still takes a while.
I think that those FKs don't matter to the performance of the DELETE statement since we're just deleting, and not inserting or updating, so the FKs don't need to be checked. What do you think?

That seems like a really bad idea. No matter what you do, you'll have a period where referential integrity is not enforced on your database. Then you go to put the FKs back in place and, oops, someone has inserted an invalid row.
Furthermore, ALTER TABLE is a DDL statement, so executing it will commit any work up to that point. You'll lose the ability to rollback if something goes wrong elsewhere in your transaction.
Can you look through the explain plan to see why your DELETE statement is taking so long?

Eventually I didn't have to disable those FKs before the archive process starts and enable them when the process ends. But instead, in order to improve performance of the DELETE statements, we had to drop indexes before the archive process starts and recreate them after the archive process finishes. We also committed more often.

Why would you Truncate immediately before Dropping a temp table?

I see some code where the author has truncated a temp table immediately before dropping the temp table. Is there a reason for doing this?
TRUNCATE TABLE #Temp
DROP TABLE #Temp

Another reason is that a DROP TABLE is a fully logged operation, so by truncating first (which is never logged) you lower transactional logging overhead.

Possibly the author was under the impression that DROP TABLE would be quicker if the table was already empty and knew that TRUNCATE would be quicker than DELETE.

On very large temp tables, it's sometimes faster to truncate first then drop because truncate simply moves a pointer. It's normally not needed since temp tables drop on their own.

Could possibly be a knucklehead. (not that I'm perfect).

Another possibility is that the coder is trying to avoid a delayed drop that will occur when temp table is larger than 8MB. I.e. truncate it then drop it. I'm not sure if the SQL engine will be fooled by this but it might force synchronous clean up. I can't see why you'd want to do this, maybe to avoid some problem with accumulating delayed drops (Temp tables for destruction)

Possibly to see if the TRUNCATE command throws an exception due to existing foreign keys?

deleting a large number of rows from a table

We have a requirement to delete rows in the order of millions from multiple tables as a batch job (note that we are not deleting all the rows, we are deleting based on a timestamp stored in an indexed column). Obviously a normal DELETE takes forever (because of logging, referential constraint checking etc.). I know in the LUW world, we have ALTER TABLE NOT LOGGED INITIALLY but I can't seem to find the an equivalent SQL statement for DB2 v8 z/OS. Any one has any ideas on how to do this really fast? Also, any ideas on how to avoid the referential checks when deleting the rows? Please let me know.

In the past I have solved this kind of problem by exporting the data and re-loading it with a replace style command. For example:
EXPORT to myfile.ixf OF ixf
SELECT *
FROM my_table
WHERE last_modified < CURRENT TIMESTAMP - 30 DAYS;
Then you can LOAD it back in, replacing the old stuff.
LOAD FROM myfile.ixf OF ixf
REPLACE INTO my_table
NONRECOVERABLE INDEXING MODE INCREMENTAL;
I'm not sure whether this will be faster or not for you (probably it depends on whether you're deleting more than you're keeping).

Do the foreign keys already have indexes as well?
How do you have your delete action set? CASCADE, NULL, NO ACTION
Use SET INTEGRITY to temporarily disable constraints on the batch process.
http://www.ibm.com/developerworks/data/library/techarticle/dm-0401melnyk/index.html
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r

We modified the tablespace so the lock would occur at the tablespace level instead of at the page level. Once we changed that DB2 only required one lock to do the DELETE and we didn't have any issues with locking. As for the logging, we just asked the customer to be aware of the amount of logging required (as there did not seem to be a solution to get around the logging issue). As for the constraints, we just dropped and recreated them after the delete.
Thanks all for your help.

Why 'delete from table' takes a long time when 'truncate table' takes 0 time?

(I've tried this in MySql)
I believe they're semantically equivalent. Why not identify this trivial case and speed it up?

truncate table cannot be rolled back, it is like dropping and recreating the table.

...just to add some detail.
Calling the DELETE statement tells the database engine to generate a transaction log of all the records deleted. In the event the delete was done in error, you can restore your records.
Calling the TRUNCATE statement is a blanket "all or nothing" that removes all the records with no transaction log to restore from. It is definitely faster, but should only be done when you're sure you don't need any of the records you're going to remove.

Delete from table deletes each row from the one at a time and adds a record into the transaction log so that the operation can be rolled back. The time taken to delete is also proportional to the number of indexes on the table, and if there are any foreign key constraints (for innodb).
Truncate effectively drops the table and recreates it and can not be performed within a transaction. It therefore required fewer operations and executes quickly. Truncate also does not make use of any on delete triggers.
Exact details about why this is quicker in MySql can be found in the MySql documentation:
http://dev.mysql.com/doc/refman/5.0/en/truncate-table.html

Your question was about MySQL and I know little to nothing about MySQL as a product but I thought I'd add that in SQL Server a TRUNCATE statement can be rolled back. Try it for yourself
create table test1 (col1 int)
go
insert test1 values(3)
begin tran
truncate table test1
select * from test1
rollback tran
select * from test1
In SQL Server TRUNCATE is logged, it's just not logged in such a verbose way as DELETE is logged. I believe it's referred to as a minimally logged operation. Effectively the data pages still contain the data but their extents have been marked for deletion. As long as the data pages still exist you can roll back the truncate. Hope this is helpful. I'd be interested to know the results if somebody tries it on MySQL.

For MySql 5 using InnoDb as the storage engine, TRUNCATE acts just like DELETE without a WHERE clause: i.e. for large tables it takes ages because it deletes rows one-by-one. This is changing in version 6.x.
see
http://dev.mysql.com/doc/refman/5.1/en/truncate-table.html
for 5.1 info (row-by-row with InnoDB) and
http://blogs.mysql.com/peterg/category/personal-opinion/
for changes in 6.x
Editor's note
This answer is clearly contradicted by the MySQL documentation:
"For an InnoDB table before version 5.0.3, InnoDB processes TRUNCATE TABLE by deleting rows one by one. As of MySQL 5.0.3, row by row deletion is used only if there are any FOREIGN KEY constraints that reference the table. If there are no FOREIGN KEY constraints, InnoDB performs fast truncation by dropping the original table and creating an empty one with the same definition, which is much faster than deleting rows one by one."

Truncate is on a table level, while Delete is on a row level. If you would translate this to sql in an other syntax, truncate would be:
DELETE * FROM table
thus deleting all rows at once, while DELETE statement (in PHPMyAdmin) goes like:
DELETE * FROM table WHERE id = 1
DELETE * FROM table WHERE id = 2
Just until the table is empty. Each query taking a number of (milli)seconds which add up to taking longer than a truncate.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Bulk delete (truncate vs delete) - sql

truncate is what you're looking for. If you need to slim down the db afterwards, run a shrink. This MSDN refernce (if you're talking T-SQL) compares the behind the scenes of deleting rows versus truncating.

"Delete all rows"... wouldn't DROP TABLE (and re-recreate an empty one with same schema / indices) be preferable ? (I personally like "fresh starts" ;-) ) This said TRUNCATE TABLE is quite OK too, and yes, DBCC SHRINKFILE may be required afterwards if you wish to recover the space.

Depending on the size of the full database, the shrink may take a while; I've found it to go faster if it is shrunk in smaller chunks, rather than trying to get it back all at once.

One thing to remember with Truncate Table (as well as drop table) is going forward this will not work if you ever have foreign keys referencing the table.

As pointed out, if you can't use truncate or drop SELECT 1 WHILE ##ROWCOUNT <> 0 DELETE TOP (100000) MyTable

Related

Truncation of large table in SQL Server database

Oracle performance - disabling FKs to make DELETE statement work faster?

Why would you Truncate immediately before Dropping a temp table?

deleting a large number of rows from a table

Why 'delete from table' takes a long time when 'truncate table' takes 0 time?

Categories

Resources