SQL Batched Delete - sql

I have a table in SQL Server 2005 which has approx 4 billion rows in it. I need to delete approximately 2 billion of these rows. If I try and do it in a single transaction, the transaction log fills up and it fails. I don't have any extra space to make the transaction log bigger. I assume the best way forward is to batch up the delete statements (in batches of ~ 10,000?).
I can probably do this using a cursor, but is the a standard/easy/clever way of doing this?
P.S. This table does not have an identity column as a PK. The PK is made up of an integer foreign key and a date.

You can 'nibble' the delete's which also means that you don't cause a massive load on the database. If your t-log backups run every 10 mins, then you should be ok to run this once or twice over the same interval. You can schedule it as a SQL Agent job
try something like this:
DECLARE #count int
SET #count = 10000
DELETE FROM table1
WHERE table1id IN (
SELECT TOP (#count) tableid
FROM table1
WHERE x='y'
)

What distinguishes the rows you want to delete from those you want to keep? Will this work for you:
while exists (select 1 from your_table where <your_condition>)
delete top(10000) from your_table
where <your_condition>

In addition to putting this in a batch with a statement to truncate the log, you also might want to try these tricks:
Add criteria that matches the first column in your clustered index in addition to your other criteria
Drop any indexes from the table and then put them back after the delete is done if that's possible and won't interfere with anything else going on in the DB, but KEEP the clustered index
For the first point above, for example, if your PK is clustered then find a range which approximately matches the number of rows that you want to delete each batch and use that:
DECLARE #max_id INT, #start_id INT, #end_id INT, #interval INT
SELECT #start_id = MIN(id), #max_id = MAX(id) FROM My_Table
SET #interval = 100000 -- You need to determine the right number here
SET #end_id = #start_id + #interval
WHILE (#start_id <= #max_id)
BEGIN
DELETE FROM My_Table WHERE id BETWEEN #start_id AND #end_id AND <your criteria>
SET #start_id = #end_id + 1
SET #end_id = #end_id + #interval
END

Sounds like this is one-off operation (I hope for you) and you don't need to go back to a state that's halfway this batched delete - if that's the case why don't you just switch to SIMPLE transaction mode before running and then back to FULL when you're done?
This way the transaction log won't grow as much. This might not be ideal in most situations but I don't see anything wrong here (assuming as above you don't need to go back to a state that's in between your deletes).
you can do this in your script with smt like:
ALTER DATABASE myDB SET RECOVERY FULL/SIMPLE
Alternatively you can setup a job to shrink the transaction log every given interval of time - while your delete is running. This is kinda bad but I reckon it'd do the trick.

Well, if you were using SQL Server Partitioning, say based on the date column, you would have possibly switched out the partitions that are no longer required. A consideration for a future implementation perhaps.
I think the best option may be as you say, to delete the data in smaller batches, rather than in one hit, so as to avoid any potential blocking issues.
You could also consider the following method:
Copy the data to keep into a temporary table
Truncate the original table to purge all data
Move everything from the temporary table back into the original table
Your indexes would also be rebuilt as the data was added back to the original table.

I would do something similar to the temp table suggestions but I'd select into a new permanent table the rows you want to keep, drop the original table and then rename the new one. This should have a relatively low tran log impact. Obviously remember to recreate any indexes that are required on the new table after you've renamed it.
Just my two p'enneth.

Here is my example:
-- configure script
-- Script limits - transaction per commit (default 10,000)
-- And time to allow script to run (in seconds, default 2 hours)
--
DECLARE #MAX INT
DECLARE #MAXT INT
--
-- These 4 variables are substituted by shell script.
--
SET #MAX = $MAX
SET #MAXT = $MAXT
SET #TABLE = $TABLE
SET #WHERE = $WHERE
-- step 1 - Main loop
DECLARE #continue INT
-- deleted in one transaction
DECLARE #deleted INT
-- deleted total in script
DECLARE #total INT
SET #total = 0
DECLARE #max_id INT, #start_id INT, #end_id INT, #interval INT
SET #interval = #MAX
SELECT #start_id = MIN(id), #max_id = MAX(id) from #TABLE
SET #end_id = #start_id + #interval
-- timing
DECLARE #start DATETIME
DECLARE #now DATETIME
DECLARE #timee INT
SET #start = GETDATE()
--
SET #continue = 1
IF OBJECT_ID (N'EntryID', 'U') IS NULL
BEGIN
CREATE TABLE EntryID (startid INT)
INSERT INTO EntryID(startid) VALUES(#start_id)
END
ELSE
BEGIN
SELECT #start_id = startid FROM EntryID
END
WHILE (#continue = 1 AND #start_id <= #max_id)
BEGIN
PRINT 'Start issued: ' + CONVERT(varchar(19), GETDATE(), 120)
BEGIN TRANSACTION
DELETE
FROM #TABLE
WHERE id BETWEEN #start_id AND #end_id AND #WHERE
SET #deleted = ##ROWCOUNT
UPDATE EntryID SET EntryID.startid = #end_id + 1
COMMIT
PRINT 'Deleted issued: ' + STR(#deleted) + ' records. ' + CONVERT(varchar(19), GETDATE(), 120)
SET #total = #total + #deleted
SET #start_id = #end_id + 1
SET #end_id = #end_id + #interval
IF #end_id > #max_id
SET #end_id = #max_id
SET #now = GETDATE()
SET #timee = DATEDIFF (second, #start, #now)
if #timee > #MAXT
BEGIN
PRINT 'Time limit exceeded for the script, exiting'
SET #continue = 0
END
-- ELSE
-- BEGIN
-- SELECT #total 'Removed now', #timee 'Total time, seconds'
-- END
END
SELECT #total 'Removed records', #timee 'Total time sec' , #start_id 'Next id', #max_id 'Max id', #continue 'COMPLETED? '
SELECT * from EntryID next_start_id
GO

The short answer is, you can't delete 2 billion rows without incurring some kind of major database downtime.
Your best option may be to copy the data to a temp table and truncate the original table, but this will fill your tempDB and would use no less logging than deleting the data.
You will need to delete as many rows as you can until the transaction log fills up, then truncate it each time. The answer provided by Stanislav Kniazev could be modified to do this by increasing the batch size and adding a call to truncate the log file.

I agree with the people who want you loop over a smaller set of records, this will be faster than trying to do the whole operation in one step. You may to experience withthe number of records you should include inthe loop. About 2000 at a time seems to be the sweet spot in most of the tables I do large deltes from althouhg a few need smaller amounts like 500. Depends on number of forign keys, size of the record, triggers etc, so it really will take some experimenting to find what you need. It also depends on how heavy the use of the table is. A heavily accessed table will need each iteration of the loop to run a shorter amount of time. If you can run during off hours, or best yet in single user mode, then you can have more records deleted in one loop.
If you don't think you do this in one night during off hours, it might be best to design the loop with a counter and only do a set number of iterations each night until it is done.
Further, if you use an implicit transaction rather than an explicit one, you can kill the loop query at any time and records already deleted will stay deleted except those in the current round of the loop. Much faster than trying to rollback half a million records becasue you've brought the system to a halt.
It is usually a good idea to backup a database immediately before undertaking an operation of this nature.

Related

WAITFOR DELAY doesn't act separately within each WHILE loop

I've been teaching myself to use WHILE loops and decided to try making a fun Russian Roulette simulation. That is, a query that will randomly SELECT (or PRINT) up to 6 statements (one for each of the chambers in a revolver), the last of which reads "you die!" and any prior to this reading "you survive."
I did this by first creating a table #Nums which contains the numbers 1-6 in random order. I then have a WHILE loop as follows, with a BREAK if the chamber containing the "bullet" (1) is selected (I know there are simpler ways of selecting a random number, but this is adapted from something else I was playing with before and I had no interest in changing it):
SET NOCOUNT ON
CREATE TABLE #Nums ([Num] INT)
DECLARE #Count INT = 1
DECLARE #Limit INT = 6
DECLARE #Number INT
WHILE #Count <= #Limit
BEGIN
SET #Number = ROUND(RAND(CONVERT(varbinary,NEWID()))*#Limit,0,1)+1
IF NOT EXISTS (SELECT [Num] FROM #Nums WHERE [Num] = #Number)
BEGIN
INSERT INTO #Nums VALUES(#Number)
SET #Count += 1
END
END
DECLARE #Chamber INT
WHILE 1=1
BEGIN
SET #Chamber = (SELECT TOP 1 [Num] FROM #Nums)
IF #Chamber = 1
BEGIN
SELECT 'you die!' [Unlucky...]
BREAK
END
SELECT
'you survive.' [Phew...]
DELETE FROM #Nums WHERE [Num] = #Chamber
END
DROP TABLE #Nums
This works fine, but the results all appear instantaneously, and I want to add a delay between each one to add a bit of tension.
I tried using WAITFOR DELAY as follows:
WHILE 1=1
BEGIN
WAITFOR DELAY '00:00:03'
SET #Chamber = (SELECT TOP 1 [Num] FROM #Nums)
IF #Chamber = 1
BEGIN
SELECT 'you die!' [Unlucky...]
BREAK
END
SELECT
'you survive.' [Phew...]
DELETE FROM #Nums WHERE [Num] = #Chamber
END
I would expect the WAITFOR DELAY to initially cause a 3 second delay, then for the first SELECT statement to be executed and for the text to appear in the results grid, and then, assuming the live chamber was not selected, for there to be another 3 second delay and so on, until the live chamber is selected.
However, before anything appears in my results grid, there is a delay of 3 seconds per number of SELECT statements that are executed, after which all results appear at the same time.
I tried using PRINT instead of SELECT but encounter the same issue.
Clearly there's something I'm missing here - can anyone shed some light on this?
It's called buffering. The server doesn't want to return an only partially full response because most of the time, there's all of the networking overheads to account for. Lots of very small packets is more expensive than a few larger packets1.
If you use RAISERROR (don't worry about the name here where we're using 10) you can specify NOWAIT to say "send this immediately". There's no equivalent with PRINT or returning result sets:
SET NOCOUNT ON
CREATE TABLE #Nums ([Num] INT)
DECLARE #Count INT = 1
DECLARE #Limit INT = 6
DECLARE #Number INT
WHILE #Count <= #Limit
BEGIN
SET #Number = ROUND(RAND(CONVERT(varbinary,NEWID()))*#Limit,0,1)+1
IF NOT EXISTS (SELECT [Num] FROM #Nums WHERE [Num] = #Number)
BEGIN
INSERT INTO #Nums VALUES(#Number)
SET #Count += 1
END
END
DECLARE #Chamber INT
WHILE 1=1
BEGIN
WAITFOR DELAY '00:00:03'
SET #Chamber = (SELECT TOP 1 [Num] FROM #Nums)
IF #Chamber = 1
BEGIN
RAISERROR('you die!, Unlucky',10,1) WITH NOWAIT
BREAK
END
RAISERROR('you survive., Phew...',10,1) WITH NOWAIT
DELETE FROM #Nums WHERE [Num] = #Chamber
END
DROP TABLE #Nums
As Larnu already aluded to in comments, this isn't a good use of T-SQL.
SQL is a set-oriented language. We try not to write procedural code (do this, then do that, then run this block of code multiple times). We try to give the server as much as possible in a single query and let it work out how to process it. Whilst T-SQL does have language support for loops, we try to avoid them if possible.
1I'm using packets very loosely here. Note that it applies the same optimizations no matter what networking (or no-networking-local-memory) option is actually being used to carry the connection between client and server.

Purging Tables in SQL SERVER : deleting time

i'm in charge of an OLAP database where i noticed that some cleaning would do some benefits.
my first analysis comes across 500Millions rows to be deleted in approximatively 50 tables . and this represents in general 70% of each table.
i've come across some solution where i need to use a tmp table, drop the original one then bring it again ... but i have too many dependencies and i don't want to take the risk going through that path.
so i went for another solution :
deleting little by little so there is no table locks.
this is a code i've found here in stack overflow that i've tried to improve
deleting 4000 by 4000 rows.
set statistics time off
DECLARE #BATCHSIZE INT, #WAITFORVAL VARCHAR(8), #ITERATION INT, #TOTALROWS
INT, #MAXRUNTIME VARCHAR(8), #BSTOPATMAXTIME BIT, #MSG VARCHAR(500)
SET DEADLOCK_PRIORITY LOW;
SET #BATCHSIZE = 4000
SET #WAITFORVAL = '00:00:10'
SET #MAXRUNTIME = '18:00:00' -- 6 PM
SET #BSTOPATMAXTIME = 1 -- ENFORCE 6 PM STOP TIME
SET #ITERATION = 0 -- LEAVE THIS
SET #TOTALROWS = 0 -- LEAVE THIS
Begin TRY
WHILE #BATCHSIZE>0
BEGIN
-- IF #BSTOPATMAXTIME = 1, THEN WE'LL STOP THE WHOLE JOB AT A SET TIME...
IF (CONVERT(VARCHAR(8),GETDATE(),108) >= #MAXRUNTIME AND #BSTOPATMAXTIME=1) OR #ITERATION >2000
BEGIN
Return;
END
DELETE top (#BATCHSIZE)
FROM FacY where IdDimX not in (select IdDimX from vwX)
SET #BATCHSIZE=##ROWCOUNT
SET #ITERATION=#ITERATION+1
SET #TOTALROWS=#TOTALROWS+#BATCHSIZE
SET #MSG = 'Iteration: ' + CAST(#ITERATION AS VARCHAR) + ' Total deletes:' + CAST(#TOTALROWS AS VARCHAR)
RAISERROR (#MSG, 0, 1) WITH NOWAIT
--COMMIT TRANSACTION;
END
end TRY
BEGIN CATCH
IF ##ERROR <> 0
AND ##TRANCOUNT > 0
BEGIN
PRINT 'There is an error occured. The database update failed.';
ROLLBACK TRANSACTION;
END;
END CATCH;
so this took approx. between 30min and one hour to delete 4Million rows.
instead i trie now to delete 100 000 rows and it did so in 1minutes then i tried 1Millions rows and it did it 5-6minutes.
then i went for more 10million rows and it took 15minutes. ( but logs of 50GB were full at 60% so i think it's the limit)
so now i'm wondering isn't it better if i go deleting by big blocks finally ? since it would take a lot of time.
and what i don't understand is why it takes less time to delete by big blocks?
thanks for your help

Is this sql update guaranteed to be atomic?

I have the following sql:
UPDATE Customer SET Count=1 WHERE ID=1 AND Count=0
SELECT ##ROWCOUNT
I need to know if this is guaranteed to be atomic.
If 2 users try this simultaneously, will only one succeed and get a return value of 1? Do I need to use a transaction or something else in order to guarantee this?
The goal is to get a unique 'Count' for the customer. Collisions in this system will almost never happen, so I am not concerned with the performance if a user has to query again (and again) to get a unique Count.
EDIT:
The goal is to not use a transaction if it is not needed. Also this logic is ran very infrequently (up to 100 per day), so I wanted to keep it as simple as possible.
It may depend on the sql server you are using. However for most, the answer is yes. I guess you are implementing a lock.
Using SQL SERVER (v 11.0.6020) that this is indeed an atomic operation as best as I can determine.
I wrote some test stored procedures to try to test this logic:
-- Attempt to update a Customer row with a new Count, returns
-- The current count (used as customer order number) and a bit
-- which determines success or failure. If #Success is 0, re-run
-- the query and try again.
CREATE PROCEDURE [dbo].[sp_TestUpdate]
(
#Count INT OUTPUT,
#Success BIT OUTPUT
)
AS
BEGIN
DECLARE #NextCount INT
SELECT #Count=Count FROM Customer WHERE ID=1
SET #NextCount = #Count + 1
UPDATE Customer SET Count=#NextCount WHERE ID=1 AND Count=#Count
SET #Success=##ROWCOUNT
END
And:
-- Loop (many times) trying to get a number and insert in into another
-- table. Execute this loop concurrently in several different windows
-- using SMSS.
CREATE PROCEDURE [dbo].[sp_TestLoop]
AS
BEGIN
DECLARE #Iterations INT
DECLARE #Counter INT
DECLARE #Count INT
DECLARE #Success BIT
SET #Iterations = 40000
SET #Counter = 0
WHILE (#Counter < #Iterations)
BEGIN
SET #Counter = #Counter + 1
EXEC sp_TestUpdate #Count = #Count OUTPUT , #Success = #Success OUTPUT
IF (#Success=1)
BEGIN
INSERT INTO TestImage (ImageNumber) VALUES (#Count)
END
END
END
This code ran, creating unique sequential ImageNumber values in the TestImage table. This proves that the above SQL update call is indeed atomic. Neither function guaranteed the updates were done, but they did guarantee that no duplicates were created, and no numbers were skipped.

About lock behavior when update row in sql server

Now, I'm trying to increment number sequential in SQL Server with the number provided from users.
I have a problem when multiple user insert a row same time with same number.
I try to update the number that user provided to a temporary table, and I expect when I update the same table with same condition, SQL Server will lock any modified to this row until the current update finished, but it not.
Here is the update statement I used:
UPDATE GlobalParam
SET ValueString = (CAST(ValueString as bigint) + 1)
WHERE Id = 'xxxx'
Could you tell me any way to force the other update command wait until the current command finished ?
This is entire my command :
DECLARE #Result bigint;
UPDATE GlobalParam SET ValueString = (SELECT MAX(Code) FROM Item)
DECLARE #SelectTopStm nvarchar(MAX);
DECLARE #ExistRow int
SET #SelectTopStm = 'SELECT #ExistRow = 1 FROM (SELECT TOP 1 Code FROM Item WHERE Code = '999') temp'
EXEC sp_executesql #SelectTopStm, N'#ExistRow int output', #ExistRow output
IF (#ExistRow is not null)
BEGIN
DECLARE #MaxValue bigint
DECLARE #ReturnUpdateTbl table (ValueString nvarchar(max));
UPDATE GlobalParam SET ValueString = (CAST(ValueString as bigint) + 1)
OUTPUT inserted.ValueString INTO #ReturnUpdateTbl
WHERE [Id] = '333A8E1F-16DD-E411-8280-D4BED9D726B3'
SELECT TOP 1 #MaxValue = CAST(ValueString as bigint) FROM #ReturnUpdateTbl
SET #Result = #MaxValue
END
ELSE
BEGIN
SET #Output = 999
END
END
I write the codes above as a stored procedure.
Here is the real code when I insert one Item:
DECLARE #IncrementResult BIGINT
EXEC IncrementNumberUnique
, (some parameters)..
,#Result = #IncrementResult OUTPUT
INSERT INTO ITEM (Id, Code) VALUES ('xxxx', #IncrementResult)
I create 3 threads and make it run at the same time.
The return result :
Id Code
1 999
2 1000
3 1000
Thanks
If I understood your requirements, try ROWLOCK hint to tell the optimizer to start with locking the rows one by one as the update needs them.
UPDATE GlobalParam WITH(ROWLOCK)
SET ValueString = (CAST(ValueString as bigint) + 1)
WHERE Id = 'xxxx'
By default SQL Server does READ Committed locking which releases READ locks once the read operation is committed. Once the Update statement below is complete, all read locks are released from the table Item.
UPDATE GlobalParam SET ValueString = (SELECT MAX(Code) FROM Item)
Since your INSERT into Item is outside the scope of your procedure. you can run the thread in SERIALIZABLE isolation level. Something like this.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #IncrementResult BIGINT
EXEC IncrementNumberUnique
, (some parameters)..
,#Result = #IncrementResult OUTPUT
INSERT INTO ITEM (Id, Code) VALUES ('xxxx', #IncrementResult)
Changing the isolation level to SERIALIZABLE will increase blocking and contention of resources on item table.
To know more about isolation level, refer this
you should look into identity columns and remove such manual computation of incremental columns if possible.

How to copy large amount of data from one table to other table in SQL Server

I want to copy large amount of datas from one table to another table.I used cursors in Stored Procedure to do the same.But it is working only for tables with less records.If the tables contain more records it is executing for long time and hanged.Please give some suggestion as how can i copy the datas in faster way,My SP is as below:
--exec uds_shop
--select * from CMA_UDS.dbo.Dim_Shop
--select * from UDS.dbo.Dim_Shop
--delete from CMA_UDS.dbo.Dim_Shop
alter procedure uds_shop
as
begin
declare #dwkeyshop int
declare #shopdb int
declare #shopid int
declare #shopname nvarchar(60)
declare #shoptrade int
declare #dwkeytradecat int
declare #recordowner nvarchar(20)
declare #LogMessage varchar(600)
Exec CreateLog 'Starting Process', 1
DECLARE cur_shop CURSOR FOR
select
DW_Key_Shop,Shop_ID,Shop_Name,Trade_Sub_Category_Code,DW_Key_Source_DB,DW_Key_Trade_Category,Record_Owner
from
UDS.dbo.Dim_Shop
OPEN cur_shop
FETCH NEXT FROM cur_shop INTO #dwkeyshop,#shopid,#shopname,#shoptrade, #shopdb ,#dwkeytradecat,#recordowner
WHILE ##FETCH_STATUS = 0
BEGIN
Set #LogMessage = ''
Set #LogMessage = 'Records insertion/updation start date and time : ''' + Convert(varchar(19), GetDate()) + ''''
if (isnull(#dwkeyshop, '') <> '')
begin
if not exists (select crmshop.DW_Key_Shop from CMA_UDS.dbo.Dim_Shop as crmshop where (convert(varchar,crmshop.DW_Key_Shop)+CONVERT(varchar,crmshop.DW_Key_Source_DB)) = convert(varchar,(CONVERT(varchar, #dwkeyshop) + CONVERT(varchar, #shopdb))) )
begin
Set #LogMessage = Ltrim(Rtrim(#LogMessage)) + ' ' + 'Record for shop table is inserting...'
insert into
CMA_UDS.dbo.Dim_Shop
(DW_Key_Shop,DW_Key_Source_DB,DW_Key_Trade_Category,Record_Owner,Shop_ID,Shop_Name,Trade_Sub_Category_Code)
values
(#dwkeyshop,#shopdb,#dwkeytradecat,#recordowner,#shopid,#shopname,#shoptrade)
Set #LogMessage = Ltrim(Rtrim(#LogMessage)) + ' ' + 'Record successfully inserted in shop table for shop Id : ' + Convert(varchar, #shopid)
end
else
begin
Set #LogMessage = Ltrim(Rtrim(#LogMessage)) + ' ' + 'Record for Shop table is updating...'
update
CMA_UDS.dbo.Dim_Shop
set DW_Key_Trade_Category=#dwkeytradecat,
Record_Owner=#recordowner,
Shop_ID=#shopid,Shop_Name=#shopname,Trade_Sub_Category_Code=#shoptrade
where
DW_Key_Shop=#dwkeyshop and DW_Key_Source_DB=#shopdb
Set #LogMessage = Ltrim(Rtrim(#LogMessage)) + ' ' + 'Record successfully updated for shop Id : ' + Convert(varchar, #shopid)
end
end
Exec CreateLog #LogMessage, 0
FETCH NEXT FROM cur_shop INTO #dwkeyshop,#shopid,#shopname,#shoptrade, #shopdb ,#dwkeytradecat,#recordowner
end
CLOSE cur_shop
DEALLOCATE cur_shop
End
Assuming targetTable and destinationTable have the same schema...
INSERT INTO targetTable t
SELECT * FROM destinationTable d
WHERE someCriteria
Avoid the use of cursors unless there is no other way (rare).
You can use the WHERE clause to filter out any duplicate records.
If you have an identity column, use an explicit column list that doesn't contain the identity column.
You can also try disabling constraints and removing indexes provided you replace them (and make sure the constraints are checked) afterwards.
If you are on SQL Server 2008 (onwards) you can use the MERGE statement.
From my personal experience, when you copy the huge data from one table to another (with similar constraints), drop the constraints on the table where the data is getting copied. Once the copy is done, reinstate all the constraints again.
I could reduce the copy time from 7 hours to 30 mins in my case (100 million records with 6 constraints)
INSERT INTO targetTable
SELECT * FROM destinationTable
WHERE someCriteria (based on Criteria you can copy/move the records)
Cursors are notoriously slow and ram can begin to become a problem for very large datasets.
It does look like you are doing a good bit of logging in each iteration, so you may be stuck with the cursor, but I would instead look for a way to break the job up into multiple invocations so that you can keep your footprint small.
If you have an autonumber column, I would add a '#startIdx bigint' to the procedure, and redefine your cursor statement to take the 'TOP 1000' 'WHERE [autonumberFeild] <= #startIdx Order by [autonumberFeild]'. Then create a new stored procedure with something like:
DECLARE #startIdx bigint = 0
WHILE select COUNT(*) FROM <sourceTable> > #startIdx
BEGIN
EXEC <your stored procedure> #startIdx
END
SET #startIdx = #startIdx + 1000
Also, make sure your database files are set to auto-grow, and that it does so in large increments, so you are not spending all your time growing your datafiles.