Using Powershell to copy large Oracle table to SQL Server - memory issue - sql

We are trying to copy data from a large Oracle table (about 50M rows) to SQL Server, using Powershell and SQLBulkCopy. The issue with this particular Oracle table is that it contains a CLOB field and it seems that unlike other table loads, this one is taking up more and more OS memory, eventually overpowering SQL Server, which is located on the same server, on which Powershell is running. Oracle is external and data is being sent via a network. Max size of CLOB is 6.4M bytes, whereas the average size is 2000.
Here is a snip of code being used. Seems that batchsize does not have any bearing on what's happening:
`
$SourceConnection = New-Object Oracle.ManagedDataAccess.Client.OracleConnection($SourceConnectionnectionstring)
$SourceConnection.Open()
$SourceCmd = $SourceConnection.CreateCommand()
$SourceCmd.CommandType = "text"
$SourceCmd.CommandText = $queryStatment
$bulkCopy = New-Object Data.SqlClient.SqlBulkCopy($targetConnectionString, [System.Data.SqlClient.SqlBulkCopyOptions]::UseInternalTransaction)
$bulkcopy.DestinationTableName = $destTable
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = 500
$SourceReader = $SourceCmd.ExecuteReader()
Start-Sleep -Seconds 2
$bulkCopy.WriteToServer($SourceReader)
`
We tried different batch sizes, smaller and larger, with same result.
Tried enableStreaming 1 / 0
tried using Internal Transaction (in the code sample above) or just using default options, but still specifying batch size...
Anything else we can try to do to avoid the memory pressure?
Thank you in advance!

Turned out, after extensive research, an obscure Oracle command attribute is responsible for sending data for CLOBs, which is what was saturating memory.
InitialLOBFetchSize
This property specifies the amount of data that the OracleDataReader initially fetches for LOB columns. It is defaulted to 0, which means "the entire CLOB".
I set it to 1M bytes, which is plenty, and the process never ate into memory.

Related

Inserting many data into SQL Server in one stored procedure results in a huge amount of memory usage of the executing process

I am importing 300 MB data with table-valued parameters into the SQL Server database with a stored procedure. But then I noticed the process/executable that executs the stored procedure causes a 2 GB memory usage. And this usage stays like this for up to hours then it drops again.
Is it proportion normal?
I assume to be able to rollback, the database has to do a lot of copies and bookkeeping? Also the data is just many many small records, so that is also a reason?
If it's not normal, something else might be configured wrongly?

Pentaho gives Unexpected End of Stream when querying a large dataset

I have a Pentaho transform that reads several million rows of data with a Table Input step. When running with a few million it is ok. When I hit about 15 million I get and Unexpected End of Stream Exception x out of y bytes. When this occurs I have several other table inputs going to stream lookups that work fine. The input for my main stream gets no rows. My database is mariadb and my timeouts are set to 8 hours (don't ask :/). Has anyone encountered anything similar?
My query was not using an index on my date range when I had a large range. I've forced this index and still have the same problem. In my processlist the query is stuck at "Writing to net".
The problem was with my MariaDB connector. I changed this to a MySQL connector and it worked perfectly.

Bulk Conversion of Large SQL Databases (100 GB stored in 10 files (100 tables/file) to SQLite

I am converting a large SQL database (100GB stored in 10 files, with 100 tables per file) to SQLite. Right now, I am using the CodeProject C# utility, as suggested in another thread (convert sql-server *.mdf file into sqlite file). However, this approach is not entirely satisfactory for two reasons:
The conversion process usually stops abruptly when converting one of my files. Then I have to go in and check which tables were successfully converted or not.
I could manually convert 10 tables at a time; but this requires 100 repetitions and my constant presence in front of my computer.
Thank you so much for your kind regards!
It is possible a "transaction log" is being created. This is the log used to rollback changes if something goes wrong. Since your job is so large, this log file can grow too large and the process will fail.
Try this:
1) Back up the data.
2) Turn off the log with this: PRAGMA database.journal_mode = OFF;
Caveat: I've never tried this with SqlLite but other databases work in a similar fashion.

Moving data from one table to another in Sql Server 2005

I am moving around 10 million data from one table to another in SQL Server 2005. The Purpose of Data transfer is to Offline the old data.
After some time it throws an error Description: "The LOG FILE FOR DATABASE 'tempdb' IS FULL.".
My tempdb and templog is placed in a drive (other than C drive) which has around 200 GB free. Also my tempdb size in database is set to 25 GB.
As per my understanding I will have to increase the size of tempdb from 25 GB to 50 GB and set the log file Auto growth portion to "unrestricted file growth (MB)".
Please let me know other factors and I cannot experiment much as I am working on Production database so can you please let me know if they changes will have some other impact.
Thanks in Advance.
You know the solution. Seems you are just moving part of data to make your queries faster.
I am agree with your solution
As per my understanding I will have to increase the size of tempdb from 25 GB to 50 GB and set the log file Auto growth portion to "unrestricted file growth (MB)".
Go ahead
My guess is that you're trying to move all of the data in a single batch; can you break it up into smaller batches, and commit fewer rows as you insert? Also, as noted in the comments, you may be able to set your destination database to SIMPLE or BULK-INSERT mode.
Why are you using Log file at all? Copy your data (Data and Logfile) then set the mode on SIMPLE and run the transfer again.

Importing Massive text file into a sql server database

I am currently trying to import a text file with 180+ million records with about 300+ columns into my sql server database. Needless to say the file is roughly 70 GBs large. I have been at it for days and when i get close something happens and it craps out on me. I need the quickest and most efficient way to do this import. I have tried the wizard which should have been the easiest, then i tried just saving as an ssis package. I havent been able to figure out how to do a bulk import with the settings i think would work great. The error i keep on getting is 'not enough virtual memory'. I changed my virtual memory to 36 gigs . My system has 24 gigs of physical memory. Please help me.
If you are using BCP (and you should be for files this large), use a batch size. Otherwise, BCP will attempt to load all records in one transaction.
By command line: bcp -b 1000
By C#:
using (System.Data.SqlClient.SqlBulkCopy bulkCopy =
new System.Data.SqlClient.SqlBulkCopy(sqlConnection))
{
bulkCopy.DestinationTableName = destinationTableName;
bulkCopy.BatchSize = 1000; // 1000 rows
bulkCopy.WriteToServer(dataTable); // May also pass in DataRow[]
}
Here are the highlights from this MSDN article:
Importing a large data file as a single batch can be problematic, so
bcp and BULK INSERT let you import data in a series of batches, each
of which is smaller than the data file. Each batch is imported and
logged in a separate transaction...
Try reducing the maximum server memory for SQL Server to as low as you can get away with. (Right click the SQL instance in Mgmt Studio -> properties -> memory).
This may free up enough memory for the OS & SSIS to process such a big text file.
I'm assuming the whole process is happening locally on the server.
I had a similar problem with SQL 2012 and trying to import (as a test) around 7 million records into a database. I too ran out of memory and had to cut the bulk import into smaller pieces. The one thing to note is that all the memory that the import process uses (no matter what manner you leverage) up a ton of memory and won't release said system memory until the server was rebooted. I'm not sure if this is intended behavior by SQL Server but it's something to note for your project.
Because I was using the SEQUENCE command with this process I had to leverage T-sql code saved as sql scripts and then use SQLCMD in small pieces to lessen the memory overhead.
You'll have to play around with what works for you and highly recommend to not run the script all at once.
It's going to be a pain in the ass to break it down in smaller pieces and import it in but in the long run you'll be happier.