I need to select some 100k+ records from a SQL table and do some processing and then do a bulk insert to another table. I am using SQLBulkCopy to to do the bulk insert which runs quickly. For getting the 100k+ records, I am currently using DataReader.
Problem: Sometimes I am getting a timeout error in DataReader. I have increased the timeout to some managable number.
Is there anything like SQLBulkCopy for selecting records in a bulk batch?
Thanks!
Bala
It sound like you should do all your processing inside sql server. Or split data into chunks.
A quote from this msdn page:
Note
No special optimization techniques exist for bulk-export operations. These operations simply select the data from the source table by using a SELECT statement.
However, on that same page, it mentions the bcp utlity can "bulk export" data from SQL Server to a file.
I suggest you try your query with bcp, and see if it's significantly faster. If it's not, I'd give up and try fiddling with your batch sizes, or look harder at moving the processing into SQL Server.
Related
Like inserting single or multiple records at a time, one table data is inserting into another table with limited columns.
There are (at least) four ways:
INSERT. Pretty obvious. It supports both single rows and multiple rows supplied as literal values, as well as inserting the result of a query or stored procedure.
SELECT .. INTO inserts the results of a query into a new table.
BULK INSERT. Bulk inserts data from files. It's a little quirky and not very flexible when it comes to parsing files, but if you can get the data to line up it works well enough. Selecting data for bulk insert purposes can also be done with OPENROWSET(BULK, ...).
INSERT BULK. This is an internal command that's used under the covers by drivers that use the bulk insert protocol in TDS (the protocol used by SQL Server). You do not issue these commands yourself. Unlike BULK INSERT, this is for client-side initiated bulk inserting, for example through the SqlBulkCopy class in .NET, or SQL Server's own bcp tool.
All other interfaces and approaches to inserting data use one of these methods under the covers. Most of these will use plain old INSERT.
Bulk Insert
https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql
Insert Transact
https://learn.microsoft.com/en-us/sql/t-sql/statements/insert-transact-sql
Select Insert
https://www.w3schools.com/sql/sql_insert_into_select.asp
Hope this will help
I've written a SQL query that looks like this:
SELECT * FROM MY_TABLE WHERE ID=123456789;
When I run it in the Query Analyzer in SQL Server Management Studio, the query never returns; instead, after about ten minutes, I get the following error: System.OutOfMemoryException
My server is Microsoft SQL Server (not sure what version).
SELECT * FROM MY_TABLE; -- return 44258086
SELECT * FROM MY_TABLE WHERE ID=123456789; -- return 5
The table has over forty million rows! However, I need to fetch five specific rows!
How can I work around this frustrating error?
Edit: The server suddenly started working fine for no discernable reason, but I'll leave this question open for anyone who wants to suggest troubleshooting steps for anyone else with this problem.
According to http://support.microsoft.com/kb/2874903:
This issue occurs because SSMS has insufficient memory to allocate for
large results.
Note SSMS is a 32-bit process. Therefore, it is limited to 2 GB of
memory.
The article suggests trying one of the following:
Output the results as text
Output the results to a file
Use sqlcmd
You may also want to check the server to see if it's in need of a service restart--perhaps it has gobbled up all the available memory?
Another suggestion would be to select a smaller subset of columns (if the table has many columns or includes large blob columns).
If you need specific data use an appropriate WHERE clause. Add more details if you are stuck with this.
Alternatively write a small application which operates using a cursor and does not try to load it completely into memory.
We need to extract 54M rows from one database to another. Columns of two tables are similar but not exactly same so there is some conversion work to do. I've started a cursor, but is there any better and also performance friendly way for inserting big chunk of data?
Performance and logging-wise, the best options to move large amounts of data are with SSIS or other bulk operations such as BCP export/import.
As far as performance I would suggest you could do the following
1) Creating a Stored proc to do the task - you can call stored proc using ssis
2) Add SQL Agent job if necessary.
I have nearly 7 billion rows of data in memory (list<T> and sortedlist<T,T>) in C#. I want to insert this data into tables in SQL Server. To do this, I define different SqlConnection for each collection and set connection pool to False.
First, I tried to insert data with connected mode (ExecuteNonQuery). Even I defined Parallel.Invoke and called all insert methods for different collections concurrently, it is too slow and up to now I couldn't finish it (I couldn't discriminate any differences between sequential and concurrent insert).
Also, I tried to create an object from SqlDataTable. To fill tables I read all data from collections once and add data to SqlDataTable. In this case I set SqlBatchSize=10000 and SqlTimeOut=0 for SqlBulkCopy. But this one also is very slow.
How can I insert a huge amount of data into SQL Server fast?
Look for 'BULK INSERT'. The technique is available for various RDBMS. Basically, you create a (text)file with one line per record and tell the server to consume this text file. This is the fastest approach I could think of. I import 50 million rows in a couple of seconds that way.
You already discovered SqlBulkCopy but you say it is slow. This can be because of two reasons:
You are using too small batches. Try to stream the rows in using a custom IDataReader that you pass to WriteToServer (or just use bigger DataTables)
Your table has nonclustered indexes. Disable them pre-import and regenerate them
You can't go faster than with bulk-import, though.
I've to INSERT a lot of rows (more than 1.000.000.000) to a SQL Server data base. The table has an AI Id, two varchar(80) cols and a smalldatetime with GETDATE as default value. The last one is just for auditory, but necesary.
I'd like to know the best (fastest) way to INSERT the rows. I've been reading about BULK INSERT. But if posible I'd like to avoid it because the app does not run on the same server where database is hosted and I'd like to keep them as isolated as posible.
Thanks!
Diego
Another option would be bcp.
Alternatively, if you're using .NET you can use the SqlBulkCopy class to bulk insert data. This is something I've blogged about on the performance of, which you may be interested in as I compared SqlBulkCopy vs another way of bulk loading data to SQL Server from .NET (using SqlDataAdapter). Basic example loading 100,000 rows took 0.8229s using SqlBulkCopy vs. 25.0729s using the SqlDataAdapter approach.
Create an SSIS package that will copy the file to SQL server machine and then use the data flow task to import data from file to SQL server database.
There is no faster/more efficient way than BULK INSERT and when you're dealing with such large ammount of data, do not even think about anything from .NET, because thanks to GC, managing millions of object in memory causes massive performance degradation.