Error While Inserting large amount of data using Insert statement in SQL Server 2008 - sql

I am trying to insert records into table have large amount of data
File Description:
Size : 65.0 MB
Records count : 10000
My Sample Data:
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
.......
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
INSERT INTO tbldata(col1,col2,col3)values(col1,col2,col3)
GO
.......
UPTO 10000 ROWS
ERROR:
Exception of type 'System.OutOfMemoryException' was thrown.(mscorlib)
I Tried:
I verified this answer
Under SQL Server\Properties\Memory there is a setting for Minimum Memory Per Query. you can raise this number temporarily to help increase the number of records between the GO statements. In my case I raised this to 5000 (10000 caused a system out of memory error, not good) so I settled for 5000, after a few tests I found that I could now import about 20,000 rows so I placed a GO statement every 20,000 rows (took about 10 minutes) and I was able to import over 200,000 rows in one query.

The maximum batch size for SQL Server 2005 is 65,536 * Network Packet Size (NPS), where NPS is usually 4KB. That works out to 256 MB. That would mean that your insert statements would average 5.8 KB each. That doesn't seem right, but maybe there are extraneous spaces or something unusual in there.
My first suggestion would be to put a "GO" statement after every INSERT statement. This will break your single batch of 45,000 INSERT statements into 45,000 separate batches. This should be easier to digest. Be careful, if one of those inserts fails you may have a hard time finding the culprit. You might want to protect yourself with a transaction. You can add those statements quickly if your editor has a good search-and-replace (that will let you search on and replace return characters like \r\n) or a macro facility.
The second suggestion is to use a Wizard to import the data straight from Excel. The wizard builds a little SSIS package for you, behind the scenes, and then runs that. It won't have this problem
reference got from this Out of memory exception

Related

What is the best way to execute 100k insert statements?

I have created a set of 100k insert queries to generate data in multiple oracle tables for my performance testing. What is the best way to execute this ?
In the past, I've tried tools like Oracle SQL developer and Toad. However not sure if it can handle this large volume.
Simple insert statements like -
INSERT INTO SELLING_CODE (SELC_ID, VALC_ID, PROD_ID, SELC_CODE, SELC_MASK, VALC_ID_STATUS)
VALUES (5000001, 63, 1, '91111111', 'N/A', 107);
Inserting 100,000 rows with SQL statements is fine. It's not a huge amount of data and there are a few simple tricks that can help you keep the run time down to a few seconds.
First, make sure that your tool is not displaying something for each statement. Copying and pasting the statements into a worksheet window would be horribly slow. But saving the statements into a SQL*Plus script, and running that script can be fast. Use the real SQL*Plus client if possible. That program is available on almost any system and is good at running small scripts.
If you have to use SQL Developer, save the 100K statements in a text file, and then run this as a script (F5). This method took 45 seconds on my PC.
set feedback off
#C:\temp\test1.sql
Second, batch the SQL statements to eliminate the overhead. You don't have to batch all of them, batching 100 statements-at-a-time is enough to reduce 99% of the overhead. For example, generate one thousand statements like this:
INSERT INTO SELLING_CODE (SELC_ID, VALC_ID, PROD_ID, SELC_CODE, SELC_MASK, VALC_ID_STATUS)
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual union all
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual union all
...
select 5000001, 63, 1, '91111111', 'N/A', 107 from dual;
Save that in a text file, run it the same way in SQL Developer (F5). This method took 4 seconds on my PC.
set feedback off
#C:\temp\test1.sql
If you can't significantly change the format of the INSERT statements, you can simply add a BEGIN and END; / between every 100 lines. That will pass 100 statements at a time to the server, and significantly reduce the network overhead.
For even faster speeds, run the script in regular SQL*Plus. On my PC it only takes 2 seconds to load the 100,000 rows.
For medium-sized data like this it's helpful to keep the convenience of SQL statements. And with a few tricks you can get the performance almost the same as a binary format.

BigQuery data using SQL "INSERT INTO" is gone after some time

Today I notice another strange behaviour of BigQuery.
I run UDF standard SQL in the BQ web ui:
CREATE TEMPORARY FUNCTION ...
INSERT INTO projectid.dataset.inserttable...
All seems good, the result of the UDF SQL are inserted in the insert table correct, I can tell from "Number of rows". But the table size is not correct, still keep the table size before run the insert query. Furthermore, I found all the inserted rows are gone after 1 hour later.
Some more info I found, when run a "DETELE FROM insert table true" or "SELECT ...", then the deleted number of rows and table size seems correct with the inserted data. But just can not preview the insert table correctly in the WEB UI.
Then I am guessing the "Detail" or "Preview" info of the table has time delay? May I know do you have any idea about this behaviour?
The preview may have a delay, so SELECT * FROM YourTable; will give the most up-to-date results, or you can use COUNT(*) just to verify that the number of rows is correct. You can think of it as being similar to streaming, if you have tried that, where some rows may be in the streaming buffer for a while before they make it into regular storage.

How to solve S-IX deadlock without using the Snapshot isolation?

Ok, I got an weird problem: I got two server-side api, one is to select data from A table; the Other one is to Insert new record to B table with part of the data and PK from first one. They should not have any problem to each other with the Usual actions.
Somehow, I detected someone which called my select and insert function at less than 0.005 sec on my SQL monitor, which caused S-IX deadlock. So I searched the Internet and found an solution that told me to enable the Snapshot isolation. But I tried it on my test DB(which its total size is about 223MB), that Alter Database command does not show any sight of finishing after 1 hour execution, so this is intolerable to execute it on the Production DB with such long Downtime(Which its data size is bigger than the Test one).
So my question is: Does anyone know the Other way to solve the S-IX deadlock?(Without lower the Throughput.)
P.S.: My DB is SQL Server 2008

Speed up Python executemany

I'm inserting data from one database to another, so I have 2 connections (Conn1 and Conn2). Below is the code (using pypyodbc).
import pypyodbc
Conn1_Query = "SELECT column FROM Table"
Conn1_Cursor.execute(Conn1_Query)
Conn1_Data = Conn1_Cursor.fetchall()
Conn1_array = []
for row in Conn1_Data:
Conn1_array.append(row)
The above part runs very quickly.
stmt = "INSERT INTO TABLE(column) values (?)"
Conn2_Cursor.executemany(stmt, Conn1_array)
Conn2.commit()
This part is extremely slow. I've also tried to do a for loop to insert each row at a time using cursor.execute, but that is also very slow. What am I doing wrong and is there anything I can do to speed it up? Thanks for taking a look.
Thought I should also add that the Conn1 data is only ~50k rows. I also have some more setup code at the beginning that I didn't include because it's not pertinent to the question. It takes about 15 minutes to insert. As a comparison, it takes about 25 seconds to write the output to a csv file.
Yes, executemany under pypyodbc sends separate INSERT statements for each row. It acts just the same as making individual execute calls in a loop. Given that pypyodbc is no longer under active development, that is unlikely to change.
However, if you are using a compatible driver like "ODBC Driver xx for SQL Server" and you switch to pyodbc then you can use its fast_executemany option to speed up the inserts significantly. See this answer for more details.

is there a maximum number of inserts that can be run in a batch sql script?

I have a series of simple "Insert INTO" type statements but after running about 3 or 4 of them the script stops and i get empty sets when i try selecting from the appropriate tables....aside from my specific code...i wonder whether there is an ideal way of running multiple insert type queries.
Right now i just have a txt file saved as a.sql with normal sql commands separated by ";"
No, there is not. however, if it stops after 3 or 4 inserts, it's a good bet there's an error in the 3rd or 4th insert. Depending on which SQL engine you use, there are different ways of making it report errors during and after operations.
Additionally, if you have lots of inserts, it's a good idea to wrap them inside a transaction - this basically buffers all the insert commands until it sees the end command for the transaction, and then commit everything to your table. That way, if something goes wrong, your database doesn't get polluted with data that needs to first be deleted again. More importantly, every insert without a transaction counts as a single transaction, which makes them really slow - Doing 100 inserts inside a transaction can be as fast as doing two or three normal inserts.
Maximum Capacity Specifications for SQL Server
Max Batch size = 65,536 * Network Packet Size
However I doubt that Max Batch size is your problem.