How many inserts can you have in a sql transaction - sql-server-2005

I have a task to do that will require me using a transaction to ensure that many inserts will be completed or the entire update rolled back.
I am concerned about the amount of data that needs to be inserted in this transaction and whether this will have a negative affect on the server.
We are looking at about 10,000 records in table1 and 60,0000 records into table2.
Is this safe to do in a single transaction?

have you thought about using a bulk data loader like SSIS or the data import wizard that comes with sql server?
the data import wizard is pretty simple.
In management studio right click on the database you want to import data into. Then select tasks and import data. Follow the wizard prompts. If a record fails the whole transaction will fail.
I have loaded millions of records this way (and using SSIS).

it is safe, however keep in mind that you might be blocking other users during that time. Also take a look at bcp or BULK INSERT to make the inserts faster

Related

Transaction rollback VS Delete Records

Looking for some insights on using Transaction or Delete queries when the subsequent request fails. In brief, in my application, I'm inserting into two tables by calling two stored procedures and the inserted data would be uploaded into two REST APIs. If anyone of the REST API is failed I have to rollback the details entered into database.
So which approach is suitable? Either to use SQL transaction or Delete the inserted records through database Procedure.
This is and ideal situation to use transaction. How do you know it?
Let's say you insert some rows, then do API call, then try to delete inserted rows. What will happen in that case?
Inserted rows are readable already (even without dirty read enabled) - they are just normal rows in database. So all the queries made until you finish you request, will relate to this rows as well.
What will happen if you fail to delete the rows? Exactly, they will just stay in database. Here you have improper data. Bad.
Use transaction approach - start transaction and commit it only when you finished API call, this way you will ensure, that your database contains proper data at all times.

Data flow insert lock

I have an issue with my data flow task locking, this task compares a couple of tables, from the same server and the result is inserted into one of the tables being compared. The table being inserted into is being compared by a NOT EXISTS clause.
When performing fast load the task freezes with out errors when doing a regular insert the task gives a dead lock error.
I have 2 other tasks that perform the same action to the same table and they work fine but the amount of information being inserted is alot smaller. I am not running these tasks in parallel.
I am considering using no locks hint to get around this because this is the only task that writes to a cerain table partition, however I am only coming to this conclusion because I can not figure out anything else, aside from using a temp table, or a hashed anti join.
Probably you have so called deadlock situation. You have in your DataFlow Task (DFT) two separate connection instances to the same table. The first conn instance runs SELECT and places Shared lock on the table, the second runs INSERT and places a page or table lock.
A few words on possible cause. SSIS DFT reads table rows and processes it in batches. When number of rows is small, read is completed within a single batch, and Shared lock is eliminated when Insert takes place. When number of rows is substantial, SSIS splits rows into several batches, and processes it consequentially. This allows to perform steps following DFT Data Source before the Data Source completes reading.
The design - reading and writing the same table in the same Data Flow is not good because of possible locking issue. Ways to work it out:
Move all DFT logic inside single INSERT statement and get rid of DFT. Might not be possible.
Split DFT, move data into intermediate table, and then - move to the target table with following DFT or SQL Command. Additional table needed.
Set a Read Committed Snapshot Isolation (RCSI) on the DB and use Read Committed on SELECT. Applicable to MS SQL DB only.
The most universal way is the second with an additional table. The third is for MS SQL only.

Why Bulk Import is faster than bunch of INSERTs?

I'm writing my graduate work about methods of importing data from a file to SQL Server table. I have created my own program and now I'm comparing it with some standard methods such as bcp, BULK INSERT, INSERT ... SELECT * FROM OPENROWSET(BULK...) etc. My program reads in lines from a source file, parses them and imports them one by one using ordinary INSERTs. The file contains 1 million lines with 4 columns each. And now I have the situation that my program takes 160 seconds while the standard methods take 5-10 seconds.
So the question is why are BULK operations faster? Do they use special means or something? Can you please explain it or give me some useful links or something?
BULK INSERT can be a minimally logged operation (depending on various
parameters like indexes, constraints on the tables, recovery model of
the database etc). Minimally logged operations only log allocations
and deallocations. In case of BULK INSERT, only extent allocations are
logged instead of the actual data being inserted. This will provide
much better performance than INSERT.
Compare Bulk Insert vs Insert
The actual advantage, is to reduce the amount of data being logged in the transaction log.
In case of BULK LOGGED or SIMPLE recovery model the advantage is significant.
Optimizing BULK Import Performance
You should also consider reading this answer : Insert into table select * from table vs bulk insert
By the way, there are factors that will influence the BULK INSERT performance :
Whether the table has constraints or triggers, or both.
The recovery model used by the database.
Whether the table into which data is copied is empty.
Whether the table has indexes.
Whether TABLOCK is being specified.
Whether the data is being copied from a single client or copied in
parallel from multiple clients.
Whether the data is to be copied between two computers on which SQL
Server is running.
I think you can find a lot of articles on it, just search for "why bulk insert is faster". For example this seems to be a good analysis:
https://www.simple-talk.com/sql/performance/comparing-multiple-rows-insert-vs-single-row-insert-with-three-data-load-methods/
Generally, any database has a lot of work for a single insert: checking the constraints, building indices, flush to disk. This complex operation can be optimized by the database when doing several in one operation, and not calling the engine one by one.
First of all, inserting row for row is not optimal. See this article on set logic and this article on what's the fastest way to load data into SQL Server.
Second, BULK import is optimized for large loads. This has all to do with page flushing, writing to log, indexes and various other things in SQL Server. There's an technet article on how you can optimize BULK INSERTS, this sheds some light on how BULK is faster. But I cant link more than twice, so you'll have to google for "Optimizing Bulk Import Performance".

Import in plsql developer using sqlldr is very slow

I have a large .sql file(with 1 Million records) which has insert statements.
this is provided by external system I have no control over.
I have to import this data into my database table, I thought it is a simple job, But Alas how wrong I was.
I am using plsql developer from AllroundAutomations, I went to
Tools -- Import Tables -- SQL Inserts -- pointed exe to sqlldr.exe,
and input to my .sql file with insert statements.
But this process is very slow only inserting around 100 records in a minute, I was expecting this whole process to take not more than an hour.
Is there a better way to do this, sounds simple to just import all data, but it takes hell lot of time.
P.S: I am a developer and not DBA and not an expert on Oracle, so any help appreciated.
When running massive numbers of INSERT's your should first drop all indexes on the table, then disable all constraints, then run your INSERT statements. You should also modify your script to include a COMMIT after every 1000 records or so. Afterwards re-add your indexes, re-enable all constraints, and gather statistics on that table (DBMS_STATS.GATHER_TABLE_STATS).
Best of luck.

Minimally Logged Insert Into

I have an INSERT statement that is eating a hell of a lot of log space, so much so that the hard drive is actually filling up before the statement completes.
The thing is, I really don't need this to be logged as it is only an intermediate data upload step.
For argument's sake, let's say I have:
Table A: Initial upload table (populated using bcp, so no logging problems)
Table B: Populated using INSERT INTO B from A
Is there a way that I can copy between A and B without anything being written to the log?
P.S. I'm using SQL Server 2008 with simple recovery model.
From Louis Davidson, Microsoft MVP:
There is no way to insert without
logging at all. SELECT INTO is the
best way to minimize logging in T-SQL,
using SSIS you can do the same sort of
light logging using Bulk Insert.
From your requirements, I would
probably use SSIS, drop all
constraints, especially unique and
primary key ones, load the data in,
add the constraints back. I load
about 100GB in just over an hour like
this, with fairly minimal overhead. I
am using BULK LOGGED recovery model,
which just logs the existence of new
extents during the logging, and then
you can remove them later.
The key is to start with barebones
tables, and it just screams. Building
the index once leaves you will no
indexes to maintain, just the one
index build per index.
If you don't want to use SSIS, the point still applies to drop all of your constraints and use the BULK LOGGED recovery model. This greatly reduces the logging done on INSERT INTO statements and thus should solve your issue.
http://msdn.microsoft.com/en-us/library/ms191244.aspx
Upload the data into tempdb instead of your database, and do all the intermediate transformations in tempdb. Then copy only the final data into the destination database. Use batches to minimize individual transaction size. If you still have problems, look into deploying trace flag 610, see The Data Loading Performance Guide and Prerequisites for Minimal Logging in Bulk Import:
Trace Flag 610
SQL Server 2008 introduces trace flag
610, which controls minimally logged
inserts into indexed tables.