I have a large .sql file(with 1 Million records) which has insert statements.
this is provided by external system I have no control over.
I have to import this data into my database table, I thought it is a simple job, But Alas how wrong I was.
I am using plsql developer from AllroundAutomations, I went to
Tools -- Import Tables -- SQL Inserts -- pointed exe to sqlldr.exe,
and input to my .sql file with insert statements.
But this process is very slow only inserting around 100 records in a minute, I was expecting this whole process to take not more than an hour.
Is there a better way to do this, sounds simple to just import all data, but it takes hell lot of time.
P.S: I am a developer and not DBA and not an expert on Oracle, so any help appreciated.
When running massive numbers of INSERT's your should first drop all indexes on the table, then disable all constraints, then run your INSERT statements. You should also modify your script to include a COMMIT after every 1000 records or so. Afterwards re-add your indexes, re-enable all constraints, and gather statistics on that table (DBMS_STATS.GATHER_TABLE_STATS).
Best of luck.
Related
How to do performance tuning for a SQL Server table to speed up the inserts?
For example in an Employee table I have 150 000 records. When I am trying to insert a few more records (around 20k), it is taking 10-15 minutes.
Performance tuning using wait stats is a good approach in your case..below are few steps i would do
step1:
Run insert query
step2:
open another session and run below
select * from sys.dm_exec_requests
Now status and wait type column should give you enough info on what are your next steps
Ex:
If status is blocked(normally inserts won't be blocked),check the blocking query and see why it is blocked
Above is just an example and there is more info online for any wait type you might encounter
My suggestion for speeding inserts is to do a bulk insert into a temporary table and then a single insert into the final table.
This assumes that the source of the new records is an external file.
In any case, your question leaves lots of important information unexplained:
How are you doing the inserts now? It should not be taking 10-15 minutes to insert 20k records.
How many indexes are on the table?
How large is each record?
What triggers are on the table?
What other operations are taking place on the server?
What is the source of the records being inserted?
Do you have indexed views that use the table?
There are many reasons why inserts could be slow.
Some ideas in addition to checking for locks
Disable any on insert triggers if They exist and incorporate there logic into your insert. Also disable any indexes on table and reenable post bulk insert.
I'm writing my graduate work about methods of importing data from a file to SQL Server table. I have created my own program and now I'm comparing it with some standard methods such as bcp, BULK INSERT, INSERT ... SELECT * FROM OPENROWSET(BULK...) etc. My program reads in lines from a source file, parses them and imports them one by one using ordinary INSERTs. The file contains 1 million lines with 4 columns each. And now I have the situation that my program takes 160 seconds while the standard methods take 5-10 seconds.
So the question is why are BULK operations faster? Do they use special means or something? Can you please explain it or give me some useful links or something?
BULK INSERT can be a minimally logged operation (depending on various
parameters like indexes, constraints on the tables, recovery model of
the database etc). Minimally logged operations only log allocations
and deallocations. In case of BULK INSERT, only extent allocations are
logged instead of the actual data being inserted. This will provide
much better performance than INSERT.
Compare Bulk Insert vs Insert
The actual advantage, is to reduce the amount of data being logged in the transaction log.
In case of BULK LOGGED or SIMPLE recovery model the advantage is significant.
Optimizing BULK Import Performance
You should also consider reading this answer : Insert into table select * from table vs bulk insert
By the way, there are factors that will influence the BULK INSERT performance :
Whether the table has constraints or triggers, or both.
The recovery model used by the database.
Whether the table into which data is copied is empty.
Whether the table has indexes.
Whether TABLOCK is being specified.
Whether the data is being copied from a single client or copied in
parallel from multiple clients.
Whether the data is to be copied between two computers on which SQL
Server is running.
I think you can find a lot of articles on it, just search for "why bulk insert is faster". For example this seems to be a good analysis:
https://www.simple-talk.com/sql/performance/comparing-multiple-rows-insert-vs-single-row-insert-with-three-data-load-methods/
Generally, any database has a lot of work for a single insert: checking the constraints, building indices, flush to disk. This complex operation can be optimized by the database when doing several in one operation, and not calling the engine one by one.
First of all, inserting row for row is not optimal. See this article on set logic and this article on what's the fastest way to load data into SQL Server.
Second, BULK import is optimized for large loads. This has all to do with page flushing, writing to log, indexes and various other things in SQL Server. There's an technet article on how you can optimize BULK INSERTS, this sheds some light on how BULK is faster. But I cant link more than twice, so you'll have to google for "Optimizing Bulk Import Performance".
I need to update a table in sql server 2008 along the lines of the Merge statement - delete, insert, updates. Table is 700k rows and I need users to still have read access to it assuming an isolation level of read committed.
I tried things like ALTER TABLE table SET (LOCK_ESCALATION=DISABLE) to no avail. I tested by doing a select top 50000 * from another window, obvious read uncommitted worked :). Is there anyway around this without changing the user's isolation level and retaining an 'all or nothing' transaction behaviour?
My current solution of a cursor that commits in batches of n may allow users to work but loses the transactional behaviour. Perhaps I could just make the bulk update fast enough to always be less than 30 seconds (for timeout). The problem is the user's target db's are on very slow machines with only 512mb ram. Not sure the processor but assume it is really slow and I don't have access to them at this time!
I created a test that causes an update statement to need to run against all 700k rows:
I tried an update with a left join on my dev box (quite fast) and it was 17 seconds
The merge statement was 10 seconds
The FORWARD ONLY cursor was slower than both
These figures are acceptable on my machine but I would feel more comfortable if I could get the query time down to less than 5 seconds before allowing locks.
Any ideas on preventing any locking on the table/rows or making it faster still?
It sounds like this table may be queried a lot but not updated much. If it really is a true read-only table for everyone else, but you want to update it extremely quickly, you could create a script that uses this method (could even wrap it in a transaction, I believe, although it's probably unnecessary):
Make a copy of the table named TABLENAME_COPY
Update the copy table
Rename the original table to TABLENAME_ORIG
Rename the copy table to TABLENAME
You would only experience downtime in between steps 3 and 4, and a scripted table rename would probably be quicker than an update of so many rows.
This does all assume that no else can update the table while your process is running, but they will still be able to read it fully at any point except between 3 & 4
I have a task to do that will require me using a transaction to ensure that many inserts will be completed or the entire update rolled back.
I am concerned about the amount of data that needs to be inserted in this transaction and whether this will have a negative affect on the server.
We are looking at about 10,000 records in table1 and 60,0000 records into table2.
Is this safe to do in a single transaction?
have you thought about using a bulk data loader like SSIS or the data import wizard that comes with sql server?
the data import wizard is pretty simple.
In management studio right click on the database you want to import data into. Then select tasks and import data. Follow the wizard prompts. If a record fails the whole transaction will fail.
I have loaded millions of records this way (and using SSIS).
it is safe, however keep in mind that you might be blocking other users during that time. Also take a look at bcp or BULK INSERT to make the inserts faster
How do we insert data about 2 million rows into a oracle database table where we have many indexes on it?
I know that one option is disabling index and then inserting the data. Can anyone tell me what r the other options?
bulk load with presorted data in index key order
Check SQL*Loader out (especially the paragraph about performance optimization) : it is the standard bulk loading utility for Oracle, and it does a good job once you know how to use it (as always with Oracle).
there are many tricks to fasten de insert, below i wrote some of them
if you use sequence.nextval for insert make sure sequence has big cache value (1000 is enough usually)
drop indexes before insert and create afterwards (make sure you get the create scripts of indexes before dropping) while creating you can use parallel option
if target table has fk dependencies disable them before insert and after insert enable again. if you are sure of your data you can use novalidate option (novalidate option is valid for oracle, other rdbms systems probably have similar option)
if you select and insert you can give parallel hint for select statement and for insert you can use append hint (direct-path insert ) (direct-path insert concept is valid for oracle, other rdbms systems probably have similar option)
Not sure how you are inserting the records; if you can; insert the data in smaller chunks. In my experience 50 sets of 20k records is often quicker than 1 x 1000000
Make sure your database files are large enough before you start save you from database growth during the insert ...
If you are sure about the data, besides the index you can disable referential and constraint checks. You can also lower the transaction isolation level.
All these options come with a price, though. Each option increases your risk of having corrupt data in the sense that you may end up with null FK's etc.
As an another option, one can use oracle advanced and faster data pump (expdp, impdp) utilities availability 10 G onward. Though, Oracle still supports old export/import (exp, imp).
Oracle provides us with many choices for data loading, some way faster than others:
Oracle10 Data Pump Oracle import utility
SQL insert and merge
statements PL/SQL bulk loads for the forall PL/SQL operator
SQL*Loader
The pros/cons of each can be found here ..