How do we insert data about 2 million rows into a oracle database table where we have many indexes on it?
I know that one option is disabling index and then inserting the data. Can anyone tell me what r the other options?
bulk load with presorted data in index key order
Check SQL*Loader out (especially the paragraph about performance optimization) : it is the standard bulk loading utility for Oracle, and it does a good job once you know how to use it (as always with Oracle).
there are many tricks to fasten de insert, below i wrote some of them
if you use sequence.nextval for insert make sure sequence has big cache value (1000 is enough usually)
drop indexes before insert and create afterwards (make sure you get the create scripts of indexes before dropping) while creating you can use parallel option
if target table has fk dependencies disable them before insert and after insert enable again. if you are sure of your data you can use novalidate option (novalidate option is valid for oracle, other rdbms systems probably have similar option)
if you select and insert you can give parallel hint for select statement and for insert you can use append hint (direct-path insert ) (direct-path insert concept is valid for oracle, other rdbms systems probably have similar option)
Not sure how you are inserting the records; if you can; insert the data in smaller chunks. In my experience 50 sets of 20k records is often quicker than 1 x 1000000
Make sure your database files are large enough before you start save you from database growth during the insert ...
If you are sure about the data, besides the index you can disable referential and constraint checks. You can also lower the transaction isolation level.
All these options come with a price, though. Each option increases your risk of having corrupt data in the sense that you may end up with null FK's etc.
As an another option, one can use oracle advanced and faster data pump (expdp, impdp) utilities availability 10 G onward. Though, Oracle still supports old export/import (exp, imp).
Oracle provides us with many choices for data loading, some way faster than others:
Oracle10 Data Pump Oracle import utility
SQL insert and merge
statements PL/SQL bulk loads for the forall PL/SQL operator
SQL*Loader
The pros/cons of each can be found here ..
Related
I would like to completely clear one table in my SQL Server database.
Unfortunately, the table is large (> 90GB). I am going to use the TRUNCATE statement.
The question is whether I should pay attention to something before?
I am also wondering if it will somehow affect the server's disk space (currently about 110 GB free)?
After all the action, SHRINK DATABASE will probably be necessary.
TRUNCATE TABLE is faster and uses fewer system and transaction log resources
than DELETE with no WHERE clause,
but if you need even faster solution, you can create new version of the table (table1), drop the old table, and rename table1 into table.
R
I'm writing my graduate work about methods of importing data from a file to SQL Server table. I have created my own program and now I'm comparing it with some standard methods such as bcp, BULK INSERT, INSERT ... SELECT * FROM OPENROWSET(BULK...) etc. My program reads in lines from a source file, parses them and imports them one by one using ordinary INSERTs. The file contains 1 million lines with 4 columns each. And now I have the situation that my program takes 160 seconds while the standard methods take 5-10 seconds.
So the question is why are BULK operations faster? Do they use special means or something? Can you please explain it or give me some useful links or something?
BULK INSERT can be a minimally logged operation (depending on various
parameters like indexes, constraints on the tables, recovery model of
the database etc). Minimally logged operations only log allocations
and deallocations. In case of BULK INSERT, only extent allocations are
logged instead of the actual data being inserted. This will provide
much better performance than INSERT.
Compare Bulk Insert vs Insert
The actual advantage, is to reduce the amount of data being logged in the transaction log.
In case of BULK LOGGED or SIMPLE recovery model the advantage is significant.
Optimizing BULK Import Performance
You should also consider reading this answer : Insert into table select * from table vs bulk insert
By the way, there are factors that will influence the BULK INSERT performance :
Whether the table has constraints or triggers, or both.
The recovery model used by the database.
Whether the table into which data is copied is empty.
Whether the table has indexes.
Whether TABLOCK is being specified.
Whether the data is being copied from a single client or copied in
parallel from multiple clients.
Whether the data is to be copied between two computers on which SQL
Server is running.
I think you can find a lot of articles on it, just search for "why bulk insert is faster". For example this seems to be a good analysis:
https://www.simple-talk.com/sql/performance/comparing-multiple-rows-insert-vs-single-row-insert-with-three-data-load-methods/
Generally, any database has a lot of work for a single insert: checking the constraints, building indices, flush to disk. This complex operation can be optimized by the database when doing several in one operation, and not calling the engine one by one.
First of all, inserting row for row is not optimal. See this article on set logic and this article on what's the fastest way to load data into SQL Server.
Second, BULK import is optimized for large loads. This has all to do with page flushing, writing to log, indexes and various other things in SQL Server. There's an technet article on how you can optimize BULK INSERTS, this sheds some light on how BULK is faster. But I cant link more than twice, so you'll have to google for "Optimizing Bulk Import Performance".
I have a large .sql file(with 1 Million records) which has insert statements.
this is provided by external system I have no control over.
I have to import this data into my database table, I thought it is a simple job, But Alas how wrong I was.
I am using plsql developer from AllroundAutomations, I went to
Tools -- Import Tables -- SQL Inserts -- pointed exe to sqlldr.exe,
and input to my .sql file with insert statements.
But this process is very slow only inserting around 100 records in a minute, I was expecting this whole process to take not more than an hour.
Is there a better way to do this, sounds simple to just import all data, but it takes hell lot of time.
P.S: I am a developer and not DBA and not an expert on Oracle, so any help appreciated.
When running massive numbers of INSERT's your should first drop all indexes on the table, then disable all constraints, then run your INSERT statements. You should also modify your script to include a COMMIT after every 1000 records or so. Afterwards re-add your indexes, re-enable all constraints, and gather statistics on that table (DBMS_STATS.GATHER_TABLE_STATS).
Best of luck.
I have to load a text file into a database on a daily basis that is about 50MB in size. I am using Perl DBI to load the file using insert statements into a SQL Server. It is not very performant, and I was wondering if there are better/faster ways of loading from DBI into SQL Server.
You should probably use the BULK INSERT statement. No reason you couldn't run that from DBI.
When doing large INSERT/UPDATE operations, it's generally useful to disable any indexes on the target table(s), make the changes, and re-enable the indexes. This way, the indexes only have to be rebuilt once instead of rebuilding them after each INSERT/UPDATE statement runs.
(This can also be applied in a zero-downtime way by copying the original table to an unindexed temp table, doing your work on the temp table, adding indexes, dropping the original table, and renaming the temp table to replace it.)
Another way to speed things up (if not already done) is to use prepared statements and bind-values.
I'm running the following SAS command:
Proc SQL;
Delete From Server003.CustomerList;
Quit;
Which is taking over 8 minutes... when it takes only a few seconds to read that file. What could be cause a delete to take so long and what can I do to make it go faster?
(I do not have access to drop the table, so I can only delete all rows)
Thanks,
Dan
Edit: I also apparently cannot Truncate tables.
This is NOT regular SQL. SAS' Proc SQL does not support the Truncate statement. Ideally, you want to figure out what's going on with the performance of the delete from; but if what you really need is truncate functionality, you could always just use pure SAS and not mess with SQL at all.
data Server003.CustomerList;
set Server003.CustomerList (obs=0);
run;
This effectively performs and operates like a Truncate would. It maintains the dataset/table structure but fails to populate it with data (due to the OBS= option).
Are there a lot of other tables which have foreign keys to this table? If those tables don't have indexes on the foreign key column(s) then it could take awhile for SQL to determine whether or not it's safe to delete the rows, even if none of the other tables actually has a value in the foreign key column(s).
Try adding this to your LIBNAME statement:
DIRECT_EXE=DELETE
According to SAS/ACCESS(R) 9.2 for Relational Databases: Reference,
Performance improves significantly by using DIRECT_EXE=, because the SQL delete statement is passed directly to the DBMS, instead of SAS reading the entire result set and deleting one row at a time.
I would also mention that in general SQL commands run slower in SAS PROC SQL. Recently I did a project and moved the TRUNCATE TABLE statements into a Stored Procedure to avoid the penalty of having them inside SAS and being handled by their SQL Optimizer and surrounding execution shell. In the end this increased the performance of the TRUNCATE TABLE substantially.
It might be slower because disk writes are typically slower than reads.
As for a way around it without dropping/truncating, good question! :)
You also could consider the elegant:
proc sql; create table libname.tablename like libname.tablename; quit;
I will produce a new table with the same name and same meta data of your previous table and delete the old one in the same operation.