sqlldr direct load=true with referential partition options - sql

So I need to do multiple bulk inserts into a table with row level triggers. I thought it would be a good idea to gather the generated ids first, combine them with my data and then do a direct=true sql load. Normally this would work fine but the table is partitioned by reference so it cannot disable the foreign key constraint that would allow me to do the direct load.
Does anyone know of anyway around this? My first solution of bulk collecting into a varray and inserting every 100,000 went moderately fast but if I was able to do a direct load, that would be much faster.
ERROR: SQL*Loader-965: Error -1 disabling constraint client_fk on
table my_table

The manual implies there is no way to have SQL*Loader use a direct path load but not disable the foreign keys.
But direct-path inserts can work on reference partitioned tables, even with the foreign keys enabled, as I demonstrated in this question and answer.
Convert the process from SQL*Loader to an external table INSERT statement. SQL*Loader and external tables use similar mechanisms so the conversion shouldn't be too difficult. External tables require a little more work - you have to write the INSERT with an append hint, and manually disable and re-enable triggers and perhaps other objects. But that extra control allows loading data quickly with direct-path inserts.

Related

Get all constraint errors when inserting data from another table

I have a staging table without any constraints in my Azure SQL database (Azure SQL database 12.0.2000.8). I want to insert the data from the Staging table into the "real" table on which multiple constraints are set. When inserting the data, I use a statement of the kind
INSERT INTO <someTable> SELECT <columns> FROM StagingTable;
Now I only get the first error when violating some constraints. However, for my use case, it is important to get all violations, so they can be resolved altogether.
I have tried using TRY...CATCH mechanisms, however, this will throw an error on the first error and run the catch clause, but it will not continue with the other data. Note that the correct data that has no violations should not be inserted, so the whole insert statement can be rolled back on one error, however, I want to see all violations to be able to correct them all without having to run the insert statement multiple times to get all errors.
EDIT:
The types of constraints that need to be checked are foreign key constraints, NOT NULL constraints, duplicate keys. No casting is done, so no need to check for conversions.
There are couple of options:
If you want to catch row level information, you have to go for cursors or while loop and try to insert each row in TRY CATCH block and see if you are getting any error, and log the same.
Create another table similar to main table(say, MainCheckTable) with all constraints and disable all the constraints and load the data.
Now, you can leverage DBCC CHECKCONSTRAINTS to see all the constraint violations.Read more on this .
USE DBName;
DBCC CHECKCONSTRAINTS(MainCheckTable) WITH ALL_CONSTRAINTS;
First, don't look at your primary table(s). Look at the related tables e.g. lookups etc. Populate these first. Once you have populated the related tables (i.e.) satisfy all related constraints, then add the data.
You need to work backwards from the least constrained tables to the most constrained if that makes sense.
You should check that your related tables have the required reference values/fields that you intend to insert. This is easy to do, since you already have a staging table.

monetdb - copy into from...requires tables without indices

I get this error with Monetdb when I try to load .tbl data in tables where there are primary key and foreign key, what's wrong?
This is the command:
COPY INTO monet.CUSTOMER FROM '/home/nicola/Scrivania/ssb-dbgen-master/1gb/customer.tbl' USING DELIMITERS '|', '|\n' LOCKED;
It is always good to bulk-load into tables with (foreign-) keys disabled. You can add them after the load with the ALTER statement.
see https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData
Another part of MonetDB's documentation says: "WARNING It is advised to add integrity constraints to the table after the file has been loaded. The ALTER statements perform bulk integrity checking and perform these checks often more efficiently." https://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto
Generally, for bulk loading into an existing table, is is advised to drop the indexes/foreign keys/other constraints, load the table, and then recreate the indexes/foreign keys/other constraints.

Can I create Foreign Keys across Databases?

We have 2 databases - DB1 & DB2.
Can I create a table in DB1 that has a relation with one of the tables in DB2?
In other words, can I have a Foreign Key in my table from another database?
I connect to these databases with different users.
Any ideas?
Right now, I receive the error:
ORA-00942:Table or view does not exist
No, Oracle does not allow you to create a foreign key constraint that references a table via a database link. You would have to use triggers to enforce the integrity.
One way to deal with this would be to create a materialized view of the master table on the local database, then create the integrity constraint pointing to the MV.
That works. But it can lead to some problems. First, if you ever need to do a complete refresh of the materialized view, you'll need to disable the constraint before doing do. Otherwise, Oracle won't be able to delete the rows in the MV before bringing in the new rows.
Second, you may run into some timing delays. For example say you add a record to the master table on the remote site. Then you want to add a child record to the local table. But the MV is set to refresh daily and that hasn't happened yet. You'll get a foreign key violation, simply because the MV hasn't refreshed.
If you go this route, your safest approach is to set the MV to fast refresh on commit of the master table. That'll mean keeping a DB Link open nearly all the time. And you'll have admin work to do if you ever need to do a complete refresh.
All in all, we've generally found that a trigger is easier. In some cases, we've simply defined the FK in our logical model but implemented it manually by setting up a daily job that will check for violations and alert staff. Of course, we're pretty careful so those alerts are exceedingly rare.

deleting a large number of rows from a table

We have a requirement to delete rows in the order of millions from multiple tables as a batch job (note that we are not deleting all the rows, we are deleting based on a timestamp stored in an indexed column). Obviously a normal DELETE takes forever (because of logging, referential constraint checking etc.). I know in the LUW world, we have ALTER TABLE NOT LOGGED INITIALLY but I can't seem to find the an equivalent SQL statement for DB2 v8 z/OS. Any one has any ideas on how to do this really fast? Also, any ideas on how to avoid the referential checks when deleting the rows? Please let me know.
In the past I have solved this kind of problem by exporting the data and re-loading it with a replace style command. For example:
EXPORT to myfile.ixf OF ixf
SELECT *
FROM my_table
WHERE last_modified < CURRENT TIMESTAMP - 30 DAYS;
Then you can LOAD it back in, replacing the old stuff.
LOAD FROM myfile.ixf OF ixf
REPLACE INTO my_table
NONRECOVERABLE INDEXING MODE INCREMENTAL;
I'm not sure whether this will be faster or not for you (probably it depends on whether you're deleting more than you're keeping).
Do the foreign keys already have indexes as well?
How do you have your delete action set? CASCADE, NULL, NO ACTION
Use SET INTEGRITY to temporarily disable constraints on the batch process.
http://www.ibm.com/developerworks/data/library/techarticle/dm-0401melnyk/index.html
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r
We modified the tablespace so the lock would occur at the tablespace level instead of at the page level. Once we changed that DB2 only required one lock to do the DELETE and we didn't have any issues with locking. As for the logging, we just asked the customer to be aware of the amount of logging required (as there did not seem to be a solution to get around the logging issue). As for the constraints, we just dropped and recreated them after the delete.
Thanks all for your help.

How to efficiently insert data into index-rich oracle db?

How do we insert data about 2 million rows into a oracle database table where we have many indexes on it?
I know that one option is disabling index and then inserting the data. Can anyone tell me what r the other options?
bulk load with presorted data in index key order
Check SQL*Loader out (especially the paragraph about performance optimization) : it is the standard bulk loading utility for Oracle, and it does a good job once you know how to use it (as always with Oracle).
there are many tricks to fasten de insert, below i wrote some of them
if you use sequence.nextval for insert make sure sequence has big cache value (1000 is enough usually)
drop indexes before insert and create afterwards (make sure you get the create scripts of indexes before dropping) while creating you can use parallel option
if target table has fk dependencies disable them before insert and after insert enable again. if you are sure of your data you can use novalidate option (novalidate option is valid for oracle, other rdbms systems probably have similar option)
if you select and insert you can give parallel hint for select statement and for insert you can use append hint (direct-path insert ) (direct-path insert concept is valid for oracle, other rdbms systems probably have similar option)
Not sure how you are inserting the records; if you can; insert the data in smaller chunks. In my experience 50 sets of 20k records is often quicker than 1 x 1000000
Make sure your database files are large enough before you start save you from database growth during the insert ...
If you are sure about the data, besides the index you can disable referential and constraint checks. You can also lower the transaction isolation level.
All these options come with a price, though. Each option increases your risk of having corrupt data in the sense that you may end up with null FK's etc.
As an another option, one can use oracle advanced and faster data pump (expdp, impdp) utilities availability 10 G onward. Though, Oracle still supports old export/import (exp, imp).
Oracle provides us with many choices for data loading, some way faster than others:
Oracle10 Data Pump Oracle import utility
SQL insert and merge
statements PL/SQL bulk loads for the forall PL/SQL operator
SQL*Loader
The pros/cons of each can be found here ..