How to Rollback DB2 Ingest statement for malformed data - sql

I have a Bash Shell Script that runs a DB2 sql file. The job of this sql file is to completely replace the contents of a database table with whatever are the contents of this sql file.
However, I also need that database table to have its contents preserved if errors are discovered in the ingested file. For example, supposing my table currently looks like this:
MY_TABLE
C1
C2
row0
15
27
row1
19
20
And supposing I have an input file that looks like this:
15,28
34,90
"a string that's obviously not supposed to be here"
54,23
If I run the script with this input file, the table should stay exactly the same as it was before, not using the contents of the file at all.
However, when I run my script, this isn't the behavior I observe: instead, the contents of MY_TABLE do get replaced with all of the valid rows of the input file so the new contents of the table become:
MY_TABLE
C1
C2
row0
15
28
row1
34
90
row2
54
23
In my script logic, I explicitly disable autocommit for the part of the script that ingests the file, and I only call commit after I've checked that the sql execution returned no errors; if it did cause errors, I call rollback instead. Nonetheless, the contents of the table get replaced when errors occur, as though the rollback command wasn't called at all, and a commit was called instead.
Where is the problem in my script?
script.ksh
SQL_FILE=/app/scripts/script.db2
LOG=/app/logs/script.log
# ...
# Boilerplate to setup the connection to the database server
# ...
# +c: autocommit off
# -v: echo commands
# -s: Stop if errors occur
# -p: Show prompt for interactivity (for debugging)
# -td#: use '#' as the statement delimiter in the file
db2 +c -s -v -td# -p < $SQL_FILE >> $LOG
if [ $? -gt 2 ];
then echo "An Error occurred; rolling back the data" >> $LOG
db2 "ROLLBACK" >> $LOG
exit 1
fi
# No errors, commit the changes
db2 "COMMIT" >> $LOG
script.db2
ingest from file '/app/temp/values.csv'
format delimited by ','
(
$C1 INTEGER EXTERNAL,
$C2 INTEGER EXTERNAL
)
restart new 'SCRIPT_JOB'
replace into DATA.MY_TABLE
(
C1,
C2
)
values
(
$C1,
$C2
)#

Adding as answer per OP's suggestion:
Per the db2 documentation for the ingest command It appears that the +c: autocommit off will not function:
Updates from the INGEST command are committed at the end of an ingest
operation. The INGEST command issues commits based on the commit_period
and commit_count configuration parameters. As a result of this, the
following do not affect the INGEST command: the CLP -c or +c options, which
normally affect whether the CLP automatically commits the NOT LOGGED
INITIALLY option on the CREATE TABLE statement

You probably want to set the warningcount 1 option, which will cause the command to terminate after the first error or warning. The default behaviour is to continue processing while ignoring all errors (warningcount 0).

Related

SQL query is returning 0 when running shell script but 6 when checking db

I'm trying to run a shell script that has a SQL query in it. Now, I can't use a SQL script in the shell script because of story requirements. I have been trying to get the shell script to return the correct count which is '6' but it is only returning '0'.
#!/bin/ksh
. /apps/path/config/setenv.ksh
DATE=`date "+%m%d%Y`
returnMessage="`sqlplus username/password#$ORACLE_SID << EOF
WHENEVER OSERROR EXIT SQL.OSCODE ;
WHENEVER SQLERROR EXIT SQL.OSCODE ;
spool /apps/path/data/test.txt
SET HEADING OFF
SET FEEDBACK OFF
SET VERIFY OFF
SET ECHO ON
SET PAGES 0
SET LINESIZE 90
select count(*) from table where dt = to_date('06/18/2020','MM/DD/YYYY');
EOF
`
"
exitCode=$?
oracleError=`echo "$returnMessage" | grep ORA-`
if [ -n "$oracleError" -o "$exitCode" -ne 0 ]; then
log "An error occurred while looking up the $COUNT"
log "SQLPlus Exit Code = $exitCode"
log "SQLPlus Message is: $returnMessage"
return 1
fi
export COUNT=`echo $returnMessage"
return 0
The output is also given below
SQL> SET HEADING OFF
SQL> SET FEEDBACK OFF
SQL> SET VERIFY OFF
SQL> SET ECHO ON
SQL> SET PAGES 0
SQL> SET LINESIZE 90
SQL>
SQL> select count(*) from table where dt = to_date('06/18/2020','MM/DD/YYYY');
0
SQL>
THis is the output and the code I'm using. Not sure where it is going wrong since the query should return 6;
UnCOMMITted data is only visible within the session that created it (and will ROLLBACK at the end of the session if it has not been COMMITted). If you can't see the data from another session (i.e. in SQL*Plus invoked from the shell) then make sure you have issued a COMMIT command in the SQL client where you INSERTed the data.
Note: even if you connect as the same user, this will create a separate session and you will not be able to see the uncommitted data in the other session.
If you have issued a COMMIT and still can't see the data then make sure that both the SQL Client and the shell program are connecting to the same server and the same database and are querying the same user's schema of that database.

connect to sqlplus only once without writing to a file in a loop

I have a requirement for which I need to write a ksh script that reads command line parameters into arrays and creates DML statements to insert records into an oracle database. I've created a script as below to achieve this. However, the user invoking the script doesn't have permission to write into the directory where the script has to run. So, is there a way we can fire multiple inserts on the database without connecting to sqlplus multiple times within the loop and at the same time, NOT create temp sql file as below? Any ideas are highly appreciated. Thanks in advance!
i=0
while (( i<$src_tbl_cnt ))
do
echo "insert into temp_table values ('${src_tbl_arr[$i]}', ${ins_row_arr[$i]}, ${rej_row_arr[$i]});" >> temp_scrpt.sql
(( i+=1 ))
done
echo "commit; disc; quit" >> temp_scrpt.sql
sqlplus user/pass#db # temp_scrpt.sql
Just use the /tmp directory.
The /tmp directory is guaranteed to be present on any unix-family server. It is there precisely for needs like this. Definitely do something like add the current process ID in the file name so that multiple users don't step on each other. So the total name is something like /tmp/temp_$PID_scrpt.sql or the like.
When done, be sure to also delete that file--say, in a line right after the sqlplus call. Thus be sure to store the file name in a variable and delete what's in that variable.
It should go without saying, but in a well run shop: 1) The admins should have put more than enough space in /tmp, 2) All the users in the community should not be deleting other's files in /tmp or overloading it so it runs out of space. 3) The admins should setup a job that deletes files from /tmp after a certain age so that if your script fails before it deletes the temporary file, it won't be there forever.
So really, this answer is more about /tmp and managing it effectively--but that really is what you need. Using temporary files is a powerful technique, so your design is good. And the reality that users often won't have rights in a directory is common, so /tmp is your answer.
Instead of creating a temporary file you can directly pipe the output of an input generating block into sqlplus, in your shell script.
Example:
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
# echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null
This works with Ksh 93 and Bash (perhaps even with Ksh 88 modulo the (( expression syntax).
The corresponding DDL statement for the test table:
create table itest ( i number(36) ) ;
PS: Btw, even when creating a temporary file is preferred - redirecting the output is way more efficient than doing an append-style redirect for each line, e.g.:
{ for ((i=0; i<100; i++)); do echo "line $i"; done; echo end; } > foo.tmp
the below piece of code will keep connecting to SQLplus multiple times or it will connect only once ?
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null

sqlldr - load completion not reflected

I have a bash script (load_data.sh) that invokes the sqlldr command to load data from a .csv into a table (data_import). On a recent invocation, I noticed that even though the command execution was completed, the table didn't contain the data from the .csv file. I say this because the subsequent statement (process_data.sh) in the bash script tried to run a stored procedure that threw the error
ORA-01403: no data found.
I learned that the commit happens right after the file load. So, I'm wondering what's causing this error and how I can avoid it in the future.
Here are my scripts:
load_data.sh
#!/usr/bin/bash -p
# code here #
if [[ -f .st_running ]]
then
echo "Exiting as looks like another instance of script is running"
exit
fi
touch .st_running
# ... #
# deletes existing data in the table
./clean.sh
sqlldr user/pwd#host skip=1 control=$CUR_CTL.final data=$fpath log=${DATA}_data.log rows=10000 direct=true errors=999
# accesses the newly loaded data in the table and processes it
./process_data.sh
rm -f .st_running
clean.sh/process_data.sh
# code here #
# ... #
sqlplus user/pwd#host <<EOF
set serveroutput on
begin
schema.STORED_PROC;
commit;
end;
/
exit;
EOF
# code here #
# ... #
STORED_PROC run by process_data.sh:
SELECT count(*) INTO l_num_to_import FROM data_import;
IF (l_num_to_import = 0) THEN RETURN;
END IF;
/* the error (`ORA-01403: no data found`) happens at this statement: */
SELECT upper(name) INTO name FROM data_import WHERE ROWNUM = 1;
Control file
LOAD DATA
APPEND
INTO TABLE DATA_IMPORT
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
...
...
)
Edits
The input file had 8 rows and the logs from both runs stated that 8 rows were successfully inserted.
Interesting behavior: The script ran fine (without complaining about the error) the 2nd time I ran it on the same file. So, during the first run, the sqlldr command doesn't seem to complete before the next sqlplus command is executed.
If you capture the PID of the sqlldr command and wait for it to complete then you will be sure its complete. You can add a datestamp to the log file or timestamp if its run multiple times a day and do a while loop and sleep and check to see when the log prints its last line of completion. Then run the next step.

import csv error using SQL Loader and perl

Hello all I have a question I hope you guys can help me with. I tried to include all the relevant info. I'm building a perl script that will eventually loop though different sqlloader control files and import their respective csv data into oracle sql database tables. I'm testing multiple control loads before looping them.
The problem is that I get an error even though the script connects to the db and uploads all the csv data without any problems that I can see. all the rows are accounted for and the log doesn't really help:
================================================================================
[root#sanasr06 scripts]# perl db_upload.pl
connection made! Starting database upload...
Error: Can't open import control_general to SQL DB : at db_upload.pl line 44
================================================================================
line 44 is the system connection:
system ("sqlldr $userid\#$sid/$passwd control=#control_pools log=$log silent=all ") or $logger->logdie("Error: Can't open import control data to SQL DB :$!");
I'm including the control file output, the perl script and the control file. (the skipped file mentioned is for the csv headers:)
SQL*Loader: Release 11.2.0.1.0 - Production on Tue Aug 14 12:32:36 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Control File: /despliegue/san/project/sql_ctrl/general.ctl
Character Set UTF8 specified for all input.
Data File: /despliegue/san/project/csv/Pools.csv
Bad File: /despliegue/san/project/logs/sql_error.bad
Discard File: /despliegue/san/project/logs/sql_discard.dsc
(Allow all discards)
Number to load: ALL
Number to skip: 1
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Silent options: FEEDBACK, ERRORS and DISCARDS
Table I_GENERAL, loaded from every logical record.
Insert option in effect for this table: TRUNCATE
TRAILING NULLCOLS option in effect
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
OBJECTID (FILLER FIELD) FIRST * , O(") CHARACTER
DESCRIPTION (FILLER FIELD) NEXT * , O(") CHARACTER
SERIALNUMBER NEXT * , O(") CHARACTER
PRODUCT_NAME NEXT * , O(") CHARACTER
CONTROLLER_VERSION NEXT * , O(") CHARACTER
NUMBER_OF_CONTROLLERS NEXT * , O(") CHARACTER
CAPACITY_GB NEXT * , O(") CHARACTER
PRODUCT_CODE NEXT * , O(") CHARACTER
value used for ROWS parameter changed from 64 to 15
Table I_GENERAL:
2512 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 247680 bytes(15 rows)
Read buffer bytes: 1048576
Total logical records skipped: 1
Total logical records read: 2512
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Tue Aug 14 12:32:36 2012
Run ended on Tue Aug 14 12:32:38 2012
==================================================================================
the above file is of course shortened but includes all the relevant information.
here's the perl script.
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use Log::Log4perl;
#this script loads multiple saved csv files into the database using the control files
################ Initialization #############################################
my $homepath = "/despliegue/san/project";
my $log_conf = "$homepath/logs/log.conf";
Log::Log4perl->init($log_conf)or die("Error: Can't open log.config Does it exist? $!");
my $logger = Log::Log4perl->get_logger();
################ database connection variables####
my ($serial, $model);
my $host="me.notyou33.safety";
my $port="1426";
my $userid="user";
my $passwd="pass";
my $sid="sid";
my $log="$homepath/logs/sql_import.log";
#Control file location
my #control_pools= "$homepath/sql_ctrl/pools.ctl";
my #control_general = "$homepath/sql_ctrl/general.ctl";
my #control_ports= "$homepath/sql_ctrl/ports.ctl";
my #control_replication = "$homepath/sql_ctrl/replication.ctl";
#######################Database connection and data upload #################
my $dbh = DBI->connect( "dbi:Oracle:host=$host;sid=$sid;port=$port", "$userid", "$passwd",
{ RaiseError => 1}) or $logger->logdie ("Database connection not made: $DBI::errstr");
print " connection made! Starting database upload...\n";
system ("sqlldr $userid\#$sid/$passwd control=#control_general log=$log silent=all") or $logger->logdie("Error: Can't open import control_general to SQL DB :$!");
print "one done moving to next one\n";
system ("sqlldr $userid\#$sid/$passwd control=#control_pools log=$log silent=all ") or $logger->logdie("Error: Can't open import control data to SQL DB :$!");
system ("sqlldr $userid\#$sid/$passwd control=#control_ports log=$log ") or $logger->logdie("Error: Can't open import control data to SQL DB :$!");
print "three done moving to last one\n";
system ("sqlldr $userid\#$sid/$passwd control=#control_replication log=$log silent=feedback ") or $logger->logdie("Error: Can't open import control data to SQL DB :$!");
print "................Done\n";
############################################################################
$dbh->disconnect;
==================================================================================
the control file:
OPTIONS (SKIP=1)
LOAD DATA
CHARACTERSET UTF8
INFILE '/despliegue/san/project/csv/Pools.csv'
BADFILE '/despliegue/san/project/logs/sql_error.bad'
DISCARDFILE '/despliegue/san/project/logs/sql_discard.dsc'
TRUNCATE INTO TABLE I_GENERAL
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY "\""
TRAILING NULLCOLS
(
OBJECTID FILLER,
DESCRIPTION FILLER,
SERIALNUMBER,
PRODUCT_NAME,
CONTROLLER_VERSION,
NUMBER_OF_CONTROLLERS,
CAPACITY_GB,
PRODUCT_CODE,
)
system() returns the return value of the wait call which includes the return value of the program you executed. If everything goes right, this will be 0. This is different from almost all other funktions in Perl, where you expect them to return some Value which evaluates to True in boolean context. Therefore, the commonly used error handling by using the or operator, does not work properly. You might want to try something like this instead:
system ("sqlldr $userid\#$sid/$passwd control=#control_pools log=$log silent=all") == 0
or $logger->logdie("Error: Can't open import control data to SQL DB :$?");
You can read more about handling the return value of system() in the documentation under perldoc -f system
There is Logdie which should be logdie, AFAIU
The problem is the system call expects a return value of 0 to be "successful". Your sqlldr job, if it skips or discards a record, will not return 0 (I've seen it return 2, check docs to be sure). So, unless you load all records successfully, your perl script (as written) will exit out.
perl system
sqlldr return codes
In my case I execute the sqlldr with backticks (similar to system), that helps me to get any feedback in a variable.
my $sqlldr = "sqlldr userid=usr/pss\#TNS control=\'$controlfile\' log=\'$logfile\' silent=header,feedback";
$execution = `$sqlldr 2>&1`;
The trick is that the returned value is not 0 in perl and you have to shift right 8 bits that value to be sure that you get 0. In my case I do as follows:
# Get the returned code from the last execution
my $ret = $? >> 8;
if ($ret == 0) {
$logger->info("Class DLA_Upload: All rows were successfully loaded");
}
elsif ($ret == 1) {
die("Class DLA_Upload: Executing sqlldr returned the following error:\n$execution");
}
elsif ($ret == 2) {
$logger->info("Class DLA_Upload: SQL*Loader was executed but some or all rows were rejected or discarded, please check $logfile for further information");
}
else {
die("Class DLA_Upload: FATAL ERROR: sqlldr corrupted or not found");
}
Why, here you have a link from Perl monks that explains it properly.

Execute SQL from file in bash

I'm trying to load a sql from a file in bash and execute the loaded sql. The sql file needs to be versatile, meaning it cannot be altered in order to make things easy while being run in bash (escaping special characters like * )
So I have run into some problems:
If I read my sample.sql
SELECT * FROM SAMPLETABLE
to a variable with
ab=`cat sample.sql`
and execute it
db2 `echo $ab`
I receive an sql error because by doing a cat the * has been replaced by all the files in the directory of sample.sql.
Easy solution would be to replace "" with "\" . But I cannot do this, because the file needs to stay executable in programs like DB Visualizer etc.
Could someone give me hint in the right direction?
The DB2 command line processor has options that accept a filename as input, so you shouldn't need to load statements from a text file into a shell variable.
This command will execute all SQL statements in the file, with newline treated as the statement terminator:
db2 -f sample.sql
This command will execute all SQL statements in the file, with semicolon treated as the statement terminator:
db2 -t -f sample.sql
Other useful CLP flags are:
-x : Suppress the column headings
-v : Echo the statement text immediately before execution
-z : Tee a copy of all CLP output to the filename immediately following this flag
Redirect stdin from the file.
db2 < sample.sql
In case, you have a variable used in your script and wanted to get it replaced by the shell before executed in DB2 then use this approach:
Contents of File.sql:
cat <<xEOF
insert values(1,2) into ${MY_SCHEMA}.${MY_TABLE};
select * from ${MY_SCHEMA}.${MY_TABLE};
xEOF
In command prompt do:
export MY_SCHEMA='STAR'
export MY_TAVLE='DIMENSION'
Then you are all good to get it executed in DB2:
eval File.sq |db2 +p -t
The shell will replace the global variables and then DB2 will execute it.
Hope it helps.