sqlldr - load completion not reflected - sql

I have a bash script (load_data.sh) that invokes the sqlldr command to load data from a .csv into a table (data_import). On a recent invocation, I noticed that even though the command execution was completed, the table didn't contain the data from the .csv file. I say this because the subsequent statement (process_data.sh) in the bash script tried to run a stored procedure that threw the error
ORA-01403: no data found.
I learned that the commit happens right after the file load. So, I'm wondering what's causing this error and how I can avoid it in the future.
Here are my scripts:
load_data.sh
#!/usr/bin/bash -p
# code here #
if [[ -f .st_running ]]
then
echo "Exiting as looks like another instance of script is running"
exit
fi
touch .st_running
# ... #
# deletes existing data in the table
./clean.sh
sqlldr user/pwd#host skip=1 control=$CUR_CTL.final data=$fpath log=${DATA}_data.log rows=10000 direct=true errors=999
# accesses the newly loaded data in the table and processes it
./process_data.sh
rm -f .st_running
clean.sh/process_data.sh
# code here #
# ... #
sqlplus user/pwd#host <<EOF
set serveroutput on
begin
schema.STORED_PROC;
commit;
end;
/
exit;
EOF
# code here #
# ... #
STORED_PROC run by process_data.sh:
SELECT count(*) INTO l_num_to_import FROM data_import;
IF (l_num_to_import = 0) THEN RETURN;
END IF;
/* the error (`ORA-01403: no data found`) happens at this statement: */
SELECT upper(name) INTO name FROM data_import WHERE ROWNUM = 1;
Control file
LOAD DATA
APPEND
INTO TABLE DATA_IMPORT
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
...
...
)
Edits
The input file had 8 rows and the logs from both runs stated that 8 rows were successfully inserted.
Interesting behavior: The script ran fine (without complaining about the error) the 2nd time I ran it on the same file. So, during the first run, the sqlldr command doesn't seem to complete before the next sqlplus command is executed.

If you capture the PID of the sqlldr command and wait for it to complete then you will be sure its complete. You can add a datestamp to the log file or timestamp if its run multiple times a day and do a while loop and sleep and check to see when the log prints its last line of completion. Then run the next step.

Related

How to Rollback DB2 Ingest statement for malformed data

I have a Bash Shell Script that runs a DB2 sql file. The job of this sql file is to completely replace the contents of a database table with whatever are the contents of this sql file.
However, I also need that database table to have its contents preserved if errors are discovered in the ingested file. For example, supposing my table currently looks like this:
MY_TABLE
C1
C2
row0
15
27
row1
19
20
And supposing I have an input file that looks like this:
15,28
34,90
"a string that's obviously not supposed to be here"
54,23
If I run the script with this input file, the table should stay exactly the same as it was before, not using the contents of the file at all.
However, when I run my script, this isn't the behavior I observe: instead, the contents of MY_TABLE do get replaced with all of the valid rows of the input file so the new contents of the table become:
MY_TABLE
C1
C2
row0
15
28
row1
34
90
row2
54
23
In my script logic, I explicitly disable autocommit for the part of the script that ingests the file, and I only call commit after I've checked that the sql execution returned no errors; if it did cause errors, I call rollback instead. Nonetheless, the contents of the table get replaced when errors occur, as though the rollback command wasn't called at all, and a commit was called instead.
Where is the problem in my script?
script.ksh
SQL_FILE=/app/scripts/script.db2
LOG=/app/logs/script.log
# ...
# Boilerplate to setup the connection to the database server
# ...
# +c: autocommit off
# -v: echo commands
# -s: Stop if errors occur
# -p: Show prompt for interactivity (for debugging)
# -td#: use '#' as the statement delimiter in the file
db2 +c -s -v -td# -p < $SQL_FILE >> $LOG
if [ $? -gt 2 ];
then echo "An Error occurred; rolling back the data" >> $LOG
db2 "ROLLBACK" >> $LOG
exit 1
fi
# No errors, commit the changes
db2 "COMMIT" >> $LOG
script.db2
ingest from file '/app/temp/values.csv'
format delimited by ','
(
$C1 INTEGER EXTERNAL,
$C2 INTEGER EXTERNAL
)
restart new 'SCRIPT_JOB'
replace into DATA.MY_TABLE
(
C1,
C2
)
values
(
$C1,
$C2
)#
Adding as answer per OP's suggestion:
Per the db2 documentation for the ingest command It appears that the +c: autocommit off will not function:
Updates from the INGEST command are committed at the end of an ingest
operation. The INGEST command issues commits based on the commit_period
and commit_count configuration parameters. As a result of this, the
following do not affect the INGEST command: the CLP -c or +c options, which
normally affect whether the CLP automatically commits the NOT LOGGED
INITIALLY option on the CREATE TABLE statement
You probably want to set the warningcount 1 option, which will cause the command to terminate after the first error or warning. The default behaviour is to continue processing while ignoring all errors (warningcount 0).

Calling SQL*PLUS from UNIX Shell script and print the status message about the query

I want to run a SQL code using shell script and return the message whether the SQL query executed successfully or not. For this I have used unix script given below.
#!/bin/sh
sqlplus -S hr/hr#xe<<EOF
#emp.sql
EOF
var1=$(cat /cygdrive/d/scripts/output.txt | grep -c 'COUNT')
if [ $var1 -ge 1 ];
then
echo "success"
else
echo "failure"
fi
exit;
and emp.sql(called sql file) as
SET ECHO OFF
SPOOL D:\scripts\output.txt
SET LINESIZE 100
SET PAGESIZE 50
SELECT count(*) FROM employees;
SPOOL OFF;
EXIT 0;
When I execute the script I am getting output as
COUNT(*)
----------
107
./script1.sh: line 13: syntax error: unexpected end of file.
I don't know where I should put EOF statement exactly. Also I am not getting the status message whether it is success or failure which I want as output. Please help. Thanks in advance
SPOOL D:\scripts\output.txt Isnt this windows way of referring to a file where as in the shell script you referred to the file as /cygdrive/d/scripts/output.txt. I assume you are using linux shell to execute so I executed your script changing the spool line in sql file. It worked fine.
Edit: Also the \ that you used, for the spooled output.txt path, will cause the sqlplus to terminate. Hence the error line 13: syntax error: unexpected end of file. Perhaps add quotes to the path or use the same file path as you used in shell

connect to sqlplus only once without writing to a file in a loop

I have a requirement for which I need to write a ksh script that reads command line parameters into arrays and creates DML statements to insert records into an oracle database. I've created a script as below to achieve this. However, the user invoking the script doesn't have permission to write into the directory where the script has to run. So, is there a way we can fire multiple inserts on the database without connecting to sqlplus multiple times within the loop and at the same time, NOT create temp sql file as below? Any ideas are highly appreciated. Thanks in advance!
i=0
while (( i<$src_tbl_cnt ))
do
echo "insert into temp_table values ('${src_tbl_arr[$i]}', ${ins_row_arr[$i]}, ${rej_row_arr[$i]});" >> temp_scrpt.sql
(( i+=1 ))
done
echo "commit; disc; quit" >> temp_scrpt.sql
sqlplus user/pass#db # temp_scrpt.sql
Just use the /tmp directory.
The /tmp directory is guaranteed to be present on any unix-family server. It is there precisely for needs like this. Definitely do something like add the current process ID in the file name so that multiple users don't step on each other. So the total name is something like /tmp/temp_$PID_scrpt.sql or the like.
When done, be sure to also delete that file--say, in a line right after the sqlplus call. Thus be sure to store the file name in a variable and delete what's in that variable.
It should go without saying, but in a well run shop: 1) The admins should have put more than enough space in /tmp, 2) All the users in the community should not be deleting other's files in /tmp or overloading it so it runs out of space. 3) The admins should setup a job that deletes files from /tmp after a certain age so that if your script fails before it deletes the temporary file, it won't be there forever.
So really, this answer is more about /tmp and managing it effectively--but that really is what you need. Using temporary files is a powerful technique, so your design is good. And the reality that users often won't have rights in a directory is common, so /tmp is your answer.
Instead of creating a temporary file you can directly pipe the output of an input generating block into sqlplus, in your shell script.
Example:
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
# echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null
This works with Ksh 93 and Bash (perhaps even with Ksh 88 modulo the (( expression syntax).
The corresponding DDL statement for the test table:
create table itest ( i number(36) ) ;
PS: Btw, even when creating a temporary file is preferred - redirecting the output is way more efficient than doing an append-style redirect for each line, e.g.:
{ for ((i=0; i<100; i++)); do echo "line $i"; done; echo end; } > foo.tmp
the below piece of code will keep connecting to SQLplus multiple times or it will connect only once ?
{
echo 'set auto off;'
for ((i=0; i<100; i++)); do
echo "insert into itest(i) values ($i);"
done
echo 'rollback;' # for testing
echo 'commit;'
} | sqlplus -S juser/secret#db > /dev/null

How to pass a query stored in a variable to a sql file. Shell script

I am creating a shell script where I have saved entries from a text file into an array. Those values are properly stored and show the correct contents. One of those entries contains a simple query and I want to pass it to a sql file. With that sql query I want to save the results into a text file.
Here is the part of the code that calls the sql file to run the sql script
PURGE_SITES=purge_site.txt
logmsg "USERID - $PURGES_SITE" n
QUERY=${Unix_Array[4]}
echo $QUERY
sqlplus -s $USER/$PASS <<EndSQL
#purges_sites.sql $PURGE_SITES '$QUERY'
EXIT SQL.SQLCODE
EndSQL
for now query stored in ${Unix_Array[4]} is "select -1 from dual"
Here is the file contents of the .sql file
set echo off ver off feed off pages 0
accept fname prompt 'Loading Sites...'
spool &1;
&2
/
spool off
It gives me error and reads &2 as "&2" instead of the query saved in the variable. However when i edit the .sql file and add something beforehand, it will display the correct data from the variable.
Here is the output
select -1 from dual
File Name===> results.txt
select -1 from dual
Loading Sites...SP2-0042: unknown command "&2" - rest of line ignored.
SP2-0103: Nothing in SQL buffer to run.
Here is the output if I add something before &2.
select -1 from dual
File Name===> results.txt
select -1 from dual
Loading Sites...select * from table_table select -1 from dual
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
I typed in select * from table_table before &2.
So its actually retrieving the value from the variable but something needs to come beforehand in order to pass correctly.
Is there a system execute command in oracle that will execute a query? &2 just by itself is not allowed.
Wont this help you?
PURGE_SITES=purge_site.txt
logmsg "USERID - $PURGES_SITE" n
QUERY=${Unix_Array[4]}
echo $QUERY
# FRAME YOUR QUERY, PROMPTING USER IN SHELL ITSELF AND SEND TO SQLPLUS DIRECTLY
# BEWARE SQL INJECTION POSSIBLE
# YOU CAN REDIRECT THE SQLPLUS OUTPUT TO A FILE LIKE THIS, NO SPOOL NEEDED
sqlplus -s $USER/$PASS <<EndSQL >> $OUTPUT_FILE
set echo off ver off feed off pages 0
$QUERY
/
EXIT SQL.SQLCODE
EndSQL

cron script to act as a queue OR a queue for cron?

I'm betting that someone has already solved this and maybe I'm using the wrong search terms for google to tell me the answer, but here is my situation.
I have a script that I want to run, but I want it to run only when scheduled and only one at a time. (can't run the script simultaneously)
Now the sticky part is that say I have a table called "myhappyschedule" which has the data I need and the scheduled time. This table can have multiple scheduled times even at the same time, each one would run this script. So essentially I need a queue of each time the script fires and they all need to wait for each one before it to finish. (sometimes this can take just a minute for the script to execute sometimes its many many minutes)
What I'm thinking about doing is making a script that checks myhappyschedule every 5 min and gathers up those that are scheduled, puts them into a queue where another script can execute each 'job' or occurrence in the queue in order. Which all of this sounds messy.
To make this longer - I should say that I'm allowing users to schedule things in myhappyschedule and not edit crontab.
What can be done about this? File locks and scripts calling scripts?
add a column exec_status to myhappytable (maybe also time_started and time_finished, see pseudocode)
run the following cron script every x minutes
pseudocode of cron script:
[create/check pid lock (optional, but see "A potential pitfall" below)]
get number of rows from myhappytable where (exec_status == executing_now)
if it is > 0, exit
begin loop
get one row from myhappytable
where (exec_status == not_yet_run) and (scheduled_time <= now)
order by scheduled_time asc
if no such row, exit
set row exec_status to executing_now (maybe set time_started to now)
execute whatever command the row contains
set row exec_status to completed
(maybe also store the command output/return as well, set time_finished to now)
end loop
[delete pid lock file (complementary to the starting pid lock check)]
This way, the script first checks if none of the commands is running, then runs first not-yet run command, until there are no more commands to be run at the given moment. Also, you can see what command is executing by querying the database.
A potential pitfall: if the cron script is killed, a scheduled task will remain in "executing_now" state. That's what the pid lock at beginning and end is for: to see if the cron script terminated properly. pseudocode of create/check pidlock:
if exists pidlockfile then
check if process id given in file exists
if not exists then
update myhappytable set exec_status = error_cronscript_died_while_executing_this
where exec_status == executing_now
delete pidlockfile
else (previous instance still running)
exit
endif
endif
create pidlockfile containing cron script process id
You can use the at(1) command inside your script to schedule its next run. Before it exits, it can check myhappyschedule for the next run time. You don't need cron at all, really.
I came across this question while researching for a solution to the queuing problem. For the benefit of anyone else searching here is my solution.
Combine this with a cron that starts jobs as they are scheduled (even if they are scheduled to run at the same time) and that solves the problem you described as well.
Problem
At most one instance of the script should be running.
We want to cue up requests to process them as fast as possible.
ie. We need a pipeline to the script.
Solution:
Create a pipeline to any script. Done using a small bash script (further down).
The script can be called as
./pipeline "<any command and arguments go here>"
Example:
./pipeline sleep 10 &
./pipeline shabugabu &
./pipeline single_instance_script some arguments &
./pipeline single_instance_script some other_argumnts &
./pipeline "single_instance_script some yet_other_arguments > output.txt" &
..etc
The script creates a new named pipe for each command. So the above will create named pipes: sleep, shabugabu, and single_instance_script
In this case the initial call will start a reader and run single_instance_script with some arguments as arguments. Once the call completes, the reader will grab the next request off the pipe and execute with some other_arguments, complete, grab the next etc...
This script will block requesting processes so call it as a background job (& at the end) or as a detached process with at (at now <<< "./pipeline some_script")
#!/bin/bash -Eue
# Using command name as the pipeline name
pipeline=$(basename $(expr "$1" : '\(^[^[:space:]]*\)')).pipe
is_reader=false
function _pipeline_cleanup {
if $is_reader; then
rm -f $pipeline
fi
rm -f $pipeline.lock
exit
}
trap _pipeline_cleanup INT TERM EXIT
# Dispatch/initialization section, critical
lockfile $pipeline.lock
if [[ -p $pipeline ]]
then
echo "$*" > $pipeline
exit
fi
is_reader=true
mkfifo $pipeline
echo "$*" > $pipeline &
rm -f $pipeline.lock
# Reader section
while read command < $pipeline
do
echo "$(date) - Executing $command"
($command) &> /dev/null
done