pdf to text conversion using oracle package - sql

i m seeing some strange behavior for my pdf to text conversion using oracle
bellow is the code of a sql file.
create or replace directory pdf_dir as '&1';
create or replace directory l_curr_dir as '&3';
declare
ll_clob CLOB;
l_bfile BFILE;
l_filename VARCHAR2(100) := '&2';
begin
begin
ctx_ddl.drop_policy('test_policy');
exception
when others then
null;
end;
ctx_ddl.create_policy('test_policy','ctxsys.auto_filter');
l_bfile := bfilename('PDF_DIR',l_filename);
dbms_lob.createtemporary(ll_clob, true);
ctx_doc.policy_filter(
policy_name => 'test_policy'
, document => l_bfile
, restab => ll_clob
, plaintext => true
);
ll_clob := REPLACE(TRIM(ll_clob), chr(13), chr(10));
ll_clob := REPLACE(ll_clob, chr(10), chr(32) || '<<EOL>>' || chr(10)||'<<BOL>>');
INSERT into tempclob_op(filename, data) VALUES(l_filename, ll_clob);
DBMS_XSLPROCESSOR.clob2file (ll_clob,'L_CURR_DIR' , 'plaintext.text');
dbms_lob.freeTemporary( ll_clob );
end;
/
problem is i have run this code for 10000 files and it gives correct results for almost all but for almost 10 files it corrupts the output in plaintext.text file. And i dont know why is it happening? Also when i run this sql code for individual files it gives me correct results.

I have added some delay of 2 seconds in every execution while in loop for each file. and seems it resolved the problem strangely ..no concrete answers though.

Related

Oracle APEX - Download hidden SQL query into CSV

I am trying to create a button on a page in my application that will download the full table I am referencing as a CSV file. I cannot use interactive reports > actions > download CSV because the interactive reports have hidden columns. I need all columns to populate in the CSV file.
Is there a way to create a SQL Script and reference it in the button?
I have already tried the steps referenced in this link: Oracle APEX - Export a query into CSV using a button but it does not help as my queries will contain columns that are hidden in the Interactive Report.
Welcome to StackOverflow!
One flexible option would be to use an application process, to be defined in the shared components (process point = ajax callback).
Something like this:
declare
lClob clob;
lBlob blob;
lFilename varchar2(250) := 'filename.csv';
begin
lClob := UNISTR('\FEFF'); -- was necessary for us to be able to use the files in MS Excel
lClob := lClob || 'Tablespace Name;Table Name;Number of Rows' || utl_tcp.CRLF;
for c in (select tablespace_name, table_name, num_rows from user_tables where rownum <= 5)
loop
lClob := lClob || c.tablespace_name || ';' || c.table_name || ';' || c.num_rows || utl_tcp.CRLF;
end loop;
lBlob := fClobToBlob(lClob);
sys.htp.init;
sys.owa_util.mime_header('text/csv', false);
sys.htp.p('Conent-length: ' || dbms_lob.getlength(lBlob));
sys.htp.p('Content-Disposition: attachment; filename = "' || lFilename || '"');
sys.htp.p('Cache-Control: no-cache, no-store, must-revalidate');
sys.htp.p('Pragma: no-cache');
sys.htp.p('Expires: 0');
sys.owa_util.http_header_close;
sys.wpg_docload.download_file(lBlob);
end;
This is the function fClobToBlob:
create function fClobToBlob(aClob CLOB) RETURN BLOB IS
tgt_blob BLOB;
amount INTEGER := DBMS_LOB.lobmaxsize;
dest_offset INTEGER := 1;
src_offset INTEGER := 1;
blob_csid INTEGER := nls_charset_id('UTF8');
lang_context INTEGER := DBMS_LOB.default_lang_ctx;
warning INTEGER := 0;
begin
if aClob is null then
return null;
end if;
DBMS_LOB.CreateTemporary(tgt_blob, true);
DBMS_LOB.ConvertToBlob(tgt_blob, aClob, amount, dest_offset, src_offset, blob_csid, lang_context, warning);
return tgt_blob;
end fClobToBlob;
On the page, you need to set your button action to "Redirect to Page in this Application", the target Page to "0". Under "Advanced", set Request to "APPLICATION_PROCESS=downloadCSV", where downloadCSV is the name of your application process.
If you need to parameterize your process, you can do this by accessing page items or application items in your application process.
Generating the CSV data can be cumbersome, but there are several packages out there that make it easier. The alexandria packages are one of them:
https://github.com/mortenbra/alexandria-plsql-utils
An example on how to use the CSV Package is here:
https://github.com/mortenbra/alexandria-plsql-utils/blob/master/demos/csv_util_pkg_demo.sql

How to Write Blob from Oracle Column to the File System

my_images table consists of a blob column called images. I need to write these images to my image_dir which is 'C:\TEMP'.
When the following PL/SQL code is executed, only the first image is written to the directory as an image. The second blob is written as 0 byte (empty) and there is no other (Should be a total number of 8).
So the loop does not seem to work correctly. I am using Oracle 11g Express Edition (XE) and SQL Developer. Here is the error and the code:
Error starting at line : 53 in command -
BEGIN write_blob_to_file_v5; END;
Error report -
ORA-01403: no data found
ORA-06512: at "SYS.DBMS_LOB", line 1056
ORA-06512: at "SYS.WRITE_BLOB_TO_FILE_V5", line 40
ORA-06512: at line 1
01403. 00000 - "no data found"
*Cause: No data was found from the objects.
*Action: There was no data from the objects which may be due to end of fetch.
PL/SQL code
CREATE OR REPLACE PROCEDURE write_blob_to_file_v5
AS
v_lob_image_name VARCHAR (100);
v_lob_image_id NUMBER;
v_blob BLOB;
v_buffer RAW (32767);
v_buffer_size BINARY_INTEGER;
v_amount BINARY_INTEGER;
v_offset NUMBER (38) := 1;
v_chunksize INTEGER;
v_out_file UTL_FILE.file_type;
BEGIN
FOR i IN (SELECT DBMS_LOB.getlength (v_blob) v_len,
image_id v_lob_image_id,
"IMAGE_NAME" v_lob_image_name,
image v_blob
FROM sys.MY_IMAGES)
LOOP
v_chunksize := DBMS_LOB.getchunksize (i.v_blob);
IF (v_chunksize < 32767)
THEN
v_buffer_size := v_chunksize;
ELSE
v_buffer_size := 32767;
END IF;
v_amount := v_buffer_size;
DBMS_LOB.open (i.v_blob, DBMS_LOB.lob_readonly);
v_out_file :=
UTL_FILE.fopen (
location => 'IMAGE_DIR',
filename => ( ''
|| i.v_lob_image_id
|| '_'
|| i.v_lob_image_name
|| '.JPG'),
open_mode => 'wb',
max_linesize => 32767);
WHILE v_amount >= v_buffer_size
LOOP
DBMS_LOB.read (i.v_blob,
v_amount,
v_offset,
v_buffer);
v_offset := v_offset + v_amount;
UTL_FILE.put_raw (file => v_out_file,
buffer => v_buffer,
autoflush => TRUE);
UTL_FILE.fflush (file => v_out_file);
--utl_file.new_line(file => v_out_file);
END LOOP;
UTL_FILE.fflush (v_out_file);
UTL_FILE.fclose (v_out_file);
DBMS_LOB.close (i.v_blob);
END LOOP;
END;
The main problem is related NOT to re-initialize the parameter v_offset to 1 ( as in the declaration section ) :
v_offset := 1;
for every image id just before
v_chunksize := dbms_lob.getchunksize(i.v_blob);
assignment.
Moreover, the problem may arise about not yet closed or already opened blobs. To prevent these,
replace
dbms_lob.open(i.v_blob,dbms_lob.lob_readonly);
with
if ( dbms_lob.isopen(i.v_blob)=0 ) then
dbms_lob.open(i.v_blob,dbms_lob.lob_readonly);
end if;
and
replace
dbms_lob.close(i.v_blob);
with
if ( dbms_lob.isopen(i.v_blob)=1 ) then
dbms_lob.close(i.v_blob);
end if;

plsql reading text file in an array

When I run this portion of my code, which is inside a package, I get an error (specifically at the l_cnt := 1_cnt + 1 line for some reason and the code crashes. What could I be doing wrong? I am trying to read in a file of certs. Here's what I have so far:
v_certList arr_claims_t := arr_claims_t();
v_certLst VARCHAR2(2000);
f UTL_FILE.FILE_TYPE;
s VARCHAR2(200);
-- used for looping
l_cnt simple_integer := 0;
/*cop procedure*/
PROCEDURE COP_DATALOAD_V2 AS
arr_claims arr_claims_t;
arr_sql arr_sql_t;
BEGIN
f := UTL_FILE.FOPEN('V_COP',
'certs_file.txt',
'R',
2500);
-- populata our v_certlist of arr_claims_t
loop
utl_file.get_line(f, s);
v_certList.extend();
l_cnt := l_cnt+1;
v_certList(l_cnt) := s;
end loop;
exception
when no_data_found then
utl_file.fclose(f);
I want the array to be succesfuly populated given a text file (and I understand this is not the best practice but this is what I will have to do for now)
I figured out the error! The s that it was reading in was too big for the array. This was because empty spaces in the files was included.
v_certList.extend(1);
l_cnt := l_cnt + 1;
v_certList(l_cnt) := substr(s,
0,
10)
This fixed it for me.

PL/SQL error LPX-00249 while parsing XML file from web

I'm writing a programm in PL/SQL for downloading XML files from internet, parsing them and storing values in Oracle database. I have one large XML file, which includes links for a huge amount of smaller XMLs. There are about 6253 files to be parsed. I have function, which downloads XML file into CLOB and saves data into XmlType. This value is returned into programm and is further processed. This is the function:
create or replace function get_xml_by_url
( v_url VARCHAR2
)
RETURN XMLType
AS
req SYS.UTL_HTTP.REQ;
resp SYS.UTL_HTTP.RESP;
xmlClob CLOB;
x XmlType;
l_offset number := 1;
value VARCHAR2(3999); -- URL to post to
BEGIN
BEGIN
UTL_HTTP.SET_PROXY('http://10.1.250.233:8080');
req := UTL_HTTP.BEGIN_REQUEST (url=> v_url, method => 'GET');
UTL_HTTP.SET_HEADER(req, 'User-Agent', 'Mozilla/4.0');
UTL_HTTP.SET_HEADER
( r => req
, name => 'Content-Type'
, value => 'text/xml;charset=UTF-8'
);
resp := UTL_HTTP.GET_RESPONSE(req);
DBMS_LOB.CREATETEMPORARY(xmlClob, true);
-- Loading first line
UTL_HTTP.READ_LINE(resp,value,false);
DBMS_LOB.WRITE(xmlClob,length(value),l_offset,value);
l_offset := l_offset + length(value);
-- Loading and adjusting second line
UTL_HTTP.READ_LINE(resp,value,true);
value := rtrim(value,'xmlns="http://seznam.gov.cz/ovm/datafile/seznamovm/v1">')||'>';
DBMS_LOB.WRITE(xmlClob, length(value), l_offset,value);
l_offset := l_offset + length(value);
-- Filling CLOB
LOOP
UTL_HTTP.READ_LINE(resp,value,false);
DBMS_LOB.WRITE(xmlClob,length(value),l_offset,value);
l_offset := l_offset + length(value);
END LOOP;
EXCEPTION
when UTL_HTTP.END_OF_BODY
then
UTL_HTTP.END_RESPONSE(resp);
when others
then
utl_http.end_response(resp);
END;
x := XMLType.createXML(xmlClob);
DBMS_LOB.FREETEMPORARY(xmlClob);
RETURN x;
END;
I was calling this function in a loop for all 6.253 XMl files and every time I've got an error, but every time in different file, and when I ran the script again for only the one XML file which raised an error, it ran fine. I think that the problem is something about memory, but I don't know where a nd why it occurs.
I'm getting following error:
Error report:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00249: invalid external ID declaration
Error at line 1
ORA-06512: in "SYS.XMLTYPE", line 5
ORA-06512: in "GET_XML_BY_URL", line 47
ORA-06512: in line 29
31011. 00000 - "XML parsing failed"
*Cause: XML parser returned an error while trying to parse the document.
*Action: Check if the document to be parsed is valid.
GET_XML_BY_URL is a name given to described function. Does anybody have any experience wuth this kind of problem?
Best regards, Michal

Have PL/SQL Outputs in Real Time

Is it possible to have Outputs from PL/SQL in real time? I have a pretty huge package that runs for more than an hour and I'd like to see where the package is at a particular time.
Anyways, I currently do this with a log table, which gets filled up with hundreds of log descriptions per run, I'm just curious if this is possible.
Thanks!
This is the kind of thing I use (output can be seen in v$session and v$session_longops)...
DECLARE
lv_module_name VARCHAR2(48);
lv_action_name VARCHAR2(32);
gc_MODULE CONSTANT VARCHAR2(48) := 'MY_PROC';
-- For LONGOPS
lv_rindex BINARY_INTEGER;
lv_slno BINARY_INTEGER;
lc_OP_NAME CONSTANT VARCHAR2(64) := '['||gc_MODULE||']';
lv_sofar NUMBER;
-- This is a guess as to the amount of work we will do
lv_totalwork NUMBER;
lc_TARGET_DESC CONSTANT VARCHAR2(64) := 'Tables';
lc_UNITS CONSTANT VARCHAR2(64) := 'Rows';
CURSOR tab_cur
IS
SELECT owner, table_name
FROM all_tables;
BEGIN
<<initialisation>>
BEGIN
-- To preserve the calling stack, read the current module and action
DBMS_APPLICATION_INFO.READ_MODULE( module_name => lv_module_name
, action_name => lv_action_name );
-- Set our current module and action
DBMS_APPLICATION_INFO.SET_MODULE( module_name => gc_MODULE
, action_name => NULL );
END initialisation;
<<main>>
BEGIN
DBMS_APPLICATION_INFO.SET_ACTION( action_name => 'Part 01' );
NULL;
DBMS_APPLICATION_INFO.SET_ACTION( action_name => 'Part 02' );
FOR tab_rec IN tab_cur
LOOP
DBMS_APPLICATION_INFO.SET_CLIENT_INFO( client_info => 'Rows = ['||TO_CHAR( tab_cur%ROWCOUNT, '999,999,999' )||']' );
NULL;
END LOOP;
DBMS_APPLICATION_INFO.SET_ACTION( action_name => 'Part 03' );
--Initialising longops
lv_rindex := DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS_NOHINT;
lv_sofar := 0;
lv_totalwork := 5000; -- This is a guess, but could be actual if the query is quick
FOR tab_rec IN tab_cur
LOOP
DBMS_APPLICATION_INFO.SET_CLIENT_INFO( client_info => 'Rows = ['||TO_CHAR( tab_cur%ROWCOUNT, '999,999,999' )||']' );
lv_sofar := lv_sofar + 1;
-- Update our totalwork guess
IF lv_sofar > lv_totalwork
THEN
lv_totalwork := lv_totalwork + 500;
END IF;
DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS( rindex => lv_rindex
, slno => lv_slno
, op_name => lc_OP_NAME
, sofar => lv_sofar
, totalwork => lv_totalwork
, target_desc => lc_TARGET_DESC
, units => lc_UNITS
);
END LOOP;
-- Clean up longops
DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS( rindex => lv_rindex
, slno => lv_slno
, op_name => lc_OP_NAME
, sofar => lv_sofar
, totalwork => lv_sofar
, target_desc => lc_TARGET_DESC
, units => lc_UNITS
);
END main;
<<finalisation>>
BEGIN
-- Reset the module and action to the values that may have called us
DBMS_APPLICATION_INFO.SET_MODULE( module_name => lv_module_name
, action_name => lv_action_name );
-- Clear the client info, preventing any inter process confusion for anyone looking at it
DBMS_APPLICATION_INFO.SET_CLIENT_INFO( client_info => NULL );
END finalisation;
END;
/
I don't know if this is exactly what you want but I use dbms_application_info.set_module to see where my package is.
dbms_application_info.set_module(module_name => 'Conversion job',
action_name => 'updating table_x');
A query on v$session will show you which part of the procedure is running.
you could use autonomous transactions (as suggested in this SO for example).
This would allow you to write and commit in a log table without commiting the main transaction. You would then be able to follow what happens in your main script while it is running (incidentally, it will also allow you to time/tune your batch).
Use DBMS_PIPE to write a message to a named pipe. In another session you can read the messages from the pipe. Very simple, works like a charm !
procedure sendmessage(p_pipename varchar2
,p_message varchar2) is
s number(15);
begin
begin
sys.dbms_pipe.pack_message(p_message);
exception
when others then
sys.dbms_pipe.reset_buffer;
end;
s := sys.dbms_pipe.send_message(p_pipename, 0);
if s = 1
then
sys.dbms_pipe.purge(p_pipename);
end if;
end;
function receivemessage(p_pipename varchar2
,p_timeout integer) return varchar2 is
n number(15);
chr varchar2(200);
begin
n := sys.dbms_pipe.receive_message(p_pipename, p_timeout);
if n = 1
then
return null;
end if;
sys.dbms_pipe.unpack_message(chr);
return(chr);
end;
If your long-running job is processing a large number of fairly evenly sized tasks, you may find session longops a good way of monitoring the job progress, as well as allowing you to estimate how long the job will take to finish.
DBMS_APPLICATION_INFO.set_session_longops
If you have access to shell from PL/SQL environment you can call netcat:
BEGIN RUN_SHELL('echo "'||v_msg||'" | nc '||v_host||' '||v_port||' -w 5'); END;
/
v_host is a host running python script that reads data from socket on port v_port.
I used this design when I wrote aplogr for shell and pl/sql logs monitoring.