Why this procedure does not work properly? INSTR and SUBSTR issues - sql

I made this procedure:
create or replace procedure calculate_vertices(vertices VARCHAR2) AS
pos2 INTEGER;
pos1 INTEGER := 1;
posDash INTEGER;
lat VARCHAR2(20);
lon VARCHAR2(20);
BEGIN
loop
pos2:=INSTR(vertices,'#',pos1);
exit when pos2 = 0;
posDash := INSTR (vertices,'-',pos1);
lat := SUBSTR(vertices,pos1,pos2-(posDash+1));
dbms_output.put_line(lat);
lon := SUBSTR(vertices,posDash+1,pos2-(posDash+1));
dbms_output.put_line(lon);
pos1 := pos2+1;
end loop;
END;
then i called it as follows:
exec calculate_vertices('122.23-243.345#222.22-323#');
The result expected is:
122.23
243.345
222.22
323
But the real output is:
122.23-
243.345
222
323
How is it possible?
EDIT: I noticed it put in the variable LAT the same number of characters of the variable LON. Why?

I think
lat := SUBSTR(vertices,pos1,pos2-(posDash+1));
is wrong, it should be
lat := SUBSTR(vertices,pos1,posDash -pos1);
just think, what does the position of the # matter for the length of the string before the dash.

You may use REGEXP_SUBSTR, REGEXP_COUNT in a single select instead of your PL/SQL block.
WITH t (str)
AS (SELECT '122.23-243.345#222.22-323#'
FROM dual)
SELECT TRIM(REGEXP_SUBSTR(str, '[^-#]+', 1, LEVEL)) str
FROM t
CONNECT BY LEVEL <= REGEXP_COUNT (str, '[^-#]+');
DEMO

Related

Oracle remove html from clob fields

I have a simple function to convert html blob to plain text
FUNCTION HTML_TO_TEXT(html IN CLOB) RETURN CLOB
IS v_return CLOB;
BEGIN
select utl_i18n.unescape_reference(regexp_replace(html, '<.+?>', ' ')) INTO v_return from dual;
return (v_return);
END;
called in that way:
SELECT A, B, C, HTML_TO_TEXT(BLobField) FROM t1
all works fine until BlobFields contains more than 4000 character, then i got
ORA-01704: string literal too long
01704. 00000 - "string literal too long"
*Cause: The string literal is longer than 4000 characters.
*Action: Use a string literal of at most 4000 characters.
Longer values may only be entered using bind variables.
i try to avoud string inside function using variables but nothing changes:
FUNCTION HTML_TO_TEXT(html IN CLOB) RETURN CLOB
IS v_return CLOB;
"stringa" CLOB;
BEGIN
SELECT regexp_replace(html, '<.+?>', ' ') INTO "stringa" FROM DUAL;
select utl_i18n.unescape_reference("stringa") INTO v_return from dual;
return (v_return);
END;
Do not use regular expressions to parse HTML. If you want to extract the text then use an XML parser:
SELECT a,
b,
c,
UTL_I18N.UNESCAPE_REFERENCE(
XMLQUERY(
'//text()'
PASSING XMLTYPE(blobfield, 1)
RETURNING CONTENT
).getStringVal()
) AS text
FROM t1
Which will work where the extracted text is 4000 characters or less (since XMLTYPE.getStringVal() will return a VARCHAR2 data type and UTL_I18N.UNESCAPE_REFERENCE accepts a VARCHAR2 argument).
If you want to get it to work on CLOB values then you can still use XMLQUERY and getClobVal() but UTL_I18N.UNESCAPE_REFERENCE still only works on VARCHAR2 input (and not CLOBs) so you will need to split the CLOB into segments and parse those and concatenate them once you are done.
Something like:
CREATE FUNCTION html_to_text(
i_xml IN XMLTYPE
) RETURN CLOB
IS
v_text CLOB;
v_output CLOB;
str VARCHAR2(4000);
len PLS_INTEGER;
pos PLS_INTEGER := 1;
lim CONSTANT PLS_INTEGER := 4000;
BEGIN
SELECT XMLQUERY(
'//text()'
PASSING i_xml
RETURNING CONTENT
).getStringVal()
INTO v_text
FROM DUAL;
len := LENGTH(v_text);
WHILE pos <= len LOOP
str := DBMS_LOB.SUBSTR(v_text, lim, pos);
v_output := v_output || UTL_I18N.UNESCAPE_REFERENCE(str);
pos := pos + lim;
END LOOP;
RETURN v_output;
END;
/
However, you probably want to make it more robust and check if you are going to split the string in the middle of an escaped XML character.
db<>fiddle here

How to iterate over binary string in Oracle?

enter image description here
declare
str varchar2(2000) := :inputstr;
v_len number;
currChar CHAR(1);
begin
v_len := length(str);
for i in 1..v_len
Loop
currChar := substr(str,i,1);
if currChar = 1 then
dbms_output.put_line('curr index' || i);
end if;
End loop;
end;
When I pass '000111000' as input to IN_STRING variable , it trims the string and behaves very unusually.Please suggest some good approaches to iterate over binary strings like this.I am expecting output as 4,5,6 from above operation.
EDIT1:
Please don't directly input the string as str varchar2(2000) := '000111000';
Instead input it from bind variable as I mentioned above.
Your code works so long as you pass in a VARCHAR2 data type (and not a NUMBER).
You can also tidy up the code passing in the bind variable only once and using CONSTANTs to hold the values that are constant:
VARIABLE in_string VARCHAR2;
DECLARE
c_string CONSTANT VARCHAR2(200) := :in_string;
c_length CONSTANT PLS_INTEGER := LENGTH(c_string);
v_out CHAR(1);
BEGIN
FOR i IN 1..c_length
LOOP
v_out := SUBSTR(c_string,i,1) ;
DBMS_OUTPUT.PUT_LINE(v_out);
END LOOP;
END;
/
Which outputs:
0
0
1
1
1
0
0
0
db<>fiddle here
Shouldn't behave unusual, unless datatype of in_string variable is NUMBER (then leading zeros don't have any meaning) - switch to VARCHAR2.
Illustration:
NUMBER variable datatype
value you enter
result - really, missing leading zeros
Otherwise, it works OK (this is SQL*Plus so I used substitution variable):
SQL> DECLARE
2 v_length NUMBER (10);
3 v_out VARCHAR2 (20);
4 BEGIN
5 v_length := LENGTH ( '&&in_string');
6
7 FOR i IN 1 .. v_length
8 LOOP
9 v_out := SUBSTR ( '&&in_string', i, 1);
10 DBMS_OUTPUT.PUT_LINE (v_out);
11 END LOOP;
12 END;
13 /
Enter value for in_string: 00111000
0
0
1
1
1
0
0
0
PL/SQL procedure successfully completed.
Another option (if you're interested in it) doesn't require PL/SQL:
SQL> SELECT SUBSTR ( '&&in_string', LEVEL, 1) val
2 FROM DUAL
3 CONNECT BY LEVEL <= LENGTH ( '&&in_string');
V
-
0
0
1
1
1
0
0
0
8 rows selected.
SQL>

Oracle. Not valid ascii value of regex result

I'd like to edit a string. Get from 2 standing nearby digits digit and letter (00 -> 0a, 01 - 0b, 23-> 2c etc.)
111324 -> 1b1d2e.
Then my code:
set serveroutput on size unlimited
declare
str varchar2(128);
function convr(num varchar2) return varchar2 is
begin
return chr(ascii(num)+49);
-- return chr(ascii(num)+49)||'<-'||(ascii(num)+49)||','||ascii(num)||','||num||'|';
end;
function replace_dd(str varchar2) return varchar2 is
begin
return regexp_replace(str,'((\d)(\d))','\2'||convr('\3'));
end;
begin
str := '111324';
Dbms_Output.Put_Line(str);
Dbms_Output.Put_Line(replace_dd(str));
end;
But I get the next string: '112'.
When I checked result by commented return string I'v got:
'1<-141,92,1|1<-141,92,3|2<-141,92,4|'.
ascii(num) does not depend on num. It always works like ascii('\'). It is 92, plus 49 we got 141 and it is out of ascii table. But num by itself is printed correctly.
How can I get correct values? Or maybe another way to resolve this issue?
What is happening is that the replacement string is expanded first, and only after it is fully processed, any remaining backreferences like \2 are replaced by string fragments. So convr('\3') is processed first, and at this stage '\3' is a literal. ascii() returns the ascii code of the FIRST character of whatever string it receives as argument. So the 3 plays no role, you only get ascii('\') as you noticed. Then your user-defined function is evaluated and plugged back into the concatenation... by now there is no \3 left in the replacement string.
Exercise: Try to explain/understand why
regexp_replace('abcdef', '(b).*(e)', '\2' || upper('\1'))
is aebf and not aeBf. (Hint: what is the return from upper('\1') by itself, unrelated to anything else?)
You could split the input string into component characters, apply your transformation on those with even index and combine the string back (all in SQL, no need for loops and such). Something like this (done in plain SQL, you can rewrite it into your function if you like):
with
inputs ( str ) as (
select '111324' from dual union all
select '372' from dual
),
singletons ( str, idx, ch ) as (
select str, level, substr(str, level, 1)
from inputs
connect by level <= length(str)
and prior str = str
and prior sys_guid() is not null
)
select str,
listagg(case mod(idx, 2) when 1 then ch else chr(ascii(ch)+49) end, '')
within group (order by idx)
as modified_str
from singletons
group by str
;
STR MODIFIED_STR
------ --------------
111324 1b1d2e
372 3h2
Here code adds 5 to a single letter and resolve the isssue.
set serveroutput on size unlimited
declare
str varchar2(128);
str1 varchar2(128);
function replace_a(str varchar2) return varchar2 is
begin
return regexp_replace(str,'(\D)','5\1');
end;
function convr(str varchar2) return varchar2 is
ind number;
ret varchar2(128);
begin
Dbms_Output.Put_Line(str);
--return chr(ascii(num)+49)||'<-'||(ascii(num)+49)||','||ascii(num)||','||num||'|';
ind := 1 ;
ret :=str;
loop
ind := regexp_instr(':'||ret,'(#\d#)',ind) ;
exit when ind=0;
Dbms_Output.Put_Line(ind);
ret := substr(ret,1,ind-2)||chr(ascii(substr(ret,ind,1))+49)||substr(ret,ind+2);
SYS.Dbms_Output.Put_Line(ret);
end loop;
return ret;
end;
function replace_dd(str varchar2) return varchar2 is
begin
return convr(regexp_replace(str,'((\d)(\d))','\2#\3#'));
end;
begin
str := '11a34';
Dbms_Output.Put_Line(str);
Dbms_Output.Put_Line(replace_a(str));
Dbms_Output.Put_Line(replace_dd(replace_a(str)));
end;
result:
11a34
115a34
1#1#5a3#4#
3
1b5a3#4#
7
1b5a3e
1b5a3e

Oracle retrieve only number in string

In Oracle pl/sql, how do I retrieve only numbers from string.
e.g. 123abc -> 123 (remove alphabet)
e.g. 123*-*abc -> 123 (remove all special characters too)
your_string := regexp_replace(your_string, '\D')
Several options, but this should work:
select regexp_replace('123*-*abc', '[^[:digit:]]', '') from dual
This removes all non-digits from the input.
If using in pl/sql, you could do an assignment to a variable:
declare
l_num number;
l_string varchar2(20) := '123*-*abc';
begin
l_num := regexp_replace(l_string, '[^[:digit:]]', '');
dbms_output.put_line('Num is: ' || l_num);
end;
Output:
Num is: 123
Try this:
select regexp_replace(value, '[A-Za-z]') from dual;

How do I expand a string with wildcards in PL/SQL using string functions

I have a column, which stores a 4 character long string with 4 or less wild characters (for eg. ????, ??01', 0??1 etc). For each such string like 0??1 I have to insert into another table values 0001 to 0991; for the string ??01, values will be be 0001 to 9901; for string ???? values will be 0000 to 9999 and so on.
How could I accomplish this using PL/SQL and string functions?
EDIT
The current code is:
declare
v_rule varchar2(50) := '????52132';
v_cc varchar2(50);
v_nat varchar2(50);
v_wild number;
n number;
begin
v_cc := substr(v_rule,1,4);
v_nat := substr(v_rule,5);
dbms_output.put_line (v_cc || ' '|| v_nat);
if instr(v_cc, '????') <> 0 then
v_wild := 4;
end if;
n := power(10,v_wild);
for i in 0 .. n - 1 loop
dbms_output.put_line(substr(lpad(to_char(i),v_wild,'0' ),0,4));
end loop;
end;
/
Would something like the following help?
BEGIN
FOR source_row IN (SELECT rule FROM some_table)
LOOP
INSERT INTO some_other_table (rule_match)
WITH numbers AS (SELECT LPAD(LEVEL - 1, 4, '0') AS num FROM DUAL CONNECT BY LEVEL <= 10000)
SELECT num FROM numbers WHERE num LIKE REPLACE(source_row.rule, '?', '_');
END LOOP;
END;
/
This assumes you have a table called some_table with a column rule, which contains text such as ??01, 0??1 and ????. It inserts into some_other_table all numbers from 0000 to 9999 that match these wild-carded patterns.
The subquery
SELECT LPAD(LEVEL - 1, 4, '0') AS num FROM DUAL CONNECT BY LEVEL <= 10000)
generates all numbers in the range 0000 to 9999. We then filter out from this list of numbers any that match this pattern, using LIKE. Note that _ is the single-character wildcard when using LIKE, not ?.
I set this up with the following data:
CREATE TABLE some_table (rule VARCHAR2(4));
INSERT INTO some_table (rule) VALUES ('??01');
INSERT INTO some_table (rule) VALUES ('0??1');
INSERT INTO some_table (rule) VALUES ('????');
COMMIT;
CREATE TABLE some_other_table (rule_match VARCHAR2(4));
After running the above PL/SQL block, the table some_other_table had 10200 rows in it, all the numbers that matched all three of the patterns given.
Replace * to %, ? to _ and use LIKE clause with resulting values.
To expand on #Oleg Dok's answer, which uses the little known fact that an underscore means the same as % but only for a single character and using PL\SQL I think the following is the simplest way to do it. A good description of how to use connect by is here.
declare
cursor c_min_max( Crule varchar2 ) is
select to_number(min(numb)) as min_n, to_number(max(numb)) as max_n
from ( select '0000' as numb
from dual
union
select lpad(level, 4, '0') as numb
from dual
connect by level <= 9999 )
where to_char(numb) like replace(Crule, '?', '_');
t_mm c_min_max%rowtype;
l_rule varchar2(4) := '?091';
begin
open c_min_max(l_rule);
fetch c_min_max
into t_mm;
close c_min_max;
for i in t_mm.min_n .. t_mm.max_n loop
dbms_output.put_line(lpad(i, 4, '0'));
end loop;
end;
/