Oracle - need to extract text between given strings - sql

Example - need to extract everything between "Begin begin" and "End end". I tried this way:
with phrases as (
select 'stackoverflow is awesome. Begin beginHello, World!End end It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Begin begin)([[:print:]]+)(End end[[:print:]]+)', '\2')
from phrases
;
Result: Hello, World!
However it fails if my text contains new line characters. Any tip how to fix this to allow extracting text containing also new lines?
[edit]How does it fail:
with phrases as (
select 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Begin begin)([[:print:]]+)(End end[[:print:]]+)', '\2')
from phrases
;
Result:
stackoverflow is awesome. Begin beginHello, World!End end It has
everything!
Should be:
Hello,
World!
[edit]
Another issue. Let's see to this sample:
WITH phrases AS (
SELECT 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!End endTESTESTESTES' AS phrase
FROM dual
)
SELECT REGEXP_REPLACE(phrase, '.+Begin begin(.+)End end.+', '\1', 1, 1, 'n')
FROM phrases;
Result:
Hello,
World!End end It has everything!
So it matches last occurence of end string and this is not what I want. Subsgtring should be extreacted to first occurence of my label, so result should be:
Hello,
World!
Everything after first occurence of label string should be ignored. Any ideas?

I'm not that familiar with the POSIX [[:print:]] character class but I got your query functioning using the wildcard .. You need to specify the n match parameter in REGEXP_REPLACE() so that . can match the newline character:
WITH phrases AS (
SELECT 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!' AS phrase
FROM dual
)
SELECT REGEXP_REPLACE(phrase, '.+Begin begin(.+)End end.+', '\1', 1, 1, 'n')
FROM phrases;
I used the \1 backreference as I didn't see the need to capture the other groups from the regular expression. It might also be a good idea to use the * quantifier (instead of +) in case there is nothing preceding or following the delimiters. If you want to capture all of the groups then you can use the following:
WITH phrases AS (
SELECT 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!' AS phrase
FROM dual
)
SELECT REGEXP_REPLACE(phrase, '(.+Begin begin)(.+)(End end.+)', '\2', 1, 1, 'n')
FROM phrases;
UPDATE - FYI, I tested with [[:print:]] and it doesn't work. This is not surprising since [[:print:]] is supposed to match printable characters. It doesn't match anything with an ASCII value below 32 (a space). You need to use ..
UPDATE #2 - per update to question - I don't think a regex will work the way you want it to. Adding the lazy quantifier to (.+) has no effect and Oracle regular expressions don't have lookahead. There are a couple of things you might do, one is to use INSTR() and SUBSTR():
WITH phrases AS (
SELECT 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!End endTESTTESTTEST' AS phrase
FROM dual
)
SELECT SUBSTR(phrase, str_start, str_end - str_start) FROM (
SELECT INSTR(phrase, 'Begin begin') + LENGTH('Begin begin') AS str_start
, INSTR(phrase, 'End end') AS str_end, phrase
FROM phrases
);
Another is to combine INSTR() and SUBSTR() with a regular expression:
WITH phrases AS (
SELECT 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!End endTESTTESTTEST' AS phrase
FROM dual
)
SELECT REGEXP_REPLACE(SUBSTR(phrase, 1, INSTR(phrase, 'End end') + LENGTH('End end')), '.+Begin begin(.+)End end.+', '\1', 1, 1, 'n')
FROM phrases;

Try this regex:
([[:print:]]+Begin begin)(.+?)(End end[[:print:]]+)
Sample usage:
SELECT regexp_replace(
phrase ,
'([[:print:]]+Begin begin)(.+?)(End end[[:print:]]+)',
'\2',
1, -- Start at the beginning of the phrase
0, -- Replace ALL occurences
'n' -- Let dot meta character matches new line character
)
FROM
(SELECT 'stackoverflow is awesome. Begin beginHello, '
|| chr(10)
|| ' World!End end It has everything!' AS phrase
FROM DUAL
)
The dot meta character (.) matches any character in the database character set and the new line character. However, when regexp_replace is called, the match_parameter must contain n switch for dot matches new lines.

In order to get your second option to work you need to add [[:space:][:print:]]* as follows:
with phrases as (
select 'stackoverflow is awesome. Begin beginHello,
World!End end It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Begin begin)([[:print:]]+[[:space:][:print:]]*)(End end[[:print:]]+)', '\2')
from phrases
;
But still it will break if you have more \n, for instance it won't work for
with phrases as (
select 'stackoverflow is awesome. Begin beginHello,
World!End end
It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Begin begin)([[:print:]]+[[:space:][:print:]]*)(End end[[:print:]]+)', '\2')
from phrases
;
Then you need to add
with phrases as (
select 'stackoverflow is awesome. Begin beginHello,
World!End end
It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Begin begin)([[:print:]]+[[:space:][:print:]]*)(End end[[:print:]]+[[:space:][:print:]]*)', '\2')
from phrases
;
The problem of regex is that you might have to scope the variations and create a rule that match all of them. If something falls out of your scope, you'll have to visit the regex and add the new exception.
You can find extra info here.

Description.........: This is a function similar to the one that was available from PRIME Computers
back in the late 80/90's. This function will parse out a segment of a string
based on a supplied delimiter. The delimiters can be anything.
Usage:
Field(i_string =>'This.is.a.cool.function'
,i_deliiter => '.'
,i_start_pos => 2
,i_occurrence => 2)
Return value = is.a
FUNCTION field(i_string VARCHAR2
,i_delimiter VARCHAR2
,i_occurance NUMBER DEFAULT 1
,i_return_instances NUMBER DEFAULT 1) RETURN VARCHAR2 IS
--
v_delimiter VARCHAR2(1);
n_end_pos NUMBER;
n_start_pos NUMBER := 1;
n_delimiter_pos NUMBER;
n_seek_pos NUMBER := 1;
n_tbl_index PLS_INTEGER := 0;
n_return_counter NUMBER := 0;
v_return_string VARCHAR2(32767);
TYPE tbl_type IS TABLE OF VARCHAR2(4000) INDEX BY PLS_INTEGER;
tbl tbl_type;
e_no_delimiters EXCEPTION;
v_string VARCHAR2(32767) := i_string || i_delimiter;
BEGIN
BEGIN
LOOP
----------------------------------------
-- Search for the delimiter in the
-- string
----------------------------------------
n_delimiter_pos := instr(v_string, i_delimiter, n_seek_pos);
--
IF n_delimiter_pos = length(v_string) AND n_tbl_index = 0 THEN
------------------------------------------
-- The delimiter you are looking for is
-- not in this string.
------------------------------------------
RAISE e_no_delimiters;
END IF;
--
EXIT WHEN n_delimiter_pos = 0;
n_start_pos := n_seek_pos;
n_end_pos := n_delimiter_pos - n_seek_pos;
n_seek_pos := n_delimiter_pos + 1;
--
n_tbl_index := n_tbl_index + 1;
-----------------------------------------------
-- Store the segments of the string in a tbl
-----------------------------------------------
tbl(n_tbl_index) := substr(i_string, n_start_pos, n_end_pos);
END LOOP;
----------------------------------------------
-- Prepare the results for return voyage
----------------------------------------------
v_delimiter := NULL;
FOR a IN tbl.first .. tbl.last LOOP
IF a >= i_occurance AND n_return_counter < i_return_instances THEN
v_return_string := v_return_string || v_delimiter || tbl(a);
v_delimiter := i_delimiter;
n_return_counter := n_return_counter + 1;
END IF;
END LOOP;
--
EXCEPTION
WHEN e_no_delimiters THEN
v_return_string := i_string;
END;
RETURN TRIM(v_return_string);
END;

Related

PL/SQL LOOP - Return a row with mixed capital letters

I know this question probably has an easy answer, but I can't get my head around it.
I'm trying to, inside a loop, return a string (in the SQL output) with mixed capital and non-capital letters.
Example: If a name in the row is John Doe, the output will print JoHn DoE, or MiXeD CaPiTaL.
This is my code (which I know is poor written but I need to use the cursor!):
declare
aa_ VARCHAR2(2000);
bb_ NUMBER:=0;
cc_ NUMBER:=0;
CURSOR cur_ IS
SELECT first_name namn, last_name efternamn FROM person_info
;
begin
FOR rec_ IN cur_ LOOP
dbms_output.put_line(rec_.namn);
FOR bb_ IN 1.. LENGTH(rec_.namn) LOOP
dbms_output.put(UPPER(SUBSTR(rec_.namn,bb_,1)));
cc_ := MOD(bb_,2);
IF cc_ = 0 THEN
dbms_output.put(UPPER(SUBSTR(rec_.namn,cc_,1)));
ELSE
dbms_output.put(LOWER(SUBSTR(rec_.namn,2)));
END IF;
end loop;
dbms_output.new_line;
end loop;
end;
Again, I know the code is really bad but yeah, trying to learn!
Thanks in advance :)
You may use plain SQL for this purpose, without any loop:
Split input text by pairs separated with some special character (that doesn't appear in the text).
Use initcap SQL function to turn each first letter to upper case.
Remove the special separator.
with a as (
select 'John Doe' as a
from dual
union all
select 'mixed capital and non-capital letters'
from dual
)
select
replace(
initcap(
/*Convert case*/
regexp_replace(a, '([a-zA-Z]{2})',
/*Add ASCII nul after each two letters*/
'\1' || chr(0)
)
),
/*Remove ASCII nul to revert changes*/
chr(0)
) as mixed_case
from a
| MIXED_CASE |
| :------------------------------------ |
| JoHn DoE |
| MiXeD CaPiTaL AnD NoN-CaPiTaL LeTtErS |
db<>fiddle here
I'd put the text transformation into a function, rather than including all the logic in the body of the loop.
declare
cursor c_people is
select 'John' as first_name, 'Doe' as last_name from dual union all
select 'Mixed', 'Capitals'
from dual;
function mixCaps(inText varchar2) return varchar2
is
letter varchar2(1);
outText varchar2(4000);
begin
for i in 1..length(inText) loop
letter := substr(inText,i,1);
outText := outText ||
case mod(i,2)
when 0 then lower(letter)
else upper(letter)
end;
end loop;
return outText;
end mixCaps;
begin
for person in c_people loop
dbms_output.put_line(mixCaps(person.first_name|| ' ' || person.last_name));
end loop;
end;
If performance was critical and you had large numbers of values, you might consider inlining the function using pragma inline (but then you wouldn't be using dbms_output anyway).
For learning purpose you can use code below (it is not efficient it is for learning of oracle features)
Steps :
split word on letters using connect by level
get Nth (level) occurence of one letter ('.?') from word using reg exp
convert to upper case every 2nd letter
concatenate back using list agg and sorting by letter number
used here function in with so you can apply it to any sql table
with
function mixed(iv_name varchar2) return varchar2 as
l_result varchar2(4000);
begin
with src_letters as
(select REGEXP_SUBSTR(iv_name, '.?', level) as letter
,level lvl
from dual
connect by level <= length(iv_name)),
mixed_letters as
(select case
when mod(lvl, 2) = 0 then
letter
else
upper(letter)
end as letter
,lvl
from src_letters
order by lvl)
select listagg(letter) within group(order by lvl)
into l_result
from mixed_letters;
return l_result;
end;
select mixed('text') from dual

Oracle SQL Developer Query on recommended password setup

Setting up a random password for user using
select
dbms_random.string('L',2) || dbms_random.string('X',6) || '1!' as deflvrpwd,
'${access_request_cri_acc_cas9}' as ACNTDN
from dual
New requirement
New Hire Details:
Name :John Doe
Region: America
WDID : 876214
WDID Reverse and split
Region in the middle with the letter A replaced with # symbol
Should read if we follow your formula.
= 412#meric#s678
Please suggest attribute are same as mentioned.
Thank You
Here's one option; read comments within code.
SQL> WITH
2 -- sample data
3 test (name, region, wdid)
4 AS
5 (SELECT 'John Doe', 'America', '876214' FROM DUAL),
6 temp
7 AS
8 -- reverse WDID; don't use undocumented REVERSE function
9 -- replace "A" (or "a") with "#" in REGION
10 ( SELECT name,
11 REPLACE (REPLACE (region, 'A', '#'), 'a', '#') new_region,
12 LISTAGG (letter, '') WITHIN GROUP (ORDER BY lvl DESC) new_wdid
13 FROM ( SELECT SUBSTR (wdid, LEVEL, 1) letter,
14 LEVEL lvl,
15 name,
16 region
17 FROM test
18 CONNECT BY LEVEL <= LENGTH (wdid))
19 GROUP BY name, region)
20 -- finally
21 SELECT SUBSTR (new_wdid, 1, 3) || new_region || SUBSTR (new_wdid, 4) AS result
22 FROM temp;
RESULT
--------------------------------------------------------------------------------
412#meric#678
SQL>
I don't know where s in your result comes from (this: 412#meric#s678).
There's a small cost in context switching between SQL and PL/SQL, but this doesn't sound like a high-volume or performance-critical thing, so you might find it cleaner to put the logic in a function:
create or replace function get_password (p_wdid varchar2, p_region varchar2)
return varchar2 as
l_split pls_integer;
l_password varchar2(30);
begin
-- split WDID halfway, but allow for odd lengths
l_split := floor(length(p_wdid)/2);
-- iterate over the WDID in reverse
for i in reverse 1..length(p_wdid) LOOP
-- when we reach the split point, append the modified region
if i = l_split then
l_password := l_password || translate(p_region, 'Aax', '##x');
end if;
-- append each WDID character, in reverse order
l_password := l_password || substr(p_wdid, i, 1);
end loop;
return l_password;
end get_password;
/
The WDID is reversed in a loop, and the modified region is included at the midway point, based on the length of the WDID value.
You can then do:
select get_password('876214', 'America') from dual;
GET_PASSWORD('876214','AMERICA')
--------------------------------
412#meric#678
This also doesn't have the unexplained 's' from the example in your question.
If you can't create a function but are on a recent version of Oracle then you can define an ad hoc function in a CTE:
with
function invert (p_input varchar2) return varchar2 as
l_output varchar2(30);
begin
for i in reverse 1..length(p_input) LOOP
l_output := l_output || substr(p_input, i, 1);
end loop;
return l_output;
end invert;
t (wdid, region) as (
select invert('876214'), translate('America', 'Aax', '##x')
from dual
)
select substr(wdid, 1, floor(length(wdid)/2))
|| region
|| substr(wdid, floor(length(wdid)/2) + 1)
from t;
which gets the same result. (I've called the function invert to avoid confusion with the undocumented reverse function.)
db<>fiddle showing both.

Display count of characters repeated in a string using SQL only

I want to achieve below code snippet o/p using select query alone. Is it possible without using regexp?
Character Count when string input is dynamic.
DECLARE
str VARCHAR2(255);
lv_val NUMBER;
lv_char CHAR(1);
lv_unq VARCHAR2(255);
BEGIN
str:= :p_string;
FOR i IN 1..length(str)
LOOP
lv_val := 0;
lv_char := SUBSTR(str,i,1);
IF instr(lv_unq,lv_char)>0 THEN
NULL;
ELSE
lv_unq := lv_unq||lv_char;
lv_val := ((LENGTH(str) - LENGTH(REPLACE(replace(str,' ',''), lv_char, ''))) / LENGTH(lv_char));
--select ((length(str) - LENgth(REPLACE(str, lv_char, ''))) / LENgth(lv_char)) into lv_val FROM dual;
DBMS_OUTPUT.PUT_LINE('Character '||lv_char || ' is repeated :'||lv_val||' times in the string '||str);
END IF;
END LOOP;
END;
Answering to the question's title:
Display count of characters repeated in a string using SQL only
with v as (select substr('hello world', level, 1) c from dual connect by level < 12),
d as (select chr(ascii('a')+level-1) c from dual connect by level <= 26)
select d.c, count(v.c) from d left join v on d.c = v.c
group by d.c
order by d.c;
See http://sqlfiddle.com/#!4/d41d8/38321/0 for the result
The first view split your string into characters. The second view is just the alphabet. Once you have both views, you only need a simple left join with a group by clause to count the number of matching occurrences.
Please note:
in the first view, the string and its length are hard-coded in this example
I assume all your characters are lower-case
I only take into account the 26 (lower case) letters of the ASCII encoding.

How do I expand a string with wildcards in PL/SQL using string functions

I have a column, which stores a 4 character long string with 4 or less wild characters (for eg. ????, ??01', 0??1 etc). For each such string like 0??1 I have to insert into another table values 0001 to 0991; for the string ??01, values will be be 0001 to 9901; for string ???? values will be 0000 to 9999 and so on.
How could I accomplish this using PL/SQL and string functions?
EDIT
The current code is:
declare
v_rule varchar2(50) := '????52132';
v_cc varchar2(50);
v_nat varchar2(50);
v_wild number;
n number;
begin
v_cc := substr(v_rule,1,4);
v_nat := substr(v_rule,5);
dbms_output.put_line (v_cc || ' '|| v_nat);
if instr(v_cc, '????') <> 0 then
v_wild := 4;
end if;
n := power(10,v_wild);
for i in 0 .. n - 1 loop
dbms_output.put_line(substr(lpad(to_char(i),v_wild,'0' ),0,4));
end loop;
end;
/
Would something like the following help?
BEGIN
FOR source_row IN (SELECT rule FROM some_table)
LOOP
INSERT INTO some_other_table (rule_match)
WITH numbers AS (SELECT LPAD(LEVEL - 1, 4, '0') AS num FROM DUAL CONNECT BY LEVEL <= 10000)
SELECT num FROM numbers WHERE num LIKE REPLACE(source_row.rule, '?', '_');
END LOOP;
END;
/
This assumes you have a table called some_table with a column rule, which contains text such as ??01, 0??1 and ????. It inserts into some_other_table all numbers from 0000 to 9999 that match these wild-carded patterns.
The subquery
SELECT LPAD(LEVEL - 1, 4, '0') AS num FROM DUAL CONNECT BY LEVEL <= 10000)
generates all numbers in the range 0000 to 9999. We then filter out from this list of numbers any that match this pattern, using LIKE. Note that _ is the single-character wildcard when using LIKE, not ?.
I set this up with the following data:
CREATE TABLE some_table (rule VARCHAR2(4));
INSERT INTO some_table (rule) VALUES ('??01');
INSERT INTO some_table (rule) VALUES ('0??1');
INSERT INTO some_table (rule) VALUES ('????');
COMMIT;
CREATE TABLE some_other_table (rule_match VARCHAR2(4));
After running the above PL/SQL block, the table some_other_table had 10200 rows in it, all the numbers that matched all three of the patterns given.
Replace * to %, ? to _ and use LIKE clause with resulting values.
To expand on #Oleg Dok's answer, which uses the little known fact that an underscore means the same as % but only for a single character and using PL\SQL I think the following is the simplest way to do it. A good description of how to use connect by is here.
declare
cursor c_min_max( Crule varchar2 ) is
select to_number(min(numb)) as min_n, to_number(max(numb)) as max_n
from ( select '0000' as numb
from dual
union
select lpad(level, 4, '0') as numb
from dual
connect by level <= 9999 )
where to_char(numb) like replace(Crule, '?', '_');
t_mm c_min_max%rowtype;
l_rule varchar2(4) := '?091';
begin
open c_min_max(l_rule);
fetch c_min_max
into t_mm;
close c_min_max;
for i in t_mm.min_n .. t_mm.max_n loop
dbms_output.put_line(lpad(i, 4, '0'));
end loop;
end;
/

How to reverse a string after tokenizing it in SQL

I need to tokenize a string and reverse it in SQL. For example if the string is, 'L3:L2:L1:L0', i need to reverse it as 'L0:L1:L2:L3'. The tokenizing could be done using a delimiter ':' and then reverse it. Please suggest a Function in SQL for the same.
Thanks in advance,
Geetha
If possible, the best solution would be to change your data so that each value is stored in a different row.
If that doesn't work, you can create a PL/SQL function.
If you want a purely SQL solution, typically you'll have to split each value into multiple rows (cross join with an object table, or connect by level <= max number of items), and then re-aggregate the data using one of a dozen different methods (listagg, collect, stragg, xml, sys_connect_by_path, etc.)
Another SQL-only way is to use regular expressions. This is probably the fastest, but it only works with up to 9 items because Oracle only supports 9 back references:
--Get everything except the extra ':' at the end.
select substr(string, 1, length(string) - 1) string from
(
select regexp_replace(
--Add a delimter to the end so all items are the same
'L3:L2:L1:L0'||':'
--Non-greedy search for anything up to a : (I bet there's a better way to do this)
,'(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?(.*?:)?'
--Reverse the back-references
,'\9\8\7\6\5\4\3\2\1') string
from dual
);
Something like :
SELECT
REGEXP_REPLACE('L1:L2:L3',
'([[:alnum:]]{1,}):([[:alnum:]]{1,}):([[:alnum:]]{1,})',
'\3 \2 \1') "REGEXP_REPLACE"
from dual
But you might need to detail what constitutes a token.
Here is a solution using a PL/SQL pipelined function to split the elements:
create type t_str_array as table of varchar2(4000);
create or replace function split_str (p_str in varchar2,
p_separator in varchar2 := ':') return t_str_array pipelined
as
l_str varchar2(32000) := p_str || p_separator;
l_pos pls_integer;
begin
loop
l_pos := instr(l_str, p_separator);
exit when (nvl(l_pos,0) = 0);
pipe row (ltrim(rtrim(substr(l_str,1,l_pos-1))));
l_str := substr(l_str, l_pos+1);
end loop;
return;
end split_str;
Then you would use normal SQL to order the elements:
select * from table(split_str('L3:L2:L1:L0')) order by column_value
declare
s varchar2(1000) := 'L 1 0:L9:L8:L7:L6:L5:L4:L3:L2:L1:L0';
j number := length(s);
begin
for i in reverse 1..length(s) loop
if substr(s, i, 1) = ':' then
dbms_output.put(substr(s, i + 1, j - i) || ':');
j := i - 1;
end if;
end loop;
dbms_output.put_line(substr(s, 1, j));
end;
Convert elements in a CSV string into records, suppressing all NULLs:
SELECT REGEXP_SUBSTR( :csv,'[^,]+', 1, LEVEL ) AS element
FROM dual
CONNECT BY REGEXP_SUBSTR( :csv, '[^,]+', 1, LEVEL ) IS NOT NULL ;
Convert elements in a CSV string into records, preserving NULLs (but not order):
SELECT REGEXP_SUBSTR( :csv,'[^,]+', 1, LEVEL ) AS element
FROM dual
CONNECT BY LEVEL <= LENGTH( :csv ) - LENGTH( REPLACE( :CSV, ',' ) ) + 1 ;
Improving upon Kevan's answer, here is what I tried:
select listagg(TOKEN, ':') WITHIN GROUP (ORDER BY TOKEN_LEVEL DESC)
from
(SELECT REGEXP_SUBSTR( myStr,'[^:]+', 1, LEVEL ) AS TOKEN, LEVEL TOKEN_LEVEL
FROM dual
CONNECT BY REGEXP_SUBSTR( myStr, '[^:]+', 1, LEVEL ) IS NOT NULL);
Since you use Oracle it would be easy to generate a java stored procedure passing the string and then
split sting into array
loop array backwards and concate the resulting string
return the resulting string
this will be a small java code and not slower then pl/sql. but if you want to use pl/sql you can possibly also use DBMS_UTILITY.table_to_comma/.comma_to_table. But as the function name let assume -> you have to use "," as token.