To extract the specific strings from the given string in Oracle - sql

expression - BR65437812-909#-#BR12340000-990
Need to extract the given expression and update in columns like
a = BR12340000, b = 990

select
SUBSTR(s, 1, INSTR(s, '-') - 1) as a,
SUBSTR(s, INSTR(s, '-', -1) + 1) as b
from
(select 'BR65437812-909#-#BR12340000-990' as s from dual)
Using SUBSTR(string, start, length) we have the following arguments:
For A:
the string to search
1 as the start and
(index_of_the_first_hyphen - 1) as the length. INSTR(string, searchfor) gives us the index of the first hyphen
For B:
Using SUBSTR(string, start) we have arguments:
the string to search
the (index_of_last_hyphen + 1) - this time we use the extra INSTR(string, searchfor, startindex) argument startindex and set it to -1; this makes it search from the end of the string and work backwards, giving us the index of the last hyphen
We don't need a length argument - SUBSTR without length returns the rest of the string to the end
It's important to note that INSTR with a start index of -1 does search backwards but it always returns the index from the start of the string, not the end.
INSTR('dddde', 'd', -1)
12345 -- returns 4, because d is 4 from the start
54321 -- it does not return 2, even though d is 2 from the "start" when searching backwards

You can use regexp_replace to get every character after last #sign firstly, and then split by dash.
with t(str) as
(
select regexp_replace('BR65437812-909#-#BR12340000-990','.*#','') from dual
)
select regexp_replace(str,'-.*','') as a,
regexp_replace(str,'.*-','') as b
from t;
A B
---------- ---
BR12340000 990
Demo

with s as (
select 'BR65437812-909#-#BR12340000-990' str from dual)
select regexp_substr(str, '[^-#]+', 1, 3) a, regexp_substr(str, '[^-#]+', 1, 4) b
from s;
A B
---------- ---
BR12340000 990

Related

How to pull a value in between multiple values?

I have a column named Concatenated Segments which has 12 segment values, and I'm looking to edit the formula on the column to only show the 5th segment. The segments are separated by periods.
How would I need to edit the formula to do this?
Would using a substring work?
Alternatively, using good old SUBSTR + INSTR combination
possibly faster on large data sets
which doesn't care about uninterrupted strings (can contain anything between dots)
SQL> WITH
2 -- thank you for typing, #marcothesane
3 indata(s) AS (
4 SELECT '1201.0000.5611005.0099.211003.0000.2199.00099.00099.0000.0000.00000' FROM dual
5 )
6 select substr(s, instr(s, '.', 1, 4) + 1,
7 instr(s, '.', 1, 5) - instr(s, '.', 1, 4) - 1
8 ) result
9 from indata;
RESULT
------
211003
SQL>
Use REGEXP_SUBSTR(), searching for the 5th uninterrupted string of digits, or the 5th uninterrupted string of anything but a dot (\d and [^\.]) starting from position 1 of the input string:
WITH
-- your input ... paste it as text next time, so I don't have to manually re-type it ....
indata(s) AS (
SELECT '1201.0000.5611005.0099.211003.0000.2199.00099.00099.0000.0000.00000' FROM dual
)
SELECT
REGEXP_SUBSTR(s,'\d+',1,5) AS just_digits
, REGEXP_SUBSTR(s,'[^\.]+',1,5) AS between_dots
FROM indata;
-- out just_digits | between_dots
-- out -------------+--------------
-- out 211003 | 211003

Get string until character Oracle SQL

How to get string before character?
I need to get string before ; in Oracle SQL.
For example:
147739 - Blablabla ; Blublublu
Needed output:
147739 - Blablabla
My code so far:
SELECT
UPPER(CONVERT(REGEXP_REPLACE(SUBSTR(HISTORICO, INSTR(HISTORICO, 'Doc') + 4), 'S/A', 'SA'), 'US7ASCII'))
FROM
GEQ_GL_CONC_CONTABIL_FRETES_V
WHERE
periodo = '$Periodo$' AND livro = 'ESMALTEC_FISCAL'
I want the whole string up to ;
We can use a combination of SUBSTR and INSTR to achieve this;
SELECT SUBSTR(FIELD_NAME,1,INSTR(FIELD_NAME,';', 1, 1)-1) FROM TABLE_NAME;
The first argument to SUBSTR is the position in the field value from which we want to start (1 = at the beginning), the second argument is the length of the substring we want to read, here it is synonymous with the position of ';' -1.
The third and fourth arguments to INSTR are where to start searching for ';' and the count we are interested in. In our example that is from the beginning (1) and the first occurence (again 1).
You could try using substr() and instr()
select SUBSTR(my_col, 0, INSTR(my_col, ';')-1)
from my_table
select SUBSTR(' Blablabla ; Blublublu', 0, INSTR('A Blablabla ; Blublublu', ';')-1)
from dual
A few alternatives using REGEXP
The result with each solution depends of how uniform your data is
WITH tbl
AS (
SELECT '147739 - Blablabla ; Blublublu' str
FROM DUAL
)
SELECT TRIM(REGEXP_SUBSTR(str, '([[:alnum:]]|-| )*')) AS SOLUTION_1
, REGEXP_SUBSTR(str, '[[:digit:]]*( )?(-)?( )?[[:alpha:]]*') AS SOLUTION_2
, REGEXP_SUBSTR(str, '[[:digit:]]*( |-)*[[:alpha:]]*') AS SOLUTION_3
FROM tbl;

How to replace more than one character in oracle?

How to replace multiple whole characters, except those in combinations...?
The below code replaces multiple characters, but it also disturbing those in combinations.
SELECT regexp_replace('a,ca,va,ea,r,y,q,b,g','(a|y|q|g)','X') RESULT FROM dual;
Current output:
RESULT
--------------------
X,cX,vX,eX,r,X,X,b,X
Expected output:
RESULT
------------------------
'X,ca,va,ea,r,X,X,b,X
I just want to replace only separate whole characters('a','y','q','g'), but not the 1 in combinations('ca','va','ea')...
Because you are delimiting with a comma ',' you can combine that like ',a,'
and this will replace only single a's.
you can try follows:
with t as
(
select 'a,ca,va,ea,r,y,q,b,g' str
from dual
)
select substr(sys_connect_by_path(regexp_replace(regexp_substr(str, '[^,]+', 1, level), '^(a|y|q|g)$', 'X'), ','), 2) as str
from t
where connect_by_isleaf = 1
connect by level <= length(regexp_replace(str, '[^,]*')) + 1;
Sadly oracle doesn´t support lookahead and lookbehind. But this is a solution i came up with.
SELECT regexp_replace
(regexp_replace
('a,ca,va,ea,r,y,q,b,g',
'^[ayqg](,)|(,)[ayqg](,)|(,)[ayqg]$',
'\2\4X\1\3'),'(,)[ayqg](,)','\1X\2')
RESULT FROM dual;
I had to use the regexp twice sadly, since it doesn´t find two similar values following after each other and replacing it. ..,a,y,.. is getting replaced as ..,X,y,... So the second call replaces the missing [ayqg] with the exact values. In the first inner regexp call replaces the first and last values.
Maybe this could be simplified into one expression, but i am not that conform with the regex from oracle.
As a explanation i am grouping the commata and basicly replace every ,[ayqg], with ,X, by backreferencing the commata
You would look for word boundaries, which is \b, and which is unfortunately not supported by Oracle's regexp_replace.
So let's look for a non-word character \W or the beginning ^ or ending $ of the text.
select
regexp_replace('a,ca,va,ea,r,y,q,b,g','(^|$|\W)(a|y|q|g)(^|$|\W)','\1X\3') as result
from dual;
In order to not remove the non-word characters, we must have them in the replace string: \1 for the expression in the first parenteses, \3 for the ones in the third. Thus we only change the expression in the second parentheses, which is a, y, q or g, with X.
Unfortunately above gives
X,ca,va,ea,r,X,q,b,X
The q was not replaced, because we recognize ',y,' thus being positioned a 'g,' whereas we'd need to be positioned at ',g,' to recognize g as a word, too.
So we need to replace in iterations (i.e. recursively):
with results(txt, num) as
(
select 'a,ca,va,ea,r,y,q,b,g' as txt, 0 as num from dual
union all
select regexp_replace(txt, '(^|$|\W)(a|y|q|g)(^|$|\W)','\1X\3'), num + 1 as num
from results
where txt <> regexp_replace(txt, '(^|$|\W)(a|y|q|g)(^|$|\W)','\1X\3')
)
select max(txt) keep (dense_rank last order by num) as result
from results;
EDIT: Kevin Esche is right; of course one has to do it only twice. Hence you can also do:
select
regexp_replace(txt, search_str, replace_str) as result
from
(
select
regexp_replace(txt, search_str, replace_str) as txt, search_str, replace_str
from
(
select
'a,ca,va,ea,r,y,q,y,q,b,g' as txt,
'(^|$|\W)(a|y|q|g)(^|$|\W)' as search_str,
'\1X\3' as replace_str
from dual
)
);
with replaced_values as (
SELECT case when length(val)=1 then regexp_replace(val,'(a|y|q|g)','X') else val end new_val, lvl
from (
SELECT regexp_substr('a,ca,va,ea,r,y,q,b,g','[^,]+', 1, LEVEL) val, level lvl FROM dual
connect by regexp_substr('a,ca,va,ea,r,y,q,b,g','[^,]+',1, LEVEL) is not null
) all_values
)
select lISTAGG(new_val, ',') WITHIN GROUP (ORDER BY lvl) RESULT
from replaced_values
This statement pivots data into rows and replaces only lines wich contains one character.
Data are then unpivoted in one rows
This sql works also with empty entries like 'a,,,b,c' and more complex regular expressions:
with t as
(select ',a,,ca,va,ea,bbb,ba,r,y,q,b,g,,,' as str,
',' as delimiter,
'(a|y|q|g|ea|[b]*)' as regexp_expr,
'X' as replace_expr
from dual)
(select substr (sys_connect_by_path(regexp_replace(substr(str,
decode(level - 1, 0, 0, instr(str, ',', 1, level - 1)) + 1,
decode(instr(str, ',', 1, level),
0,
length(str),
instr(str, ',', 1, level) - 1) -
decode(level - 1, 0, 0, instr(str, ',', 1, level - 1))),
'^' || regexp_expr || '$',
replace_expr), ','), 2)
from t
where connect_by_isleaf = 1
connect by level <= length(regexp_replace(str, '[^'|| delimiter||']')) + 1)
Result
,X,,ca,va,X,X,ba,r,X,X,X,X,,,
Don't Know much Oracle, but I would have thought something like this could work. Assuming the delimiter is always a comma.
SELECT
regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace('a,ca,va,ea,r,y,q,b,g','(,a,|,y,|,q,|,g,)',',X,') ,'(,a,|,y,|,q,|,g,)',',X,'), '(^a,|^y,|^q,|^g,)','X,'), '(,a$|,y$|,q$|,g$)',',X'), '(^a$|^y$|^q$|^g$)','X')
RESULT FROM test;
The first two parts replaces a single character in commas in the middle, the third part gets those at the start of the string, the fourth is for the end of the string and the fifth is for when then string has just one character.
This answer might will be simplifiable by advanced Regexp use.
How i can replace words?
RS & OS ===> D, LS & IS ==== >
SECTION_ID Output required
1-LS-1991 1-P-1991
1-IS-1991 1-P-1991
1-RS-1991 1- D- 1991
1-OS-1991 1-D-1991

Oracle RegExp_Like for Two Repeating Digits

I would like to write a regexp_like function which would identify if a string consists of two repeating characters. It would only identify a string that has alternating numbers and only consisting of two unique numbers, but the unique number cannot repeat, it must alternate.
Requirement :
Regular expression should match the pattern for 787878787, but it should NOT match the pattern 787878788
It should NOT consider the pattern like 000000000
I think you want the following:
WITH t1 AS (
SELECT '787878787' AS str FROM dual
UNION
SELECT '787878788' AS str FROM dual
UNION
SELECT '7878787878' AS str FROM dual
UNION
SELECT '78' AS str FROM dual
)
SELECT * FROM t1
WHERE REGEXP_LIKE(str, '^(.)(.)(\1\2)*\1?$')
AND SUBSTR(str, 1, 1) != SUBSTR(str, 2, 1)
This will cover the case (mentioned in the requirements) where the string ends with the same character with which it begins. If you want only digits, replace the . in the regex with \d.
Update:
Here is how the regex breaks down:
^ = start of string
(.) = first character - can be anything - in parentheses to capture it and use it in a backreference
(.) = second character - can be anything
\1 = backreference to first captured group
\2 = backreference to second captured group
(\1\2)* = These should appear together zero or more times
\1? = The first captured group should appear zero or one times
$ = end of the string
Hope this helps.
You might do something like this -
SQL> WITH DATA AS(
2 SELECT '787878787' str FROM dual UNION ALL
3 SELECT '787878788' FROM dual
4 )
5 SELECT *
6 FROM DATA
7 WHERE REGEXP_LIKE(str, '(\d+?)\1')
8 AND SUBSTR(str, 1,1) = SUBSTR(str, -1, 1)
9 /
STR
---------
787878787
SQL>
Since you are dealing only with digits, I used \d.
\d+? will match the digits, and, \1 are the captured digits. The substr in the AND condition is checking whether the first and last digit of the string are same.
Edit : Additional requirement by OP
To avoid the numbers like 00000000, you need to add a NOT condition to the predicate.
SQL> WITH DATA AS
2 ( SELECT '787878787' str FROM dual
3 UNION ALL
4 SELECT '787878788' FROM dual
5 UNION ALL
6 SELECT '787878788' FROM dual
7 )
8 SELECT *
9 FROM DATA
10 WHERE REGEXP_LIKE(str, '(\d+?)\1')
11 AND SUBSTR(str, 1,1) = SUBSTR(str, -1, 1)
12 AND SUBSTR(str, 2,1) <> SUBSTR(str, -1, 1)
13 /
STR
---------
787878787
SQL>
You could try:
^(..)\1*$
Breakdown:
^ - assert beginning of line
(..) - capture the first 2 characters
\1* - repeat the captured group pattern zero or more times
$ - assert end of line
Untested in oracle...

Split String by delimiter position using oracle SQL

I have a string and I would like to split that string by delimiter at a certain position.
For example, my String is F/P/O and the result I am looking for is:
Therefore, I would like to separate the string by the furthest delimiter.
Note: some of my strings are F/O also for which my SQL below works fine and returns desired result.
The SQL I wrote is as follows:
SELECT Substr('F/P/O', 1, Instr('F/P/O', '/') - 1) part1,
Substr('F/P/O', Instr('F/P/O', '/') + 1) part2
FROM dual
and the result is:
Why is this happening and how can I fix it?
Therefore, I would like to separate the string by the furthest delimiter.
I know this is an old question, but this is a simple requirement for which SUBSTR and INSTR would suffice. REGEXP are still slower and CPU intensive operations than the old subtsr and instr functions.
SQL> WITH DATA AS
2 ( SELECT 'F/P/O' str FROM dual
3 )
4 SELECT SUBSTR(str, 1, Instr(str, '/', -1, 1) -1) part1,
5 SUBSTR(str, Instr(str, '/', -1, 1) +1) part2
6 FROM DATA
7 /
PART1 PART2
----- -----
F/P O
As you said you want the furthest delimiter, it would mean the first delimiter from the reverse.
You approach was fine, but you were missing the start_position in INSTR. If the start_position is negative, the INSTR function counts back start_position number of characters from the end of string and then searches towards the beginning of string.
You want to use regexp_substr() for this. This should work for your example:
select regexp_substr(val, '[^/]+/[^/]+', 1, 1) as part1,
regexp_substr(val, '[^/]+$', 1, 1) as part2
from (select 'F/P/O' as val from dual) t
Here, by the way, is the SQL Fiddle.
Oops. I missed the part of the question where it says the last delimiter. For that, we can use regex_replace() for the first part:
select regexp_replace(val, '/[^/]+$', '', 1, 1) as part1,
regexp_substr(val, '[^/]+$', 1, 1) as part2
from (select 'F/P/O' as val from dual) t
And here is this corresponding SQL Fiddle.