Using REGEXP_SUBSTR with Strings Qualifier - sql

Getting Examples from similar Stack Overflow threads,
Remove all characters after a specific character in PL/SQL
and
How to Select a substring in Oracle SQL up to a specific character?
I would want to retrieve only the first characters before the occurrence of a string.
Example:
STRING_EXAMPLE
TREE_OF_APPLES
The Resulting Data set should only show only STRING_EXAM and TREE_OF_AP because PLE is my delimiter
Whenever i use the below REGEXP_SUBSTR, It gets only STRING_ because REGEXP_SUBSTR treats PLE as separate expressions (P, L and E), not as a single expression (PLE).
SELECT REGEXP_SUBSTR('STRING_EXAMPLE','[^PLE]+',1,1) from dual;
How can i do this without using numerous INSTRs and SUBSTRs?
Thank you.

The problem with your query is that if you use [^PLE] it would match any characters other than P or L or E. You are looking for an occurence of PLE consecutively. So, use
select REGEXP_SUBSTR(colname,'(.+)PLE',1,1,null,1)
from tablename
This returns the substring up to the last occurrence of PLE in the string.
If the string contains multiple instances of PLE and only the substring up to the first occurrence needs to be extracted, use
select REGEXP_SUBSTR(colname,'(.+?)PLE',1,1,null,1)
from tablename

Why use regular expressions for this?
select substr(colname, 1, instr(colname, 'PLE')-1) from...
would be more efficient.
with
inputs( colname ) as (
select 'FIRST_EXAMPLE' from dual union all
select 'IMPLEMENTATION' from dual union all
select 'PARIS' from dual union all
select 'PLEONASM' from dual
)
select colname, substr(colname, 1, instr(colname, 'PLE')-1) as result
from inputs
;
COLNAME RESULT
-------------- ----------
FIRST_EXAMPLE FIRST_EXAM
IMPLEMENTATION IM
PARIS
PLEONASM

Related

How to print the sequence to nth length? [duplicate]

I would like to know how to achieve the same functionality as REPEAT() in SQL*Plus. For example consider this problem: display the character '*' as many times as the value specified by an integer attribute specified for each entry in a given table.
Nitpicking: SQL*Plus doesn't have any feature for that. The database server (Oracle) provides the ability to execute SQL and has such a function:
You are looking for rpad()
select rpad('*', 10, '*')
from dual;
will output
**********
More details can be found in the manual: https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions159.htm#SQLRF06103
For single characters, the accepted answer works fine.
However, If you have multiple characters in a given string, you need to use RPAD along with length function like this.
WITH t (str) AS
(
SELECT 'a'
FROM DUAL
UNION ALL SELECT 'abc'
FROM DUAL
UNION ALL SELECT '123'
FROM DUAL
UNION ALL SELECT '#+-'
FROM DUAL
)
SELECT RPAD(str, 5*LENGTH(str), str) repeated_5_times
FROM t;
Output:
REPEATED_5_TIMES
---------------
aaaaa
abcabcabcabcabc
123123123123123
#+-#+-#+-#+-#+-

How to get first string after character Oracle SQL

I'm trying to get first string after a character.
Example is like
ABCDEF||GHJ||WERT
I need only
GHJ
I tried to use REGEXP but i couldnt do it.
Can anyone help me with please?
Thank you
Somewhat simpler:
SQL> select regexp_substr('ABCDEF||GHJ||WERT', '\w+', 1, 2) result from dual;
^
RES |
--- give me the 2nd "word"
GHJ
SQL>
which reads as: give me the 2nd word out of that string. Won't work properly if GHJ consists of several words (but that's not what your example suggests).
Something like I interpret with a separator in place, In this case it is || or | example is with oracle database
-- pattern -- > [^] represents non-matching character and + for says one or more character followed by ||
-- 3rd parameter --> starting position
-- 4th parameter --> nth occurrence
WITH tbl(str) AS
(SELECT 'ABCDEF||GHJ||WERT' str FROM dual)
SELECT regexp_substr(str
,'[^||]+'
,1
,2) output
FROM tbl;
I think the most general solution is:
WITH tbl(str) AS (
SELECT 'ABCDEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABC|DEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABClDEF||GHJ||WERT' str FROM dual
)
SELECT regexp_replace(str, '^.*\|\|(.*)\|\|.*', '\1')
FROM tbl;
Note that this works even if the individual elements contain punctuation or a single vertical bar -- which the other solutions do not. Here is a comparison.
Presumably, the double vertical bar is being used for maximum flexibility.
You should use regexp_substr function
select regexp_substr('ABCDEF||GHJ||WERT ', '\|{2}([^|]+)', 1, 1, 'i', 1) str
from dual;
STR
---
GHJ

ORACLE regexp_substr extract everything after specific char

How to get rest of string after specific char?
I have a string 'a|b|c|2|:x80|3|rr|' and I would like to get result after 3rd occurance of |. So the result should be like 2|:x80|3|rr|
The query
select REGEXP_SUBSTR('a|b|c|2|:x80|3|rr|','[^|]+$',1,4)
from dual
Returned me NULL
Use SUBSTR / INSTR combination
WITH t ( s ) AS (
SELECT 'a|b|c|2|:x80|3|rr|'
FROM dual
) SELECT substr(s,instr(s,'|',1,3) + 1)
FROM t;
Demo
REGEXP_REPLACE() will do the trick. Skip 3 groups of anything followed by a pipe, then replace with the 2nd group, which is the rest of the line (anchored to the end).
SQL> select regexp_replace('a|b|c|2|:x80|3|rr|', '(.*?\|){3}(.*)$', '\2') trimmed
2 from dual;
TRIMMED
------------
2|:x80|3|rr|
SQL>
I suggest a nice by long way by using regexp_substr, regexp_count and listagg together as :
select listagg(str) within group (order by lvl)
as "Result String"
from
(
with t(str) as
(
select 'a|b|c|2|:x80|3|rr|' from dual
)
select level-1 as lvl,
regexp_substr(str,'(.*?)(\||$)',1,level) as str
from dual
cross join t
connect by level <= regexp_count('a|b|c|2|:x80|3|rr|','\|')
)
where lvl >= 3;
Rextester Demo
If you use oracle 11g and above you can specify a subexpression to return like this:
select REGEXP_SUBSTR('a|b|c|2|:x80|3|rr|','([^|]+\|){3}(.+)$',1,1,null,2) from dual
Erkko,
You need to use the combination of SUBSTR and REGEXP_INSTR OR INSTR.
Your query will look like this. (Without Regex)
SELECT SUBSTR('a|b|c|2|:x80|3|rr|',INSTR('a|b|c|2|:x80|3|rr|','|',1,3)+1) from dual;
Your query will look like this. (With Regex as you want to use)
SELECT SUBSTR('a|b|c|2|:x80|3|rr|',REGEXP_INSTR('a|b|c|2|:x80|3|rr|','\|',1,3)+1) from dual;
Explanation:
First, you will need to find the place of the string you want as you mentioned. So in your case | comes at place 6. So that +1 would be your position to start to substring.
Second, from the original string, substring from that position+1 to unlimited.(Where your string ends)
Example:
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=6fd782db95f575201eded084493232ee

Counting word lengths in a string

I am using an Oracle regular expression to extract the first letter of each word in a string. The results are returned in a single cell, with spaces representing hard breaks. Here is an example...
input:
'I hope that some kind person
browsing stack overflow
can help me'
output:
ihtskp bso chm
What I am trying to do next is count the length of each "word" in my output, like this:
6 3 3
Alternatively, a count of the words in each line of the original string would be acceptable, as it would yield the same result.
Thanks!
Count the number of spaces and add one:
select (length(your_col) - length(replace(your_col, ' '))+1) from your_table;
It will give you the number of words per line. From there you can get all counts on one line by using listagg function:
select LISTAGG(cnt,' ') within group (order by null) from (
select (length(a)-length(replace(a,' '))+1) cnt from (
select 'apa bpa bv' a from dual
union all
select 'n bb gg' a from dual
union all
select 'ff ff rr gg' a from dual))
group by null;
Perhaps you also need to split the strings if they contain newlines or are they split already?
I tried to edit my original post but it hasn't appeared, but I figured out a way to solve my issue. I just decided to break the words into rows, since I know how to character count rows, and then reassembled the character counts into a single cell using listagg:
with my_string as (
select regexp_substr (words,'[0-9]+|[a-z]+|[A-Z]+',1,lvl) parsed
from (
select words, level lvl
from letters connect by level <= length(words) - length(replace(words,' ')) + 1)
)
select listagg(length(parsed),' ') within group (order by parsed) word_count
from my_string

sql Search a string between certain words

If the key word is "Find", is it possible to extract a string that is between the "Find"?
stackoverflow is awesome. FindHello, World!Find It has everything!
The result should be 'Hello, World!' because the string is between "Find"
My initial idea was to use Instr to locate two "Find", then locate what's between "Find".
Is there any better way to do this?
You can use either regular expressions or instr() to achieve what you're after.
I actually prefer regular expressions, if you're using version 10g or later, because I find doing multiple contortions with instr() fairly unwieldy, but it's up to you.
with phrases as (
select 'stackoverflow is awesome. FindHello, World!Find It has everything!' as phrase
from dual
)
select substr( phrase
, instr(phrase,'Find',1,1) + 4
, instr(phrase,'Find',1,2)
- instr(phrase,'Find',1,1)
- 4
)
from phrases
This gets the first and second occurrences of the string Find, starting from the first character, then uses these to work out the positions that you should be doing the sub-string on.
Alternatively, using regular expressions:
with phrases as (
select 'stackoverflow is awesome. FindHello, World!Find It has everything!' as phrase
from dual
)
select regexp_replace(phrase
, '([[:print:]]+Find)([[:print:]]+)(Find[[:print:]]+)', '\2')
from phrases
;
This takes any printable character multiple times, followed by the string Find etc. But, the main bit is the grouping (), which separates each part of the phrase. The \2 means that of the original matched string only the second group, i.e. that between the Find's is returned.
Here's a little SQL Fiddle to demonstrate.
This query suppose to handle more than two 'Find's
with SourceString as(
select 'Find123Find45345Find76876234Find87687Find' s_string
, 'Find' delimiter
from dual
)
select substr(s_string, f_f - s_f + length(delimiter), s_f-Length(delimiter ) )
from (select f_f
, s_f
from(select f_f
, f_f - lag(f_f, 1, f_f) over(order by 1) s_f
from (select Instr(s_string, delimiter , 1, level) f_f
from SourceString
connect by level <= Length(s_string))
)
where s_f > 0)
, SourceString