Match content before optional string - sql

Given the following string, I would like to produce: A-2010:
/Space/w_00123/A-2010/u_23
/Space/w_00123/A-2010 (The /u_23 from above is optional, so missing here)
So the text between 3 and 4th / (if present, or until end of string) is what I really need.
I tried:
select
regexp_substr('/Space/w_00123/A-2010/u_23', '/Space/w_.+/(.*?)(?:/.+|$)', 1, 1, null, 1) r
from dual; -- this results in u_23 as opposed to A-2010
What's the right matcher expression here?

Using regexp_substr with 3rd appearance as the 4th argument gives what you need
with t(str) as
(
select '/Space/w_00123/A-2010/u_23' from dual union all
select '/Space/w_00123/A-2010' from dual
)
select regexp_substr(str,'([^/]+)',1,3) as "Result Strings"
from t;
Result Strings
--------------
A-2010
A-2010
Demo

Try it like this using negated classes:
select
regexp_substr('/Space/w_00123/A-2010/u_23', '/Space/w_[^/]+/([^/]+)', 1, 1, null, 1) r
from dual
Online DB Fiddle

Related

Regex - Invalid regular expression: '(?:[v=)', no argument for repetition operator:?

I have this query that I'm using in Snowflake:
Select *,
case WHEN REGEXP_SUBSTR(NAME, '(?:\[v=)') is not null THEN REGEXP_SUBSTR(NAME, '[[]v=([0-9]+)')
else null
end
from my_table;
but when I try to run it, it tells me:
Invalid regular expression: '(?:[v=)', no argument for repetition
operator: ?
but I've tested this out on regex101 here and it looks like its working, I want to check for [v=
Edit: More insight into what I'm trying to find w. the regex, I have rows that look like this test [v=123], words [v=444], more [v=532] and I need to be able to look through each column and find if it has [v= and extract the numbers only
You can use
SELECT *, REGEXP_SUBSTR(Name, '[[]v=([0-9]+)', 1, 1, 'e') from my_table;
Here,
[[]v=([0-9]+) matches [v= and then captures one or more digits into Group 1
1, 1, 'e' means that we start matching from the first char, one occurrence, and the e makes the engine fetch the Group 1 value (it is default, if you needed Group 2 (if you had it in the pattern) value, you woul add another param, and use REGEXP_SUBSTR(Name, '[[](v|V)=([0-9]+)', 1, 1, 'e', 2)).
This should do it:
SELECT REGEXP_SUBSTR(Name, 'v=([0-9]+)', 1, 1, 'e') val from values ('foo'), ('test [v=111] ') , (' anothertest [v=222]') t(Name) ;
VAL
null
111
222

Get the data from a string between double quotes in Oracle

I have a string with double quotes inside.
EG:
<cosmtio :ff "intermit"ksks>
I need the data between the ""
I have tried the regexp_substr but still couldn't get the value between double-quotes.
We could try using REGEXP_REPLACE here:
SELECT
string,
REGEXP_REPLACE(string, '.*"([^"]+)".*', '\1') AS quoted_term
FROM yourTable;
Data:
WITH yourTable AS (
SELECT '<cosmtio :ff "intermit"ksks>' AS string FROM dual
)
Demo
Another option, using REGEXP_SUBSTR:
SELECT
string,
TRIM(BOTH '"' FROM REGEXP_SUBSTR(string, '".*"'))
FROM yourTable;
But this approach requires nesting two function calls, which means it might not outperform the REGEXP_REPLACE version.
You need to use REGEXP_SUBSTR:
SELECT REGEXP_SUBSTR('<cosmtio :ff "intermit"ksks>', '"([^"]+)"', 1, 1, NULL, 1) AS Result FROM DUAL
See the online demo.
The regex is simple: "([^"]+)" matches ", then captures any 1+ chars other than " into Group 1 and then matches ". The last argument is 1 telling Oracle REGEXP_SUBSTR to return the Group 1 values. The first (position) and the second (occurrence) arguments are default, 1. NULL means no specific options need to be passed to the regex engine.
You can try the following:
SELECT REGEXP_REPLACE('<cosmtio :ff "intermit"ksks>', '^[^"]*("([^"]*)")?.*', '\2') FROM dual
It is possible with regexp_substr as following:
Select
regexp_substr('<cosmtio :ff "intermit"ksks>', '[^"]+', 1, 2)
from dual;
Cheers!!

Get string until character Oracle SQL

How to get string before character?
I need to get string before ; in Oracle SQL.
For example:
147739 - Blablabla ; Blublublu
Needed output:
147739 - Blablabla
My code so far:
SELECT
UPPER(CONVERT(REGEXP_REPLACE(SUBSTR(HISTORICO, INSTR(HISTORICO, 'Doc') + 4), 'S/A', 'SA'), 'US7ASCII'))
FROM
GEQ_GL_CONC_CONTABIL_FRETES_V
WHERE
periodo = '$Periodo$' AND livro = 'ESMALTEC_FISCAL'
I want the whole string up to ;
We can use a combination of SUBSTR and INSTR to achieve this;
SELECT SUBSTR(FIELD_NAME,1,INSTR(FIELD_NAME,';', 1, 1)-1) FROM TABLE_NAME;
The first argument to SUBSTR is the position in the field value from which we want to start (1 = at the beginning), the second argument is the length of the substring we want to read, here it is synonymous with the position of ';' -1.
The third and fourth arguments to INSTR are where to start searching for ';' and the count we are interested in. In our example that is from the beginning (1) and the first occurence (again 1).
You could try using substr() and instr()
select SUBSTR(my_col, 0, INSTR(my_col, ';')-1)
from my_table
select SUBSTR(' Blablabla ; Blublublu', 0, INSTR('A Blablabla ; Blublublu', ';')-1)
from dual
A few alternatives using REGEXP
The result with each solution depends of how uniform your data is
WITH tbl
AS (
SELECT '147739 - Blablabla ; Blublublu' str
FROM DUAL
)
SELECT TRIM(REGEXP_SUBSTR(str, '([[:alnum:]]|-| )*')) AS SOLUTION_1
, REGEXP_SUBSTR(str, '[[:digit:]]*( )?(-)?( )?[[:alpha:]]*') AS SOLUTION_2
, REGEXP_SUBSTR(str, '[[:digit:]]*( |-)*[[:alpha:]]*') AS SOLUTION_3
FROM tbl;

Retrieve the characters before a matching pattern

135 ;1111776698 ;AB555678765
I have the above string and what I am looking for is to retrieve all the digits before the first occurrence of ;.
But the number of characters before the first occurrence of ; varies i.e. it may be a 4 digit number or 3 digit number.
I have played with regex_instr and instr, but I unable to figure this out.
The query should return all the digits before the first occurrence of ;
This answer assumes that you are using Oracle database. I don't know of way to do this using REGEX_INSTR alone, but we can do with REGEXP_REPLACE using capture groups:
SELECT REGEXP_REPLACE('135 ;1111776698 ;AB555678765', '^\s*(\d{3,4})\s*;.*', '\1')
FROM dual;
Demo
Here is the regex pattern being used:
^\s*(\d{3,4})\s*;.*
This allows, from the start of the string, any amount of leading whitespace, followed by a 3 or 4 digit number, followed again by any amount of whitespace, then a semicolon. The .* at the end of the pattern just consumes whatever remains in your string. Note (\d{3,4}), which captures the 3-4 digit number, which is then available in the replacement as \1.
Using INSTR,SUBTSR and TRIM should work ( based on your comment that there are "just white spaces and digits" )
select TRIM(SUBSTR(s,1, INSTR(s,';')-1)) FROM t;
Demo
The following using regexp_substr() should work:
SELECT s, REGEXP_SUBSTR(s, '^[^;]*')
Make sure you try all possible values in that first position, even those you don't expect and make sure they are handled as you want them to be. Always expect the unexpected! This regex matches the first subgroup of zero or more optional digits (allows a NULL to be returned) when followed by an optional space then a semi-colon, or the end of the line. You may need to tighten (or loosen) up the matching rules for your situation, just make sure to test even for incorrect values, especially if the input comes from user-entered data.
with tbl(id, str) as (
select 1, '135 ;1111776698 ;AB555678765' from dual union all
select 2, ' 135 ;1111776698 ;AB555678765' from dual union all
select 3, '135;1111776698 ;AB555678765' from dual union all
select 4, ';1111776698 ;AB555678765' from dual union all
select 5, ';135 ;1111776698 ;AB555678765' from dual union all
select 6, ';;1111776698 ;AB555678765' from dual union all
select 7, 'xx135 ;1111776698 ;AB555678765' from dual union all
select 8, '135;1111776698 ;AB555678765' from dual union all
select 9, '135xx;1111776698 ;AB555678765' from dual
)
select id, regexp_substr(str, '(\d*?)( ?;|$)', 1, 1, NULL, 1) element_1
from tbl
order by id;
ID ELEMENT_1
---------- ------------------------------
1 135
2 135
3 135
4
5
6
7 135
8 135
9
9 rows selected.
To get the desired result, you should use REGEX_SUBSTR as it will substring your desired data from the string you give. Here is the example of the Query.
Solution to your example data:
SELECT REGEXP_SUBSTR('135 ;1111776698 ;AB555678765','[^;]+',1,1) FROM DUAL;
So what it does, Regex splits the string on the basis of ; separator. You needed the first occurrence so I gave arguments as 1,1.
So if you need the second string 1111776698 as your output you can give an argument as 1,2.
The syntax for Regexp_substr is as following:
REGEXP_SUBSTR( string, pattern [, start_position [, nth_appearance [, match_parameter [, sub_expression ] ] ] ] )
Here is the link for more examples:
https://www.techonthenet.com/oracle/functions/regexp_substr.php
Let me know if this works for you. Best luck.

How to extract value between 2 slashes

I have a string like "1490/2334/5166400411000434" from which I need to derive value after second slash. I tried below logic
select REGEXP_SUBSTR('1490/2334/5166400411000434','[^/]+',1,3) from dual;
it is working fine. But when i dont have value between first and second slash it is returining blank.
For example my string is "1490//5166400411000434" and am trying
select REGEXP_SUBSTR('1490//5166400411000434','[^/]+',1,3) from dual;
it is returning blank. Please suggest me what i am missing.
If I understand well, you may need
regexp_substr(t, '(([^/]*/){2})([^/]*)', 1, 1, 'i', 3)
This handles the first 2 parts like 'xxx/' and then checks for a sequence of non / characters; the parameter 3 is used to get the 3rd matching subexpression, which is what you want.
For example:
with test(t) as (
select '1490/2334/5166400411000434' from dual union all
select '1490//5166400411000434' from dual union all
select '1490//5166400411000434/ramesh/3344' from dual
)
select t, regexp_substr(t, '(([^/]*/){2})([^/]*)', 1, 1, 'i', 3) as substr
from test
gives:
T SUBSTR
---------------------------------- ----------------------------------
1490/2334/5166400411000434 5166400411000434
1490//5166400411000434 5166400411000434
1490//5166400411000434/ramesh/3344 5166400411000434
You can REVERSE() your string and take the value before the first slash. And then reverse again to obtain the desired output.
select reverse(regexp_substr(reverse('1490//5166400411000434'), '[^/]+', 1, 1)) from dual;
It can also be done with basic substring and instr function:
select reverse(SUBSTR(reverse('1490//5166400411000434'), 0, INSTR(reverse('1490//5166400411000434'), '/')-1)) from dual;
Use other options in REGEXP_SUBSTR to match a pattren
select REGEXP_SUBSTR('1490//5166400411000434','(/\d*)/(\d+)',1,1,'x',2) from dual
Basically it is finding the pattren of two / including digits starting from 1 with 1 appearance and ignoring whitespaces ('x') then outputting 2nd subexpression that is in second expression within ()
... pattern,1,1,'x',subexp2)