Oracle regexp to match only digits after certain combination of signs - sql

I have a string which roughly looks like: XXXXXXXXX - 1234567 XXXXXXXX,
where X can be either digit, string or sign (<,>,. or space).
I need to extract only these numbers after ' - '.
I have tried following:
select regexp_substr('17.12.12 <XXXXXXXXXX> - 1234567 <XXXXXXXXXX>','(- )[0-9]{1,7}') from dual
I end up with - 1234567.
How to I get rid of '- '?
Thank you in advance

This should work with Oracle 11g.
Place the capturing group around the pattern part you are interested in first. Since you need the digits, wrap the [0-9]{1,7} with the capturing parentheses.
Then, pass all the 6 arguments to the REGEXP_SUBSTR function where the 6th one indicates the number of capturing group you want to extract:
select regexp_substr('17.12.12 <XXXXXXXXXX> - 1234567 <XXXXXXXXXX>',' - ([0-9]{1,7})', 1,1,NULL,1) from dual
Here, 1,1,NULL,1 means: start looking for a pattern match from Position 1, just for the first match, with no specific regex options, and return the contents of Group 1.

What #Gordon Linoff was trying to say was:
select substr(regexp_substr('17.12.12 <XXXXXXXXXX> - 1234567 <XXXXXXXXXX>','(- )[0-9]{1,7}'), 3)
from dual
Substr the remaining "- " off of your result.

Related

Masking a query string param value using Postgres regexp_replace

I want to mask movie names with XXXXXXXX in a PostgreSQL table column. The content of the column is something like
hollywood_genre_movieTitle0=The watergate&categorey=blabla&hollywood_genre_movieTitle1=Terminator&hollywood_genre_movieTitle2=Spartacus&hollywood_genre_movieTitle3=John Wayne and the Indians&categorey=blabla&hollywood_genre_movieTitle4=Start Trek&hollywood_genre_movieTitle5=ET&categorey=blabla
And I would like to mask the titles (behind the pattern hollywood_genre_movieTitle\d) using the regexp_replace function
regexp_replace('(hollywood_genre_movieTitle\d+=)(.*?)(&?)', '\1XXXXXXXX\3', 'g')
This just replaces the first occurrence of a title and and cuts the string. In short this expression does not do the thing I want. What I would like is that all movies names are replace with XXXXXXXX.
Can someone help me solve that?
The regex does not work because (.*?)(&?) matches an empty string or & lands in Group 3 if it immediately follows hollywood_genre_movieTitle\d+= pattern.
You need to use a negated character class [^&] and a + quantifier to match any 1 or more chars other than & after the hollywood_genre_movieTitle\d+= pattern.
SELECT regexp_replace(
'hollywood_genre_movieTitle0=The watergate&categorey=blabla&hollywood_genre_movieTitle1=Terminator&hollywood_genre_movieTitle2=Spartacus&hollywood_genre_movieTitle3=John Wayne and the Indians&categorey=blabla&hollywood_genre_movieTitle4=Start Trek&hollywood_genre_movieTitle5=ET&categorey=blabla',
'(hollywood_genre_movieTitle\d+=)[^&]+',
'\1XXXXXXXX',
'g')
See the online demo.
Details
(hollywood_genre_movieTitle\d+=) - Capturing group 1:
hollywood_genre_movieTitle - a substring
\d+= - 1 or more digits and a = after them
[^&]+ - 1 or more chars other than &.

replace all occurrences of a sub string between 2 charcters using sql

Input string: ["1189-13627273","89-13706681","118-13708388"]
Expected Output: ["14013627273","14013706681","14013708388"]
What I am trying to achieve is to replace any numbers till the '-' for each item with hard coded text like '140'
SELECT replace(value_to_replace, '-', '140')
FROM (
VALUES ('1189-13627273-77'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
check this
I found the right way to achieve that using the below regular expression.
SELECT REGEXP_REPLACE (string_to_change, '\\"[0-9]+\\-', '140')
You don't need a regexp for this, it's as easy as concatenation of 140 and the substring from - (or the second part when you split by -)
select '140'||substring('89-13706681' from position('-' in '89-13706681')+1 for 1000)
select '140'||split_part('89-13706681','-',2)
also, it's important to consider if you might have instances that don't contain - and what would be the output in this case
Use regexp_replace(text,text,text) function to do so giving the pattern to match and replacement string.
First argument is the value to be replaced, second is the POSIX regular expression and third is a replacement text.
Example
SELECT regexp_replace('1189-13627273', '.*-', '140');
Output: 14013627273
Sample data set query
SELECT regexp_replace(value_to_replace, '.*-', '140')
FROM (
VALUES ('1189-13627273'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
Caution! Pattern .*- will replace every character until it finds last occurence of - with text 140.

Comparing fields when a field has data in between 2 characters that match the field being compared

I have code that looks like this:
left outer join
gme_batch_header bh
on
substr(ln.lot_number,instr(ln.lot_number,'(') + 1,
instr(ln.lot_number,')') - instr(ln.lot_number,'(') - 1)
=
bh.batch_no
It works fine, but I have come across a few lot numbers that have two sections of strings that are between parenthesis. How would I compare what is between the second set of parenthesis? Here is an example of the data in the lot number field:
E142059-307-SCRAP-(74055)
This one works with the code,
58LF-3-B-2-2-2 (SCRAP)-(61448)
This one tries comparing SCRAP with the batch no, which isn't correct. It needs to be the 61448.
The result is always the last item in parenthesis.
After more research, I actually got it to work with this code:
substr(ln.lot_number,instr(ln.lot_number,'(',-1) + 1, instr(ln.lot_number,')',-1) - instr(ln.lot_number,'(',-1) - 1)
Assuming SQL2005+, and it is always the last occurrence you want, then I would suggest finding the last instance of a ( in your query and substring to there. To get the last instance you could use something like:
REVERSE(SUBSTRING(REVERSE(lot_number),0,CHARINDEX('(',REVERSE(lot_number))))
If your version of Oracle supports regular expressions try this:
substr(regexp_substr(ln.lot_number,'[0-9]+\)$'),1,length(regexp_substr(ln.lot_number,'[0-9]+\)$'))-1)
Explanation:
regexp_substr(scrap_row,'[0-9]+\)$' ==> find me just numbers in the string that ends in ). This returns the numbers but it includes the closing parenthesis.
To remove the closing parenthsis, just send it through substring and extract first number through the length of the number stopping at 1 character from the end of the string.
Query for analysis:
with scrap
as (select '58LF-3-B-2-2-2 (SCRAP)-(61448)' as scrap_row from dual)
select scrap_row,
regexp_substr(scrap_row,'[0-9]+\)$') as regex_substring,
length(regexp_substr(scrap_row,'[0-9]+\)$')) as length_regex_substring,
substr(regexp_substr(scrap_row,'[0-9]+\)$'),1,length(regexp_substr(scrap_row,'[0-9]+\)$'))-1) as regex_sans_parenthesis
from scrap
If you have 11g, this will do it pretty simply by using the subgroup argument of regexp_substr() and constructing the regex appropriately:
SQL> with tbl(data) as
(
select 'E142059-307-SCRAP-(74055)' from dual
union
select '58LF-3-B-2-2-2 (SCRAP)-(61448)' from dual
)
select data from tbl
where regexp_substr(data, '\((\d+)\)$', 1, 1, NULL, 1)
= '61448';
DATA
------------------------------
58LF-3-B-2-2-2 (SCRAP)-(61448)
The regular expression can be read as:
\( - Search for a literal left paren
( - Start a remembered subgroup
\d+ - followed by 1 more more digits
) - End remembered subgroup
\) - followed by a literal right paren
$ - at the end of the line.
The regexp_substr function arguments are:
Source - the source string
Pattern - The regex pattern to look for
position - Position in the string to start looking for the pattern
occurrence - If the pattern occurs multiple times, which occurrence you want
match_params - See the docs, not used here
subexpression - which subexpression to use (the remembered group)
So in English, look for a series of 1 or more digits surrounded by parens, where it occurs at the end of the line and save the digit part only to use to compare. IMHO a lot easier to follow/maintain than nested instr(), substr().
For re-useability, make a function called get_last_number_in_parens() that contains this code and uses an argument of the string to search. This way that logic is encapsulated and can be re-used by folks that may not be so comfortable with regular expressions, but can benefit from the power! One place to maintain code too. Then call like this:
select data from tbl
where get_last_number_in_parens(data) = '61448';
How easy is that?!
Hello you can check with this code. It works whaever the condition may be
SELECT SUBSTR('58LF-3-B-2-2-2-(61448)',instr('58LF-3-B-2-2-2-(61448)','(',-1)+1,LENGTH('58LF-3-B-2-2-2-(61448)')-instr('58LF-3-B-2-2-2-(61448)','(',-1)-1)
FROM dual;
SELECT SUBSTR('58LF-3-B-2-2-2 (SCRAP)-(61448)',instr('58LF-3-B-2-2-2 (SCRAP)-(61448)','(',-1)+1,LENGTH('58LF-3-B-2-2-2 (SCRAP)-(61448)')-instr('58LF-3-B-2-2-2 (SCRAP)-(61448)','(',-1)-1)
FROM dual;
Output
==================================
61448
==================================

using oracle sql substr to get last digits

I have a result of a query and am supposed to get the final digits of one column say 'term'
The value of column term can be like:
'term' 'number' (output)
---------------------------
xyz012 12
xyz112 112
xyz1 1
xyz02 2
xyz002 2
xyz88 88
Note: Not limited to above scenario's but requirement being last 3 or less characters can be digit
Function I used: to_number(substr(term.name,-3))
(Initially I assumed the requirement as last 3 characters are always digit, But I was wrong)
I am using to_number because if last 3 digits are '012' then number should be '12'
But as one can see in some specific cases like 'xyz88', 'xyz1') would give a
ORA-01722: invalid number
How can I achieve this using substr or regexp_substr ?
Did not explore regexp_substr much.
Using REGEXP_SUBSTR,
select column_name, to_number(regexp_substr(column_name,'\d+$'))
from table_name;
\d matches digits. Along with +, it becomes a group with one or more digits.
$ matches end of line.
Putting it together, this regex extracts a group of digits at the end of a string.
More details here.
Demo here.
Oracle has the function regexp_instr() which does what you want:
select term, cast(substr(term, 1-regexp_instr(reverse(term),'[^0-9]')) as int) as number
select SUBSTRING(acc_no,len(acc_no)-1,len(acc_no)) from table_name;

How to Select a substring in Oracle SQL up to a specific character?

Say I have a table column that has results like:
ABC_blahblahblah
DEFGH_moreblahblahblah
IJKLMNOP_moremoremoremore
I would like to be able to write a query that selects this column from said table, but only returns the substring up to the Underscore (_) character. For example:
ABC
DEFGH
IJKLMNOP
The SUBSTRING function doesn't seem to be up to the task because it is position-based and the position of the underscore varies.
I thought about the TRIM function (the RTRIM function specifically):
SELECT RTRIM('listofchars' FROM somecolumn)
FROM sometable
But I'm not sure how I'd get this to work since it only seems to remove a certain list/set of characters and I'm really only after the characters leading up to the Underscore character.
Using a combination of SUBSTR, INSTR, and NVL (for strings without an underscore) will return what you want:
SELECT NVL(SUBSTR('ABC_blah', 0, INSTR('ABC_blah', '_')-1), 'ABC_blah') AS output
FROM DUAL
Result:
output
------
ABC
Use:
SELECT NVL(SUBSTR(t.column, 0, INSTR(t.column, '_')-1), t.column) AS output
FROM YOUR_TABLE t
Reference:
SUBSTR
INSTR
Addendum
If using Oracle10g+, you can use regex via REGEXP_SUBSTR.
This can be done using REGEXP_SUBSTR easily.
Please use
REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1)
where STRING_EXAMPLE is your string.
Try:
SELECT
REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1)
from dual
It will solve your problem.
You need to get the position of the first underscore (using INSTR) and then get the part of the string from 1st charecter to (pos-1) using substr.
1 select 'ABC_blahblahblah' test_string,
2 instr('ABC_blahblahblah','_',1,1) position_underscore,
3 substr('ABC_blahblahblah',1,instr('ABC_blahblahblah','_',1,1)-1) result
4* from dual
SQL> /
TEST_STRING POSITION_UNDERSCORE RES
---------------- ------------------ ---
ABC_blahblahblah 4 ABC
Instr documentation
Susbtr Documentation
SELECT REGEXP_SUBSTR('STRING_EXAMPLE','[^_]+',1,1) from dual
is the right answer, as posted by user1717270
If you use INSTR, it will give you the position for a string that assumes it contains "_" in it. What if it doesn't? Well the answer will be 0. Therefore, when you want to print the string, it will print a NULL.
Example: If you want to remove the domain from a "host.domain". In some cases you will only have the short name, i.e. "host". Most likely you would like to print "host". Well, with INSTR it will give you a NULL because it did not find any ".", i.e. it will print from 0 to 0. With REGEXP_SUBSTR you will get the right answer in all cases:
SELECT REGEXP_SUBSTR('HOST.DOMAIN','[^.]+',1,1) from dual;
HOST
and
SELECT REGEXP_SUBSTR('HOST','[^.]+',1,1) from dual;
HOST
Another possibility would be the use of REGEXP_SUBSTR.
In case if String position is not fixed then by below Select statement we can get the expected output.
Table Structure
ID VARCHAR2(100 BYTE)
CLIENT VARCHAR2(4000 BYTE)
Data-
ID CLIENT
1001 {"clientId":"con-bjp","clientName":"ABC","providerId":"SBS"}
1002
--
{"IdType":"AccountNo","Id":"XXXXXXXX3521","ToPricingId":"XXXXXXXX3521","clientId":"Test-Cust","clientName":"MFX"}
Requirement - Search ClientId string in CLIENT column and return the corresponding value. Like From "clientId":"con-bjp" --> con-bjp(Expected output)
select CLIENT,substr(substr(CLIENT,instr(CLIENT,'"clientId":"')+length('"clientId":"')),1,instr(substr(CLIENT,instr(CLIENT,'"clientId":"')+length('"clientId":"')),'"',1 )-1) cut_str from TEST_SC;
--
CLIENT cut_str
----------------------------------------------------------- ----------
{"clientId":"con-bjp","clientName":"ABC","providerId":"SBS"} con-bjp
{"IdType":"AccountNo","Id":"XXXXXXXX3521","ToPricingId":"XXXXXXXX3521","clientId":"Test-Cust","clientName":"MFX"} Test-Cust
Remember this if all your Strings in the column do not have an underscore
(...or else if null value will be the output):
SELECT COALESCE
(SUBSTR("STRING_COLUMN" , 0, INSTR("STRING_COLUMN", '_')-1),
"STRING_COLUMN")
AS OUTPUT FROM DUAL
To find any sub-string from large string:
string_value:=('This is String,Please search string 'Ple');
Then to find the string 'Ple' from String_value we can do as:
select substr(string_value,instr(string_value,'Ple'),length('Ple')) from dual;
You will find result: Ple