REGEXP_SUBSTR oracle 11g - sql

All,
I have a input data coming in the following format,
"CUStid":["201","217"],"HireDate":"2016-06-24","EndDate":"2016-08-23"
(or)
"CUStid":["301",""],"HireDate":"2016-06-24","EndDate":"2016-08-23"
I need to output just the customer id from the input data, and when I run the following it is returning null
(REGEXP_SUBSTR (inputdata,
'"CUStid":"([^"]*)"',
1,
1,
NULL,
1))
expected output is:
CUStid1 Custid2
201 217
301
can someone please tell me how can I do this in oracle 11g
Thanks

Use
REGEXP_SUBSTR (inputdata,'"CUStid":"\[(.+)\]"',1,1,null,1)
[ and ] should be escaped with \.
to get "301","" ,"201","217"
Use string manipulation to get the multiple values as separate columns.
If there can be a maximum of 2 values separated by a , use
select
substr(replace(val,'"',''), 1, instr(replace(val,'"',''),',')-1) custid1
,substr(replace(val,'"',''), instr(replace(val,'"',''),',')+1) custid2
from (select REGEXP_SUBSTR (inputdata,'"CUStid":"\[(.+)\]"',1,1,null,1) val
from tablename)

The quick answer is that in your search pattern you are missing a [.
It should be:
'"CUStid":["([^"]*)"' (notice the [ after the colon : )
... and actually (TESTING HELPS!) the [ is a metacharacter in regular expressions, so it must be escaped:
'"CUStid":\["([^"]*)"'
The longer answer is that you need to be able to pick up the second customer id as well, but perhaps you already know how to do that. Good luck!
Edited
Here is the full query:
with inputs (inputdata) as (
select '"CUStid":["201","217"],"HireDate":"2016-06-24","EndDate":"2016-08-23"' from dual
union all
select '"CUStid":["301",""],"HireDate":"2016-06-24","EndDate":"2016-08-23"' from dual
)
select regexp_substr(inputdata, '"CUStid":\["([^"]*)"', 1, 1, null, 1) as custid1,
regexp_substr(inputdata, '"CUStid":\["[^"]*","([^"]*)"', 1, 1, null, 1) as custid2
from inputs;
Result:
CUSTID1 CUSTID2
---------- ----------
201 217
301
2 rows selected.

Related

Match content before optional string

Given the following string, I would like to produce: A-2010:
/Space/w_00123/A-2010/u_23
/Space/w_00123/A-2010 (The /u_23 from above is optional, so missing here)
So the text between 3 and 4th / (if present, or until end of string) is what I really need.
I tried:
select
regexp_substr('/Space/w_00123/A-2010/u_23', '/Space/w_.+/(.*?)(?:/.+|$)', 1, 1, null, 1) r
from dual; -- this results in u_23 as opposed to A-2010
What's the right matcher expression here?
Using regexp_substr with 3rd appearance as the 4th argument gives what you need
with t(str) as
(
select '/Space/w_00123/A-2010/u_23' from dual union all
select '/Space/w_00123/A-2010' from dual
)
select regexp_substr(str,'([^/]+)',1,3) as "Result Strings"
from t;
Result Strings
--------------
A-2010
A-2010
Demo
Try it like this using negated classes:
select
regexp_substr('/Space/w_00123/A-2010/u_23', '/Space/w_[^/]+/([^/]+)', 1, 1, null, 1) r
from dual
Online DB Fiddle

Retrieve the characters before a matching pattern

135 ;1111776698 ;AB555678765
I have the above string and what I am looking for is to retrieve all the digits before the first occurrence of ;.
But the number of characters before the first occurrence of ; varies i.e. it may be a 4 digit number or 3 digit number.
I have played with regex_instr and instr, but I unable to figure this out.
The query should return all the digits before the first occurrence of ;
This answer assumes that you are using Oracle database. I don't know of way to do this using REGEX_INSTR alone, but we can do with REGEXP_REPLACE using capture groups:
SELECT REGEXP_REPLACE('135 ;1111776698 ;AB555678765', '^\s*(\d{3,4})\s*;.*', '\1')
FROM dual;
Demo
Here is the regex pattern being used:
^\s*(\d{3,4})\s*;.*
This allows, from the start of the string, any amount of leading whitespace, followed by a 3 or 4 digit number, followed again by any amount of whitespace, then a semicolon. The .* at the end of the pattern just consumes whatever remains in your string. Note (\d{3,4}), which captures the 3-4 digit number, which is then available in the replacement as \1.
Using INSTR,SUBTSR and TRIM should work ( based on your comment that there are "just white spaces and digits" )
select TRIM(SUBSTR(s,1, INSTR(s,';')-1)) FROM t;
Demo
The following using regexp_substr() should work:
SELECT s, REGEXP_SUBSTR(s, '^[^;]*')
Make sure you try all possible values in that first position, even those you don't expect and make sure they are handled as you want them to be. Always expect the unexpected! This regex matches the first subgroup of zero or more optional digits (allows a NULL to be returned) when followed by an optional space then a semi-colon, or the end of the line. You may need to tighten (or loosen) up the matching rules for your situation, just make sure to test even for incorrect values, especially if the input comes from user-entered data.
with tbl(id, str) as (
select 1, '135 ;1111776698 ;AB555678765' from dual union all
select 2, ' 135 ;1111776698 ;AB555678765' from dual union all
select 3, '135;1111776698 ;AB555678765' from dual union all
select 4, ';1111776698 ;AB555678765' from dual union all
select 5, ';135 ;1111776698 ;AB555678765' from dual union all
select 6, ';;1111776698 ;AB555678765' from dual union all
select 7, 'xx135 ;1111776698 ;AB555678765' from dual union all
select 8, '135;1111776698 ;AB555678765' from dual union all
select 9, '135xx;1111776698 ;AB555678765' from dual
)
select id, regexp_substr(str, '(\d*?)( ?;|$)', 1, 1, NULL, 1) element_1
from tbl
order by id;
ID ELEMENT_1
---------- ------------------------------
1 135
2 135
3 135
4
5
6
7 135
8 135
9
9 rows selected.
To get the desired result, you should use REGEX_SUBSTR as it will substring your desired data from the string you give. Here is the example of the Query.
Solution to your example data:
SELECT REGEXP_SUBSTR('135 ;1111776698 ;AB555678765','[^;]+',1,1) FROM DUAL;
So what it does, Regex splits the string on the basis of ; separator. You needed the first occurrence so I gave arguments as 1,1.
So if you need the second string 1111776698 as your output you can give an argument as 1,2.
The syntax for Regexp_substr is as following:
REGEXP_SUBSTR( string, pattern [, start_position [, nth_appearance [, match_parameter [, sub_expression ] ] ] ] )
Here is the link for more examples:
https://www.techonthenet.com/oracle/functions/regexp_substr.php
Let me know if this works for you. Best luck.

regexp_substr strip text between first forward slash and second one

/abc/required_string/2/ should return abc with regexp_substr
SELECT REGEXP_SUBSTR ('/abc/blah/blah/', '/([a-zA-Z0-9]+)/', 1, 1, NULL, 1) first_val
from dual;
You might try the following:
SELECT TRIM('/' FROM REGEXP_SUBSTR(mycolumn, '^\/([^\/]+)'))
FROM mytable;
This regular expression will match the first occurrence of a pattern starting with / (I habitually escape /s in regular expressions, hence \/ which won't hurt anything) and including any non-/ characters that follow. If there are no such characters then it will return NULL.
Hope this helps.
You can search for /([^/]+)/, which says:
/ forward slash
( start of subexpression (usually called "group" in other languages)
[^/] any character other than forward slash
+ match the preceding expression one or more times
) end of subexpression
/ forward slash
You can use the 6th argument to regexp_substr to select a subexpression.
Here we pass 1 to match only the characters between the /s:
select regexp_substr(txt, '/([^/]+)/', 1, 1, null, 1)
from t1
See it working at SQL Fiddle.
Classic SUBSTR + INSTR offer a simple solution; I know you specified regular expressions, but - consider this too, might work better for a large data volume.
SQL> with test (col) as
2 (select '/abc/required_string/2/' from dual)
3 select substr(col, 2, instr(col, '/', 1, 2) - 2) result
4 from test;
RES
---
abc
SQL>
Here's another way to get the 2nd occurrence of a string of characters followed by a forward slash. It handles the problem if that element happens to be NULL as well. Always expect the unexpected!
Note: If you use the regex form of [^/]+, and that element is NULL it will return "required string" which is NOT what you expect! That form does NOT handle NULL elements. See here for more info: [https://stackoverflow.com/a/31464699/2543416]
with tbl(str) as (
select '/abc/required_string/2/' from dual union all
select '//required_string1/3/' from dual
)
select regexp_substr(str, '(.*?)(/)', 1, 2, null, 1)
from tbl;

How to extract the number from a string using Oracle?

I have a string as follows: first, last (123456) the expected result should be 123456. Could someone help me in which direction should I proceed using Oracle?
It will depend on the actual pattern you care about (I assume "first" and "last" aren't literal hard-coded strings), but you will probably want to use regexp_substr.
For example, this matches anything between two brackets (which will work for your example), but you might need more sophisticated criteria if your actual examples have multiple brackets or something.
SELECT regexp_substr(COLUMN_NAME, '\(([^\)]*)\)', 1, 1, 'i', 1)
FROM TABLE_NAME
Your question is ambiguous and needs clarification. Based on your comment it appears you want to select the six digits after the left bracket. You can use the Oracle instr function to find the position of a character in a string, and then feed that into the substr to select your text.
select substr(mycol, instr(mycol, '(') + 1, 6) from mytable
Or if there are a varying number of digits between the brackets:
select substr(mycol, instr(mycol, '(') + 1, instr(mycol, ')') - instr(mycol, '(') - 1) from mytable
Find the last ( and get the sub-string after without the trailing ) and convert that to a number:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE test ( str ) AS
SELECT 'first, last (123456)' FROM DUAL UNION ALL
SELECT 'john, doe (jr) (987654321)' FROM DUAL;
Query 1:
SELECT TO_NUMBER(
TRIM(
TRAILING ')' FROM
SUBSTR(
str,
INSTR( str, '(', -1 ) + 1
)
)
) AS value
FROM test
Results:
| VALUE |
|-----------|
| 123456 |
| 987654321 |

Oracle SQL split column using string functions

In a table(football_team) the values in a column(names) looks like this Andrew Luck , QB . I want to split this column into 3 columns first_name,Last_name,position using PL/SQL functions.
i tried this
select regexp_substr(names,'[^ ,"]+',1,1) as first_name,
regexp_substr(names,'[^ ,"]+',1,2) as last_name,
regexp_substr(names,'[^ ,"]+',1,3) as position from football_team;
doesn't work
Can I make it by using only SUBSTR and INSTR functions.
Please help me. Thanks in advance.
Yes you could use string functions too but IMO regexp is much simpler here. Well as long as you can read regexps, YMMV.
The problem is in regular expressions. Try this instead:
with names(name) as (
select 'Andrew Luck , QB' from dual
union all
select 'John Doe , WB' from dual
)
select
regexp_substr(name, '^([[:alpha:]]+)', 1, 1, '', 1) as firstname
,regexp_substr(name, '^[[:alpha:]]+[[:space:]]+([[:alpha:]]+)', 1, 1, '', 1) as lasttname
,regexp_substr(name, '([[:alpha:]]+)$', 1, 1, '', 1) as position
from names
;
Returns:
FIRSTNAME LASTNAME POSITION
--------- -------- --------
Andrew Luck QB
John Doe WB
The firstname matching regular expressions explained:
^ -- start of the string
( -- start of subexpression group that is referenced by regexp_substr parameter #6 (subexpr)
[ -- start of matching character list
[:alpha:] -- character class expression: alphabet.
] -- end of matching character list
+ -- matches one or more occurrences of the preceding subexpression
) -- end of subexpression group
The explanantion of other two regexps are left as an excercise for the OP.