split a string on multiple delimiters

split a string on multiple delimiters - sql

I have a sql that returns the example string below:
Input PKIND:BCMOX:10048301-
output BCMOX:10048301
I need to write code the first substring the string on - then split it on : and return the 2 & 3 item (BCMOX:10048301)

If the string format is consistent and you want to extract everything after the first : until the first occurrence of -, use a combination of substr and instr.
select substr(col, instr(col,':')+1, instr(col,'-')-instr(col,':')-1)
from yourtable
where instr(col,':') > 0 and instr(col,'-') > 0 --to get the rows which have these 2 characters

The REGEXP_SUBSTR version. Return everything between the first colon and the first hyphen.
select regexp_substr('PKIND:BCMOX:10048301-', ':(.*)-', 1, 1, NULL, 1) from dual;

Related

How to get file name without extension with using Regular Expressions

I have a field with following values, now i want to extract only those rows with "xyz" in the field value mentioned below, can you please help?
Mydata_xyz_aug21
Mydata2_zzz_aug22
Mydata3_xyz_aug33
One more requirement
I want to extract only "aIBM_MyProjectFile" from following string below, can you please help me with this?
finaldata/mydata/aIBM_MyProjectFile.exe.ld
I've tried this but it didn't work.
select
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','([^/]*)[\.]') exp
from dual;

To extract substrings between the first pair of underscores, you need to use
regexp_substr('Mydata_xyz_aug21','_([^_]+)_', 1, 1, NULL, 1)
To get the file name without the extension, you need
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','.*/([^.]+)', 1, 1, NULL, 1)
Note that each regex contains a capturing group (a pattern inside (...)) and this value is accessed with the last 1 argument to the regexp_substr function.
The _([^_]+)_ pattern finds the first _, then places 1 or more chars other than _ into Group 1 and then matches another _.
The .*/([^.]+) pattern matches the whole text up to the last /, then captures 1 or more chars other than . into Group 1 using ([^.]+).

For the first requirement, it would suffice to use LIKE, as posted in answer above:
SELECT column
FROM table
WHERE column LIKE '%xyz%';
For your second requirement (extraction) you will have to use REGEXP_SUBSTR function:
SELECT REGEXP_SUBSTR ('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld', '.*/([^.]+)', 1, 1, NULL, 1)
FROM DUAL
I hope it helped!

Another way to do this is to skip regexp completely:
WITH
aset AS
(SELECT 'with_extension.txt' txt FROM DUAL
UNION ALL
SELECT 'without_extension' FROM DUAL)
SELECT CASE
WHEN INSTR (txt, '.', -1) > 0
THEN
SUBSTR (txt, 1, INSTR (txt, '.', -1) - 1)
ELSE
txt
END
txt
FROM aset
The result of this is
with_extension
without_extension
A BIG Caveat where the regexp is better:
My method doesn't handle this case correctly:
\this\is.a\test
So after I have gone to all this effort, stay with the regexp solutions. I'll leave this here so that others may learn from it.

Oracle Substring a column for last index of character

I am trying to do substring in Oracle using last index of a character '_'.
Not able to get it rite. Please see the query which does a substring of second occurrence of "-". But it does not work if the string only has one occurrence of "-".
SELECT NVL(SUBSTR('TEMP_ABC', 0, INSTR('TEMP_ABC', '_',1,2)-1), 'TEMP_ABC')
FROM DUAL
Result - TEMP_ABC
Expected Result - TEMP
SELECT NVL(SUBSTR('TEMP_ABC_XYZ', 0, INSTR('TEMP_ABC_XYZ', '_',1,2)-1), 'TEMP_ABC_XYZ')
FROM DUAL
Result - TEMP_ABC
Expected Result - TEMP_ABC
Any clue on what I am doing wrong here?

In instr function, if you use -1 at the last parameter, it means last occurrence of the char string.
instr(string, '_', -1) = last occurrence of _
Thus:
select substr('TEMP_ABC',1,instr('TEMP_ABC','_',-1)-1)
from dual;
Result: TEMP
select substr('TEMP_ABC_XYZ',1,instr('TEMP_ABC_XYZ','_',-1)-1)
from dual;
Result: TEMP_ABC

Oracle - Get all characters before the nth occurrence of a character

I am trying to get a query where I get all characters from a string before the 'n'the occurence of a character.
Say I could have the following strings:
'123456,123456,123456'
'123456'
'123456,123456,123456,123456,123456,123456'
'123456,123456,123456,123456,123456,123456,123456'
Now I want my query to always return everything before the 5th occurence of the comma,
Result:
'123456,123456,123456'
'123456'
'123456,123456,123456,123456,123456'
'123456,123456,123456,123456,123456'
I've been trying with some substr or regexes, but I can't get my head around this.

INSTR function has exactly what you need to find the position of n-th substring - see the occurrence parameter.
To get the part of a string till this location use SUBSTRING.
To avoid the case when there is no Nth symbol, use NVL (or COALESCE).
For example (replace 5 with N and insert your columns):
SELECT NVL(
SUBSTR(YOUR_COLUMN, 1,
INSTR(YOUR_COLUMN,',',1,5) -1),
YOUR_COLUMN)
FROM YOUR_TABLE;

You can do that:
define string_value='123456,123456';
select CASE
WHEN (length('&string_value') - length(replace('&string_value',',',null))) >=5
THEN SUBSTR('&string_value',0,INSTR('&string_value',',',1,5)-1)
ELSE '&string_value'
END as output
from dual;
output:
123456,123456
define string_value='123456,123456,123456,123456,123456,123456';
select CASE
WHEN (length('&string_value') - length(replace('&string_value',',',null))) >=5
THEN SUBSTR('&string_value',0,INSTR('&string_value',',',1,5)-1)
ELSE '&string_value'
END as output
from dual;
output:
123456,123456,123456,123456,123456
This will work event if the number of character between the commas is not always the same.

How to use regexp_substr() with group of delimiter characters?

I have a string something like this 'SERO02~~~NA_#ERO5'. I need to sub string it using delimiter ~~~. So can get SERO02 and NA_#ERO5 as result.
I create an regex experession like this:
select regexp_substr('SERO02~~~NA_#ERO5' ,'[^~~~]+',1,2) from dual;
It worked fine and returns : NA_#ERO5
But if I change the string to ERO02~NA_#ERO5 the result is still same.
But I expect the expression to return nothing since delimiter ~~~ is not found in that string. Can someone help me out to create correct expression?

[^~~~] matches a single character that is not one of the characters following the caret in the square brackets. Since all those characters are identical then [^~~~] is the same as [^~].
You can match it using:
SELECT REGEXP_SUBSTR(
'SERO02~~~NA_#ERO5',
'~~~(.*?)(~~~|$)',
1,
1,
NULL,
1
)
FROM DUAL;
Which will match ~~~ then store zero-or-more characters in a capture group (the round brackets () indicates a capture group) until it finds either ~~~ or the end-of-string. It will then return the first capture group.

You can do it without regular expressions, with a bit of logics:
with test(text) as ( select 'SERO02~~~NA_#ERO5' from dual)
select case
when instr(text, '~~~') != 0 then
substr(text, instr(text, '~~~') + 3)
else
null
end
from test
This will give the part of the string after '~~~', if it exists, null otherwise.
You can edit the ELSE part to get what you need when the input string does not contain '~~~'.
Even using regexp,to match the string '~~~', you need to write it exactly, without []; the [] is used to list a set of characters, so [aaaaa] is exactly the same than [a],while [abc] means 'a' OR 'b' OR 'c'.
With regexp, even if not necessary, one way could be the following:
substr(regexp_substr(text, '~~~.*'), 4)

In case you want all elements. Handles NULL elements too:
SQL> with tbl(str) as (
select 'SERO02~~~NA_#ERO5' from dual
)
select regexp_substr(str, '(.*?)(~~~|$)', 1, level, null, 1) element
from tbl
connect by level <= regexp_count(str, '~~~') + 1;
ELEMENT
-----------------
SERO02
NA_#ERO5
SQL>

How to replace character in SQL

I want to Replace a particular character on position 4 in sql Server ,
i know about replace or case when but my problem is that i just want to 4th position character replace ,
i am trying like
SELECT REPLACE(_NAME,0,1) AS exp FROM _EMPLOYEE
but it will not cheching 4th character
for example if _name contain IMR002001 then it should be IMR012001

Use stuff():
select stuff(_NAME, 4, 1, '#')
This replaces the substring starting at position 4 with length 1 with the string that is the fourth argument. The string can be longer or shorter than the string being replaced.
For your example:
select stuff(_NAME, 4, 1, '1')

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

split a string on multiple delimiters - sql

I have a sql that returns the example string below: Input PKIND:BCMOX:10048301- output BCMOX:10048301 I need to write code the first substring the string on - then split it on : and return the 2 & 3 item (BCMOX:10048301)

The REGEXP_SUBSTR version. Return everything between the first colon and the first hyphen. select regexp_substr('PKIND:BCMOX:10048301-', ':(.*)-', 1, 1, NULL, 1) from dual;

Related

How to get file name without extension with using Regular Expressions

Oracle Substring a column for last index of character

Oracle - Get all characters before the nth occurrence of a character

How to use regexp_substr() with group of delimiter characters?

How to replace character in SQL

Categories

Resources