Get the words after specific character PLSQL - sql

i want to get words after specific character in pl/sql
for example :
text = '2 - 99 - 7051B'
I want to see 7051B which means after second '-' to the last character.
function try ( text in varchar2 )
is begin
v_textout varchar2(100) := '';
--some process
return v_textout;
end;

No need for PL/SQL to extract the desired part. Two alternative methods use REGEXP_REPLACE() and REGEXP_SUBSTR() regular expression functions respectively
WITH t(text) AS
(
SELECT '2 - 99 - 7051B' FROM dual
)
SELECT REGEXP_REPLACE(text,'(.*- )(\S+)','\2') AS first_method,
REGEXP_SUBSTR(text,'[^- ]+$') AS second_method
FROM t;
FIRST_METHOD SECOND_METHOD
------------ -------------
7051B 7051B
Demo
where the spaces after dash characters are left deliberately according to the sample, and plus(+) stands for one or more occurences for the match .

I found the solution :
select substr(text , instr(text , '-', 1, 2) + 1, length(text))
from dual;

Related

How to get first string after character Oracle SQL

I'm trying to get first string after a character.
Example is like
ABCDEF||GHJ||WERT
I need only
GHJ
I tried to use REGEXP but i couldnt do it.
Can anyone help me with please?
Thank you
Somewhat simpler:
SQL> select regexp_substr('ABCDEF||GHJ||WERT', '\w+', 1, 2) result from dual;
^
RES |
--- give me the 2nd "word"
GHJ
SQL>
which reads as: give me the 2nd word out of that string. Won't work properly if GHJ consists of several words (but that's not what your example suggests).
Something like I interpret with a separator in place, In this case it is || or | example is with oracle database
-- pattern -- > [^] represents non-matching character and + for says one or more character followed by ||
-- 3rd parameter --> starting position
-- 4th parameter --> nth occurrence
WITH tbl(str) AS
(SELECT 'ABCDEF||GHJ||WERT' str FROM dual)
SELECT regexp_substr(str
,'[^||]+'
,1
,2) output
FROM tbl;
I think the most general solution is:
WITH tbl(str) AS (
SELECT 'ABCDEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABC|DEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABClDEF||GHJ||WERT' str FROM dual
)
SELECT regexp_replace(str, '^.*\|\|(.*)\|\|.*', '\1')
FROM tbl;
Note that this works even if the individual elements contain punctuation or a single vertical bar -- which the other solutions do not. Here is a comparison.
Presumably, the double vertical bar is being used for maximum flexibility.
You should use regexp_substr function
select regexp_substr('ABCDEF||GHJ||WERT ', '\|{2}([^|]+)', 1, 1, 'i', 1) str
from dual;
STR
---
GHJ

How to extract the number from a string using Oracle?

I have a string as follows: first, last (123456) the expected result should be 123456. Could someone help me in which direction should I proceed using Oracle?
It will depend on the actual pattern you care about (I assume "first" and "last" aren't literal hard-coded strings), but you will probably want to use regexp_substr.
For example, this matches anything between two brackets (which will work for your example), but you might need more sophisticated criteria if your actual examples have multiple brackets or something.
SELECT regexp_substr(COLUMN_NAME, '\(([^\)]*)\)', 1, 1, 'i', 1)
FROM TABLE_NAME
Your question is ambiguous and needs clarification. Based on your comment it appears you want to select the six digits after the left bracket. You can use the Oracle instr function to find the position of a character in a string, and then feed that into the substr to select your text.
select substr(mycol, instr(mycol, '(') + 1, 6) from mytable
Or if there are a varying number of digits between the brackets:
select substr(mycol, instr(mycol, '(') + 1, instr(mycol, ')') - instr(mycol, '(') - 1) from mytable
Find the last ( and get the sub-string after without the trailing ) and convert that to a number:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE test ( str ) AS
SELECT 'first, last (123456)' FROM DUAL UNION ALL
SELECT 'john, doe (jr) (987654321)' FROM DUAL;
Query 1:
SELECT TO_NUMBER(
TRIM(
TRAILING ')' FROM
SUBSTR(
str,
INSTR( str, '(', -1 ) + 1
)
)
) AS value
FROM test
Results:
| VALUE |
|-----------|
| 123456 |
| 987654321 |

Query to remove all non-digit but only keep last period/dot

Struggle to design a regular expression to filter field value from varchar2 to number, so that it can remove all non-digit and only left the last period in the string, so that
"about 1,000.00" return 1000.00 or 1000
"3,000,000.000" return 300000.000 or 3000000
"3.000.000.000" return return 3000000.000 or 3000000
"a^*3^%*(C4.5d*9" return 34.59
Any method just change the string into accurate convertible string that can be converted by to_number()
I use
SELECT REGEXP_REPLACE(field_value, '[^0-9\.]+', '') from dual;
but can't resolve the 3rd case....
Because the regex in oracle are somewhat limited I don't think it's possible only using regexp_replace. You could do a workaround like this:
SELECT
CASE
WHEN last_dot < 2 THEN digits_and_dots
ELSE REPLACE(SUBSTR(digits_and_dots, 1, last_dot - 1), '.') ||
SUBSTR(digits_and_dots, last_dot)
END
FROM (
SELECT
INSTR(digits_and_dots, '.', -1) last_dot,
digits_and_dots
FROM (
SELECT
REGEXP_REPLACE(field_value, '[^0-9\.]+', '') digits_and_dots
FROM DUAL
) t
) o
Here's a way to do it, assuming there is one decimal character. The value you are working with is a string so I think of the decimal that we want to keep as a separator of the string and split it into 2 parts based on that. The first part is all characters leading up to but not including the last decimal, the second part is the last decimal and all characters after it. Then apply the replace, getting rid of everything that is not a number from the first part, and everything that is not a number or a decimal from the second part, then concatenate them together. Needs more testing with varied inputs but you get the idea. All these regular expressions are kind of expensive though so I doubt this will be the fastest solution.
with tbl(str) as (
select 'about 1,000.00' from dual union
select '3,000,000.000' from dual union
select '3.000.000.000' from dual union
select 'a^*3^%*(C4.5d*9' from dual
)
select str original,
regexp_replace(regexp_substr(str, '^(.*)\.', 1, 1, NULL, 1), '[^0-9]+', '') ||
regexp_replace(regexp_substr(str, '.*(\..*)$', 1, 1, NULL, 1), '[^0-9\.]+', '') Converted
from tbl;
SQL> /
ORIGINAL CONVERTED
--------------- ---------------
3,000,000.000 3000000.000
3.000.000.000 3000000.000
a^*3^%*(C4.5d*9 34.59
about 1,000.00 1000.00
SQL>
Shortest way is as follows:
select regexp_substr('a^*3^%*(C4.5d*9s','\d+\.\d+') from dual;
or
select regexp_replace('a^*3^%*(C4.5d*9s', '[^0.0-9]', '') from dual;

How to use regexp_substr() with group of delimiter characters?

I have a string something like this 'SERO02~~~NA_#ERO5'. I need to sub string it using delimiter ~~~. So can get SERO02 and NA_#ERO5 as result.
I create an regex experession like this:
select regexp_substr('SERO02~~~NA_#ERO5' ,'[^~~~]+',1,2) from dual;
It worked fine and returns : NA_#ERO5
But if I change the string to ERO02~NA_#ERO5 the result is still same.
But I expect the expression to return nothing since delimiter ~~~ is not found in that string. Can someone help me out to create correct expression?
[^~~~] matches a single character that is not one of the characters following the caret in the square brackets. Since all those characters are identical then [^~~~] is the same as [^~].
You can match it using:
SELECT REGEXP_SUBSTR(
'SERO02~~~NA_#ERO5',
'~~~(.*?)(~~~|$)',
1,
1,
NULL,
1
)
FROM DUAL;
Which will match ~~~ then store zero-or-more characters in a capture group (the round brackets () indicates a capture group) until it finds either ~~~ or the end-of-string. It will then return the first capture group.
You can do it without regular expressions, with a bit of logics:
with test(text) as ( select 'SERO02~~~NA_#ERO5' from dual)
select case
when instr(text, '~~~') != 0 then
substr(text, instr(text, '~~~') + 3)
else
null
end
from test
This will give the part of the string after '~~~', if it exists, null otherwise.
You can edit the ELSE part to get what you need when the input string does not contain '~~~'.
Even using regexp,to match the string '~~~', you need to write it exactly, without []; the [] is used to list a set of characters, so [aaaaa] is exactly the same than [a],while [abc] means 'a' OR 'b' OR 'c'.
With regexp, even if not necessary, one way could be the following:
substr(regexp_substr(text, '~~~.*'), 4)
In case you want all elements. Handles NULL elements too:
SQL> with tbl(str) as (
select 'SERO02~~~NA_#ERO5' from dual
)
select regexp_substr(str, '(.*?)(~~~|$)', 1, level, null, 1) element
from tbl
connect by level <= regexp_count(str, '~~~') + 1;
ELEMENT
-----------------
SERO02
NA_#ERO5
SQL>

PLSQL show digits from end of the string

I have the following problem.
There is a String:
There is something 2015.06.06. in the air 1234567 242424 2015.06.07. 12125235
I need to show only just the last date from this string: 2015.06.07.
I tried with regexp_substr with insrt but it doesn't work.
So this is just test, and if I can solve this after it with this solution I should use it for a CLOB query where there are multiple date, and I need only the last one. I know there is regexp_count, and it is help to solve this, but the database what I use is Oracle 10g so it wont work.
Can somebody help me?
The key to find the solution of this problem is the idea of reversing the words in the string presented in this answer.
Here is the possible solution:
WITH words AS
(
SELECT regexp_substr(str, '[^[:space:]]+', 1, LEVEL) word,
rownum rn
FROM (SELECT 'There is something 2015.06.06. in the air 1234567 242424 2015.06.07. 2015.06.08 2015.06.17. 2015.07.01. 12345678999 12125235' str
FROM dual) tab
CONNECT BY LEVEL <= LENGTH(str) - LENGTH(REPLACE(str, ' ')) + 1
)
, words_reversed AS
(
SELECT *
FROM words
ORDER BY rn DESC
)
SELECT regexp_substr(word, '\d{4}\.\d{2}\.\d{2}', 1, 1)
FROM words_reversed
WHERE regexp_like(word, '\d{4}\.\d{2}\.\d{2}')
AND rownum = 1;
From the documentation on regexp_substr, I see one problem immediately:
The . (period) matches any character. You need to escape those with a backslash: \. in order to match only a period character.
For reference, I am linking this post which appears to be the approach you are taking with substr and instr.
Relevant documentation from Oracle:
INSTR(string , substring [, position [, occurrence]])
When position is negative, then INSTR counts and searches backward from the end of string. The default value of position is 1, which means that the function begins searching at the beginning of string.
The problem here is that your regular expression only returns a single value, as explained here, so you will be giving the instr function the appropriate match in the case of multiple dates.
Now, because of this limitation, I recommend using the approach that was proposed in this question, namely reverse the entire string (and your regular expression, i.e. \d{2}\.\d{2}\.\d{4}) and then the first match will be the 'last match'. Then, perform another string reversal to get the original date format.
Maybe this isn't the best solution, but it should work.
There are three different PL/SQL functions that will get you there.
The INSTR function will identify where the first "period" in the date string appears.
SUBSTR applied to the entire string using the value from (1) as the start point
TO_DATE for a specific date mask: YYYY.MM.DD will convert the result from (2) into a Oracle date time type.
To make this work in procedural code, the standard blocks apply:
DECLARE
v_position pls_integer;
... other variables
BEGIN
sql code and function calls;
END
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE finddate
(column1 varchar2(11), column2 varchar2(39))
;
INSERT ALL
INTO finddate (column1, column2)
VALUES ('row1', '1234567 242424 2015.06.07. 12125235')
INTO finddate (column1, column2)
VALUES ('string2', '1234567 242424 2015.06.07. 12125235')
SELECT * FROM dual
;
Query 1:
select instr(column2,'.',1) from finddate
where column1 = 'string2'
select substr(column2,(20-4),10) from finddate
select to_date('2015.06.07','YYYY.MM.DD') from finddate
Results:
| TO_DATE('2015.06.07','YYYY.MM.DD') |
|------------------------------------|
| June, 07 2015 00:00:00 |
| June, 07 2015 00:00:00 |
Here's a way using regexp_replace() that should work with 10g, assuming the format of the lines will be the same:
with tbl(col_string) as
(
select 'There is something 2015.06.06. in the air 1234567 242424 2015.06.07. 12125235'
from dual
)
select regexp_replace(col_string, '^.*(\d{4}\.\d{2}\.\d{2})\. \d*$', '\1')
from tbl;
The regex can be read as:
^ - Match the start of the line
. - followed by any character
* - followed by 0 or more of the previous character (which is any character)
( - Start a remembered group
\d{4}\.\d{2}\.\d{2} - 4 digits followed by a literal period followed by 2 digits, etc
) - End the first remembered group
\. - followed by a literal period
- followed by a space
\d* - followed by any number of digits
$ - followed by the end of the line
regexp_replace then replaces all that with the first remembered group (\1).
Basically describe the whole line as a regular expression, group around what you want to return. You will most likely need to tweak the regex for the end of the line if it could be other characters than digits but this should give you an idea.
For the sake of argument this works too ONLY IF there are 2 occurrences of the date pattern:
with tbl(col_string) as
(
select 'There is something 2015.06.06. in the air 1234567 242424 2015.06.07. 12125235' from dual
)
select regexp_substr(col_string, '\d{4}\.\d{2}\.\d{2}', 1, 2)
from tbl;
returns the second occurrence of the pattern. I expect the above regexp_replace more accurately describes the solution.