How to get a specific substring from a column in sql - sql

I have a Orders table that has one of the columns called "details" as:
Contact ID: A18YTX7GWEJRU8 City/Site and Site Name: Orlando - Orlando (UFL4) Date of Call (MM/DD/YYYY): 01/23/2017 Time of Call (Local Time): 16:44 Order ID(s): 112-0654231-9637802 Call Summary: Cx did not receive. Order marked as delivered to doorstep at 16:27 created by flx-cstech on behalf of sssmiley.
There are different cell values in that column. Also could be like:
Short Description: Dry Ice Retrieval Please enter the following information for the site ops to pick up the dry ice from the customer: Contact ID:AD3R60PA1QCCF Order ID:112-6254812-3186644
Or anything else.
I just want to extract the Order ID(s): 112-0654231-9637802 part from it. How do I do that?

SELECT REGEXP_SUBSTR(
your_column,
'Order\s+ID(\s*\(s\))?:\s*\d{3}-\d{7}-\d{7}'
)
FROM your_table
To just get the number you can wrap the number in a capture group:
SELECT REGEXP_SUBSTR(
your_column,
'Order\s+ID(\s*\(s\))?:\s*(\d{3}-\d{7}-\d{7})',
1, -- Start from the 1st character
1, -- Get the 1st match
NULL, -- Apply default flags
2 -- Get the 2nd capture group
)
FROM your_table
Or, if you do not have anything else with the same 3-digit, dash, 7-digit, dash, 7-digit format:
SELECT REGEXP_SUBSTR(
your_column,
'\d{3}-\d{7}-\d{7}',
)
FROM your_table

Your string looks like a fixed format string, so the simplest way would be:
select substr(detail, 160, 31)

https://docs.oracle.com/cd/B12037_01/appdev.101/b10795/adfns_re.htm
REGEXP_LIKE This function searches a character column for a pattern. Use this
function in the WHERE clause of a query to return rows matching the
regular expression you specify.
and similar functions

Related

Returning a specific substring from ORACLE SQL string using REGEXP_SUBSTR

I am having a difficult time trying to return a specific section of string from a field (BSE.NOTES) using REGEXT_SUBSTR. For my query, I have a VARCHAR2 field with a specific text that I would like to return:
Hospital Dept Name Case: SP12-34567 Authorizing Provider: D. K, MD MSCR Collected: 07/09/2021 12:49 PM Ordering Location: Hospital Received: 07/09/2021 03:23 PM Pathologist: D. L., MD Specimens: A) - Body part 1 B) - Body part 2
From this text, I need to return the string "Case: SP-***" for each record. I tried using the following code, but only get NULL values:
REGEXP_SUBSTR(BSE.NOTES, '(\S|^)(Case\:\s)([0-9]\-\[0-9])', 1, 1, NULL) AS CASE_NUMB
I am not very versed and using regexp_substr() so any help is greatly appreciated!
Your pattern is looking for a single numeric digit either side of the -, and doesn't allow for the 'SP' part; it will also only match after the start of the line or a non-whitespace character, and your parentheses don't look right. And you haven't specified that you want a sub-pattern (read more in the docs).
This will get the value you said you wanted:
select REGEXP_SUBSTR(BSE.NOTES, '(\s|^)(Case:\sSP[0-9]+-[0-9]+)', 1, 1, null, 2) AS CASE_NUMB
from bse
CASE_NUMB
Case: SP12-34567
Or if you actually only want the second part of that you can add more sub-patterns:
select REGEXP_SUBSTR(BSE.NOTES, '(\s|^)(Case:\s(SP[0-9]+-[0-9]+))', 1, 1, null, 3) AS CASE_NUMB
from bse
CASE_NUMB
SP12-34567
fiddle
Using regular expressions to extract data from fields like this is a bit error-prone though, particularly where it's essentially free-form text that could have odd capitalisation (you can make the third argument 'i' to help with that), spacing (maybe make the second \s optional, as \s?), typos or other oddities... And you may not need the anchor/whitespace at the start, but it might not hurt, particularly if you do go case-insensitive.
Do you mean this?
WITH
bse(notes) AS (
SELECT 'Hospital Dept Name Case: SP12-34567 Authorizing Provider: D. K,'
||' MD MSCR Collected: 07/09/2021 12:49 PM Ordering Location:'
||' Hospital Received: 07/09/2021 03:23 PM Pathologist: D. L.'
||', MD Specimens: A) - Body part 1 B) - Body part 2'
FROM dual
)
SELECT
REGEXP_SUBSTR(
bse.notes -- the string
, 'Case:\s*SP\d\d-\d+' -- the pattern "Case:",
-- zero, one or more whitespace (\s*),
-- "SP", two digits (\d\d), a hyphen,
-- one or more digits (\d+)
) AS extr
FROM bse;
-- out extr
-- out ------------------
-- out Case: SP12-34567

How do I extract consonants from a string field?

How do I extract only the consonants from a field in records that contain names?
For example, if I had the following record in the People table:
Field
Value
Name
Richard
How could I extract only the consonants in "Richard" to get "R,c,r,d"?
If you mean "how can I remove all vowels from the input" so that 'Richard' becomes 'Rchrd', then you can use the translate function as Boneist has shown, but with a couple more subtle additions.
First, you can completely remove a character with translate, if it appears in the second argument and it doesn't have a corresponding "translate to" character in the third argument.
Second, alas, if the third (and last) argument to translate is null the function returns null (and the same if the last argument is the empty string; there is a very small number of instances where Oracle does not treat the empty string as null, but this is not one of them). So, to make the whole thing work, you need to add an extra character to both the second and the third argument - a character you do NOT want to remove. It may be anything (it doesn't even need to appear in the input string), just not one of the characters to remove. In the illustration below I use the period character (.) but you can use any other character - just not a vowel.
Pay attention too to upper vs lower case letters. Ending up with:
with
sample_inputs (name) as (
select 'Richard' from dual union all
select 'Aliosha' from dual union all
select 'Ai' from dual union all
select 'Ng' from dual
)
select name, translate(name, '.aeiouAEIOU', '.') as consonants
from sample_inputs
;
NAME CONSONANTS
------- ----------
Richard Rchrd
Aliosha lsh
Ai
Ng Ng
Should be able to string a couple replace functions together
Select replace(replace(Value, 'A', ''), 'E', '')),...etc
You can easily do this with the translate() function, e.g.:
WITH people AS (SELECT 'Name' field, 'Richard' val FROM dual UNION ALL
SELECT 'Name' field, 'Siobhan' val FROM dual)
SELECT field, val, TRANSLATE(val, 'aeiou', ',,,,,') updated_val
FROM people;
FIELD VAL UPDATED_VAL
----- ------- -----------
Name Richard R,ch,rd
Name Siobhan S,,bh,n
The translate function simply takes a list of characters and - based on the second list of characters, which defines the translation - translates the input string.
So in the above example, the a (first character in the first list) becomes a , (first character in the second list), the e (second character in the first list) becomes a , (second character in the second list), etc.
N.B. I really, really hope your key-value table is just a made-up example for the situation you're trying to solve, and not an actual production table; in general, key-value tables are a terrible idea in a relational database!

fixed number format with different lengths in Oracle

I need help with a Oracle Query
I have a query:
scenario 1: select to_char('1737388250',what format???) from dual;
expected output: 173,7388250
scenario 2: select to_char('173738825034',what format??) from dual;
expected output: 173,738825034
scenario 3: select to_char('17373882',what format??) from dual;
expected output: 173,73882
I need a query to satify all above scenarios?
Can some one help please?
It is possible to get the desired result with a customized format model given to to_char; I show one example below. However, any solution along these lines is just a hack (a solution that should work correctly in all cases, but using features of the language in ways they weren't intended to be used).
Here is one example - this will work if your "inputs" are positive integers greater than 999 (that is: at least four digits).
with
sample_data (num) as (
select 1737388250 from dual union all
select 12338 from dual
)
select num, to_char(num, rpad('fm999G', length(num) + 3, '9')) as formatted
from sample_data
;
NUM FORMATTED
---------- ------------
1737388250 173,7388250
12338 123,38
This assumes comma is the "group separator" in nls_numeric_characters; if it isn't, that can be controlled with the third argument to to_char. Note that the format modifier fm is needed so that no space is prepended to the resulting string; and the +3 in the second argument to rpad accounts for the extra characters in the format model (f, m and G).
You can try
select TO_CHAR(1737388250, '999,99999999999') from dual;
Take a look here
Your requirement is different so you can use substr and concatanation as follows:
select substr(your_number,1,3)
|| case when your_number >= 1000 then ',' end
|| substr(1737388250,4)
from dual;
Db<>fiddle
Your "number" is enclosed in single-quotes. This makes it a character string, albeit a string of only numeric characters. But a character string, nonetheless. So it makes no sense to pass a character string to TO_CHAR.
Everyone's suggestions are eliding over this and useing and using an actual number .. notice the lack of single-quotes in their code.
You say you always want a comma after the first three "numbers" (characters), which makes no sense from a numerical/mathematical sense. So just use INSTR and insert the comma:
select substr('123456789',1,3)||','||substr('123456789',4) from dual:
If the source data is actually a number, then pass it to to_char, and wrap that in substr:
select substr(to_char(123456789),1,3)||','||substr(to_char(123456789,4) from dual:

How to extract multiple dates from varchar2(4000) multiline string using sql?

I have two columns ID (NUMBER), DESCRIPTION (VARCHAR2(4000)) in original table
DESCRIPTION column has multi line strings.
I need to extract dates from each line of the string and also need to find earliest date. so the result would look like in expected result table.
Origional result:
Expected Table:
Using this query:
to_date((regexp_substr(A.Description , '\d{1,2}/\d{1,2}/\d{4}')), 'MM-DD-YYYY')
I was able to extract date from the first line
Discontinued:09/10/2015:Rappaport Family Institute for Research:;
only, but not from the other two.
OK, I think I found a solution similar to the other post, but simpler. FYI. regexp_substr() function only returns one match. Here is an example with a string with embedded line feeds (really does not matter, but added to show it will work in this case):
WITH A AS
(SELECT 'this is a test:12/01/2015 01/05/2018'
|| chr(13)
||chr(10)
|| ' this is the 2nd line: 07/07/2017' Description
FROM dual
)
SELECT to_date(regexp_substr(A.Description , '\d{1,2}/\d{1,2}/\d{4}',1,level),'MM/DD/YYYY')
FROM A
CONNECT BY level <= regexp_count(a.description, '\d{1,2}/\d{1,2}/\d{4}')
Output:
12/01/2015
01/05/2018
07/07/2017
If you are not familiar with hierarchical queries in oracle, "level" is a pseudo-column. By using that as the 3rd parameter (occurrence) in the regexp_substr function, each "level" will start the pattern match after the prior found substring. regexp_count will count the #times the pattern is matched, so we keep parsing the sting, moving over one occurrence until the max #of matches is reached.

How to extract group from regular expression in Oracle?

I got this query and want to extract the value between the brackets.
select de_desc, regexp_substr(de_desc, '\[(.+)\]', 1)
from DATABASE
where col_name like '[%]';
It however gives me the value with the brackets such as "[TEST]". I just want "TEST". How do I modify the query to get it?
The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.
In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;
Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.
10g does not appear to have this option, but in your case you can achieve the same result with:
select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);
since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)
You need to do a replace and use a regex pattern that matches the whole string.
select regexp_replace(de_desc, '.*\[(.+)\].*', '\1') from DATABASE;