Pull values from a string

Pull values from a string - sql

I have a column for which I need to pull the R- value.
What would be the best way to do this?
I feel like I'm close with this but having issues getting what I need
Basically need the R- value, can be different characters long, up until the Control lab.
See example below.
select SUBSTR( err_desc, 57, INSTR (err_desc, 'Control Lab:', 1, 1)-1)
from error_log
where sql_err_text = 'EXCEEDED VARIANCE LIMIT'
and year = '2022';
Example would be:
Plant: 649 Order: 2HC2204018 Year: 2022 Cycle: 01 Raw: R-66-59-18 Control Lab: WH Variance Warning: 50 Variance Limit: 100
Required output: R-66-59-18
Plant: 650 Order: 9GM2202004 Year: 2022 Cycle: 03 Raw: R-401059 Control Lab: GR
Required output: R-401059
Tried SQL above. Was getting more characters than expected.

I would use REGEXP_SUBSTR() with a capture group:
SELECT err_desc, REGEXP_SUBSTR(err_desc, 'Raw: (R(-[0-9]+)+)', 1, 1, NULL, 1) AS output
FROM error_log
WHERE sql_err_text = 'EXCEEDED VARIANCE LIMIT' AND year = '2022';

If you just want the value from the first key-value pair where the key ends in Raw then you can use:
SELECT id,
CASE
WHEN spos = 0 THEN NULL
WHEN epos = 0 THEN SUBSTR(value, spos + 6)
ELSE SUBSTR(value, spos + 6, epos - spos - 6)
END AS raw_value
FROM (
SELECT id,
' ' || value AS value,
INSTR(' ' || value, ' Raw: ') AS spos,
INSTR(' ' || value, ' ', INSTR(' ' || value, ' Raw: ') + 6) AS epos
FROM table_name
)
or, less to type but regular expressions are much slower to execute:
SELECT id,
REGEXP_SUBSTR(value, '(^| )Raw: (\S+)', 1, 1, NULL, 2) AS raw_value
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( id, value ) AS
SELECT 1, 'Plant: 649 Order: 2HC2204018 Year: 2022 Cycle: 01 Raw: R-66-59-18 Control Lab: WH Variance Warning: 50 Variance Limit: 100' FROM DUAL UNION ALL
SELECT 2, 'Plant: 650 Order: 9GM2202004 Year: 2022 Cycle: 03 Raw: R-401059 Control Lab: GR' FROM DUAL UNION ALL
SELECT 3, 'Plant: 651 Not Raw: NOT-THIS Raw: XYZ-123' FROM DUAL;
Both output:
ID
RAW_VALUE
1
R-66-59-18
2
R-401059
3
NOT-THIS
Note: If you have multiple key-value pairs which end in Raw then this (and the answers by other people) will get confused as to which key you want and return the first one that matches Raw.
If you want all the key-value pairs (assuming you use : to delimit between the key-value pairs and each value ends after the next space character) then you can use:
WITH bounds (id, value, spos, delim_pos, epos) AS (
SELECT id,
value,
1,
INSTR(value, ': ', 1),
INSTR(value, ' ', INSTR(value, ': ', 1) + 2)
FROM table_name
UNION ALL
SELECT id,
value,
epos + 1,
INSTR(value, ': ', epos + 1),
INSTR(value, ' ', INSTR(value, ': ', epos + 1) + 2)
FROM bounds
WHERE epos > 0
) SEARCH DEPTH FIRST BY id SET order_id,
SELECT id,
TRIM(SUBSTR(value, spos, delim_pos - spos)) AS key,
CASE epos
WHEN 0
THEN SUBSTR(value, delim_pos + 2)
ELSE SUBSTR(value, delim_pos + 2, epos - delim_pos - 2)
END AS value
FROM bounds;
Which, for the sample data, outputs:
ID
KEY
VALUE
1
Plant
649
1
Order
2HC2204018
1
Year
2022
1
Cycle
01
1
Raw
R-66-59-18
1
Control Lab
WH
1
Variance Warning
50
1
Variance Limit
100
2
Plant
650
2
Order
9GM2202004
2
Year
2022
2
Cycle
03
2
Raw
R-401059
2
Control Lab
GR
3
Plant
651
3
Not Raw
NOT-THIS
3
Raw
XYZ-123
If you only want the Raw values after getting all the keys then you can use a filter:
WITH bounds (id, value, spos, delim_pos, epos) AS (
SELECT id,
value,
1,
INSTR(value, ': ', 1),
INSTR(value, ' ', INSTR(value, ': ', 1) + 2)
FROM table_name
UNION ALL
SELECT id,
value,
epos + 1,
INSTR(value, ': ', epos + 1),
INSTR(value, ' ', INSTR(value, ': ', epos + 1) + 2)
FROM bounds
WHERE epos > 0
) SEARCH DEPTH FIRST BY id SET order_id,
key_value_pairs (id, key, value) AS (
SELECT id,
TRIM(SUBSTR(value, spos, delim_pos - spos)),
CASE epos
WHEN 0
THEN SUBSTR(value, delim_pos + 2)
ELSE SUBSTR(value, delim_pos + 2, epos - delim_pos - 2)
END
FROM bounds
)
SELECT *
FROM key_value_pairs
WHERE key = 'Raw';
Which outputs:
ID
KEY
VALUE
1
Raw
R-66-59-18
2
Raw
R-401059
3
Raw
XYZ-123
Which will get the correct key if there are multiple ending in Raw.
fiddle

Option with the substr + instr combination; you have to find where Raw starts, and then subtract Control and Raw position (to find substring length). +5 (as well as -5) represents "Raw", colon and space. +1 is here to properly set substring length.
SQL> with error_log (err_desc) as
2 (select 'Plant: 649 Order: 2HC2204018 Year: 2022 Cycle: 01 Raw: R-66-59-18 Control Lab: WH Variance Warning: 50 Variance Limit: 100' from dual union all
3 select 'Plant: 650 Order: 9GM2202004 Year: 2022 Cycle: 03 Raw: R-401059 Control Lab: GR' from dual
4 )
5 select substr(err_desc, instr(err_desc, 'Raw') + 5,
6 instr(err_desc, 'Control') - instr(err_desc, 'Raw') - 5 + 1
7 ) result
8 from error_log;
RESULT
--------------------------------------------------------------------------------
R-66-59-18 C
R-401059 C
SQL>

Related

Regex values after special character with empty values

I am struggle with regex to split spring into columns in Oracle database.
select (REGEXP_SUBSTR(replace('1:::9999', ' ',''), '[^: ]+', 1, 4)) from dual;
I need to obtain 4th value from that string as a column value, sometimes values at position 2,3 are empty and my query doesn't work. I am trying to figure out what regex will work

You can use
select (REGEXP_SUBSTR(replace('1:::9999', ' ',''), '([^: ]*)(:|$)', 1, 4, 'i', 1)) from dual;
Here, the ([^: ]*)(:|$) matches
([^: ]*) - Group 1: any zero or more chars other than : and space
(:|$) - Group 2, either : or end of string.

You do not need a (slower) regex for this task, use simple substr/instr functions:
with input_(val) as (
select '1:::9999' from dual
union all
select '1:2::' from dual
union all
select '1:2::3:5' from dual
)
, replaced as (
select input_.*, replace(val, ' ', '') as val_replaced
from input_
)
select
val,
substr(
val_replaced,
/*Locate the first occurrence of a colon and get a substring ...*/
instr(val_replaced, ':', 1, 3) + 1,
/*.. until the end, if the next colon is absent, or until the next colon*/
nvl(nullif(instr(val_replaced, ':', 1, 4), 0), length(val_replaced) + 1) - instr(val_replaced, ':', 1, 3) - 1
) as col
from replaced
VAL
COL
1:::9999
9999
1:2::
null
1:2::3:5
3
fiddle with performance difference.

I need to Update my row with english version

I have a column which includes rows like this : MER.Fiyatlandırma Müdür Yardımcısı.
SELECT name, SUBSTR(
name,
INSTR(name, '.', 1, 1),
INSTR(name, '.', 1, 2) + 1 - INSTR(name, '.', 1, 1)
) AS deneme
FROM HR_ALL_POSITIONS_F;
I used this code and my rows looks like .Fiyatlandırma Müdür Yardımcısı.
Also ı have an excel file which includes .Fiyatlandırma Müdür Yardımcısı. this row and english version.
I need to change .Fiyatlandırma Müdür Yardımcısı. this to Price Vice Manager.
How can ı do that.
UPDATE denememusa123
SET denememusa123.eklenecekkolon = (SELECT ENGLISHPOSITION FROM pozisyontanimlama)
WHERE (SELECT SUBSTR (name,
INSTR (name,
'.',
1,
1),
INSTR (name,
'.',
1,
2)
+ 1
- INSTR (name,
'.',
1,
1))
FROM HR_ALL_POSITIONS_F) = (Select TURKISHPOSITION FROM pozisyontanimlama);
(I tired this but it is not working.
ora-01427:single-row subquery returns more than one row )
Help Please.

You can use a MERGE statement:
MERGE INTO denememusa123 dst
USING (
WITH split_denememusa123 (rid, prefix, position, suffix) AS (
SELECT rowid AS rid,
SUBSTR(
eklenecekkolon,
1,
INSTR(eklenecekkolon, '.', 1, 1)
) AS prefix,
SUBSTR(
eklenecekkolon,
INSTR(eklenecekkolon, '.', 1, 1) + 1,
INSTR(eklenecekkolon, '.', 1, 2) - INSTR(eklenecekkolon, '.', 1, 1) - 1
) AS position,
SUBSTR(
eklenecekkolon,
INSTR(eklenecekkolon, '.', 1, 2)
) AS suffix
FROM denememusa123
WHERE INSTR(eklenecekkolon, '.', 1, 2) > 0
)
SELECT s.rid, s.prefix || p.englishposition || s.suffix AS position
FROM split_denememusa123 s
INNER JOIN pozisyontanimlama p
ON (p.turkishposition = s.position)
) src
ON (src.rid = dst.ROWID)
WHEN MATCHED THEN
UPDATE SET dst.eklenecekkolon = src.position;
Which, for the sample data:
CREATE TABLE denememusa123 (eklenecekkolon) AS
SELECT 'MER.Fiyatlandırma Müdür Yardımcısı.XYZ' FROM DUAL;
CREATE TABLE pozisyontanimlama (turkishposition, englishposition) AS
SELECT 'Fiyatlandırma Müdür Yardımcısı', 'Price Vice Manager' FROM DUAL
Then after the MERGE the table contains:
EKLENECEKKOLON
MER.Price Vice Manager.XYZ
fiddle

How to return a variable-langth string from a string

I have the following dataset:
Ident
Script
ID1
Var_xxx_calc + Var_yyy_db + Var_zzz_calc
ID2
Var_xxx_calc + Var_zzz_db
Is there any way to split this up into the following table?
Ident
Script
Var1
Var2
Var3
ID1
Var_xxx_calc + Var_yyy_db + Var_zzz_calc
Var_xxx_calc
Var_yyy_db
Var_zzz_calc
ID2
if Var_xxx_calc + Var_zzz_db > 10 then 'OK' else 'NOK'
Var_xxx_calc
Var_zzz_db
null
Extra difficulty:
the Var_% all have different lengths, I only know they start with 'Var_'
I use Oracle Production version 19.12.0.0.0

Tried to find a solution and ended with a limited one. I use a delimiter that is forcefully implanted into the SCRIPT instead of any possible character that could be infront of or just after your 'Var_something' string. You will see it in code as "endless" Replaces. The idea is to set a delimiter if it is posssible just after the string you are trying to separate.
Here is the code with some sample data generaated by WITH clause:
WITH
tbl AS
(
Select 'ID_1' "IDENT", 'Var_xxxxx_calculus + Var_yyy_dbase + Var_zz_calc' "SCRIPT" From Dual Union All
Select 'ID_2' "IDENT", 'Var_xxxxx_calcul + Var_yyy_dbase' "SCRIPT" From Dual Union All
Select 'ID_3' "IDENT", 'If Var_xxx_calc + Var_zzz_db* 2 > 10 then "OK" else "NOK"' "SCRIPT" From Dual Union All
Select 'ID_4' "IDENT", 'Some other text without Var followed by underscore' "SCRIPT" From Dual Union All
Select 'ID_5' "IDENT", 'And some with "Var_" Var_A, Var_B, Var_C in it' "SCRIPT" From Dual
)
SELECT
IDENT "IDENT",
SCRIPT "SCRIPT",
CASE
WHEN VAR_COUNT > 0 THEN
SubStr( SCRIPT_DELIMITED,
InStr(SCRIPT_DELIMITED, 'Var_', 1, 1),
InStr(SCRIPT_DELIMITED, ';', InStr(SCRIPT_DELIMITED, 'Var_', 1, 1), 1) - InStr(SCRIPT_DELIMITED, 'Var_', 1, 1)
)
END "VAR_1",
CASE
WHEN VAR_COUNT > 1 THEN
SubStr( SCRIPT_DELIMITED,
InStr(SCRIPT_DELIMITED, 'Var_', 1, 2),
InStr(SCRIPT_DELIMITED, ';', InStr(SCRIPT_DELIMITED, 'Var_', 1, 2), 1) - InStr(SCRIPT_DELIMITED, 'Var_', 1, 2)
)
END "VAR_2",
CASE
WHEN VAR_COUNT > 2 THEN
SubStr( SCRIPT_DELIMITED,
InStr(SCRIPT_DELIMITED, 'Var_', 1, 3),
InStr(SCRIPT_DELIMITED, ';', InStr(SCRIPT_DELIMITED, 'Var_', 1, 3), 1) - InStr(SCRIPT_DELIMITED, 'Var_', 1, 3)
)
END "VAR_3",
CASE
WHEN VAR_COUNT > 3 THEN
SubStr( SCRIPT_DELIMITED,
InStr(SCRIPT_DELIMITED, 'Var_', 1, 4),
InStr(SCRIPT_DELIMITED, ';', InStr(SCRIPT_DELIMITED, 'Var_', 1, 4), 1) - InStr(SCRIPT_DELIMITED, 'Var_', 1, 4)
)
END "VAR_4"
FROM
(
SELECT
IDENT "IDENT",
SCRIPT "SCRIPT",
CASE
WHEN InStr(SCRIPT, 'Var_', 1, 4) > 0 THEN 4
WHEN InStr(SCRIPT, 'Var_', 1, 3) > 0 THEN 3
WHEN InStr(SCRIPT, 'Var_', 1, 2) > 0 THEN 2
WHEN InStr(SCRIPT, 'Var_', 1, 1) > 0 THEN 1
ELSE 0
END "VAR_COUNT",
Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(SCRIPT, ' ', ';'), '+', ';'), '>', ';'), '<', ';'), '=', ';'), '"', ';'), '*', ';'), ',', ';') || ';' "SCRIPT_DELIMITED"
FROM
tbl
)
It is limited here with four variables selected for separation and with a condition that there is no other separation character than those treated with Replace functions.
The result with sample data is:
-- IDENT SCRIPT VAR_1 VAR_2 VAR_3 VAR_4
-- ----- --------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- ---------------------------------------------------------- ----------------------------------------------------------
-- ID_1 Var_xxxxx_calculus + Var_yyy_dbase + Var_zz_calc Var_xxxxx_calculus Var_yyy_dbase Var_zz_calc
-- ID_2 Var_xxxxx_calcul + Var_yyy_dbase Var_xxxxx_calcul Var_yyy_dbase
-- ID_3 If Var_xxx_calc + Var_zzz_db* 2 > 10 then "OK" else "NOK" Var_xxx_calc Var_zzz_db
-- ID_4 Some other text without Var followed by underscore
-- ID_5 And some with "Var_" Var_A, Var_B, Var_C in it Var_ Var_A Var_B Var_C
Regards...

Here's a solution using REGEXP_SUBSTR(). In the WITH clause, the 'tbl' table just sets up sample data from your original post. The select gets the first, second and third occurances of the pattern 'Var_', followed by one or more non-greedy characters, another underscore, another set of one or more non-greedy characters, where followed by a space and a character OR the end of the line. If the Var value could only contain alpha characters it would be better to specify that in order to tighten up the match and not risk including accidental values.
Caveat: It only selects 3 values per your original post.
WITH tbl(ident, script) AS (
SELECT 'ID1', 'Var_xxx_calc + Var_yyy_db + Var_zzz_calc' FROM dual UNION ALL
SELECT 'ID2', 'Var_xxx_calc + Var_zzz_db' FROM dual UNION ALL
SELECT 'ID3', 'Var_xxx_calc + Var_yyy_db + Var_zzz_calc' FROM dual UNION ALL
SELECT 'ID4', 'IF Var_xxx_calc + Var_zzz_db > 10 THEN ''OK'' ELSE ''NOK''' FROM dual
)
SELECT ident,
REGEXP_SUBSTR(script, '(Var_.+?_.+?)( .|$)', 1, 1, NULL, 1) AS Var1,
REGEXP_SUBSTR(script, '(Var_.+?_.+?)( .|$)', 1, 2, NULL, 1) AS Var2,
REGEXP_SUBSTR(script, '(Var_.+?_.+?)( .|$)', 1, 3, NULL, 1) AS Var3
FROM tbl;
IDENT VAR1 VAR2 VAR3
----- --------------- --------------- ---------------
ID1 Var_xxx_calc Var_yyy_db Var_zzz_calc
ID2 Var_xxx_calc Var_zzz_db
ID3 Var_xxx_calc Var_yyy_db Var_zzz_calc
ID4 Var_xxx_calc Var_zzz_db
4 rows selected.

How do I get substring after a character when the occurance of the character keeps changing

Example
123\.456.578.910.ABC
123\.456.578.910
Expected result
123\.456.578
123\.456.578
For the both the inputs I should get only the first 3
I tried the regexp and substring and instr but I’m not getting the results

We can use REGEXP_SUBSTR here with a capture group:
SELECT REGEXP_SUBSTR(col, '^(\d+(\.\d+)*)', 1, 1, NULL, 1)
FROM yourTable;
Demo

Traditional, substr + instr combination is another option:
Sample data:
SQL> with test (col) as
2 (select '123\.456.578.910.ABC' from dual union all
3 select '123\.456.578.910' from dual
4 )
Query begins here:
5 select col,
6 substr(col, 1, instr(col, '.', 1, 3) - 1) result
7 from test;
COL RESULT
-------------------- --------------------
123\.456.578.910.ABC 123\.456.578
123\.456.578.910 123\.456.578
SQL>

If you value will always have at least 3 . characters then you can use:
SELECT value,
SUBSTR(value, 1, INSTR(value, '.', 1, 3) - 1) AS expected
FROM table_name;
If it may have fewer and you want the entire string in those cases then:
SELECT value,
CASE INSTR(value, '.', 1, 3)
WHEN 0
THEN value
ELSE SUBSTR(value, 1, INSTR(value, '.', 1, 3) - 1)
END AS expected
FROM table_name;
Which, for your sample data:
CREATE TABLE table_name (value) AS
SELECT '123\.456.578.910.ABC' FROM DUAL UNION ALL
SELECT '123\.456.578.910' FROM DUAL;
Both outputs:
VALUE
EXPECTED
123.456.578.910.ABC
123.456.578
123.456.578.910
123.456.578
db<>fiddle here

Substring between characters in Oracle 9i

In later versions I can use this regexp_substr:
SELECT
ID,
regexp_substr(ID, '[^.]+', 1, 2) DATA 1,
regexp_substr(ID, '[^.]+', 1, 3) DATA 2
FROM employees
Table: Employees
ID
--------------------------
2017.1.3001-ABC.01.01
2017.2.3002-ABCD.02.02
2017.303.3003-ABC.03.03
2017.404.3004-ABCD.04.04
Expected output:
ID DATA 1 DATA 2
------------------------ ------ ---------
2017.1.3001-ABC.01.01 1 3001-ABC
2017.2.3002-ABCD.02.02 2 3002-ABCD
2017.303.3003-ABC.03.03 303 3003-ABC
2017.404.3004-ABCD.04.04 404 3004-ABCD
Please help me to get the sub-string between . characters in ID column in SQL Oracle 9i.

You don't need regular expressions - just use SUBSTR and INSTR:
SELECT id,
SUBSTR( id, dot1 + 1, dot2 - dot1 - 1) AS data1,
SUBSTR( id, dot2 + 1, dot3 - dot2 - 1) AS data2
FROM (
SELECT id,
INSTR( id, '.', 1, 1 ) AS dot1,
INSTR( id, '.', 1, 2 ) AS dot2,
INSTR( id, '.', 1, 3 ) AS dot3
FROM employees
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pull values from a string - sql

I would use REGEXP_SUBSTR() with a capture group: SELECT err_desc, REGEXP_SUBSTR(err_desc, 'Raw: (R(-[0-9]+)+)', 1, 1, NULL, 1) AS output FROM error_log WHERE sql_err_text = 'EXCEEDED VARIANCE LIMIT' AND year = '2022';

Related

Regex values after special character with empty values

I need to Update my row with english version

How to return a variable-langth string from a string

How do I get substring after a character when the occurance of the character keeps changing

Substring between characters in Oracle 9i

Categories

Resources