Extracting certain text between two characters - sql

I have a nvarchar string from which I need to extract certain text from between characters.
Example: 1.abc.5,m001-1-Exit,822-FName-18001233321--2021-09-23 13:53:10 Thursday-m001-1-Exit-Swipe,Card NO: 822User ID: FNameName: 18001233321Dept: Read Date: 2021-09-23 13:53:10 ThursdayAddr: m001-1-ExitStatus: Swipe,07580ec2000002a52E917D0000000000372BA56E11010000
What I need:
| Name | Phone Number |
| -------- | -------------- |
| FName | 1800123321 |
My Attempt:
SELECT SUBSTRING(col, LEN(LEFT(col, CHARINDEX ('-', col))) + 1, LEN(col) - LEN(LEFT(col, CHARINDEX ('-', col))) - LEN(RIGHT(col, LEN(col) - CHARINDEX ('-', col))) - 1);

One way:
Use patindex to find "FName-"
Remove the start of the string up until and including "FName-"
Use patindex to find "--"
Remove the rest of the string from and including "--"
You can consolidate the query down to one line, but you'll find yourself repeating parts of the logic - which I like to avoid. And calculating one thing at a time makes it easier to debug.
select
A.Col
, B.StringStart
, C.NewString
, patindex('%--%',C.NewString) NewStringEnd
, substring(C.NewString,1,patindex('%--%',C.NewString)-1) -- <- Required Result
from (
values
(N'1.abc.5,m001-1-Exit,822-FName-18001233321--2021-09-23 13:53:10 Thursday-m001-1-Exit-Swipe,Card NO: 822User ID: FNameName: 18001233321Dept: Read Date: 2021-09-23 13:53:10 ThursdayAddr: m001-1-ExitStatus: Swipe,07580ec2000002a52E917D0000000000372BA56E11010000')
) A (Col)
cross apply (
values
(patindex('%FName-%',Col))
) B (StringStart)
cross apply (
values
(substring(A.Col,B.StringStart+6,len(A.Col)-B.StringStart-6))
) C (NewString);

Related

Oracle REGEXP_SUBSTR to ignore the first ocurrence of a character but include the 2nd occurence

I have a string that has this format "number - name" I'm using REGEXP_SUBSTR to split it in two separate columns one for name and one for number.
SELECT
REGEXP_SUBSTR('123 - ABC','[^-]+',1,1) AS NUM,
REGEXP_SUBSTR('123 - ABC','[^-]+',1,2) AS NAME
from dual;
But it doesn't work if the name includes a hyphen for example: ABC-Corp then the name is shown only like 'ABC' instead of 'ABC-Corp'. How can I get a regex exp to ignore everything before the first hypen and include everything after it?
You want to split the string on the first occurence of ' - '. It is a simple enough task to be efficiently performed by string functions rather than regexes:
select
substr(mycol, 1, instr(mycol, ' - ') - 1) num,
substr(mycol, instr(mycol, ' - ') + 3) name
from mytable
Demo on DB Fiddlde:
with mytable as (
select '123 - ABC' mycol from dual
union all select '123 - ABC - Corp' from dual
)
select
mycol,
substr(mycol, 1, instr(mycol, ' - ') - 1) num,
substr(mycol, instr(mycol, ' - ') + 3) name
from mytable
MYCOL | NUM | NAME
:--------------- | :-- | :---------
123 - ABC | 123 | ABC
123 - ABC - Corp | 123 | ABC - Corp
NB: #GMB solution is much better in your simple case. It's an overkill to use regular expressions for that.
tldr;
Usually it's easierr and more readable to use subexpr parameter instead of occurrence in case of such fixed masks. So you can specify full mask: \d+\s*-\s*\S+
ie numbers, then 0 or more whitespace chars, then -, again 0 or more whitespace chars and 1+ non-whitespace characters.
Then we adding () to specify subexpressions: since we need only numbers and trailing non-whitespace characters we puts them into ():
'(\d+)\s*-\s*(\S+)'
Then we just specify which subexpression we need, 1 or 2:
SELECT
REGEXP_SUBSTR(column_value,'(\d+)\s*-\s*(\S+)',1,1,null,1) AS NUM,
REGEXP_SUBSTR(column_value,'(\d+)\s*-\s*(\S+)',1,1,null,2) AS NAME
from table(sys.odcivarchar2list('123 - ABC', '123 - ABC-Corp'));
Result:
NUM NAME
---------- ----------
123 ABC
123 ABC-Corp
https://docs.oracle.com/database/121/SQLRF/functions164.htm#SQLRF06303
https://docs.oracle.com/database/121/SQLRF/ap_posix003.htm#SQLRF55544

PostgreSQL - Extract string before ending delimiter

I have a column of data that looks like this:
58,0:102,56.00
52,0:58,68
58,110
57,440.00
52,0:58,0:106,6105.95
I need to extract the character before the last delimiter (',').
Using the data above, I want to get:
102
58
58
57
106
Might be done with a regular expression in substring(). If you want:
the longest string of only digits before the last comma:
substring(data, '(\d+)\,[^,]*$')
Or you may want:
the string before the last comma (',') that's delimited at the start either by a colon (':') or the start of the string.
Could be another regexp:
substring(data, '([^:]*)\,[^,]*$')
Or this:
reverse(split_part(split_part(reverse(data), ',', 2), ':', 1))
More verbose but typically much faster than a (expensive) regular expression.
db<>fiddle here
Can't promise this is the best way to do it, but it is a way to do it:
with splits as (
select string_to_array(bar, ',') as bar_array
from foo
),
second_to_last as (
select
bar_array[cardinality(bar_array)-1] as field
from splits
)
select
field,
case
when field like '%:%' then split_part (field, ':', 2)
else field
end as last_item
from second_to_last
I went a little overkill on the CTEs, but that was to expose the logic a little better.
With a CTE that removes everything after the last comma and then splits the rest into an array:
with cte as (
select
regexp_split_to_array(
replace(left(col, length(col) - position(',' in reverse(col))), ':', ','),
','
) arr
from tablename
)
select arr[array_upper(arr, 1)] from cte
See the demo.
Results:
| result |
| ------ |
| 102 |
| 58 |
| 58 |
| 57 |
| 106 |
The following treats the source string as an "array of arrays". It seems each data element can be defined as S(x,y) and the overall string as S1:S2:...Sn.
The task then becomes to extract x from Sn.
with as_array as
( select string_to_array(S[n], ',') Sn
from (select string_to_array(col,':') S
, length(regexp_replace(col, '[^:]','','g'))+1 n
from tablename
) t
)
select Sn[array_length(Sn,1)-1] from as_array
The above extends S(x,y) to S(a,b,...,x,y) the task remains to extracting x from Sn. If it is the case that all original sub-strings S are formatted S(x,y) then the last select reduces to select Sn[1]

How to trim data with last alphanumeric key as reference

Need your help on below Data trimming or what so ever to get the result that shows also on below. I want to get the data where my reference is the last - symbol.
See example below.
FROM TO
+-------------------------------+
|ABC-1234-AR-R | ABC-1234-AR |
|ABC-1254-AR-IT | ABC-1254-AR |
|ABC-1223-AR-LTL| ABC-1223-AR |
|ABC-1234-R | ABC-1234 |
+-------------------------------+
This will give you index of last occurence of a hyphen:
LEN(data) - CHARINDEX('-', REVERSE(data)) + 1
So it's enough to take substring of this length:
SELECT
data,
SUBSTRING(data, 1, LEN(data) - CHARINDEX('-', REVERSE(data))) AS data_trimmed
FROM yourTable;
Demo
You can also use a combination of LEFT and CHARINDEX functions.
Query
select [from],
left([from], len([from]) - charindex('-', reverse([from]), 1)) as [to]
from [your_table_name];
Find a demo here

How to extract the number from a string using Oracle?

I have a string as follows: first, last (123456) the expected result should be 123456. Could someone help me in which direction should I proceed using Oracle?
It will depend on the actual pattern you care about (I assume "first" and "last" aren't literal hard-coded strings), but you will probably want to use regexp_substr.
For example, this matches anything between two brackets (which will work for your example), but you might need more sophisticated criteria if your actual examples have multiple brackets or something.
SELECT regexp_substr(COLUMN_NAME, '\(([^\)]*)\)', 1, 1, 'i', 1)
FROM TABLE_NAME
Your question is ambiguous and needs clarification. Based on your comment it appears you want to select the six digits after the left bracket. You can use the Oracle instr function to find the position of a character in a string, and then feed that into the substr to select your text.
select substr(mycol, instr(mycol, '(') + 1, 6) from mytable
Or if there are a varying number of digits between the brackets:
select substr(mycol, instr(mycol, '(') + 1, instr(mycol, ')') - instr(mycol, '(') - 1) from mytable
Find the last ( and get the sub-string after without the trailing ) and convert that to a number:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE test ( str ) AS
SELECT 'first, last (123456)' FROM DUAL UNION ALL
SELECT 'john, doe (jr) (987654321)' FROM DUAL;
Query 1:
SELECT TO_NUMBER(
TRIM(
TRAILING ')' FROM
SUBSTR(
str,
INSTR( str, '(', -1 ) + 1
)
)
) AS value
FROM test
Results:
| VALUE |
|-----------|
| 123456 |
| 987654321 |

Extract value between two characters with varying positions

NOTE: I am using TSQL
I need to be able to extract data from the middle of a string. Both the length of the data I need, and the length of the string will vary.
Here are examples of the complete string:
362 Any Rd - NewPc#:420010079274892700465647513335 - StopID:12345
362 Any Rd - NewPc#:4200644392748927004720180006426006 - StopID:12345
362 Any Rd - NewPc#:00006675214112593057 - StopID:12345
362 Random Rd - NewPc#:420063709274892700465647550149 - StopID:4567
I only need the following from the above strings:
420010079274892700465647513335
4200644392748927004720180006426006
00006675214112593057
420063709274892700465647550149
Can someone please help me figure this out?
You can use a combination of substring and charindex.
Fiddle
select
substring(somecolumn, charindex(':',somecolumn) + 1,
len(somecolumn) -
charindex('-', reverse(somecolumn)) - 1 - charindex(':',somecolumn))
from tablename
Select SUBSTRING( Note,CHARINDEX ('#:' , Note, 1 ) +2,
CHARINDEX ( ' - S' ,Note ,1 )-
CHARINDEX ('#:' , Note, 1 ) -2 )
from OER
You use 2 functions for your table oer:
1st is to find the location for ":"
CHARINDEX ('#:' , Note, 1 ) +2
The +2 is to get rid of the "#:"
You need to use this 2 times, once again for the " - S" To see how many characters you want to go.
CHARINDEX ( ' - S' ,Note ,1 )
And the 2nd function is Substring to use part of your Note
link sql fiddle:
http://sqlfiddle.com/#!3/95ce8/7/0
An example is this: You have a string and the character $
String :
aaaaa$bbbbb$ccccc
Code:
SELECT SUBSTRING('aaaaa$bbbbb$ccccc',CHARINDEX('$','aaaaa$bbbbb$ccccc')+1, CHARINDEX('$','aaaaa$bbbbb$ccccc',CHARINDEX('$','aaaaa$bbbbb$ccccc')+1) -CHARINDEX('$','aaaaa$bbbbb$ccccc')-1) as My_String
Output:
bbbbb