Regex: how to get the text between a few colons? - sql

So, i have a lot of strings like the ones below in my database:
product1:1stparty:single_aduls:android:
product2:3rdparty:married_adults:ios:
product3:3rdparty:other_adults:android:
I need a regex to get only the text after the product name and before the device category. So, in the first line I'd get 1stparty:single_aduls, in the second 3rdparty:married_adults and in the third 3rdparty:other_adults. I'm stuck and can't find a way to solve that. Could anyone help me please?

As a regular expression, you can use:
select regexp_extract('product1:1stparty:single_aduls:android:', '^[^:]*:(.*):[^:]*:$')
This returns every after the first colon and before the penultimate colon.

We can try using REGEXP_REPLACE here:
SELECT REGEXP_REPLACE(val, r"^.*?:|:[^:]+:$", "") AS output
FROM yourTable;
This approach removes either the leading ...: or trailing :...: from the column, leaving behind the content you want. Here is a demo showing that the regex replacement is working:
Demo

You can also use standard split function and access result array element by index, which is quite clear to read and understand.
with a as (
select split('product1:1stparty:single_aduls:android:', ':') as splitted
)
select splitted[ordinal(2)] || ':' || splitted[ordinal (3)] as subs
from a

Consider below example
with your_table as (
select 'product1:1stparty:single_aduls:android:' txt union all
select 'product2:3rdparty:married_adults:ios:' union all
select 'product3:3rdparty:other_adults:android:'
)
select *,
(
select string_agg(part, ':' order by offset)
from unnest(split(txt, ':')) part with offset
where offset in (1, 2)
) result
from your_table
with output

Related

How do I dynamically extract substring from string?

I’m trying to dynamically extract a substring from a very long URL. For example, I may have the following URLs:
https://www.google.com/ABCDEF Version=“0.0.00.0” GHIJK
https://www.google.com/ABCDEFGH Version=“0.0.0.0” IJKLM
https://www.google.com/ABC Version=“0.0.0.00” 12345
I am trying to extract the version code only (0.0.0.0).
This is what I have so far:
SELECT SUBSTR(col, INSTR(col, ‘Version=“‘)+9)
FROM table
This query returns the following result:
0.0.00.0” GHIJK … (url continues on)
So, I attempt to find “Version” in the link, so I can start from the same position in each row. This works fine, however I’m having a hard time dynamically locating the ending quote (“). I tried using INSTR in the third parameter of my SUBSTR function, like so:
SELECT SUBSTR(col, INSTR(col, ‘Version=“‘)+9, INSTR(col, ‘“‘))
FROM table
I figured that this would find the position of the ending quote, and then use that number for the length, but it returns a strange output. I’ve also used POSITION, CHARINDEX, LENGTH, and LOCATE. None of these functions work in Oracle.
I think maybe when I put +9 after the first INSTR function, it’s setting the query to a fixed position instead of a dynamic one, but I’m not sure how else to remove ‘Version=“‘.
Here's one option (which, actually, selects what's between double quotes - that's version in your example; if there were some other similar substring, you'd get a wrong result).
with test (col) as
(select 'https://www.google.com/ABCDEF Version="0.0.00.0" GHIJK' from dual union all
select 'https://www.google.com/ABCDEFGH Version="0.0.0.0" IJKLM' from dual union all
select 'https://www.google.com/ABC Version="0.0.0.00" 12345' from dual
)
select col,
replace(regexp_substr(col, '".+"'), '"') version
from test;
which results in
https://www.google.com/ABCDEF Version="0.0.00.0" GHIJK 0.0.00.0
https://www.google.com/ABCDEFGH Version="0.0.0.0" IJKLM 0.0.0.0
https://www.google.com/ABC Version="0.0.0.00" 12345 0.0.0.00
You can still use use INSTR to locate the second " in the string, then subtract the location of the first " to get the length that you need to get. Below is an example query:
SELECT col,
SUBSTR (col, INSTR (col, '"') + 1, INSTR (col, '"', 1, 2) - INSTR (col, '"') - 1) version
FROM test;
You can use REGEXP_SUBSTR() with Version=(\d.*\d?) pattern in order to extract the piece between Version=" and "(your quotes are presumed to be regular double quotes " ")
SELECT REGEXP_SUBSTR(url,'Version="(\d.*\d)"',1,1,null,1) AS version
FROM t
where
the third argument(1) is position,
the fourth argument(1) is occurence, and especially important to use the last one as being capture group (1)
indeed using '"(\d.*\d)"' pattern is enough for the
current data set
or
REGEXP_REPLACE() with capture group \2 as
SELECT REGEXP_REPLACE(url,'^(.*Version=")([^"]*).*','\2') AS version
FROM t
Demo

How to use Charindex for one or the other character

I have a string with a bunch of numbers but it contains one letter somewhere in the center of the string. This letter can either be 'A' or 'B'. I am trying to find out the position of this letter with the Charindex() function. However it doesn't work when you have two search parameters:
select charindex('[A,B]','190118A3700000')
I tried it out with a range and wildcards but it did not work. So what I want are these two separate queries combined in one:
select charindex('A','190118A3700000')
select charindex('B','190118A3700000')
Does anybody have an idea how to do this?
Thank you!!!
Use patindex():
select patindex('%[A,B]%', '190118A3700000')
Or, if you want the first non-digit:
select patindex('%[^0-9]%', '190118A3700000')
Here is a db<>fiddle.
charindex() does not understand wildcards.
If charindex doesn't find that character, it returns 0 so all you need to do is
select col, charindex('A', col) + charindex('B', col) as position
from your_table;
Another alternative
select col, charindex('A', replace(col, 'B', 'A')) as position
from your_table;
DEMO

Concatenate & Trim String

Can anyone help me, I have a problem regarding on how can I get the below result of data. refer to below sample data. So the logic for this is first I want delete the letters before the number and if i get that same thing goes on , I will delete the numbers before the letter so I can get my desired result.
Table:
SALV3000640PIX32BLU
SALV3334470A9CARBONGRY
TP3000620PIXL128BLK
Desired Output:
PIX32BLU
A9CARBONGRY
PIXL128BLK
You need to use a combination of the SUBSTRING and PATINDEX Functions
SELECT
SUBSTRING(SUBSTRING(fielda,PATINDEX('%[^a-z]%',fielda),99),PATINDEX('%[^0-9]%',SUBSTRING(fielda,PATINDEX('%[^a-z]%',fielda),99)),99) AS youroutput
FROM yourtable
Input
yourtable
fielda
SALV3000640PIX32BLU
SALV3334470A9CARBONGRY
TP3000620PIXL128BLK
Output
youroutput
PIX32BLU
A9CARBONGRY
PIXL128BLK
SQL Fiddle:http://sqlfiddle.com/#!6/5722b6/29/0
To do this you can use
PATINDEX('%[0-9]%',FieldName)
which will give you the position of the first number, then trim off any letters before this using SUBSTRING or other string functions. (You need to trim away the first letters before continuing with the next step because unlike CHARINDEX there is no starting point parameter in the PATINDEX function).
Then on the remaining string use
PATINDEX('%[a-z]%',FieldName)
to find the position of the first letter in the remaining string. Now trim off the numbers in front using SUBSTRING etc.
You may find this other solution helpful
SQL to find first non-numeric character in a string
Try this it may helps you
;With cte (Data)
AS
(
SELECT 'SALV3000640PIX32BLU' UNION ALL
SELECT 'SALV3334470A9CARBONGRY' UNION ALL
SELECT 'SALV3334470A9CARBONGRY' UNION ALL
SELECT 'SALV3334470B9CARBONGRY' UNION ALL
SELECT 'SALV3334470D9CARBONGRY' UNION ALL
SELECT 'TP3000620PIXL128BLK'
)
SELECT * , CASE WHEN CHARINDEX('PIX',Data)>0 THEN SUBSTRING(Data,CHARINDEX('PIX',Data),LEN(Data))
WHEN CHARINDEX('A9C',Data)>0 THEN SUBSTRING(Data,CHARINDEX('A9C',Data),LEN(Data))
ELSE NULL END AS DesiredResult FROM cte
Result
Data DesiredResult
-------------------------------------
SALV3000640PIX32BLU PIX32BLU
SALV3334470A9CARBONGRY A9CARBONGRY
SALV3334470A9CARBONGRY A9CARBONGRY
SALV3334470B9CARBONGRY NULL
SALV3334470D9CARBONGRY NULL
TP3000620PIXL128BLK PIXL128BLK

T-SQL Substring - Last 3 Characters

Using T-SQL, how would I go about getting the last 3 characters of a varchar column?
So the column text is IDS_ENUM_Change_262147_190 and I need 190
SELECT RIGHT(column, 3)
That's all you need.
You can also do LEFT() in the same way.
Bear in mind if you are using this in a WHERE clause that the RIGHT() can't use any indexes.
You can use either way:
SELECT RIGHT(RTRIM(columnName), 3)
OR
SELECT SUBSTRING(columnName, LEN(columnName)-2, 3)
Because more ways to think about it are always good:
select reverse(substring(reverse(columnName), 1, 3))
declare #newdata varchar(30)
set #newdata='IDS_ENUM_Change_262147_190'
select REVERSE(substring(reverse(#newdata),0,charindex('_',reverse(#newdata))))
=== Explanation ===
I found it easier to read written like this:
SELECT
REVERSE( --4.
SUBSTRING( -- 3.
REVERSE(<field_name>),
0,
CHARINDEX( -- 2.
'<your char of choice>',
REVERSE(<field_name>) -- 1.
)
)
)
FROM
<table_name>
Reverse the text
Look for the first occurrence of a specif char (i.e. first occurrence FROM END of text). Gets the index of this char
Looks at the reversed text again. searches from index 0 to index of your char. This gives the string you are looking for, but in reverse
Reversed the reversed string to give you your desired substring
if you want to specifically find strings which ends with desired characters then this would help you...
select * from tablename where col_name like '%190'

Extract text before third - "Dash" in SQL

Can you please help to get this code for SQL?
I have column name INFO_01 which contain info like:
D10-52247-479-245 HALL SO
and I would like to extract only
D10-52247-479
I want the part of the text before the third "-" dash.
You'll need to get the position of the third dash (using instr) and then use substr to get the necessary part of the string.
with temp as (
select 'D10-52247-479-245 HALL SO' test_string from dual)
select test_string,
instr(test_string,1,3) third_dash,
substr(test_string,1,instr(test_string,1,3)-1) result
from temp
);
Here is a simple statement that should work:
SELECT SUBSTR(column, 1, INSTR(column,'-',1,3) ) FROM table;
Using a combination of SUBSTR and INSTR will return what you want:
SELECT SUBSTR('D10-52247-479-245', 0, INSTR('D10-52247-479-245', '-', -1, 1)-1) AS output
FROM DUAL
Result:
output
-------------
D10-52247-479
Use:
SELECT SUBSTR(t.column, 0, INSTR(t.column, '-', -1, 1)-1) AS output
FROM YOUR_TABLE t
Reference:
SUBSTR
INSTR
Addendum
If using Oracle10g+, you can use regex via REGEXP_SUBSTR.
I'm assuming MySQL, let me know if I'm wrong here. But using SUBSTRING_INDEX you could do the following:
SELECT SUBSTRING_INDEX(column, '-', 3)
EDIT
Appears to be oracle. Looks like we may have to resort to REGEXP_SUBSTR
SELECT REGEXP_SUBSTR(column, '^((?.*\-){2}[^\-]*)')
Can't test, so not sure what kind of result that will have...