Extract text before third - "Dash" in SQL - sql

Can you please help to get this code for SQL?
I have column name INFO_01 which contain info like:
D10-52247-479-245 HALL SO
and I would like to extract only
D10-52247-479
I want the part of the text before the third "-" dash.

You'll need to get the position of the third dash (using instr) and then use substr to get the necessary part of the string.
with temp as (
select 'D10-52247-479-245 HALL SO' test_string from dual)
select test_string,
instr(test_string,1,3) third_dash,
substr(test_string,1,instr(test_string,1,3)-1) result
from temp
);

Here is a simple statement that should work:
SELECT SUBSTR(column, 1, INSTR(column,'-',1,3) ) FROM table;

Using a combination of SUBSTR and INSTR will return what you want:
SELECT SUBSTR('D10-52247-479-245', 0, INSTR('D10-52247-479-245', '-', -1, 1)-1) AS output
FROM DUAL
Result:
output
-------------
D10-52247-479
Use:
SELECT SUBSTR(t.column, 0, INSTR(t.column, '-', -1, 1)-1) AS output
FROM YOUR_TABLE t
Reference:
SUBSTR
INSTR
Addendum
If using Oracle10g+, you can use regex via REGEXP_SUBSTR.

I'm assuming MySQL, let me know if I'm wrong here. But using SUBSTRING_INDEX you could do the following:
SELECT SUBSTRING_INDEX(column, '-', 3)
EDIT
Appears to be oracle. Looks like we may have to resort to REGEXP_SUBSTR
SELECT REGEXP_SUBSTR(column, '^((?.*\-){2}[^\-]*)')
Can't test, so not sure what kind of result that will have...

Related

DB2 - How to retrieve the last substring starting from the end

i'm trying to retrieve the last substring from a string, starting from the end.
Here it is my dataset:
Input:
BRAND_Arnette
BRAND_Persol
MODEL_CODE_DISPLAY_226781
Output:
Arnette
Persol
226781
What i 've managed to do is to retrieve what i need, but i'm not using an universal approach, because i'm considerging always the latest 10 chars, starting from the right:
SELECT
SUBSTR(RIGHT(rtrim(cast(attrval.IDENTIFIER as char(50))), 10), LOCATE('_',RIGHT(rtrim(cast(attrval.IDENTIFIER as char(50))), 10))+1)
FROM ...
How can this select be edited so it can be always valid? Thanks
Try the following expression:
SUBSTR (identifier, LOCATE_IN_STRING (identifier, '_', -1) + 1)
dbfiddle example.
I would suggest regexp_substr():
select regexp_substr(identifier, '[^_]+$')
Here is a db<>fiddle.

Regex: how to get the text between a few colons?

So, i have a lot of strings like the ones below in my database:
product1:1stparty:single_aduls:android:
product2:3rdparty:married_adults:ios:
product3:3rdparty:other_adults:android:
I need a regex to get only the text after the product name and before the device category. So, in the first line I'd get 1stparty:single_aduls, in the second 3rdparty:married_adults and in the third 3rdparty:other_adults. I'm stuck and can't find a way to solve that. Could anyone help me please?
As a regular expression, you can use:
select regexp_extract('product1:1stparty:single_aduls:android:', '^[^:]*:(.*):[^:]*:$')
This returns every after the first colon and before the penultimate colon.
We can try using REGEXP_REPLACE here:
SELECT REGEXP_REPLACE(val, r"^.*?:|:[^:]+:$", "") AS output
FROM yourTable;
This approach removes either the leading ...: or trailing :...: from the column, leaving behind the content you want. Here is a demo showing that the regex replacement is working:
Demo
You can also use standard split function and access result array element by index, which is quite clear to read and understand.
with a as (
select split('product1:1stparty:single_aduls:android:', ':') as splitted
)
select splitted[ordinal(2)] || ':' || splitted[ordinal (3)] as subs
from a
Consider below example
with your_table as (
select 'product1:1stparty:single_aduls:android:' txt union all
select 'product2:3rdparty:married_adults:ios:' union all
select 'product3:3rdparty:other_adults:android:'
)
select *,
(
select string_agg(part, ':' order by offset)
from unnest(split(txt, ':')) part with offset
where offset in (1, 2)
) result
from your_table
with output

Removing part of the string

SQL....On my table I have attribute table with “Pat0700-1700” on my report I want to drop the Pat and only display 0700-1700. How would I accomplish this on SQL. I have search and tried the substring with neg results.
On these following RDBMS:
Oracle
MySQL
DB2
StandardSQL
you can try with the function SUBSTR():
SELECT SUBSTR(<column>, 4) AS substr_string
FROM <table>
OUTPUT:
substr_string
-------------
0700-1700
The standard SQL method would be replace():
select replace(col, 'Pat', '')
Given that the rest of the string has a fixed format -- 9 characters -- you might also find that one of these is appropriate (and more general):
select right(col, 9)
select substr(col, 4, 9) -- or perhaps substring()

How to get file name without extension with using Regular Expressions

I have a field with following values, now i want to extract only those rows with "xyz" in the field value mentioned below, can you please help?
Mydata_xyz_aug21
Mydata2_zzz_aug22
Mydata3_xyz_aug33
One more requirement
I want to extract only "aIBM_MyProjectFile" from following string below, can you please help me with this?
finaldata/mydata/aIBM_MyProjectFile.exe.ld
I've tried this but it didn't work.
select
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','([^/]*)[\.]') exp
from dual;
To extract substrings between the first pair of underscores, you need to use
regexp_substr('Mydata_xyz_aug21','_([^_]+)_', 1, 1, NULL, 1)
To get the file name without the extension, you need
regexp_substr('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld','.*/([^.]+)', 1, 1, NULL, 1)
Note that each regex contains a capturing group (a pattern inside (...)) and this value is accessed with the last 1 argument to the regexp_substr function.
The _([^_]+)_ pattern finds the first _, then places 1 or more chars other than _ into Group 1 and then matches another _.
The .*/([^.]+) pattern matches the whole text up to the last /, then captures 1 or more chars other than . into Group 1 using ([^.]+).
For the first requirement, it would suffice to use LIKE, as posted in answer above:
SELECT column
FROM table
WHERE column LIKE '%xyz%';
For your second requirement (extraction) you will have to use REGEXP_SUBSTR function:
SELECT REGEXP_SUBSTR ('FinalProject/MyProject/aIBM_MyProjectFile.exe.ld', '.*/([^.]+)', 1, 1, NULL, 1)
FROM DUAL
I hope it helped!
Another way to do this is to skip regexp completely:
WITH
aset AS
(SELECT 'with_extension.txt' txt FROM DUAL
UNION ALL
SELECT 'without_extension' FROM DUAL)
SELECT CASE
WHEN INSTR (txt, '.', -1) > 0
THEN
SUBSTR (txt, 1, INSTR (txt, '.', -1) - 1)
ELSE
txt
END
txt
FROM aset
The result of this is
with_extension
without_extension
A BIG Caveat where the regexp is better:
My method doesn't handle this case correctly:
\this\is.a\test
So after I have gone to all this effort, stay with the regexp solutions. I'll leave this here so that others may learn from it.

Oracle - REGEXP_SUBSTR leading zeroes ignored issue

While execution below query I'm getting "235" instead of expected results "0"
select REGEXP_SUBSTR(000.235||'', '[^.]+', 1, 1) from dual;
Do this instead, and you'll see where the problem comes:
select 000.235||'' from dual
Result:
.235
The regexp picks up the first longest occurrence of non-period, which in this string is "235", so it's working correctly; it's the input value that is broken
Now, if you'd written it like this, it would be fine:
select REGEXP_SUBSTR('000.235', '[^.]+', 1, 1) from dual
So why the odd presentation of the numeric? What does your data in your table look like? This is unlikely to be the actual query you're running - if you need help with the true query, post it up
Oracle trim numeric values, you can fix it by adding ltrim to number:
select REGEXP_SUBSTR(ltrim(' 000.235')||'', '[^.]+', 1, 1) from dual;
result: 000 as expected