Trimming value from a column in Snowflake

Trimming value from a column in Snowflake - sql

I have column called File with values 'Mens_Purchaser_Segment_Report' and 'Loyalist_Audience_Segment_Report'. I want to capture everything that comes before word Segment.
I used query:
select
TRIM(file,regexp_substr(file, '_Segment_Report.*')) as new_col
Output:
Mens_Purch
Loyalist_Audi
How do I capture everything before Segment?
Tried below but same results-->
TRIM(file,regexp_substr(file, 'S.*'))
TRIM(file,regexp_substr(file, '_S.*'))

You didn't specify if the trailing text is always _Segment_Report, you're asking for any text before _Segment. Depending on that various solutions can be used, see below.
create or replace table foo(s string) as select * from values
('Mens_Purchaser_Segment_Report'),
('Loyalist_Audience_Segment_Report');
-- If you know the suffix you want to remove is always exactly '_Segment_Report'
select s, replace(s, '_Segment_Report', '') from foo;
-- If you know the suffix you want to remove starts with '_Segment' but can have something after
-- - approach 1, where we replace the _Segment and anything after it with nothing
select s, regexp_replace(s, '_Segment.*', '') from foo;
-- - approach 2, where we extract things before _Segment
-- Note: it will behave differently if there are many instances of '_Segment'
select s, regexp_substr(s, '(.*)_Segment.*', 1, 1, 'e') from foo;

try
using regexp_replace
select regexp_replace(fld1, 'Segment', '') from (
select 'Mens_Purchaser_Segment_Report and Loyalist_Audience_Segment_Report' fld1 from dual );

Related

Search a pattern from comma seperated parameters in plsql

My Parameter to a procedure lv_ip := 'MNS-GC%|CS,MIB-TE%|DC'
My cursor query should search for records that start with 'MNS-GC%' and 'MIB-TE%'.
Select id, date,program,program_start_date
from table_1
where program like 'MNS-GC%' or program LIKE 'MIB-TE%'
Please suggest ways to read it from the parameter and an alternative to LIKE.

Since you mention you want to preserve what's on the right side of the pipe, and want to be able to process parameters dynamically, here's a way to parse multi-delimited data that could give you some ideas using a CTE.
The table called 'tbl' just sets up your original data. tbl_comma contains that data split on the comma. The final query splits that data into name/value pairs.
Hopefully this will help give you some ideas even though it's not the exact answer you are looking for.
COLUMN ID FORMAT a3
COLUMN PROGRAM FORMAT a10
COLUMN part2 FORMAT a6
-- Original data
WITH tbl(ID, DATA) AS (
SELECT 1, 'MNS-GC%|CS,MIB-TE%|DC' FROM dual UNION ALL
SELECT 2, 'MNS-GC%|CS,MIB-TE%|DC,MIB-TA%|AB,MIB-TB%|BC' FROM dual
),
tbl_comma(ID, CASE) AS (
SELECT ID,
REGEXP_SUBSTR(DATA, '(.*?)(,|$)', 1, LEVEL, NULL, 1) CASE
FROM tbl
CONNECT BY REGEXP_SUBSTR(DATA, '(.*?)(,|$)', 1, LEVEL) IS NOT NULL
AND PRIOR ID = ID
AND PRIOR SYS_GUID() IS NOT NULL
)
--SELECT * FROM tbl_comma;
-- Parse into name/value pairs
SELECT ID,
REGEXP_REPLACE(CASE, '^(.*)\|.*', '\1') PROGRAM,
REGEXP_REPLACE(CASE, '.*\|(.*)$', '\1') PART2
FROM tbl_comma;
ID PROGRAM PART2
--- ---------- ------
1 MNS-GC% CS
1 MIB-TE% DC
2 MNS-GC% CS
2 MIB-TE% DC
2 MIB-TA% AB
2 MIB-TB% BC
6 rows selected.

If you're stuck with that input and the structure is fixed, with each comma-separated element having a pipe-delimited value, you could possibly convert that string to a regular expression pattern, and then use regexp_like to pattern-match:
select id, date, program, program_start_date
from table_1
where regexp_like(
program,
'^(' || rtrim(regexp_replace(lv_ip, '%\|.*?(,|$)', '|'), '|') || ')')
With your example parameter, the
'^(' || rtrim(regexp_replace(lv_ip, '%\|.*?(,|$)', '|'), '|') || ')'
would generate the pattern
^(MNS-GC|MIB-TE)
i.e. looking for either of those strings at the start of the program value.
db<>fiddle
Alternatively you could split the input up yourself, with instr and substr, and - since the number of elements may vary - create a dynamic query using them. That might be faster than using regular expression, but might be harder to maintain.
What would the regexp be to match CS|DC
It depends how you plan to use those values, but if you're looking for some column exactly matching one of them, then you could do something similar with:
'^(' || ltrim(regexp_replace(l_ip, '(^|,)[^|]*', null), '|') || ')$'
which with your input string would generate the pattern
^(CS|DC)$
But if you need to match the corresponding values as pairs - so the equivalent of something like:
where (program like 'MNS-GC%' and some_col = 'CS')
or (program like 'MIB-TE%' and some_col = 'DC')
... then you'd need to extract them as pairs, as #Gary_W has shown.

Regex: how to get the text between a few colons?

So, i have a lot of strings like the ones below in my database:
product1:1stparty:single_aduls:android:
product2:3rdparty:married_adults:ios:
product3:3rdparty:other_adults:android:
I need a regex to get only the text after the product name and before the device category. So, in the first line I'd get 1stparty:single_aduls, in the second 3rdparty:married_adults and in the third 3rdparty:other_adults. I'm stuck and can't find a way to solve that. Could anyone help me please?

As a regular expression, you can use:
select regexp_extract('product1:1stparty:single_aduls:android:', '^[^:]*:(.*):[^:]*:$')
This returns every after the first colon and before the penultimate colon.

We can try using REGEXP_REPLACE here:
SELECT REGEXP_REPLACE(val, r"^.*?:|:[^:]+:$", "") AS output
FROM yourTable;
This approach removes either the leading ...: or trailing :...: from the column, leaving behind the content you want. Here is a demo showing that the regex replacement is working:
Demo

You can also use standard split function and access result array element by index, which is quite clear to read and understand.
with a as (
select split('product1:1stparty:single_aduls:android:', ':') as splitted
)
select splitted[ordinal(2)] || ':' || splitted[ordinal (3)] as subs
from a

Consider below example
with your_table as (
select 'product1:1stparty:single_aduls:android:' txt union all
select 'product2:3rdparty:married_adults:ios:' union all
select 'product3:3rdparty:other_adults:android:'
)
select *,
(
select string_agg(part, ':' order by offset)
from unnest(split(txt, ':')) part with offset
where offset in (1, 2)
) result
from your_table
with output

How to extract the specific part of string in SQL server?

I have a string ST0023_Lamb_Weston_2017_US in a table from particular column. While selecting the name I need to get only Lamb_Weston_2017_US. I can use
SELECT SUBSTRING('ST0023_Lamb_Weston_2017_US', 8, 20)
But there will be different names in the column. For example ,
ST0023_Lamb_Weston_2017_US
ST0053_PL_Sandbox_Dorgan_US
ST0071_EDA_Austria
ST0071_EDA_Austria
ST10338_Nestle_Soluble_Instant_Cacao_ES
So the above mentioned are the different names available. I need to remove the "ST" part and the number part till first hyphen and return name alone. Please help me with this.

Inside substring function use charindex to pick the starting position of underscore. Plus one is added with charindex to exclude the underscore position and ending position will be considered till the length of the data.
create table data
(
value varchar(100)
)
insert into data
select 'ST0023_Lamb_Weston_2017_US' union
select 'ST0053_PL_Sandbox_Dorgan_US' union
select 'ST0071_EDA_Austria' union
select 'ST0071_EDA_Austria' union
select 'ST10338_Nestle_Soluble_Instant_Cacao_ES'
go
select value, SUBSTRING(value, CHARINDEX('_',value)+1 , LEN(value)) 'Newvalue' from data

How to use regexp_substr() with group of delimiter characters?

I have a string something like this 'SERO02~~~NA_#ERO5'. I need to sub string it using delimiter ~~~. So can get SERO02 and NA_#ERO5 as result.
I create an regex experession like this:
select regexp_substr('SERO02~~~NA_#ERO5' ,'[^~~~]+',1,2) from dual;
It worked fine and returns : NA_#ERO5
But if I change the string to ERO02~NA_#ERO5 the result is still same.
But I expect the expression to return nothing since delimiter ~~~ is not found in that string. Can someone help me out to create correct expression?

[^~~~] matches a single character that is not one of the characters following the caret in the square brackets. Since all those characters are identical then [^~~~] is the same as [^~].
You can match it using:
SELECT REGEXP_SUBSTR(
'SERO02~~~NA_#ERO5',
'~~~(.*?)(~~~|$)',
1,
1,
NULL,
1
)
FROM DUAL;
Which will match ~~~ then store zero-or-more characters in a capture group (the round brackets () indicates a capture group) until it finds either ~~~ or the end-of-string. It will then return the first capture group.

You can do it without regular expressions, with a bit of logics:
with test(text) as ( select 'SERO02~~~NA_#ERO5' from dual)
select case
when instr(text, '~~~') != 0 then
substr(text, instr(text, '~~~') + 3)
else
null
end
from test
This will give the part of the string after '~~~', if it exists, null otherwise.
You can edit the ELSE part to get what you need when the input string does not contain '~~~'.
Even using regexp,to match the string '~~~', you need to write it exactly, without []; the [] is used to list a set of characters, so [aaaaa] is exactly the same than [a],while [abc] means 'a' OR 'b' OR 'c'.
With regexp, even if not necessary, one way could be the following:
substr(regexp_substr(text, '~~~.*'), 4)

In case you want all elements. Handles NULL elements too:
SQL> with tbl(str) as (
select 'SERO02~~~NA_#ERO5' from dual
)
select regexp_substr(str, '(.*?)(~~~|$)', 1, level, null, 1) element
from tbl
connect by level <= regexp_count(str, '~~~') + 1;
ELEMENT
-----------------
SERO02
NA_#ERO5
SQL>

Stripping a specific part of variable in SQL

I have a variable where i dynamically store the URL
For Eg. a="https://myaccessdev.searshc.com/aveksa/main?Oid=1&ReqType=GetPage&ObjectClass=com.aveksa.gui.objects.workflow.GuiWorkflowJob&WFObjectID=3478:WPDS"
I want to strip the number in the last before ":WDPS" from this whole string.
Is there any way of doing it in SQL

I want to strip the number [...] before ":WDPS" from this whole string.
This is a perfect match for REGEXP_REPLACE:
with test_data as
(select 'https://myaccessdev.searshc.com/aveksa/main?Oid=1&ReqType=GetPage&ObjectClass=com.aveksa.gui.objects.workflow.GuiWorkflowJob&WFObjectID=3478:WPDS' str
from dual)
select REGEXP_REPLACE(str, '.*=([0-9]*):WPDS.*', '\1') from test_data
-- ^^^^^^^^^^^^^^^^^^^^ ^^^^
-- replace everything before and after by the first capturing
-- the target string group (i.e.: the sequence between
-- parenthesis in the regular expression)
Producing:
3478
See http://sqlfiddle.com/#!4/d41d8/37095 for a live demo.

If you just want the number, you can do something like this:
select regexp_substr(regexp_substr(s, '=[0-9]*:WPDS', 1, 1), '[0-9]*', 1, 1)
You can do this with regexp_substr(). I'm having trouble testing it right now. The following comes quite close:
select regexp_substr(s, '=[0-9]*:WPDS', 1, 1)

Try this
DECLARE #string varchar(255) = 'https://myaccessdev.searshc.com/aveksa/main?Oid=1&ReqType=GetPage&ObjectClass=com.aveksa.gui.objects.workflow.GuiWorkflowJob&WFObjectID=3478:WPDS'
DECLARE #end varchar(10) = REVERSE(SUBSTRING(REVERSE(#string),0,CHARINDEX('=', REVERSE(#string))))
SELECT SUBSTRING(#end,0,CHARINDEX(':', #end))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Trimming value from a column in Snowflake - sql

try using regexp_replace select regexp_replace(fld1, 'Segment', '') from ( select 'Mens_Purchaser_Segment_Report and Loyalist_Audience_Segment_Report' fld1 from dual );

Related

Search a pattern from comma seperated parameters in plsql

Regex: how to get the text between a few colons?

How to extract the specific part of string in SQL server?

How to use regexp_substr() with group of delimiter characters?

Stripping a specific part of variable in SQL

Categories

Resources