REGEX get all matched patterns by SQL DB2 - sql

all.
I need to extract from the string by REGEX all that matching the pattern "TTT\d{3}"
For the string in example i would like to get:
TTT108,TTT109,TTT111,TTT110
The DB2 function i would like to use is REGEXP_REPLACE(str,'REGEX pattern', ',').
The number of matching can be 0,1,2,3... in each string.
Thank you.
The example:
TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)

If you want to extract the valid instead of replacing the invalid characters, please check if this helps:
with data (s) as (values
('TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)')
)
select listagg(sst,', ') within group (order by n)
from (
select n,
regexp_substr(s,'(TTT[0-9][0-9][0-9])', 1, n)
from data
cross join (values (1),(2),(3),(4),(5)) x (n) -- any numbers table
where n <= regexp_count(s,'(TTT[0-9][0-9][0-9])')
) x (n,sst)

For any number of tokens & Db2 versions before 11.1:
select id, listagg(tok, ',') str
from
(
values
(1, 'TTT108(optional);TTT109(optional);TTT111(optional);TTT110optional);ENTITYLIST_2=(optional);ENTITYLIST_3=(optional);Containment_Status=(optional)')
) mytable (id, str)
, xmltable
(
'for $id in tokenize($s, ";") let $new := replace($id, "(TTT\d{3}).*", "$1") where matches($id, "(TTT\d{3}).*") return <i>{string($new)}</i>'
passing mytable.str as "s"
columns tok varchar(6) path '.'
) t
group by id;

Related

Remove Substring according to specific pattern

I need to remove in a SQL Server database a substring according to a pattern:
Before: Winter_QZ6P91712017_115BPM
After: Winter_115BPM
Or
Before: cpx_Note In My Calendar_QZ6P91707044
After: cpx_Note In My Calendar
Basically delete the substring that has pattern _ + 12 chars.
I've tried PatIndex('_\S{12}', myCol) to get the index of the substring but it doesn't match anything.
Assuming you mean underscore followed by 12 characters that are not underscores you can use this pattern:
SELECT *,
CASE WHEN PATINDEX('%[_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_]%', str) > 0
THEN STUFF(str, PATINDEX('%[_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_][^_]%', str), 13, '')
ELSE str
END
FROM (VALUES
('Winter_QZ6P91712017_115BPM'),
('Winter_115BPM_QZ6P91712017')
) AS tests(str)
late to the party, but you could also use latest STRING_SPLIT function to explode the string by underscores and count length of each segment between underscores. If the length is >=12, these sections must be replaced from original string via replace function recursively.
drop table if exists Tbl;
drop table if exists #temptable;
create table Tbl (input nvarchar(max));
insert into Tbl VALUES
('Winter_QZ6P91712017_115BPM'),
('cpx_Note In My Calendar_QZ6P91707044'),
('stuff_asdasd_QZ6P91712017'),
('stuff_asdasd_QZ6P91712017_stuff_asdasd_QZ6P91712017'),
('stuff_asdasd_QZ6P917120117_stuff_asdasd_QZ6P91712017');
select
input, value as replacethisstring,
rn = row_number() over (partition by input order by (select 1))
into #temptable
from
(
select
input,value as hyphensplit
from Tbl
cross apply string_split(input,'_')
)T cross apply string_split(hyphensplit,' ')
where len(value)>=12
; with cte as (
select input, inputtrans= replace(input,replacethisstring,''), level=1 from #temptable where rn=1
union all
select T.input,inputtrans=replace(cte.inputtrans,T.replacethisstring,''),level=level+1
from cte inner join #temptable T on T.input=cte.input and rn=level+1
--where level=rn
)
select input, inputtrans
from (
select *, rn=row_number() over (partition by input order by level desc) from cte
) T where rn=1
sample output
SQL Server doesn't support Regex. Considering, however, you just want to remove the first '_' and the 12 characters afterwards, you could use CHARINDEX to find the location of said underscore, and then STUFF to remove the 13 characters:
SELECT V.YourString,
STUFF(V.YourString, CHARINDEX('_',V.YourString),13,'') AS NewString
FROM (VALUES('Winter_QZ6P91712017_115BPM'))V(YourString);

Split string in bigquery

I've the following string that I would like to split and given in rows.
Example values in my column are:
['10000', '10001', '10002', '10003', '10004']
Using the SPLIT function, I get the following result:
I've two questions:
How do I split it so that I get '10000', instead of ['10000'?
How do I remove the Apostrof ' ?
Response:
Consider below example
with t as (
select ['10000', '10001', '10002', '10003', '10004'] col
)
select cast(item as int64) num
from t, unnest(col) item
Above is assumption that col is array. In case if it is a string - use below
with t as (
select "['10000', '10001', '10002', '10003', '10004']" col
)
select cast(trim(item, " '[]") as int64) num
from t, unnest(split(col)) item
Both with output

Convert comma separated string into rows

I have a comma separated string.
Now I'd like to separate this string value into each row.
Input:
1,2,3,4,5
Required output:
value
----------
1
2
3
4
5
How can I achieve this in sql?
Thanks in advance.
Use the STRING_SPLIT function if you are on SQL Server
SELECT value
FROM STRING_SPLIT('1,2,3,4,5', ',')
Else you can loop on the SUBSTRING_INDEX() function and insert every string in a temporary table.
If you are using Postgres, you can use string_to_array and unnest:
select *
from unnest(string_to_array('1,2,3,4,5',',') as t(value);
In Postgres, you can also use the 'regexp_split_to_table()' function.
If you're using MariaDB or MySQL you can use a recursive CTE such as:
with recursive itemtable as (
select
trim(substring_index(data, ',', 1)) as value,
right(data, length(data) - locate(',', data, 1)) as data
from (select '1,2,3,4,5' as data) as input
union
select
trim(substring_index(data, ',', 1)) as value,
right(data, length(data) - locate(',', data, 1)) as data
from itemtable
)

How to parse data inside CDATA in oracle

There is a column in the table with type CLOB and it stores the payloads.
I need to get the value with respect to tenantId (7 in this case). How do write a query to get the same. Can't use substr or instr since the order of tenantId changes from record to record.
<PayLoad><![CDATA[{"order":{"entityErrors":[],"action":"CONFIRM","tenantId":"7","Id":"2","deliveryReservationDetails":[{"reservationId":"c05e0c77-1c8f-4dce-a388-fe97fd36f96e","fulfillmentLocationType":"Store"}]}}]]></PayLoad>
In Oracle 12 you should be able to do:
SELECT tenantId
FROM your_table t
LEFT OUTER JOIN
XMLTABLE(
'/PayLoad/.'
PASSING XMLTYPE( t.your_xml_column )
COLUMNS cdata CLOB PATH '.'
) x
ON ( 1 = 1 )
LEFT OUTER JOIN
JSON_TABLE(
x.cdata,
'$'
COLUMNS ( tenantId VARCHAR2(10) PATH '$.order.tenantId' )
) j
ON ( 1 = 1 );
(Untested as I'm on 11g for the next few hours)
On Oracle 11:
SELECT REGEXP_SUBSTR( x.cdata, '"tenantId":"((\\"|[^"])*)"', 1, 1, NULL, 1 ) AS tenantId
FROM your_table t
LEFT OUTER JOIN
XMLTABLE(
'/PayLoad/.'
PASSING XMLTYPE( t.your_xml_column )
COLUMNS cdata CLOB PATH '.'
) x
ON ( 1 = 1 )
Or (if the JSON string will not occur in another branch of the XML) you could just use:
SELECT REGEXP_SUBSTR(
your_xml_column,
'"tenantId":"((\\"|[^"])*)"',
1,
1,
NULL,
1
) AS tenantId
FROM your_table
Your CDATA payload appears to be JSON, so if you're on 12c and only have one order per CDATA you can do:
select json_value(x.cdata, '$.order.tenantId' returning number) as tenantid
from your_table t
cross join xmltable('/PayLoad'
passing xmltype(your_clob)
columns cdata clob path './text()'
) x;
As #MTO pointed out, treating the JSON value as a number may not be valid, as it's shown as a string in the raw JSON. Making it returning varchar2(10) or some other suitable size may be more appropriate (and safer, at least until you use it).
Quick demo with your XML as a CTE:
with your_table (your_clob) as (
select to_clob('<PayLoad><![CDATA[{"order":{"entityErrors":[],"action":"CONFIRM","tenantId":"7","Id":"2","deliveryReservationDetails":[{"reservationId":"c05e0c77-1c8f-4dce-a388-fe97fd36f96e","fulfillmentLocationType":"Store"}]}}]]></PayLoad>') from dual
)
select json_value(x.cdata, '$.order.tenantId' returning number) as tenantid
from your_table t
cross join xmltable('/PayLoad'
passing xmltype(your_clob)
columns cdata clob path './text()'
) x;
TENANTID
----------------------
7

T-SQL function to split string with two delimiters as column separators into table

I'm looking for a t-sql function to get a string like:
a:b,c:d,e:f
and convert it to a table like
ID Value
a b
c d
e f
Anything I found in Internet incorporated single column parsing (e.g. XMLSplit function variations) but none of them letting me describe my string with two delimiters, one for column separation & the other for row separation.
Can you please guiding me regarding the issue? I have a very limited t-sql knowledge and cannot fork those read-made functions to get two column solution?
You can find a split() function on the web. Then, you can do string logic:
select left(val, charindex(':', val)) as col1,
substring(val, charindex(':', val) + 1, len(val)) as col2
from dbo.split(#str, ';') s(val);
You can use a custom SQL Split function in order to separate data-value columns
Here is a sql split function that you can use on a development system
It returns an ID value that can be helpful to keep id and value together
You need to split twice, first using "," then a second split using ";" character
declare #str nvarchar(100) = 'a:b,c:d,e:f'
select
id = max(id),
value = max(value)
from (
select
rowid,
id = case when id = 1 then val else null end,
value = case when id = 2 then val else null end
from (
select
s.id rowid, t.id, t.val
from (
select * from dbo.Split(#str, ',')
) s
cross apply dbo.Split(s.val, ':') t
) k
) m group by rowid