SQL Server analog of Oracle's REGEXP_SUBSTR

SQL Server analog of Oracle's REGEXP_SUBSTR - sql

The below Oracle query takes a comma-separated list of values, like '3,4', and returns its individual tokens, 3 and 4, in separate rows.
Can somebody please show how to do the same in SQL Server.
SELECT REGEXP_SUBSTR('3,4','[^,]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR('3,4', '[^,]+', 1, LEVEL) IS NOT NULL

The query would use a recursive CTE. I think this is the logic:
with c as (
select '3,4' as rest, NULL as val
union all
select stuff(rest, charindex(',', rest + ',') + 1),
left(rest, charindex(',', rest + ',') - 1)
from c
)
select col
from c;
I should note that Oracle 12c supports recursive CTEs, which I (at least) find more intuitive than connect by.

Related

Split a String which do not have a delimiter in Oracle

I have been beating my head around a problem. Following is the input string
1034536455702130340053769240340002208520191202134036
What I need to do is split this string into the following
03453645570
03400537692
03400022085
Here, every string that needs to get picked starts with a '03'.
I can do it with a PL/SQL code, by picking each substring starting from a '03' in a loop, then concatenating each value after removing extra characters from left and right and getting only 11 characters in each iteration. And then use REGEXP_SUBSTR to get desired result. However, this approach involves too much code. Is there a way by which this can be achieved using an SQL query?
SELECT UPPER (
REGEXP_SUBSTR ('03453645570,03400537692,03400022085',
'[^,]+',
1,
LEVEL))
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('03453645570,03400537692,03400022085',
'[^,]+',
1,
LEVEL)
IS NOT NULL

You can use your existing code with the original input string, and just change the regex to match 03 followed by 9 digits:
SELECT REGEXP_SUBSTR ('1034536455702130340053769240340002208520191202134036',
'03[0-9]{9}',
1,
LEVEL)
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('1034536455702130340053769240340002208520191202134036',
'03[0-9]{9}',
1,
LEVEL)
IS NOT NULL
Output
VAL
03453645570
03400537692
03400022085
Demo on dbfiddle

Using #Nicks solution. Following is the code that I have used to optimize it. I have evaluated only 1k records and it takes less than 1 seconds. I hope it helps.
--Table to store all sorts of strings
INSERT INTO TBL_SCRM_MSISDN
SELECT '0342244357903452274515236320191201091147' NUMBERS FROM DUAL
UNION
SELECT '03457064700420191201124242' FROM DUAL
UNION
SELECT '03414221723620191201130431' FROM DUAL
UNION
SELECT '1034536455702130340053769240340002208520191202134036' FROM DUAL;
-- Table used to store unique values
create table TBL_MSISDN
(
msisdn VARCHAR2(500)
);
--Using a loop to evaluate a single value one at a time
BEGIN
EXECUTE IMMEDIATE 'TRUNCATE TABLE TBL_MSISDN';
FOR C IN
(
SELECT * FROM TBL_SCRM_MSISDN
)
LOOP
INSERT INTO TBL_MSISDN
SELECT REGEXP_SUBSTR (C.NUMBERS,
'03[0-9]{9}',
1,
LEVEL)
AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR (C.NUMBERS,
'03[0-9]{9}',
1,
LEVEL) is not null;
END LOOP;
commit;
END;
/
SELECT * FROM TBL_MSISDN WHERE MSISDN IS NOT NULL;

Try Below query
select trim(regexp_substr('1$03453645570213$034005376924$0340002208520191202134$036','[^$]+', 1, level) ) value, level
from dual
connect by regexp_substr('103453645570213$034005376924$0340002208520191202134$036', '[^$]+', 1, level) is not null
order by level;

Apply order by in comma separated string in oracle

I have one of the column in oracle table which has below value :
select csv_val from my_table where date='09-OCT-18';
output
==================
50,100,25,5000,1000
I want this values to be in ascending order with select query, output would looks like :
output
==================
25,50,100,1000,5000
I tried this link, but looks like it has some restriction on number of digits.

Here, I made you a modified version of the answer you linked to that can handle an arbitrary (hardcoded) number of commas. It's pretty heavy on CTEs. As with most LISTAGG answers, it'll have a 4000-char limit. I also changed your regexp to be able to handle null list entries, based on this answer.
WITH
T (N) AS --TEST DATA
(SELECT '50,100,25,5000,1000' FROM DUAL
UNION
SELECT '25464,89453,15686' FROM DUAL
UNION
SELECT '21561,68547,51612' FROM DUAL
),
nums (x) as -- arbitrary limit of 20, can be changed
(select level from dual connect by level <= 20),
splitstr (N, x, substring) as
(select N, x, regexp_substr(N, '(.*?)(,|$)', 1, x, NULL, 1)
from T
inner join nums on x <= 1 + regexp_count(N, ',')
order by N, x)
select N, listagg(substring, ',') within group (order by to_number(substring)) as sorted_N
from splitstr
group by N
;
Probably it can be improved, but eh...

Based on sample data you posted, relatively simple query would work (you need lines 3 - 7). If data doesn't really look like that, query might need adjustment.
SQL> with my_table (csv_val) as
2 (select '50,100,25,5000,1000' from dual)
3 select listagg(token, ',') within group (order by to_number(token)) result
4 from (select regexp_substr(csv_val, '[^,]+', 1, level) token
5 from my_table
6 connect by level <= regexp_count(csv_val, ',') + 1
7 );
RESULT
-------------------------
25,50,100,1000,5000
SQL>

How to pass multiple string values through bind variable by comma separated oracle sql

SELECT a.gl_account, g.gl
from et_bp_gl_account a,et_bp_gl g
where a.gl_id=g.gl_id
and g.gl in (select replace(:P117_GL,':',',') from et_bp_gl )
----- Here is the code that I use to pass multiple values through the bind variable like that (Asset Mg:Finance) the subquery supposed to return (Asset Mg,Finance) by replacing ':' by ',' but it doesn't work and returns
no date found
Using Oracle Sql

Apex, eh? That's either a shuttle item or a select item that allows multiple selection. Anyway, you should split that colon-separated list into rows, something like this:
SELECT a.gl_account, g.gl
from et_bp_gl_account a,et_bp_gl g
where a.gl_id=g.gl_id
and g.gl in (select regexp_substr(:P117_GL, '[^:]+', 1, level)
from dual
connect by level <= regexp_count(:P117_GL, ':') + 1
)

No need for regular expression
select * from table(apex_string.split('1:2:3',':'));
So your query might look like
SELECT a.gl_account, g.gl
from et_bp_gl_account a,et_bp_gl g
where a.gl_id=g.gl_id
and g.gl in (select column_value
from table(apex_string.split(:P117_GL,':'))
)
It wouldn't surprise me if this could be simplified further

SQL - Point differences between two lists

Given two comma separated (un-ordered) lists of numbers, I want to extract only the differences between them (using regexp probably).
e.g.:
select
'1010484,1025781,1051394,1069679' as list_1,
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
What I wish for is a result such as:
l1_additional_data: 1069679
l2_additional_data: 1005923,1034010,1044261,1048311
How can this be done?
I'm using Vertica, BTW - That means that no hierarchic ("connect by") queries could be used here.
Thanks in advance!

There's a relevant post that will be helpful - Splitting string into multiple rows in Oracle
I don't know vertica but based on oracle You could go with:
with list1 as
(
select
regexp_substr(list_1 ,'[^,]+', 1, level) as list_1_rows
from (
select
'1010484,1025781,1051394,1069679' as list_1
from dual)
connect by
regexp_substr(list_1 ,'[^,]+', 1, level) is not null),
list2 as (select
regexp_substr(list_2 ,'[^,]+', 1, level) as list_2_rows
from (
select
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
from dual)
connect by regexp_substr(list_2 ,'[^,]+', 1, level) is not null)
select * from list1
full outer join list2
on list1.list_1_rows = list2.list_2_rows
where list_1_rows is null or list_2_rows is null

OK, Here's my solution - But it's not very efficient, and it probably won't scale (in terms of performance):
WITH lists AS
(SELECT'1010484,1025781,1051394,1069679' AS list_1, '1005923,1010484,1025781,1034010,1044261,1048311,1051394' AS list_2 )
, numbers AS
(SELECT row_number() over() i
FROM system_columns limit 100)
SELECT group_concat(parsed_code_1) list_1_additions,
group_concat(parsed_code_2) list_2_additions
FROM(SELECT parsed_code_1
FROM(SELECT split_part(list_1, ',', i) parsed_code_1
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_1, ',')+1) l
WHERE parsed_code_1 IS NOT NULL) a
FULL OUTER JOIN
(SELECT parsed_code_2
FROM(SELECT split_part(list_2, ',', i) parsed_code_2
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_2, ',')+1) l
WHERE parsed_code_2 IS NOT NULL) b ON(parsed_code_1 = parsed_code_2)
WHERE parsed_code_1 IS NULL OR parsed_code_2 IS NULL

How to remove duplicates from space separated list by Oracle regexp_replace? [duplicate]

This question already has answers here:
How to remove duplicates from comma separated list by regexp_replace in Oracle?
(2 answers)
Closed 4 years ago.
I have a list called 'A B A A C D'. My expected result is 'A B C D'. So far from web I have found out
regexp_replace(l_user ,'([^,]+)(,[ ]*\1)+', '\1');
Expression. But this is for , separated list. What is the modification need to be done in order to make it space separated list. no need to consider the order.

If I understand well you don't simply need to replace ',' with a space, but also to remove duplicates in a smarter way.
If I modify that expression to work with space instead of ',', I get
select regexp_replace('A B A A C D' ,'([^ ]+)( [ ]*\1)+', '\1') from dual
which gives 'A B A C D', not what you need.
A way to get your needed result could be the following, a bit more complicated:
with string(s) as ( select 'A B A A C D' from dual)
select listagg(case when rn = 1 then str end, ' ') within group (order by lev)
from (
select str, row_number() over (partition by str order by 1) rn, lev
from (
SELECT trim(regexp_substr(s, '[^ ]+', 1, level)) str,
level as lev
FROM string
CONNECT BY instr(s, ' ', 1, level - 1) > 0
)
)
My main problem here is that I'm not able to build a regexp that checks for non adjacent duplicates, so I need to split the string, check for duplicates and then aggregate again the non duplicated values, keeping the order.
If you don't mind the order of the tokens in the result string, this can be simplified:
with string(s) as ( select 'A B A A C D' from dual)
select listagg(str, ' ') within group (order by 1)
from (
SELECT distinct trim(regexp_substr(s, '[^ ]+', 1, level)) as str
FROM string
CONNECT BY instr(s, ' ', 1, level - 1) > 0
)

Assuming you want to keep the component strings in the order of their first occurrence (and not, say, reorder them alphabetically - your example is poorly chosen in this regard, because both lead to the same result), the problem is more complicated, because you must keep track of order too. Then for each letter you must keep just the first occurrence - here is where row_number() helps.
with
inputs ( str ) as ( select 'A B A A C D' from dual)
-- end test data; solution begins below this line
select listagg(token, ' ') within group (order by id) as new_str
from (
select level as id, regexp_substr(str, '[^ ]+', 1, level) as token,
row_number() over (
partition by regexp_substr(str, '[^ ]+', 1, level)
order by level ) as rn
from inputs
connect by regexp_substr(str, '[^ ]+', 1, level) is not null
)
where rn = 1
;

Xquery?
select xmlquery('string-join(distinct-values(ora:tokenize(.," ")), " ")' passing 'A B A A C D' returning content) result from dual

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server analog of Oracle's REGEXP_SUBSTR - sql

Related

Split a String which do not have a delimiter in Oracle

Apply order by in comma separated string in oracle

How to pass multiple string values through bind variable by comma separated oracle sql

SQL - Point differences between two lists

How to remove duplicates from space separated list by Oracle regexp_replace? [duplicate]

Categories

Resources