simpler way to parse using regex

simpler way to parse using regex - sql

I have an input in the form of 'ABCD 3/1'.
I need to parse the digit before '/', Also if the input does not match this pattern then return the original string itself.
I am using below query, which works, but there would be a way to this in single regex I believe, any hints appreciated.
select nvl(REGEXP_substr(REGEXP_substr('ABCD 3/1', '\d\/'), '\d'), 'ABCD 3/1') from dual;

What about this? I believe it meets your requirements. Add more test cases as you see fit to the with clause.
SQL> with tbl(str) as (
select 'ABCD 3/1' from dual union
select 'ABCD 332/1' from dual union
select 'ABCD A/1' from dual union
select 'ABCD EFS' from dual
)
select regexp_replace(str, '.*\s(\d)/\d.*', '\1') digit_before_slash
from tbl;
DIGIT_BEFORE_SLASH
-----------------------------------------------------------------------------
3
ABCD 332/1
ABCD A/1
ABCD EFS
SQL>

You can try with REGEXP_REPLACE by mapping all your input string and picking only the part you want; for example, given this:
SQL> select regexp_replace('ABCD 3/1', '([A-Z]*)( )(\d)(\/)(\d)', '1:\1, 2:\2, 3:\3, 4:\4, 5:\5') from dual ;
REGEXP_REPLACE('ABCD3/1','
--------------------------
1:ABCD, 2: , 3:3, 4:/, 5:1
You can use '\3' to get only the third matched regexp:
SQL> select regexp_replace('ABCD 3/1', '([A-Z]*)( )(\d)(\/)(\d)', '\3') from dual ;
R
-
3

Related

How to print the sequence to nth length? [duplicate]

I would like to know how to achieve the same functionality as REPEAT() in SQL*Plus. For example consider this problem: display the character '*' as many times as the value specified by an integer attribute specified for each entry in a given table.

Nitpicking: SQL*Plus doesn't have any feature for that. The database server (Oracle) provides the ability to execute SQL and has such a function:
You are looking for rpad()
select rpad('*', 10, '*')
from dual;
will output
**********
More details can be found in the manual: https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions159.htm#SQLRF06103

For single characters, the accepted answer works fine.
However, If you have multiple characters in a given string, you need to use RPAD along with length function like this.
WITH t (str) AS
(
SELECT 'a'
FROM DUAL
UNION ALL SELECT 'abc'
FROM DUAL
UNION ALL SELECT '123'
FROM DUAL
UNION ALL SELECT '#+-'
FROM DUAL
)
SELECT RPAD(str, 5*LENGTH(str), str) repeated_5_times
FROM t;
Output:
REPEATED_5_TIMES
---------------
aaaaa
abcabcabcabcabc
123123123123123
#+-#+-#+-#+-#+-

How to get first string after character Oracle SQL

I'm trying to get first string after a character.
Example is like
ABCDEF||GHJ||WERT
I need only
GHJ
I tried to use REGEXP but i couldnt do it.
Can anyone help me with please?
Thank you

Somewhat simpler:
SQL> select regexp_substr('ABCDEF||GHJ||WERT', '\w+', 1, 2) result from dual;
^
RES |
--- give me the 2nd "word"
GHJ
SQL>
which reads as: give me the 2nd word out of that string. Won't work properly if GHJ consists of several words (but that's not what your example suggests).

Something like I interpret with a separator in place, In this case it is || or | example is with oracle database
-- pattern -- > [^] represents non-matching character and + for says one or more character followed by ||
-- 3rd parameter --> starting position
-- 4th parameter --> nth occurrence
WITH tbl(str) AS
(SELECT 'ABCDEF||GHJ||WERT' str FROM dual)
SELECT regexp_substr(str
,'[^||]+'
,1
,2) output
FROM tbl;

I think the most general solution is:
WITH tbl(str) AS (
SELECT 'ABCDEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABC|DEF||GHJ||WERT' str FROM dual UNION ALL
SELECT 'ABClDEF||GHJ||WERT' str FROM dual
)
SELECT regexp_replace(str, '^.*\|\|(.*)\|\|.*', '\1')
FROM tbl;
Note that this works even if the individual elements contain punctuation or a single vertical bar -- which the other solutions do not. Here is a comparison.
Presumably, the double vertical bar is being used for maximum flexibility.

You should use regexp_substr function
select regexp_substr('ABCDEF||GHJ||WERT ', '\|{2}([^|]+)', 1, 1, 'i', 1) str
from dual;
STR
---
GHJ

Insert character between string Oracle SQL

I need to insert character string after each character in Oracle SQL.
Example:
ABC will A,B,C
DEFG will be D,E,F,G
This question gives only one character in string
Oracle insert character into a string

Edit: As some fellows have mentioned, Oracle does not admit this regex. So my approach would be to do a regex to match all characters, add them a comma after the character and then removing the last comma.
WITH regex AS (SELECT REGEXP_REPLACE('ABC', '(.)', '\1,') as reg FROM dual) SELECT SUBSTR(reg, 1, length(reg)-1) FROM regex;
Note that with the solution of rtrim there could be errors if the string you want to parse has a final ending comma and you don't want to remove it.
Previous solution: (Not working on Oracle)
Check if this does the trick:
SELECT REGEXP_REPLACE('ABC', '(.)(?!$)', '\1,') FROM dual;
It does a regexp_replace of every character, but the last one for the same character followed by a ,
To see how regexp_replace works I recommend you: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions130.htm

SELECT rtrim(REGEXP_REPLACE('ABC', '(.)', '\1,'),',') "REGEXP_REPLACE" FROM dual;

You could do it using:
REGEXP_REPLACE
RTRIM
For example,
SQL> WITH sample_data AS(
2 SELECT 'ABC' str FROM dual UNION ALL
3 SELECT 'DEFG' str FROM dual UNION ALL
4 SELECT 'XYZ' str FROM dual
5 )
6 -- end of sample_data mimicking a real table
7 SELECT str,
8 rtrim(regexp_replace(str, '(\w?)', '\1,'),',') new_str
9 FROM sample_data;
STR NEW_STR
---- ----------
ABC A,B,C
DEFG D,E,F,G
XYZ X,Y,Z

Since there is no way to negate the end of string in an Oracle regex (that does not support lookarounds), you may use
SELECT REGEXP_REPLACE(
REGEXP_REPLACE('ABC', '([^,])([^,])','\1,\2'),
'([^,])([^,])',
'\1,\2')
AS Result from dual
See the DB Fiddle. The point here is to use REGEXP_REPLACE with ([^,])([^,]) pattern twice to cater for consecutive matches.
The ([^,])([^,]) pattern matches any non-comma char into Group 1 (\1) and then any non-comma char into Group 2 (\2), and inserts a comma in between them.

REPEAT function equivalent in Oracle

I would like to know how to achieve the same functionality as REPEAT() in SQL*Plus. For example consider this problem: display the character '*' as many times as the value specified by an integer attribute specified for each entry in a given table.

Nitpicking: SQL*Plus doesn't have any feature for that. The database server (Oracle) provides the ability to execute SQL and has such a function:
You are looking for rpad()
select rpad('*', 10, '*')
from dual;
will output
**********
More details can be found in the manual: https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions159.htm#SQLRF06103

For single characters, the accepted answer works fine.
However, If you have multiple characters in a given string, you need to use RPAD along with length function like this.
WITH t (str) AS
(
SELECT 'a'
FROM DUAL
UNION ALL SELECT 'abc'
FROM DUAL
UNION ALL SELECT '123'
FROM DUAL
UNION ALL SELECT '#+-'
FROM DUAL
)
SELECT RPAD(str, 5*LENGTH(str), str) repeated_5_times
FROM t;
Output:
REPEATED_5_TIMES
---------------
aaaaa
abcabcabcabcabc
123123123123123
#+-#+-#+-#+-#+-

Find the accent data in table records

In a table, I have a column that contains a few records with accented characters. I want a query to find the records with accented characters.
If we have records like as below:
2ème édition
Natália
sravanth
query should pick these records:
2ème édition
Natália

You can use the REGEXP_LIKE function along with a list of all the accented characters you're interested in:
with t1(data) as (
select '2ème édition' from dual union all
select 'Natália' from dual union all
select 'sravanth' from dual
)
select * from t1 where regexp_like(data,'[àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ]');
DATA
--------------
2ème édition
Natália

The ASCIISTR function would be another way to find accented characters
ASCIISTR takes as its argument a string, or an expression that
resolves to a string, in any character set and returns an ASCII
version of the string in the database character set. Non-ASCII
characters are converted to the form \xxxx, where xxxx represents a
UTF-16 code unit.
So you can do something like
SELECT my_field FROM my_table
WHERE NOT my_field = ASCIISTR(my_field)
Or to re-use the demo from the accepted answer:
with t1(data) as (
select '2ème édition' from dual union all
select 'Natália' from dual union all
select 'sravanth' from dual
)
select * from t1 where data != asciistr(data)
which would output the 2 rows with accents.

with t1(data) as (
select '2ème édition' from dual union all
select 'Natália' from dual union all
select 'sravanth' from dual
)
select * from t1 where REGEXP_like(ASCIISTR(data), '\ \ [[:xdigit:]]{4}');
DATA
--------------
2ème édition
Natália

Way harder than it seems on the surface as there is more than one way to create an accent. What I do is have a mirror column I call clean and scrub out all the accents on load.
See this question I asked some time ago normalized string

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

simpler way to parse using regex - sql

Related

How to print the sequence to nth length? [duplicate]

How to get first string after character Oracle SQL

Insert character between string Oracle SQL

REPEAT function equivalent in Oracle

Find the accent data in table records

Categories

Resources