How to construct a specific regular expression - sql

I want to create a regular expression that replaces every character in a string except the last 2 with a '*'. For example:
'abcdefgh' --> '******gh'
I am using oracle's regexp_replace, I have written something like:
regexp_replace('dfdfdfdfsdf','(.*)(..)','*\2',1,0)
but it ends up with one "*"
dfdfdfdfsdf --> *df
I would appreciate your kind assistance

You can use LPAD.
select LPAD(SUBSTR('dfdfdfdfsdf',-2),LENGTH('dfdfdfdfsdf'),'*') from dual;
OUTPUT
*********df
CHECK LIVE DEMO HERE

So long as you are not worried about 1 or 2 character strings then you can use the regular expression .(..$)?:
Query
WITH test_data ( value ) AS (
SELECT NULL FROM DUAL UNION ALL
SELECT 'A' FROM DUAL UNION ALL
SELECT 'AB' FROM DUAL UNION ALL
SELECT 'ABC' FROM DUAL UNION ALL
SELECT 'ABCD' FROM DUAL UNION ALL
SELECT 'ABCDE' FROM DUAL UNION ALL
SELECT 'ABCDEF' FROM DUAL
)
SELECT value,
REGEXP_REPLACE(
value,
'.(..$)?',
'*\1'
)
FROM test_data
Outputs:
VALUE | REGEXP_REPLACE(VALUE,'.(..$)?','*\1')
:----- | :------------------------------------
null | null
A | *
AB | **
ABC | *BC
ABCD | **CD
ABCDE | ***DE
ABCDEF | ****EF
db<>fiddle here

You can try replacing this pattern by *:
.(?=.{2})
Live example: https://regex101.com/r/uueD6B/1

Related

Removing first 'G' character of entire column values in table if it is starting from 'G'

I wanted to remove first 'G' character of full column values of table only if it exist.
I have tried the substr function to remove first char but it will remove the first char even if it is not 'G'. I only wanted to remove first char of entire column values if it is 'G'.
For example in myTable the column values are as follows:
G12345
332157
G54337
G54332
534535
Expected result is as follows:
12345
332157
54337
54332
534535
Wanted to write update query to update the entire column value.
Based on your description, you can use regexp_replace():
select regexp_replace('G12345', 'G', '', 1, 1)
If you only want to remove a 'G' at the beginning of the string, you can use '^G' for the pattern.
Based on your data, you can just use replace():
select replace('G12345', 'G', '')
This removes all 'G's. But your data only seems to have one.
For an update you would just include the logic as an update:
update t
set col = replace(col, 'G', '')
where col like '%G%';
Or whichever of the above functions is what you really want to do.
You want to update the rows where the value starts with a 'G', so use a WHERE clause:
update mytable
set value = substr(value, 2)
where value like 'G%';
As you want to replace only the first G, then
SQL> with test (col) as
2 (select 'G12345' from dual union all
3 select '332157' from dual union all
4 select '11G222' from dual
5 )
6 select col,
7 substr(col, case when substr(col, 1, 1) = 'G' then 2 else 1 end) result
8 from test;
COL RESULT
------ ------
G12345 12345
332157 332157
11G222 11G222
SQL>
You can use:
SELECT CASE WHEN value LIKE 'G%' THEN SUBSTR( value, 2 ) ELSE value END
AS value_without_g
FROM myTable
Which, from the sample data:
CREATE TABLE myTable ( value ) AS
SELECT 'G12345' FROM DUAL UNION ALL
SELECT '332157' FROM DUAL UNION ALL
SELECT 'G54337' FROM DUAL UNION ALL
SELECT 'G54332' FROM DUAL UNION ALL
SELECT '534535' FROM DUAL;
Outputs:
| VALUE_WITHOUT_G |
| :-------------- |
| 12345 |
| 332157 |
| 54337 |
| 54332 |
| 534535 |
db<>fiddle here

using SQL IN clause to select specific number

how can I express if I want to select for example account numbers where
7th digit of an account number is IN(2,3,4). Let's say account number has 10 digits in total.
You can use regular expressions:
where regexp_like(account_number, '^.{6}[234]')
An other possible regular expression is:
where regexp_like(account_number, '^\d{6}[234]\d{3}$')
Use SUBSTR to get the 7th digit:
SELECT *
FROM table_name
WHERE SUBSTR( account_number, 7, 1 ) IN ( '2', '3', '4' )
Which, for the sample data:
CREATE TABLE table_name ( account_number ) AS
SELECT 1234567890 FROM DUAL UNION ALL
SELECT 2222222222 FROM DUAL UNION ALL
SELECT 3333333333 FROM DUAL UNION ALL
SELECT 4444444444 FROM DUAL UNION ALL
SELECT 5555555555 FROM DUAL
Outputs:
| ACCOUNT_NUMBER |
| -------------: |
| 2222222222 |
| 3333333333 |
| 4444444444 |
db<>fiddle here

Sort string and number combination in descending in oralce

I am trying to sort a combination of string and number in descending order .
Input :
P9S1
P7S1
P13S1
P12S2
P10S1
Expected output:
P13S1
P12S2
P10S1
P9S1
P7S1
Here is what I tried
Sample code:
with
inputs (firmware) as (
select 'P9S1' from dual union all
select 'P7S1' from dual union all
select 'P13S1' from dual union all
select 'P12S2' from dual union all
select 'P10S1' from dual
)
select firmware
from inputs
order by
regexp_replace(firmware, '\d+\.\d+') desc ;
But this doesn't work as expected. Any help would be appreciated.
Thanks
You did not actually explain how the strings should turned to numbers.
This would work for your dataset:
order by to_number(regexp_replace(firmware, '\D', '')) desc
The idea is to remove all non-digits characters from the string, turn the resulting string to a number, and use it for sorting.
with inputs (firmware) as (
select 'P9S1' from dual union all
select 'P7S1' from dual union all
select 'P13S1' from dual union all
select 'P12S2' from dual union all
select 'P10S1' from dual
)
select firmware
from inputs
order by to_number(regexp_replace(firmware, '\D', '')) desc ;
| FIRMWARE |
| :------- |
| P13S1 |
| P12S2 |
| P10S1 |
| P9S1 |
| P7S1 |

Oracle SQL - How to Cut out characters from a string with SUBSTR?

I have values like "ABC1234", "ABC", "DEF456", "GHI" etc. in a specific column which I need.
Now I need to split this string but only if the character (e.g. "ABC") are followed by digits.
So if the value is "ABC1234" then I need to cut out ABC and 1234 seperated. But if there is only "ABC" as a value, I just need the "ABC". I can't find any solution with SUBSTR. Do you have any idea?
Note: The length of the characters can differ from 1 to 10 and also the length from the digits (sometimes there isn't any like I showed you).
So if the value is "ABC1234" then I need to cut out ABC and 1234
seperated. But if there is only "ABC" as a value, I just need the
"ABC".
Amidst of other solutions, I propose one solution as shown below:
Logic:
1) Replace all the digits to 1. Check the position of the digit occurring in the string. If
there is no digit in the string then use the String.
2) Extract the alphabets from 1st position to the position where
digit starts.
3) Extract the digit from the position it starts till end. If digit doesnot exists the set it NULL
--Dataset Preparation
with test (col) as
(select 'ABC1234' from dual union all
select 'ABC' from dual union all
select 'dEfH456' from dual union all
select '123GHI' from dual union all
select '456' from dual
)
--Query
select col Original_Column,
CASE
WHEN (instr(regexp_replace(col,'[0-9]','1'),'1',1)) = 0
then col
else
substr( col,1,instr(regexp_replace(col,'[0-9]','1'),'1',1)-1)
end Col_Alp,
CASE
WHEN (instr(regexp_replace(col,'[0-9]','1'),'1',1)) = 0
then NULL
Else
substr( col,instr(regexp_replace(col,'[0-9]','1'),'1',1))
END col_digit
from test
where regexp_like(col, '^[a-zA-Z0-9]+$');
Result:
SQL> /
Original_Column Col_Alp col_digit
---------- ----- -----
ABC1234 ABC 1234
ABC ABC NULL
dEfH456 dEfH 456
123GHI NULL 123GHI
456 NULL 456
Using SUBSTR (and INSTR and TRANSLATE):
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE data ( value ) AS
SELECT 'ABC1234' FROM DUAL UNION ALL
SELECT 'ABC123D' FROM DUAL UNION ALL
SELECT 'ABC ' FROM DUAL UNION ALL
SELECT 'ABC' FROM DUAL UNION ALL
SELECT 'DEFG456' FROM DUAL UNION ALL
SELECT 'GHI' FROM DUAL UNION ALL
SELECT 'JKLMNOPQRS9' FROM DUAL;
Query 1:
SELECT value,
SUBSTR( value, 1, first_digit - 1 ) AS prefix,
TO_NUMBER( SUBSTR( value, first_digit ) ) AS suffix
FROM (
SELECT value,
INSTR(
TRANSLATE( value, '-1234567890', ' ----------' ),
'-',
1
) AS first_digit
FROM data
)
WHERE SUBSTR( value, first_digit ) IS NOT NULL
AND TRANSLATE( SUBSTR( value, first_digit ), '-1234567890', ' ' ) IS NULL
Results:
| VALUE | PREFIX | SUFFIX |
|-------------|------------|--------|
| ABC1234 | ABC | 1234 |
| DEFG456 | DEFG | 456 |
| JKLMNOPQRS9 | JKLMNOPQRS | 9 |
Try this below query for scenarios mentioned , I didn't split if characters followed by numbers:
with test (col) as
(select 'ABC1234' from dual union all
select 'ABC' from dual union all
select 'dEfH456' from dual union all
select '123GHI' from dual union all
select '456' from dual
)
select col,reverse(trim(regexp_replace(reverse(col),'^[0-9]+',' '))) string ,trim(regexp_replace(col,'^[a-zA-Z]+',' ')) numbers from test
if like to move that characters&string to any place my case statement
with test (col) as
(select 'ABC1234' from dual union all
select 'ABC' from dual union all
select 'dEfH456' from dual union all
select '123GHI' from dual union all
select '456' from dual
)
select v.col,case when v.string=v.numbers THEN NULL ELSE string end string , v.numbers
from (select col,reverse(trim(regexp_replace(reverse(col),'^[0-9]+',' '))) string ,trim(regexp_replace(col,'^[a-zA-Z]+',' ')) numbers from test) v
Would something like this do?
SQL> with test (col) as
2 (select '"ABC1234", "ABC", "dEf456", "123GHI", "456"' from dual),
3 inter as
4 (select trim(regexp_substr(replace(col, '"', ''), '[^,]+', 1, level)) token
5 from test
6 connect by level <= regexp_count(col, ',') + 1
7 )
8 select regexp_substr(token, '^[a-zA-Z]+') letters,
9 regexp_substr(token, '[0-9]+$') digits
10 from inter
11 where regexp_like(token, '^[a-zA-Z]+[0-9]+$');
LETTERS DIGITS
---------- ----------
ABC 1234
dEf 456
SQL>

Oracle SQL REGEX_LIKE

SELECT first_name, last_name
FROM employees
WHERE REGEXP_LIKE (first_name, '^Ste(v|ph)en$');
The following query returns the first and last names for those employees with a first name of Steven or Stephen (where first_name begins with Ste and ends with en and in between is either v or ph)
is there a call that is opposite where the query will return everything that would not have (v or ph) between Ste and en?
so that it would return things like:
Stezen
Stellen
is it as simple as putting NOT in front of REGEXP_LIKE?
How about MINUS
SELECT *
FROM employees
WHERE REGEXP_LIKE( first_name , '^Ste([[:alpha:]])+en$')
MINUS
SELECT *
FROM employees
WHERE REGEXP_LIKE( first_name , '^Ste(v|ph)en$');
and this too:
WITH t AS
( SELECT 'Stezen' first_name FROM dual
UNION ALL
SELECT 'Steven' FROM dual
UNION ALL
SELECT 'Stephen' FROM dual
)
SELECT *
FROM t
WHERE REGEXP_LIKE( first_name , '^Ste([[:alpha:]])+en$')
AND NOT REGEXP_LIKE( first_name , '^Ste(v|ph)en$');
You need something like this:
SELECT 'Match'
FROM dual
WHERE REGEXP_LIKE ('Steden', '^Ste[^(v|ph)]en$');
EDIT
This will exclude any two (or more) letter combinations but still allow "v" :
SELECT 'Match'
FROM dual
WHERE REGEXP_LIKE ('Stephen', '^Ste[[:alpha:]]en$');
Since Oracle does not support look-ahead functionality, I will have to agree with others that we will have to deal with "v" explicitly, either by excluding the entire name(word) or at least specifying its exact position.
SELECT name
FROM WhateverTable
WHERE REGEXP_LIKE (name, '^Ste[[:alpha:]]en$') AND SUBSTR(name, 4, 1) <> 'v';
Two options:
The first query uses two REGEXP_LIKE tests: one regular expression to generically match; and one for excluding the invalid matches.
The second query uses REGEXP_SUBSTR to testfor a generic match and extract the sub-group of the match and then tests to see whether it should be exluded.
The third query then looks at how you can extend the query by having another table containing the match criteria and allows you to build and test multiple name variants.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl ( str ) AS
SELECT 'Stephen' FROM DUAL
UNION ALL SELECT 'Steven' FROM DUAL
UNION ALL SELECT 'Stepen' FROM DUAL
UNION ALL SELECT 'Steephen' FROM DUAL
UNION ALL SELECT 'Steeven' FROM DUAL
UNION ALL SELECT 'Steeven' FROM DUAL
UNION ALL SELECT 'Smith' FROM DUAL
UNION ALL SELECT 'Smithe' FROM DUAL
UNION ALL SELECT 'Smythe' FROM DUAL
UNION ALL SELECT 'Smythee' FROM DUAL;
CREATE TABLE exclusions ( prefix, exclusion, suffix ) AS
SELECT 'Ste', 'v|ph', 'en' FROM DUAL
UNION ALL SELECT 'Sm', 'ithe?|ythe', '' FROM DUAL;
Query 1:
SELECT str
FROM tbl
WHERE REGEXP_LIKE( str, '^Ste(\w+)en$' )
AND NOT REGEXP_LIKE( str, '^Ste(v|ph)en$' )
Results:
| STR |
|----------|
| Stepen |
| Steephen |
| Steeven |
| Steeven |
Query 2:
SELECT str
FROM (SELECT str,
REGEXP_SUBSTR( str, '^Ste(\w+)en$', 1, 1, NULL, 1 ) AS match
FROM tbl)
WHERE match IS NOT NULL
AND NOT REGEXP_LIKE( match, '^(v|ph)$' )
Results:
| STR |
|----------|
| Stepen |
| Steephen |
| Steeven |
| Steeven |
Query 3:
SELECT str
FROM tbl t
WHERE EXISTS ( SELECT 1
FROM exclusions e
WHERE REGEXP_LIKE( t.str, '^' || e.prefix || '(\w+)' || e.suffix || '$' )
AND NOT REGEXP_LIKE( t.str, '^' || e.prefix || '(' || e.exclusion || ')' || e.suffix || '$' )
)
Results:
| STR |
|----------|
| Stepen |
| Steephen |
| Steeven |
| Steeven |
| Smythee |