Regexp_replace adding extra characters - sql

I am using the following query in Oracle 11.2.0.3.0 / Toad for Oracle 11.6.1.6:
select regexp_replace('000010PARA197427'
,'([0-9]*)([A-Z]*)([0-9]*)'
,'\3-\2-\1') from dual
Rather than getting what I expected, 197427-PARA-000010.
I get 197427-PARA-000010-- as a result.
If I change the query to:
select regexp_replace('000010PARA197427'
,'([0-9]*)([A-Z]*)([0-9]*)'
,'\3-c\2-c\1') from dual
I then get 197427-cPARA-c000010-c-c for the result.
it's like all the literals are getting appended to the end of the result.
Any help would be much appreciated.

Not exactly sure why this is happening, but since you only have * quantifiers and no anchoring, maybe you're getting an empty match (or something like that).
Anchoring the pattern (/^...$/) seems to work. Using + rather than * for any of the quantifier also works for this sample.
SQL> select regexp_replace('000010PARA197427'
,'([0-9]+)([A-Z]*)([0-9]*)'
,'\3-\2-\1') foo from dual ;
FOO
------------------
197427-PARA-000010
SQL> select regexp_replace('000010PARA197427'
,'^([0-9]*)([A-Z]*)([0-9]*)$'
,'\3-\2-\1') foo from dual ;
FOO
------------------
197427-PARA-000010

Related

ORACLE: How to use regexp_like to find a string with single quotes between two characters?

I need to query the DB for all records that have two single quite between characters. Example : We've, who's.
I have the regex https://regex101.com/r/6MtB9j/1 but it doesn't work with REGEXP_LIKE.
Tried this
SELECT content
FROM MyTable
WHERE REGEXP_LIKE (content, '(?<=[a-zA-Z])''(?=[a-zA-Z])')
Appreciate the help!
Oracle regex does not support lookarounds.
You do not actually need lookaround in this case, you can use
SELECT content
FROM MyTable
WHERE REGEXP_LIKE (content, '[a-zA-Z]''[a-zA-Z]')
This will work since REGEXP_LIKE only attempts one match, and if there is a match, it returns true, otherwise, false (eventually, fetching a record or not).
Lookarounds are useful in case you need to replace or extract values, when matches may overlap.
If you just need a single quote in a string, you can use:
where content like '%''%'
If they specifically need to be letters, then you need a regular expression:
regexp_like(content, '[a-zA-Z][''][a-zA-Z]')
or:
regexp_like(content, '[a-zA-Z]\'[a-zA-Z]')
If I understand well, you may need something like
regexp_count(content, '[a-zA-Z]''[a-zA-Z]') = 2.
For example, this
with myTable(content) as
(
select q'[what's]' from dual union all
select q'[who's, what's]' from dual union all
select q'[who's, what's, I'm]' from dual
)
select *
from myTable
where regexp_count(content, '[a-zA-Z]''[a-zA-Z]') = 2
gives
CONTENT
------------------
who's, what's

fixed number format with different lengths in Oracle

I need help with a Oracle Query
I have a query:
scenario 1: select to_char('1737388250',what format???) from dual;
expected output: 173,7388250
scenario 2: select to_char('173738825034',what format??) from dual;
expected output: 173,738825034
scenario 3: select to_char('17373882',what format??) from dual;
expected output: 173,73882
I need a query to satify all above scenarios?
Can some one help please?
It is possible to get the desired result with a customized format model given to to_char; I show one example below. However, any solution along these lines is just a hack (a solution that should work correctly in all cases, but using features of the language in ways they weren't intended to be used).
Here is one example - this will work if your "inputs" are positive integers greater than 999 (that is: at least four digits).
with
sample_data (num) as (
select 1737388250 from dual union all
select 12338 from dual
)
select num, to_char(num, rpad('fm999G', length(num) + 3, '9')) as formatted
from sample_data
;
NUM FORMATTED
---------- ------------
1737388250 173,7388250
12338 123,38
This assumes comma is the "group separator" in nls_numeric_characters; if it isn't, that can be controlled with the third argument to to_char. Note that the format modifier fm is needed so that no space is prepended to the resulting string; and the +3 in the second argument to rpad accounts for the extra characters in the format model (f, m and G).
You can try
select TO_CHAR(1737388250, '999,99999999999') from dual;
Take a look here
Your requirement is different so you can use substr and concatanation as follows:
select substr(your_number,1,3)
|| case when your_number >= 1000 then ',' end
|| substr(1737388250,4)
from dual;
Db<>fiddle
Your "number" is enclosed in single-quotes. This makes it a character string, albeit a string of only numeric characters. But a character string, nonetheless. So it makes no sense to pass a character string to TO_CHAR.
Everyone's suggestions are eliding over this and useing and using an actual number .. notice the lack of single-quotes in their code.
You say you always want a comma after the first three "numbers" (characters), which makes no sense from a numerical/mathematical sense. So just use INSTR and insert the comma:
select substr('123456789',1,3)||','||substr('123456789',4) from dual:
If the source data is actually a number, then pass it to to_char, and wrap that in substr:
select substr(to_char(123456789),1,3)||','||substr(to_char(123456789,4) from dual:

Oracle SQL Commas format?

I am trying to have my numbers have commas
i.e.
4333 ---> 4,333
I came up with this
TO_CHAR(COUNT(*),'$9,999.99') AS TOTAL_APPS
basically Im counting everything in the db and want the commas present, this is already in a select statement and according to oracle docs that is the structure for commas, what is the issue?
So, what's wrong with what you came up with? Doesn't it do what you wanted?
Though, it looks like you found it somewhere on the Internet and applied to your situation because
This is what you have:
4333
This is what you want:
4,333
This is what you have (with '$9,999.99' format mask), i.e. there's a dollar sign as well as decimals which you - apparently - don't want:
SQL> select to_char(4333, '$9,999.99') result from dual;
RESULT
----------
$4,333.00
If you change the format mask to this:
SQL> select to_char(4333, '9G999', 'nls_numeric_characters = .,') result from dual;
RESULT
------
4,333
you might get what you wanted.
Why did I use it like this? G is the "thousands" separator. It can be different in different databases; someone uses a comma, someone else uses a dot, etc. NLS_NUMERIC_CHARACTERS says which character is actually being used for the G mark, so it could have been e.g.
SQL> select to_char(4333, '9G999', 'nls_numeric_characters = ^=') result from dual;
RESULT
------
4=333
An alternative approach that will only work in SQLPlus is to use SQLPlus formatting.
In this example, a simple query return a large number. With no formatting we just get the plain number back.
SQL> select 373746764 from dual;
373746764
----------
373746764
Let's give it an alias, still no formatting:
SQL> select 373746764 tn from dual;
TN
----------
373746764
Now let's use teh COL formatting command in SQL*PLUS:
SQL> col tn form '999,999,990.00'
SQL> select 373746764 tn from dual;
TN
---------------
373,746,764.00
So,. in your case, you could create a script that has all these COL column_alias commands which you can call and ensure your scripts have appropriate column aliases to match.
for example a script called formating.sql:
COL total_apps form '999,999,990'
COL total_cost form '999,999,990.00'
Then you report.sql scriopt can have:
#formatting.sql
SELECT COUNT(*) TOTAL_APPS FROM the_table;
I think this will work with SQL Developer as well, but as I said this approach is for SQL*Plus.
For more widespread results then TO_CHAR on each query is the way to go.

Check if a string has specific ending, erase this ending, then write those strings according to two other conditions

I've got a problem with preparing a readable report from my system.
I have to extract strings from "sil.catalog_no".
Firstly, I have to check if a string has ending like '-UW' and delete it.
After that, I have to extract that string (without -UW already), BUT without first part before first '-' or second '-' depending if before first '-' there is 'US'.
I know it's messed up, but I don't know how to describe it other way.
I have already tried SUBSTRING, LEFT, RIGHT and something with CHARINDEX, but my program/database/sql version(?) seems not to operate on those things and I can't find any other solution than these mentioned. Maybe it's because I don't use them correctly, I don't know.
The examples of strings contained in sil.catalog_no are:
HU-98010587
US-HU-88136FYT-719-UW
So, in first example, I just have to check if there is '-UW' at the end. There isn't one, so I move to second step, and just remove 'HU-' and extract the rest which is '98010587'.
With the second one, I want to check and remove '-US' from the end. Then I want to erase whole 'US-HU-', because there is 'US' first and I want to get '88136FYT-719'.
EDIT:
After re-thinking the problem, I think I would like to know the way to erase specific parts of strings. Looking at image I've provided, I would like to erase all 'HU-', 'EMC-', 'US-', and '-UW' that appear in result.
OK, I think the function regexp_replace can solve your problem. As below:
postgres=# select regexp_replace(regexp_replace('US-HU-88136FYT-719-UW','^([A-Z]+-)+',''),'(-[A-Z]+)+$','') as result;
result
--------------
88136FYT-719
(1 row)
postgres=# select regexp_replace(regexp_replace('HU-98010587','^([A-Z]+-)+',''),'(-[A-Z]+)+$','') as result;
result
----------
98010587
(1 row)
postgres=# select regexp_replace(regexp_replace('EMC-C13-PWR-7','^([A-Z]+-)+',''),'(-[A-Z]+)+$','') as result;
result
-----------
C13-PWR-7
Or we remove the 'HU-', 'EMC-', 'US-', '-UW' more precisely, as below:
postgres=# select regexp_replace(regexp_replace('HU-98010587','^(HU-|EMC-|US-)+',''),'(-UW)+$','') as result;
result
----------
98010587
(1 row)
postgres=# select regexp_replace(regexp_replace('US-HU-88136FYT-719-UW','^(HU-|EMC-|US-)+',''),'(-UW)+$','') as result;
result
--------------
88136FYT-719
(1 row)
postgres=# select regexp_replace(regexp_replace('EMC-C13-PWR-7','^(HU-|EMC-|US-)+',''),'(-UW)+$','') as result;
result
-----------
C13-PWR-7
(1 row)
postgres=# select regexp_replace(regexp_replace('US-HU-88134UGQ-UW','^(HU-|EMC-|US-)+',''),'(-UW)+$','') as result;
result
----------
88134UGQ
I think the two regular expressions above may get the right result both, and the second would accurately match your needs. Just try it.
Another approach (tried on Postgres but also works even where there is no regex, like with SQLite3):
drop table if exists t;
create table t(s varchar(30));
insert into t values
('HU-98010587'),
('US-HU-88136FYT-719-UW'),
('EMC-C13-PWR-7'),
('EMC-CTX-OM4-10M');
with xxx(original,s) as (
select s,substr(s,1,length(s)-3) from t where s like '%-UW'
union
select s,s from t where s not like '%-UW'
)
select original,substr(s,4) s from xxx where s like 'HU-%'
union
select original,substr(s,7) s from xxx where s like 'US-HU-%'
union
select original,s from xxx where s not like 'HU-%' and s not like 'US-HU-%';
To get what you say in your edit I would like to erase all 'HU-', 'EMC-', 'US-', and '-UW' that appear in result:
select s original,
replace(
replace(
replace(
replace(s,'HU-','')
,'US-','')
,'-UW','')
,'EMC-','') s
from t;

Oracle Look behind Positive

There is no example of look behind expression in Oracle Doc, so i tried using Java syntax,
This my query that supposed to get any digit after TOP
select regexp_substr('TIPTOP4152','(?<=TOP)\d+') sub from dual
But nothing to be displayed !
For the sake of argument, REGEXP_SUBSTR works too:
SQL> select regexp_substr('TIPTOP4152', 'TOP(\d+)', 1, 1, NULL, 1) nbr
from dual;
NBR
----
4152
SQL>
I'm not sure that Oracle supports lookbehind. Instead you should be able to do this pretty easily with regexp_replace
REGEXP_REPLACE('TIPTOP4152', '.*TOP(\d+)', '\1')