How to handle string with only space in oracle sql? - sql

I have a case where I am getting the data from DB and converting the string to a number using TO_NUMBER, but this case fails when the string is an empty string with unknown or space char like
columnA
------
4444
333333
The string '4444' and '333333' is converted to number by there is and error "ora-01722 invalid number" for the 2nd string.
Can this be handled with DECODE or CAST in any way, because I need to use TO_NUMBER any how for further processing?

I hope this could be Insight of your issue.
select
TO_NUMBER(trim(colA)),
TO_NUMBER(REGEXP_REPLACE(colA,'(^[[:space:]]*|[[:space:]]*$)')),
regexp_instr(colA, '[0-9.]')
from
(
select ' 123' colA from dual
union all
select ' ' colA from dual
union all
select '.456' colA from dual
)
This is similar issue : Trim Whitespaces (New Line and Tab space) in a String in Oracle

If all the data within that column is composed of integers, integers with leading and/or trailing whitespaces, null values and only whitespaces then only using TRIM() function will suffice such as
SELECT TRIM(columnA)
FROM t
and that would be more performant than using functions of regular expressions
But
If the data contains decimal numbers, letters, punctiations and special characters along with whitespaces and null values, then use
SELECT TRIM('.' FROM REGEXP_REPLACE(columnA,'[^[:digit:].]'))
FROM t
where there is at most one dot character assumed to be between the starting and ending digits. All of the leading and trailing dots are trimmed at the end of the operation provided there is any of them. The other characters are already removed by the regular expression.
If you're sure that there's no trailing or leading dots, then using
SELECT REGEXP_REPLACE(columnA,'[^[:digit:].]')
FROM t
would be enough
Demo
You can wrap up any of the expressions with TO_NUMBER() function depending on your case at the end

Related

Remove template text on regexp_replace in Oracle's SQL

I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
SELECT REGEXP_REPLACE(col,'&&#\d+;')
FROM t
where
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'

How to remove characters that are not in ASCII range 32 - 126 from a column in SQL Query

I have a table containing two columns which can hold any character values. However the target system only allows characters in the ASCII range 32 - 126.
My requirement is to remove the characters out of this ASCII range, and allow others to flow
Ex : Frédéric should become Frdric - which just drops all spl chars out of that range.
It should only be within a query, not a procedure. Thanks in advance
You could possibly use the REGEXP_REPLACE function?
Two basic examples could be:
SELECT REGEXP_REPLACE('Frédéric', '[^ -~]', '') FROM dual;
SELECT REGEXP_REPLACE('QuÉbec', '[^ -~]', '') FROM dual;
Generates the output:
The second parameter, or the regexp portion inside the [] set is: ^ Not
space single space
- hyphen
~ tilde
Which means that any characters that are not ^ in the range of space-~ (space to tilde) get replaced by the '' empty string.
Here's a dbfiddle for the image examples.
Note that I also tried using the '[^[:print:]]' character class, which covers the same range, but it does not seem to work.

Regexp_Like to Validate Uppercase Characters [A-Z] and Numbers [0-9] Only

I would like a query using regexp_like within Oracle's SQL which only validates uppercase characters [A-Z] and numbers [0-9]
SELECT *
FROM dual
WHERE REGEXP_LIKE('AAAA1111', '[A-Z, 0-9]')
List item
The select Statement probalby should look like
SELECT 'Yes' as MATCHING
FROM dual
WHERE REGEXP_LIKE ('AAAA1111', '^[A-Z0-9]+$')
Which means that starting from the very first ^ to the last $ letter every character should be upper case or a number. Important: no comma or space between Z and 0. The + stands for at least one or more characters.
Edit: Based on the answer of Barbaros another way of selecting would be possible
SELECT 'Yes' as MATCHING
FROM DUAL
WHERE regexp_like('AAAA1111','^[[:digit:][:upper:]]+$')
Edit: added a DBFiddle
A quick help may be found here and for oracle regular expressions here.
You can use :
select str as "Result String"
from tab
where not regexp_like(str,'[[:lower:] ]')
and regexp_like(str,'[[:alnum:]]')
where not regexp_like with POSIX [^[:lower:]] pattern stands for eliminating the strings
containing lowercase,
and regexp_like with POSIX [[:alnum:]] pattern stands for accepting the strings
without symbols
( containing only letters and numbers even doesn't contain a space because of the trailing space at the end part of [[:lower:] ] )
Demo

Extract Specific Set of data from a String in Oracle

I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'. I am trying to remove the _1_2_3_4_5 & the 1_ portion from the string. Which is essentially the numeric portion in the string. any special characters after the last alphabet must also be removed. In this example the _ after the character E must also not be present.
and the Query I am trying is as below
SELECT
REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+',1,1)
from dual
The Data I get from the above query is as below: -
_A_B_C_D_E_
I am trying to figure a way to remove the underscore towards the end. Any other way to approach this?
Assuming the "letters" come first and then the "digits", you could do something like this:
select regexp_substr('A_B_C_D_E_1_2_3_4_5','.*[A-Z]') from dual;
This will pull all the characters from the beginning of the string, up to the last upper-case letter in the string (.* is greedy, it will extend as far as possible while still allowing for one more upper-case letter to complete the match).
I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'
Use REGEXP_REPLACE:
SQL> SELECT trim(BOTH '_' FROM
2 (REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[0-9]+', ''))) str
3 FROM dual;
STR
---------
A_B_C_D_E
How it works:
REGEXP_REPLACE will replace all numeric occurrences '[0-9]+' from the string. Alternatively, you could also use POSIX character class '[^[:digit:]]+'
TRIM BOTH '_' will remove any leading and lagging _ from the string.
Also using REGEXP_SUBSTR:
SELECT trim(BOTH '_' FROM
(REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+'))) str
FROM dual;
STR
---------
A_B_C_D_E

string splited by comma using oracle sql

How can I split a string by comma using oracle sql?
Here I have a column which has values like below
123Lcq
Lf32i
jkp32m
I want to split it by comma
1,2,3,L,c,q
L,f,3,2,i
j,k,p,3,2,m
You can achieve the desired output using REGEXP_REPLACE:
SELECT
rtrim(regexp_replace(text, '(.)', '\1,'), ',') result
FROM (
SELECT '123Lcq' text FROM dual UNION ALL
SELECT 'Lf32i' text FROM dual UNION ALL
SELECT 'jkp32m' text FROM dual)
You could use regexp_replace:
SELECT substr(regexp_replace(mycol, '(.)', ',\1'), 2)
FROM mytable
The regular expression finds every character, and those matching characters are then all prefixed with commas. Finally a simple substr is used to eliminate the first comma.
Note that trimming commas could be an alternative to substr, but the behaviour is different when the original value already has commas at the end of the string: when trimming, you also trim away these original commas.