Remove Special Characters from an Oracle String - sql

From within an Oracle 11g database, using SQL, I need to remove the following sequence of special characters from a string, i.e.
~!##$%^&*()_+=\{}[]:”;’<,>./?
If any of these characters exist within a string, except for these two characters, which I DO NOT want removed, i.e.: "|" and "-" then I would like them completely removed.
For example:
From: 'ABC(D E+FGH?/IJK LMN~OP' To: 'ABCD EFGHIJK LMNOP' after removal of special characters.
I have tried this small test which works for this sample, i.e:
select regexp_replace('abc+de)fg','\+|\)') from dual
but is there a better means of using my sequence of special characters above without doing this string pattern of '\+|\)' for every special character using Oracle SQL?

You can replace anything other than letters and space with empty string
[^a-zA-Z ]
here is online demo
As per below comments
I still need to keep the following two special characters within my string, i.e. "|" and "-".
Just exclude more
[^a-zA-Z|-]
Note: hyphen - should be in the starting or ending or escaped like \- because it has special meaning in the Character class to define a range.
For more info read about Character Classes or Character Sets

Consider using this regex replacement instead:
REGEXP_REPLACE('abc+de)fg', '[~!##$%^&*()_+=\\{}[\]:”;’<,>.\/?]', '')
The replacement will match any character from your list.
Here is a regex demo!

The regex to match your sequence of special characters is:
[]~!##$%^&*()_+=\{}[:”;’<,>./?]+

I feel you still missed to escape all regex-special characters.
To achieve that, go iteratively:
build a test-tring and start to build up your regex-string character by character to see if it removes what you expect to be removed.
If the latest character does not work you have to escape it.
That should do the trick.

SELECT TRANSLATE('~!##$%sdv^&*()_+=\dsv{}[]:”;’<,>dsvsdd./?', '~!##$%^&*()_+=\{}[]:”;’<,>./?',' ')
FROM dual;
result:
TRANSLATE
-------------
sdvdsvdsvsdd

SQL> select translate('abc+de#fg-hq!m', 'a+-#!', etc.) from dual;
TRANSLATE(
----------
abcdefghqm

Related

Remove special characters and alphabets from a string except number in sql query in db2

Hi I tried using Regex_replace and it is still not working.
select CASE WHEN sbbb <> ' ' THEN regexp_replace(sbbb,'[a-zA-Z _-#]','']
ELSE sbbb
AS ABCDF
from Table where sccc=1;
This is the query which I am using to remove alphabets and specials characters from string and have only numbers. but it doesnot work. Query returns me the complete string with numbers,characters and special characters .What is wrong in the above query
I am working on a sql query. There is a column in database which contains characters,special characters and numbers. I want to only keep the numbers and remove all the special characters and alphabets. How can I do it in query of DB2. If a use PATINDEX it is not working. please help here.
The allowed regular expression patterns are listed on this page
Regular expression control characters
Outside of a set, the following must be preceded with a backslash to be treated as a literal
* ? + [ ( ) { } ^ $ | \ . /
Inside a set, the follow must be preceded with a backslash to be treated as a literal
Characters that must be quoted to be treated as literals are [ ] \
Characters that might need to be quoted, depending on the context are - &
So for you, this should work
regexp_replace(sbbb,'[a-zA-Z _\-#]','')

REGEXP_REPLACE URL BIGQUERY

I have two types of URL's which I would need to clean, they look like this:
["//xxx.com/se/something?SE_{ifmobile:MB}{ifnotmobile:DT}_A_B_C_D_E_F_G_H"]
["//www.xxx.com/se/car?p_color_car=White?SE_{ifmobile:MB}{ifnotmobile:DT}_A_B_C_D_E_F_G_H"]
The outcome I want is;
SE_{ifmobile:MB}{ifnotmobile:DT}_A_B_C_D_E_F_G_H"
I want to remove the brackets and everything up to SE, the URLS differ so I want to remove:
First URL
["//xxx.com/se/something?
Second URL:
["//www.xxx.com/se/car?p_color_car=White?
I can't get my head around it,I've tried this .*\/ . But it will still keep strings I don't want such as:
(1 url) =
something?
(2 url) car?p_color_car=White?
You can use
regexp_replace(FinalUrls, r'.*\?|"\]$', '')
See the regex demo
Details
.*\? - any zero or more chars other than line breakchars, as many as possible and then ? char
| - or
"\]$ - a "] substring at the end of the string.
Mind the regexp_replace syntax, you can't omit the replacement argument, see reference:
REGEXP_REPLACE(value, regexp, replacement)
Returns a STRING where all substrings of value that match regular
expression regexp are replaced with replacement.
You can use backslashed-escaped digits (\1 to \9) within the
replacement argument to insert text matching the corresponding
parenthesized group in the regexp pattern. Use \0 to refer to the
entire matching text.

sql regexp string end with ".0"

I want to judge if a positive number string is end with ".0", so I wrote the following sql:
select '12310' REGEXP '^[0-9]*\.0$'. The result is true however. I wonder why I got the result, since I use "\" before "." to escape.
So I write another one as select '1231.0' REGEXP '^[0-9]\d*\.0$', but this time the result is false.
Could anyone tell me the right pattern?
Dot (.) in regexp has special meaning (any character) and requires escaping if you want literally dot:
select '12310' REGEXP '^[0-9]*\\.0$';
Result:
false
Use double-slash to escape special characters in Hive. slash has special meaning and used for characters like \073 (semicolon), \n (newline), \t (tab), etc. This is why for escaping you need to use double-slash. Also for character class digit use \\d:
hive> select '12310.0' REGEXP '^\\d*?\\.0$';
OK
true
Also characters inside square brackets do not need double-slash escaping: [.] can be used instead of \\.
If you know it is a number string, why not just use:
select ( val like '%.0' )
You need regular expression if you want to validate that the string has digits everywhere else. But if you only need to check the last two characters, like is sufficient.
As for your question . is a wildcard in regular expressions. It matches any character.

Remove accents from string in Oracle

When trying to remove all accents from a string in Oracle using the techniques described in this stackoverflow answer: how replace accented letter in a varchar2 column in oracle I’m getting mixed results.
select CONVERT('JUAN ROMÄN', 'US7ASCII') from dual;
Returns the original string but replaces characters with for example ñ by a question mark (probably because of the chosen charset - tests with different charsets led to different results).
Using the following technique:
select utl_raw.cast_to_varchar2(nlssort(NAME_USER, 'nls_sort=binary_ai')) from YOUR_TABLE;
Returns the complete string but also places a NUL value at the end of the string.
Is there a characterset that I can use with Spanish accents to get a correct result (the original string with the different accents removed); is there a way to avoid the NUL value in the utl_raw.cast_to_varchar2 technique?
Based on the comments the the replace char(0) seems to remove the NUL value. For example
select
upper(utl_raw.cast_to_varchar2((nlssort('this is áà ñew test','nls_sort=binary_ai')))) as test,
replace(upper(utl_raw.cast_to_varchar2((nlssort('this is áà ñew test','nls_sort=binary_ai')))),chr(0),'') as test2
from dual;
If possible I would however to have a more 'straightforward/simpler' solution.
You can use TRANSLATE(your_string, from_chars, to_chars) https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions196.htm
Just put all chars with accents in from_chars string and their corresponding replacement chars in to_chars.

SQL Server LIKE containing bracket characters

I am using SQL Server 2008. I have a table with the following column:
sampleData (nvarchar(max))
The value for this column in some of these rows are lists formatted as follows:
["value1","value2","value3"]
I'm trying to write a simple query that will return all rows with lists formatted like this, by just detecting the opening bracket.
SELECT * from sampleTable where sampleData like '[%'
The above query doesn't work, because '[' is a special character. How can I escape the bracket so my query does what I want?
... like '[[]%'
You use [ ] to surround a special character (or range).
See the section "Using Wildcard Characters As Literals" in SQL Server LIKE
Note: You don't need to escape the closing bracket...
Aside from gbn's answer, the other method is to use the ESCAPE option:
SELECT * from sampleTable where sampleData like '\[%' ESCAPE '\'
See the documentation for details.
Just a further note here...
If you want to include the bracket (or other specials) within a set of characters, you only have the option of using ESCAPE (since you are already using the brackets to indicate the set).
Also you must specify the ESCAPE clause, since there is no default escape character (it isn't backslash by default as I first thought, coming from a C background).
E.g., if I want to pull out rows where a column contains anything outside of a set of 'acceptable' characters, for the sake of argument let's say alphanumerics... we might start with this:
SELECT * FROM MyTest WHERE MyCol LIKE '%[^a-zA-Z0-9]%'
So we are returning anything that has any character not in the list (due to the leading caret ^ character).
If we then want to add special characters in this set of acceptable characters, we cannot nest the brackets, so we must use an escape character, like this...
SELECT * FROM MyTest WHERE MyCol LIKE '%[^a-zA-Z0-9\[\]]%' ESCAPE '\'
Preceding the brackets (individually) with a backslash and indicating that we are using backslash for the escape character allows us to escape them within the functioning brackets indicating the set of characters.