Query to get records with special characters only - sql

I am trying to write a query which will tell me if certain record is having only the special characters. e.g- "%^&%&^%&" will error however "%HH678*(*))" is fine (as it's having alphanumeric values as well. I have written following query however, it's working fine only for English alphabets and numbers, if column is having some other characters like mandarin then also it's not giving expected value.Any help is highly appreciated.
SELECT * FROM test WHERE REGEXP_LIKE(sampletext, '[^]^A-Z^a-z^0-9^[^.^{^}^ ]' );

You may try this,
regexp_like(text, '^[^A-Za-z0-9]+$')
This would match the text only if the input text contains special chars ie, only chars which are not of letters or digits.

To detect strings containing only characters other than unaccented alphabetic, numeric, or white space characters try this:
regexp_like(text,'^[^a-zA-Z0-9[:space:]]+$')
If you don't think punctuation characters are special than add [:punct:] to the class of characters to ignore.
If you are looking for a specific set of characters you can use a character class of just those characters of interest, for example some common accented characters (note the lack of a leading ^ in the character class []:
regexp_like(text,'^[àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ]+$')

Related

Query to replace special characters in phone number field

Can anyone help with a query on how to replace special/non-numeric/hidden characters from a phone number column.
I've tried
LTRIM(RTRIM(REGEXP_REPLACE(
PHONE_NBR,
'[^[:digit:]][:cntrl:][:alpha:][:graph:][:blank:][:print:][:punct:][:space:]~',
'')))
but no luck, there are still a few records which contain non-numeric values.
Your regex is saying to ONLY replace a string consisting of: a non-numeric character followed by a control character, an alpha, a graph, a blank, a print, a punct, a space, and then a tilde.
You should be able to just use '[^[:digit:]]' as your regex, to remove all non-numeric characters.

Regular Expression for alphanumeric and some special characters not adjacent

I would like to have a regular expression to make an Oracle SQL REGEXP_LIKE query that checks
if a string starts with one alphanumeric character
if the string ends with one alphanumeric character
if the "body" of the string contains only alphanumeric character OR these authorized characters (written) : hyphen (dash), dot, apostrophe, space
if the authorised characters are NOT adjacent (to avoid something like "he--'''l..'-lo")
I started with this :
^[a-zA-Z0-9]+(a-zA-Z0-9\-\.'|([^\-\.'])\1)*[a-zA-Z0-9]$
I used backslash to escape assuming that dot and hyphen are metacharacters
I think this is what you want:
^[a-zA-Z0-9]+([-.' ][a-zA-Z0-9]|[a-zA-Z0-9])*\w?$
It looks for
at least 1 alphanumeric (alnum),
followed by
either an authorized character followed by an alphanumeric or just an alphanumeric, repeated any number of times (including 0).
optionally followed by
an alnum
This meets your specification. I'm not sure if starts with one alnum and ends with one alnum means that there must be at least 2 alnums, or if they can be the same. If there must be at least 2 of them, remove the last ? (which make the last alnum optional).
Regards
assuming you meant "authorised characters are NOT adjacent to each other"
try something along these lines
^[a-zA-Z0-9]+([a-zA-Z0-9]+[\-\.' ]?)*[a-zA-Z0-9]$
so that the repeating middle part always has one alphanumeric character followed by zero to one special characters.

Remove Special Characters from an Oracle String

From within an Oracle 11g database, using SQL, I need to remove the following sequence of special characters from a string, i.e.
~!##$%^&*()_+=\{}[]:”;’<,>./?
If any of these characters exist within a string, except for these two characters, which I DO NOT want removed, i.e.: "|" and "-" then I would like them completely removed.
For example:
From: 'ABC(D E+FGH?/IJK LMN~OP' To: 'ABCD EFGHIJK LMNOP' after removal of special characters.
I have tried this small test which works for this sample, i.e:
select regexp_replace('abc+de)fg','\+|\)') from dual
but is there a better means of using my sequence of special characters above without doing this string pattern of '\+|\)' for every special character using Oracle SQL?
You can replace anything other than letters and space with empty string
[^a-zA-Z ]
here is online demo
As per below comments
I still need to keep the following two special characters within my string, i.e. "|" and "-".
Just exclude more
[^a-zA-Z|-]
Note: hyphen - should be in the starting or ending or escaped like \- because it has special meaning in the Character class to define a range.
For more info read about Character Classes or Character Sets
Consider using this regex replacement instead:
REGEXP_REPLACE('abc+de)fg', '[~!##$%^&*()_+=\\{}[\]:”;’<,>.\/?]', '')
The replacement will match any character from your list.
Here is a regex demo!
The regex to match your sequence of special characters is:
[]~!##$%^&*()_+=\{}[:”;’<,>./?]+
I feel you still missed to escape all regex-special characters.
To achieve that, go iteratively:
build a test-tring and start to build up your regex-string character by character to see if it removes what you expect to be removed.
If the latest character does not work you have to escape it.
That should do the trick.
SELECT TRANSLATE('~!##$%sdv^&*()_+=\dsv{}[]:”;’<,>dsvsdd./?', '~!##$%^&*()_+=\{}[]:”;’<,>./?',' ')
FROM dual;
result:
TRANSLATE
-------------
sdvdsvdsvsdd
SQL> select translate('abc+de#fg-hq!m', 'a+-#!', etc.) from dual;
TRANSLATE(
----------
abcdefghqm

Which Unicode characters are "composing" characters (whose sole purpose is to add accent, tilda)?

This is related to
What are the characters that count as the same character under collation of UTF8 Unicode? And what VB.net function can be used to merge them?
This is how I plan to do this:
Use http://msdn.microsoft.com/en-us/library/dd374126%28v=vs.85%29.aspx to turn the string into
KD form.
Basically it'll turn most variation such as superscript into the normal number. Also it decompose tilda and accent into 2 characters.
Next step would be to remove all characters whose sole purpose is tildaing or accenting character.
How do I know which characters are like that? Which characters are just "composing characters"
How do I find such characters? After I find those, how do I get rid of it? Should I scan character by character and remove all such "combining characters?"
For example:
Character from 300 to 362 can be gotten rid off.
Then what?
Combining characters are listed in UnicodeData.txt as having a nonzero Canonical_Combining_Class, and a General_Category of Mn (Mark, nonspacing).
For each character in the string, call GetUnicodeCategory and check the UnicodeCategory for NonSpacingMark, SpacingCombiningMark or EnclosingMark.
You may be able to do it more efficiently using regex, eg Regex.Replace(str, "\p{M}", "").

How can I write special character in VB code

I have a Sql statament using special character (ex: ('), (/), (&)) and I don't know how to write them in my VB.NET code. Please help me. Thanks.
Find out the Unicode code point for the character (from http://www.unicode.org) and then use ChrW to convert from the code point to the character. (To put this in another string, use concatenation. I'm somewhat surprised that VB doesn't have an escape sequence, but there we go.)
For example, for the Euro sign (U+20AC) you'd write:
Dim euro as Char = ChrW(&H20AC)
The advantage of this over putting the character directly into source code is that your source code stays "just pure ASCII" - which means you won't have any strange issues with any other program trying to read it, diff it, etc. The disadvantage is that it's harder to see the symbol in the code, of course.
The most common way seems to be to append a character of the form Chr(34)... 34 represents a double quote character. The character codes can be found from the windows program "charmap"... just windows/Run... and type charmap
If you are passing strings to be processed as SQL statement try doubling the characters for example.
"SELECT * FROM MyRecords WHERE MyRecords.MyKeyField = ""With a "" Quote"" "
The '' double works with the other special characters as well.
The ' character can be doubled up to allow it into a string e.g
lSQLSTatement = "Select * from temp where name = 'fred''s'"
Will search for all records where name = fred's
Three points:
1) The example characters you've given are not special characters. They're directly available on your keyboard. Just press the corresponding key.
2) To type characters that don't have a corresponding key on the keyboard, use this:
Alt + (the ASCII code number of the special character)
For example, to type ¿, press Alt and key in 168, which is the ASCII code for that special character.
You can use this method to type a special character in practically any program not just a VB.Net text editor.
3) What you probably looking for is what is called 'escaping' characters in a string. In your SQL query string, just place a \ before each of those characters. That should do.
Chr() is probably the most popular.
ChrW() can be used if you want to generate unicode characters
The ControlChars class contains some special and 'invisible' characters, plus the quote - for example, ControlChars.Quote