we are using following SQL to identify several rows which contain only SPECIAL CHARACTERs 'F#C(!!'. But seems it's not capturing some SPECIAL CHARACTER like 'F#C(!!'. Script should capture only special characters. Could you please share the optimized script to use for this scenario case
when regexp_like(column_name '^[^a-zA-Z]*$') then 'number'
when regexp_like(column_name, '^[^g-zG-Z]*$') then 'hex'
else 'string'
end
Try removing the ^ and $ anchors from the regex pattern you are using with regexp_like:
select *
from your_table
where regexp_like(column_name, '[^a-zA-Z]');
The above logic would check for the presence of one or more characters which are not letters. If you instead want to check for non alphanumeric, then use [^A-Za-z0-9].
Related
Hi I tried using Regex_replace and it is still not working.
select CASE WHEN sbbb <> ' ' THEN regexp_replace(sbbb,'[a-zA-Z _-#]','']
ELSE sbbb
AS ABCDF
from Table where sccc=1;
This is the query which I am using to remove alphabets and specials characters from string and have only numbers. but it doesnot work. Query returns me the complete string with numbers,characters and special characters .What is wrong in the above query
I am working on a sql query. There is a column in database which contains characters,special characters and numbers. I want to only keep the numbers and remove all the special characters and alphabets. How can I do it in query of DB2. If a use PATINDEX it is not working. please help here.
The allowed regular expression patterns are listed on this page
Regular expression control characters
Outside of a set, the following must be preceded with a backslash to be treated as a literal
* ? + [ ( ) { } ^ $ | \ . /
Inside a set, the follow must be preceded with a backslash to be treated as a literal
Characters that must be quoted to be treated as literals are [ ] \
Characters that might need to be quoted, depending on the context are - &
So for you, this should work
regexp_replace(sbbb,'[a-zA-Z _\-#]','')
I want to judge if a positive number string is end with ".0", so I wrote the following sql:
select '12310' REGEXP '^[0-9]*\.0$'. The result is true however. I wonder why I got the result, since I use "\" before "." to escape.
So I write another one as select '1231.0' REGEXP '^[0-9]\d*\.0$', but this time the result is false.
Could anyone tell me the right pattern?
Dot (.) in regexp has special meaning (any character) and requires escaping if you want literally dot:
select '12310' REGEXP '^[0-9]*\\.0$';
Result:
false
Use double-slash to escape special characters in Hive. slash has special meaning and used for characters like \073 (semicolon), \n (newline), \t (tab), etc. This is why for escaping you need to use double-slash. Also for character class digit use \\d:
hive> select '12310.0' REGEXP '^\\d*?\\.0$';
OK
true
Also characters inside square brackets do not need double-slash escaping: [.] can be used instead of \\.
If you know it is a number string, why not just use:
select ( val like '%.0' )
You need regular expression if you want to validate that the string has digits everywhere else. But if you only need to check the last two characters, like is sufficient.
As for your question . is a wildcard in regular expressions. It matches any character.
From within an Oracle 11g database, using SQL, I need to remove the following sequence of special characters from a string, i.e.
~!##$%^&*()_+=\{}[]:”;’<,>./?
If any of these characters exist within a string, except for these two characters, which I DO NOT want removed, i.e.: "|" and "-" then I would like them completely removed.
For example:
From: 'ABC(D E+FGH?/IJK LMN~OP' To: 'ABCD EFGHIJK LMNOP' after removal of special characters.
I have tried this small test which works for this sample, i.e:
select regexp_replace('abc+de)fg','\+|\)') from dual
but is there a better means of using my sequence of special characters above without doing this string pattern of '\+|\)' for every special character using Oracle SQL?
You can replace anything other than letters and space with empty string
[^a-zA-Z ]
here is online demo
As per below comments
I still need to keep the following two special characters within my string, i.e. "|" and "-".
Just exclude more
[^a-zA-Z|-]
Note: hyphen - should be in the starting or ending or escaped like \- because it has special meaning in the Character class to define a range.
For more info read about Character Classes or Character Sets
Consider using this regex replacement instead:
REGEXP_REPLACE('abc+de)fg', '[~!##$%^&*()_+=\\{}[\]:”;’<,>.\/?]', '')
The replacement will match any character from your list.
Here is a regex demo!
The regex to match your sequence of special characters is:
[]~!##$%^&*()_+=\{}[:”;’<,>./?]+
I feel you still missed to escape all regex-special characters.
To achieve that, go iteratively:
build a test-tring and start to build up your regex-string character by character to see if it removes what you expect to be removed.
If the latest character does not work you have to escape it.
That should do the trick.
SELECT TRANSLATE('~!##$%sdv^&*()_+=\dsv{}[]:”;’<,>dsvsdd./?', '~!##$%^&*()_+=\{}[]:”;’<,>./?',' ')
FROM dual;
result:
TRANSLATE
-------------
sdvdsvdsvsdd
SQL> select translate('abc+de#fg-hq!m', 'a+-#!', etc.) from dual;
TRANSLATE(
----------
abcdefghqm
I was wondering if anyone can help me with a regular expression - not my strongest point - to parse the WHERE part of a SQL statement. I need to extract the column names, either in "column" or "table.column" format. I'm using MySQL and PHP.
For example, parsing:
(table.column_a = '1') OR (table.column_a = '0'))
AND (date_column < '2014-07-03')
AND column_c LIKE '%my search string%'
should yield
table.column_a
table.column_b
date_column
column_c
Edit: clarification - the strings will be parsed in PHP with preg_* functions!
Thank you!
Assuming you are not doing this in SQL, you can use a regex like this:
[A-Za-z._]+(?=[ ]*(?:[<>=]|LIKE))
See regex demo.
This would work in Notepad++ and many languages.
Explanation
[A-Za-z._]+ matches the characters in your word (if you want to add digits, add 0-9
The lookahead (?=[ ]*(?:[<>=]|LIKE)) asserts that what follows is optional spaces (the brackets are optional, they make the space stand out), then one of the characters in this class [<>=] (an operator OR | LIKE
You can add operators inside [<>=], or tag them at the end with another alternation, e.g. |REGEXP
Reference
Lookahead and Lookbehind Zero-Length Assertions
Mastering Lookahead and Lookbehind
I came across code similar to the following in an Oracle stored procedure:
SELECT * FROM hr.employees WHERE REGEXP_LIKE(FIRST_NAME, '\A'||:iValue||'\Z', 'c');
And I am not sure what the \A and \Z do.
From what I can glean from the Oracle documentation, I think that they simply suppress the meaning of special characters in the iValue parameter. If so, the above must be equivalent to
SELECT * FROM hr.employees WHERE FIRST_NAME=:iValue;
Can anyone confirm this? Empirically this seems to be the case.
I think that in the past they wanted case insensitive searching so the 'c' was an 'i' before. So in this case we do not need to use the REGEXP_LIKE function any more and can replace it with an equals.
\A matches the position at the beginning of the string.
\Z matches the position at the end of the string or before a newline at the end of the string.
\z matches the position at the end of the string.
These are independent of multiline mode, unlike ^ and $.
Example:
foo\Z would match on foo\n, but foo\z would not match on foo\n.
See Oracle reference.
if || is used for string concatenation, then it's not the same as simple string comparison as it would allow you to use regex. (Also I'm not sure how Oracle treats case sensitivity when using =, MySQL ignores case by default when comparing strings.)
\A matches the very start of input.
\Z matches the very end of input.
Check out regular-expressions.info, which is a great regex resource