Oracle query to identify columns having special characters - sql

I'm trying to write a SQL query to return rows which has anything other than
alphabets, numbers, spaces and following chars '.', '{','[','}',']'
Column has alphabets like Ÿ, ¿
eg:- There's a table TEST with 2 columns - EmpNo and SampleText
EmpNo is simple sequence and SampleText has values like
('12345abcde','abcdefghij','1234567890','ab c d 1 3','abcd$%1234','%^*&^%$#$%','% % $ # %','abcd 12}34{','MINNEAŸPOLIS','THAN ¿VV ¿A')
I want to write a query which should eliminate all rows which have even a single special character except .{[}]. In above example, it should return EmpNo - 1,2,3,4 and 8
I tried REGEXP_LIKE but I'm not getting exactly what I need.
Query I used:
SELECT * FROM test
WHERE REGEXP_LIKE(sampleText, '[^A-Z^a-z^0-9^[^.^{^}]' ,'x');
This is not ignoring blanks and I also need to ignore closing bracket ']'

You can use regular expressions for this, so I think this is what you want:
select t.*
from test t
where not regexp_like(sampletext, '.*[^a-zA-Z0-9 .{}\[\]].*')

I figured out the answer to above problem.
Below query will return rows which have even a signle occurrence of characters besides
alphabets, numbers, square brackets, curly brackets,s pace and dot.
Please note that position of closing bracket ']' in matching pattern is important.
Right ']' has the special meaning of ending a character set definition. It wouldn't make any sense to end the set before you specified any members, so the way to indicate a literal right ']' inside square brackets is to put it immediately after the left '[' that starts the set definition
SELECT * FROM test WHERE REGEXP_LIKE(sampletext, '[^]^A-Z^a-z^0-9^[^.^{^}^ ]' );

They key is the backslash escape character will not work with the right square bracket inside of the character class square brackets (it is interpreted as a literal backslash inside the character class square brackets). Add the right square bracket with an OR at the end like this:
select EmpNo, SampleText
from test
where NOT regexp_like(SampleText, '[ A-Za-z0-9.{}[]|]');

Compare the length using lengthB and length function in oracle.
SELECT * FROM test WHERE length(sampletext) <> lengthb(sampletext)

Related

Semicolon in LIKE is not working for SQL server 2017

I have a query like this which is not retrieving the values from DB table even if the required value exist there.
Here's the query, which return zero rows:
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\;'',./%'
Following is the value in the table:
'!##$%&*()-_=+{}|:"<>?[]\;'',./'
When I run the query without ";" it is returning the value.
Can any one help me in figuring this out?
Thanks
Ritu
You are using multiple characters which are reserved when using LIKE statement.
i.e. %, _, []
Use the escape character clause (where I have used backtick to treat special characters as regular) such as
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' escape '`'
The value in your table is:
!##$%&*()-_=+{};; :"<>?[]\;'',./
And the one in the like is:
(!##$%&*()-_=+{};;
Starting with ( it will never match, also you should scape the percent (%) in the middle of the string like this:
Select *
FROM SitePanel_FieldValue
WHERE SiteFieldIdfk =111
AND SiteFieldvalue like '%!##$\%&*()-_=+{};;%' ESCAPE '\'
The problem is your brackets ([]), it has nothing to do with semicolons. If we remove the brackets, the above works:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?\;'',./%' THEN 1 END AS WithoutBrackets,
CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?[]\;'',./%' THEN 1 END AS WithBrackets
Notice that WithoutBrackets returns 1, where as WithBrackets returns NULL.
Brackets in a LIKE are to denote a pattern. For example SomeExpress LIKE '[ABC]' would match the characters, A, B, and C. If you are going to include special characters, you need to ESCAPE them. You have both brackets, a percent sign (%) and an underscore (_) you need to escape. You don't need to escape the hyphen (-), as it doesn't appear in a pattern (for example [A-Z]). I choose to use a backtick as the ESCAPE character, as it doesn't appear in your string, and demonstrate with a CASE expression again:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' ESCAPE '`' THEN 1 END;
If you wanted to use a backslash (\ ), which many do, you would need to also escape the backslash in your string:
SELECT CASE WHEN '!##$\%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-\_=+{}|:"<>?\[\]\\;'',./%' ESCAPE '\' THEN 1 END;
db<>fiddle
I think the issue is actually with the backslash. This is an escape character and so if you want it to be included, you have to put it in twice.
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\\;'',./%'

Remove template text on regexp_replace in Oracle's SQL

I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
SELECT REGEXP_REPLACE(col,'&&#\d+;')
FROM t
where
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'

How to escape '|' in SQL Server or T-SQL Query?

SELECT *
FROM performance_table
WHERE ad_group like '%|%'
I have no idea on how to escape the Pipe operator here.
You don't need to escape | in T-SQL as it has no special meaning inside like. However, if for example you would like to find texts containing % character, what you're looking for is:
SELECT *
FROM performance_table
WHERE ad_group like '%#%%' escape '#'
where escape defines escape character.
The pipe character does not need to be escaped.
Your query will find all records that contain a pipe character in the ad_group column.
When used inside a string literal ('|'), the character is not treated as an operator. Its function as an operator is bitwise OR, as for example in
select 8|3
will be 11.

how not to replace "]" when using regex_replace for removing special characters

I'm trying to remove few special characters from a comment column in my table. I used the below statement but it seems to remove the ']' even though it is in the ^[not] list.
UPDATE TEST
set comments=REGEXP_REPLACE(
comments,
'[^[a-z,A-Z,0-9,[:space:],''&'','':'',''/'',''.'',''?'',''!'','']'']]*',
' '
);
The table data contains the following:
[SYSTEM]:Do you have it in stock? 😊
My requirement is to have:
[SYSTEM]:Do you have it in stock?
You have two mistakes in you regex:
Do not put characters in quotes and don't split them with comma.
Remove inner square brackets.
And place closing square brackets first in the list, just after initial circumflex. Fixed regex:
UPDATE TEST set comments=REGEXP_REPLACE(comments,'[^]a-zA-Z0-9[:space:]&:/.?!]*',' ');
My try, I just removed the commas, put the "accepted" characters after the initial "not"(no brackets).
A special case are the brackets: https://dba.stackexchange.com/a/109294/6228
select REGEXP_REPLACE(
'[ION] are varză murată.',
'[^][a-zA-Z0-9[:space:]&:/,.?!]+',
' ')
from dual;
Result:
[ION] are varz murat .

What is this Oracle regexp matching in this production code?

Here's the code that is in production:
dynamic_sql := q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]';
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two characters that are not Y6
What's the ']' after the 6?
Then what?
I think that StackOverflow's formatting is causing some of the confusion in the answers. Oracle has a syntax for a string literal, q'[...]', which means that the ... portion is to be interpreted exactly as-is; so for instance it can include single quotes without having to escape each one individually.
But the code formatting here doesn't understand that syntax, so it is treating each single-quote as a string delimiter, which makes the result look different that how Oracle really sees it.
The expression is concatenating two such string literals together. (I'm not sure why - it looks like it would be possible to write this as a single string literal with no issues.) As pointed out in another answer/comment, the resulting SQL string is actually:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
And also as pointed out in another answer, the [^Y6] portion of the regex matches a single character, not two. So this expression should simply match any string whose first character is not 'Z' and whose second character is neither 'Y' nor '6'.
When not in couples ] means... Well... Itself:
^[^Z][^Y6]]/
^ assert position at start of the string
[^Z] match a single character not present in the list below
Z the literal character Z (case sensitive)
[^Y6] match a single character not present in the list below
Y6 a single character in the list Y6 literally (case sensitive)
] matches the character ] literally
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two one characters that is not Y or 6
What's the ']' after the 6? it's a ]
I'm afraid I have to post this here as the comment section is inappropriate for the formatting required. After your edit above that shows the entire statement, I ran this to see what the string ends up being:
select q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]' txt
from dual;
It ended up yielding this:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
It is apparent now that the closing bracket and quote at the end of the regex belong to the first alternate quote string and not to the regex. This is concatenating 2 alternate quoted strings which is a tad confusing as it sure looked like part of the regex. If anything you are learning the importance of comments for the poor person behind you! Please comment this accordingly when you are done figuring this out. Even include a link to this post.