Search an Oracle clob for special characters that are not escaped - sql

Is it possible to run a query that can search an Oracle clob for any record that contains an ampersand character where the word in which the character is located in is not one of any of the following (or possible any escape code):
& - &
< - <
> - >
" - "
' - &apos;
I want to extract 5 character before the ampersand and 5 characters after the ampersand so i can see the actual value.
Basically i want to search for any record that contains those fields and replace it with the escape code.
At the moment i am doing something like this:
Select * from articles
where dbms_lob.instr(article_summary , '&amp' ) = 0 and dbms_lob.instr(article_summary , '&' )
If i was to use a regular expression, how would i specify it if i want to retrieve all fields where the value is & followed by any character other than 'a'?

You can use DBMS_XMLGEN.CONVERT for this. The second parameter is optional and if left out will escape the the XML special characters.
select DBMS_XMLGEN.CONVERT(article_summary)
from articles;
But, if article summary contains a mixture of escaped and unescaped characters, then this will give wrong result. Easiest way to solve it, is to unescape the characters first and then escape it.
DBMS_XMLGEN.CONVERT(article_summary,1) --1 as parameter does unescaping
from articles;


Semicolon in LIKE is not working for SQL server 2017

I have a query like this which is not retrieving the values from DB table even if the required value exist there.
Here's the query, which return zero rows:
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\;'',./%'
Following is the value in the table:
When I run the query without ";" it is returning the value.
Can any one help me in figuring this out?
You are using multiple characters which are reserved when using LIKE statement.
i.e. %, _, []
Use the escape character clause (where I have used backtick to treat special characters as regular) such as
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' escape '`'
The value in your table is:
!##$%&*()-_=+{};; :"<>?[]\;'',./
And the one in the like is:
Starting with ( it will never match, also you should scape the percent (%) in the middle of the string like this:
Select *
FROM SitePanel_FieldValue
WHERE SiteFieldIdfk =111
AND SiteFieldvalue like '%!##$\%&*()-_=+{};;%' ESCAPE '\'
The problem is your brackets ([]), it has nothing to do with semicolons. If we remove the brackets, the above works:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?\;'',./%' THEN 1 END AS WithoutBrackets,
CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?[]\;'',./%' THEN 1 END AS WithBrackets
Notice that WithoutBrackets returns 1, where as WithBrackets returns NULL.
Brackets in a LIKE are to denote a pattern. For example SomeExpress LIKE '[ABC]' would match the characters, A, B, and C. If you are going to include special characters, you need to ESCAPE them. You have both brackets, a percent sign (%) and an underscore (_) you need to escape. You don't need to escape the hyphen (-), as it doesn't appear in a pattern (for example [A-Z]). I choose to use a backtick as the ESCAPE character, as it doesn't appear in your string, and demonstrate with a CASE expression again:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' ESCAPE '`' THEN 1 END;
If you wanted to use a backslash (\ ), which many do, you would need to also escape the backslash in your string:
SELECT CASE WHEN '!##$\%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-\_=+{}|:"<>?\[\]\\;'',./%' ESCAPE '\' THEN 1 END;
I think the issue is actually with the backslash. This is an escape character and so if you want it to be included, you have to put it in twice.
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\\;'',./%'

Remove template text on regexp_replace in Oracle's SQL

I am trying to remove template text like &#x; or &#xx; or &#xxx; from long string
Note: x / xx / xxx - is number, The length of the number is unknown, The cell type is CLOB
for example:
SELECT 'H'ello wor±ld' FROM dual
A desirable result:
Hello world
I know that regexp_replace should be used, But how do you use this function to remove this text?
You can use
& is put twice to provide escaping for the substitution character
\d represents digits and the following + provides the multiple occurrences of them
ending the pattern with ;
or just use a single ampersand ('&#\d+;') for the pattern as in the case of Demo , since an ampersand has a special meaning for Oracle, a usage is a bit problematic.
In case you wanted to remove the entities because you don't know how to replace them by their character values, here is a solution:
UTL_I18N.UNESCAPE_REFERENCE( xmlquery( 'the_double_quoted_original_string' RETURNING content).getStringVal() )
In other words, the original 'H'ello wor±ld' should be passed to XMLQUERY as '"H'ello wor±ld"'.
And the result will be 'H'ello wo±ld'

Using wildcards

Can anyone help me out with this-
There is a column in the database as "TEXT".
This column hold some string value.
I want to search any row that is having '%' in this column .
For eg how will i search for a row having value 'Vivek%123' in the column TEXT
In sql there is something known as an escape character, basically if you use this character it will be ignored and the character right behind it will be used as a literal instead of a wildcard in the case of %
WHERE Text LIKE '%!%%'
The above sql statement will allow you to search for any string containing a percentage character '%' so it could find anything in the format of
You must escape the % character
WHERE COL1 LIKE 'Vivek#%123' ESCAPE '#' ;

in ms azure database how do you search for a string containing _

I have this situation on my azure database where I need to search for any rows that contains the _ character. This is a special character on the database so I try to escape it but I get every row as a result.
select * from table where fieldColumn like '%_%'
will return everything on the table
select * from table where fieldColumn like '%\_%'
returns nothing
select * from table where fieldColumn = '_'
so how can i get that row that has only one _ and all the other ones that may have the _ on the string?
You can set whatever escape character you want, like this:
select * from table where fieldColumn like '%!_%' ESCAPE '!'
Here I am using the ! as an escape character to tell SQL Server to treat the following character, the _ , as a string literal.
See the documentation for more info:
select * from table where fieldColumn like '%_%' escape '\';
Is a character that is put in front of a wildcard character to indicate that the wildcard should be interpreted as a regular character and not as a wildcard. escape_character is a character expression that has no default and must evaluate to only one character.

List of special characters for SQL LIKE clause

What is the complete list of all special characters for a SQL (I'm interested in SQL Server but other's would be good too) LIKE clause?
SELECT Name FROM Person WHERE Name LIKE '%Jon%'
SQL Server:
[specifier] E.g. [a-z]
ESCAPE clause E.g. %30!%%' ESCAPE '!' will evaluate 30% as true
' characters need to be escaped with ' E.g. they're becomes they''re
% - Any string of zero or more characters.
_ - Any single character
ESCAPE clause E.g. %30!%%' ESCAPE '!' will evaluate 30% as true
% - Any string of zero or more characters.
_ - Any single character
ESCAPE clause E.g. %30!%%' ESCAPE '!' will evaluate 30% as true
[specifier] E.g. [a-z]
% - Any string of zero or more characters.
_ - Any single character
Reference Guide here [PDF]
% - Any string of zero or more characters.
_ - Any single character
ESCAPE clause E.g. %30!%%' ESCAPE '!' will evaluate 30% as true
An ESCAPE character only if specified.
PostgreSQL also has the SIMILAR TO operator which adds the following:
| - either of two alternatives
* - repetition of the previous item zero or more times.
+ - repetition of the previous item one or more times.
() - group items together
The idea is to make this a community Wiki that can become a "One stop shop" for this.
For SQL Server, from :
% Any string of zero or more characters.
WHERE title LIKE '%computer%' finds all book titles with the word 'computer' anywhere in the book title.
_ Any single character.
WHERE au_fname LIKE '_ean' finds all four-letter first names that end with ean (Dean, Sean, and so on).
[ ] Any single character within the specified range ([a-f]) or set ([abcdef]).
WHERE au_lname LIKE '[C-P]arsen' finds author last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, and so on. In range searches, the characters included in the range may vary depending on the sorting rules of the collation.
[^] Any single character not within the specified range ([^a-f]) or set ([^abcdef]).
WHERE au_lname LIKE 'de[^l]%' all author last names starting with de and where the following letter is not l.
an ESCAPE character only if specified.
It is disappointing that many databases do not stick to the standard rules and add extra characters, or incorrectly enable ESCAPE with a default value of ‘\’ when it is missing. Like we don't already have enough trouble with ‘\’!
It's impossible to write DBMS-independent code here, because you don't know what characters you're going to have to escape, and the standard says you can't escape things that don't need to be escaped. (See section 8.5/General Rules/3.a.ii.)
Thank you SQL! gnnn
You should add that you have to add an extra ' to escape an exising ' in SQL Server:
smith's -> smith''s
Sybase :
% : Matches any string of zero or more characters.
_ : Matches a single character.
[specifier] : Brackets enclose ranges or sets, such as [a-f]
or [abcdef].Specifier can take two forms:
rangespec1 indicates the start of a range of characters.
- is a special character, indicating a range.
rangespec2 indicates the end of a range of characters.
can be composed of any discrete set of values, in any
order, such as [a2bR].The range [a-f], and the
sets [abcdef] and [fcbdae] return the same
set of values.
Specifiers are case-sensitive.
[^specifier] : A caret (^) preceding a specifier indicates
non-inclusion. [^a-f] means "not in the range
a-f"; [^a2bR] means "not a, 2, b, or R."
Potential answer for SQL Server
Interesting I just ran a test using LinqPad with SQL Server which should be just running Linq to SQL underneath and it generates the following SQL statement.
.Where(r => r.Name.Contains("lkjwer--_~[]"))
-- Region Parameters
DECLARE #p0 VarChar(1000) = '%lkjwer--~_~~~[]%'
-- EndRegion
SELECT [t0].[ID], [t0].[Name]
WHERE [t0].[Name] LIKE #p0 ESCAPE '~'
So I haven't tested it yet but it looks like potentially the ESCAPE '~' keyword may allow for automatic escaping of a string for use within a like expression.