Sol regular expression query error for starting with u - lucene

I am using solr 3.I can search starting with attributeValue:\hin* But it fails forattributeValue:\uo*
error is
"error": {
"msg": "org.apache.solr.search.SyntaxError: Non-hex character in Unicode escape sequence: o",
"code": 400
}
Issue is \u I can not exclude u as user can search anything from type+search.

When you're searching for something starting with \u it's treated as a Unicode symbol. Of course o symbol isn't allowed to be in the Unicode symbol. If you want to search for \ you need to escape it. More info on this:
Lucene/Solr supports escaping special characters that are part of the query
syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \ /
To escape these character use the \ before the character

Related

Need to replace special character in Impala

I want to replace only specific list of special character in impala. other than list of character all other character should remain same.
Only below list should be replace.
<
>
:
"
/
\
|
?
*
use regexp_replace(data_col,'[|?*]',''). This will replace all symbols mentioned inside [] with empty space ''.
Add whatever symbols you want in the list.

Remove special characters and alphabets from a string except number in sql query in db2

Hi I tried using Regex_replace and it is still not working.
select CASE WHEN sbbb <> ' ' THEN regexp_replace(sbbb,'[a-zA-Z _-#]','']
ELSE sbbb
AS ABCDF
from Table where sccc=1;
This is the query which I am using to remove alphabets and specials characters from string and have only numbers. but it doesnot work. Query returns me the complete string with numbers,characters and special characters .What is wrong in the above query
I am working on a sql query. There is a column in database which contains characters,special characters and numbers. I want to only keep the numbers and remove all the special characters and alphabets. How can I do it in query of DB2. If a use PATINDEX it is not working. please help here.
The allowed regular expression patterns are listed on this page
Regular expression control characters
Outside of a set, the following must be preceded with a backslash to be treated as a literal
* ? + [ ( ) { } ^ $ | \ . /
Inside a set, the follow must be preceded with a backslash to be treated as a literal
Characters that must be quoted to be treated as literals are [ ] \
Characters that might need to be quoted, depending on the context are - &
So for you, this should work
regexp_replace(sbbb,'[a-zA-Z _\-#]','')

How to escape '[' and ']' in GLOB?

I'm trying to use GLOB operator to determine if some characters are presented in a string:
SELECT *
FROM Test
WHERE num GLOB '*[~!?.;:+=()<>_#%&/\\]*'
It works fine with '][' at the beginning of the pattern:
WHERE num GLOB '*[][~!?.;:+=()<>_#%&/\\]*'
But it returns nothing when placing '[]' anywhere in the pattern:
WHERE num GLOB '*[[]~!?.;:+=()<>_#%&/\\]*'
What's the reason of such behaviour?
The ] character ends a character list. So [[] is a character list that contains only [.
An empty character list would not make sense, so as an exception, putting the ] directly behind the opening [ ([]...]) can be used to include the ] character in the list. But that is the only exception.
(For similar reasons, - as a literal character must be the last one in the list.)

Special Characters that can't be indexed using lucene

I know the list of special characters that can be indexed using Apache Lucene. Can some one tell me if there are any special characters that cannot be indexed using Apache Lucene library?
From: https://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping%20Special%20Characters
Lucene supports escaping special characters that are part of the query syntax. The current list special characters are
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
So basically it looks like you can index anything, just have to escape it.

Regular expression for double square backers in objective C

I need to find a proper regular expression for words like [[ "objective C" ]] , [[ "Java" ]] ,and [[ "perl programming"]] in Objective C
Tried with many combinations like
NSString *pattern1 = #"[\[][\[][ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz \",]+]]";
NSString *pattern2 = #"\[\[[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz \",]+]]";
Apple documentation on NSRegularExpression Class says I need to use \ for treating next character as literal. Can some body help me to find what is the error in above regular expression ?
\ is an escape character for NSString, so you need to escape it.
\[ in a regex, becomes \\[ in NSString.
By the way a simpler regex for matching a single element is
\[\[ \"(\w|\s)+\" \]\]
which escaped for NSString is
#"\\[\\[ \\"(\\w|\\s)+\\" \\]\\]"