org.apache.lucene.queryParser.ParseException - lucene

I got the below error in my project:
org.apache.lucene.queryParser.ParseException: Cannot parse 'AMERICAN EXP PROPTY CASLTY INS AND': Encountered "" at line 1, column 34.
Was expecting one of:
...
"+" ...
"-" ...
"(" ...
"" ...
...
...
...
...
"[" ...
"{" ...
...
...
"" ...
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:211)
at org.elasticsearch.index.query.xcontent.QueryStringQueryParser.parse(QueryStringQueryParser.java:196)
... 15 more
Please help on how to resolve...when i add an AND at the end of any string
it gives me the above error.
Thanks

When you are using QueryString query or specifying your query as a q parameter, elasticsearch is using Lucene to parse your query. As a result, it expects your query to follow Lucene query syntax and returns errors when your query contains syntax errors (dangling AND at the end, in your case). If you want your query string to be interpreted as text and not parsed as a query, consider using Text Query instead.

That's funny.
Lucene is waiting for a new term as in Lucene you can build queries like : "termA AND termB" or "+termA +termB"
Can you try to lowercase your query and see if it works?

use correct package name and classpath parser P is small letter
org.apache.lucene.queryparser.classic.ParseException
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>4.3.0</version>
</dependency>

Related

Regexp_Extract BigQuery anything up to "|"

I'm fairly new to coding and I was wondering if you could give me a hand writing some regular expression for BigQuery SQL.
Basically I would like to extract everything before the bar sign "|" for one of my column.
Example:
Source string:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed
Desired output:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah
I thought about using the REGEXP_EXTRACT(string, delimiter) function but I'm totally unable to write some regex (LOL). Therefore I had a look over Stack, and have found stuff like:
SELECT REGEXP_EXTRACT( String_Name , "\S*\s*\|" ) ,
# or
SELECT REGEXP_EXTRACT( String_Name , '.+?(?=|)')
But every time I get error messages like " invalid perl operator: (?= " or "Illegal escape space"
Would you have any suggestions on why I get these messages and/or how could I proceed to extract these strings?
Many many thanks in advance <3
You can use SPLIT instead:
SELECT SPLIT("bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed", "|")[OFFSET(0)]
Prefix the pattern string with r:
SELECT REGEXP_EXTRACT(String_Name, r'\S*\s*\|')
This is the syntax for a raw string constant. You can review what this means in the documentation.

how insert ODI step error message in to Oracle table, if the error message has single quotes and colons

I'm trying to insert ODI step error message into oracle table.
I captured the error message using <%=odiRef.getPrevStepLog("MESSAGE")%>.
ODI-1226: Step PRC_POA_XML_synchronize fails after 1 attempt(s).
ODI-1232: Procedure PRC_POA_XML_synchronize execution fails.
ODI-1227: Task PRC_POA_XML_synchronize (Procedure) fails on the source XML connection XML_PFIZER_LOAD_POA_DB_DEV.
Caused By: java.sql.SQLException: class java.sql.SQLException
oracle.xml.parser.v2.XMLParseException: End tag does not match start tag 'tns3:ContctID'.
at com.sunopsis.jdbc.driver.xml.SnpsXmlFile.readDocument(SnpsXmlFile.java:459)
at com.sunopsis.jdbc.driver.xml.SnpsXmlFile.readDocument(SnpsXmlFile.java:469)
When I try to insert this into a table, I'm getting the following error:
Missing IN or OUT parameter at index:: 1
I tried with substr, replace. Nothing works as in middle of the error message we have a single quotes 'tns3:ContctID'.
Is there any way to insert this into a table?
that's a tough one if you want to use pure java BeanShell and you've given way too little details to get short and straight answer, like
how do you try to insert this (command on source/target, bean shell only, Oracle SQL +jBS, jython, groovy etc...)
The problem here is not only quotes but also newlines.
To replace them is even more difficult as every parsing step <%, <?, <# requires different trick to define those literals
What will work for sure is if you write Jython task for inserting log data (Jython in technology).
There you may use Python ability for multiline string literals
simply:
⋮
err_log = """
<?=odiRef.getPrevStepLog("MESSAGE")?>
"""
⋮
I faced this error few days back . I applied below mentioned solution in ODI ...
Use - q'#<%=odiRef.getPrevStepLog("MESSAGE")%>#'
This will escape inverted comma (') for INSERT statement.
I have used this in my code and it is working fine :)
For example -
select 'testing'abcd' from dual;
this query will give below error
"ORA-01756: quoted string not properly terminated"
select q'#testing'abcd#' from dual;
This query gives no error and we get below response in SQL Developer
testing'abcd

Lucene queryparser with "/" in query criteria

When I try to search for something such as "workaround/fix" within Lucene, it throws this error:
org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'workaround/fix': Lexical error at line 1, column 15. Encountered: <EOF> after : "/fix"
at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:131)
at pi.lucengine.LucIndex.main(LucIndex.java:112)
Caused by: org.apache.lucene.queryparser.classic.TokenMgrError: Lexical error at line 1, column 15. Encountered: <EOF> after : "/fix"
at org.apache.lucene.queryparser.classic.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1133)
at org.apache.lucene.queryparser.classic.QueryParser.jj_scan_token(QueryParser.java:599)
at org.apache.lucene.queryparser.classic.QueryParser.jj_3R_2(QueryParser.java:482)
at org.apache.lucene.queryparser.classic.QueryParser.jj_3_1(QueryParser.java:489)
at org.apache.lucene.queryparser.classic.QueryParser.jj_2_1(QueryParser.java:475)
at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:226)
at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:181)
at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:170)
at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:121)
This are my lines 111 and 112:
QueryParser parser = new QueryParser(Version.LUCENE_43, field, analyzer);
Query query = parser.parse(newLine);
What do I need to do to allow it to parse the "/"?
The query parser interprets slashes as the beginning/end or a regex query (as of 4.0, see documentation here).
So, to incorporate slashes into the query, you will need to escape them by adding a backslash (\) before them.
You can handle escaping with QueryParser.escape(String).
I encountered a similar problem when using '/' in lucene queries issued from the elastic search kibana dashboard. I was escaping the '/' characters as indicated in the documentation and still not getting any success. I think this is related to the template bug reported here : https://github.com/elastic/kibana/issues/789. Not sure yet, will update when we update the logstash components
I had a case where when using forward slash with wildcard it just wouldn't return any result, even if escaped it:
+(*16/17*)
+(*16\/17*)
The solution was to add double quote:
+("*16/17*")
+("*16\/17*")

Why am I getting "ORA-00923: FROM keyword not found where expected"?

Why is the "From" seen as being in the wrong spot?
I had to change double quotes to single quotes in a query to get the query string to compile in the IDE while porting from Delphi to C#.
IOW, with SQL like this, that works in Delphi and Toad:
#"SELECT R.INTERLOPERABCID "ABCID",R.INTERLOPERID "CRID", . . .
...I had to change it to this for it to compile in Visual Studio:
#"SELECT R.INTERLOPERABCID 'ABCID',R.INTERLOPERID 'CRID', . . .
However, the SQL won't run that way - I get the "ORA-00923: FROM keyword not found where expected" err msg (both in Toad and when trying to execute the SQL in the C# app).
How can I get it both to compile and to run?
The quotes around your column aliases are giving it heartburn.
It appears that in the original case, the entire query is surrounded by double quotes, and the double quotes around the column aliases mess with those.
In the second case, Oracle is interpreting the single-quoted column aliases as strings, which throw it off.
In Oracle, you shouldn't need either: You should be able to just code:
#"SELECT R.INTERLOPERABCID ABCID,R.INTERLOPERID CRID, . . .
or, using the optional AS keyword:
#"SELECT R.INTERLOPERABCID AS ABCID,R.INTERLOPERID AS CRID, . . .
Hope this helps.
It is generally don't work if you have inline comments in query

Lucene QueryParser interprets 'AND OR' as a command?

I am calling Lucene using the following code (PyLucene, to be precise):
analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))
But consider if this is the content of querytext:
querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"
In that case, the "AND OR" trips up the queryparser, even though I am use queryparser.escape. How do I avoid the following error message?
Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
"*" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ...
<TERM> ...
"*" ...
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
....
at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)
It's not just OR, it's AND OR.
I use the following workaround:
query = queryparser.parse(queryparser.escape(querytext.replace("AND OR", "AND or")))
queryparser.parse only escapes special characters (as shown in this page) and leaves "AND OR" unchanged, so it would not work in your case. Since presumably you also used StandardAnalyzer to analyze your text, the terms in your index are already in lowercase. So you can change the whole query string to lowercase before giving it to the queryparser. Lowercase "and" and "or" are not considered operators, so "and or" would not trip the queryparser.
I realise I'm rather late to the party here, but putting quotes round the search string is a better option:
querytext = "\"THE FOOD WAS ... \""