Cloudant Query $gt (greater than) string phrase with spaces - lucene

Using Cloudant Query, i'm trying to get all documents with text greater than a specified phrase:
{
"selector":{
"name":{
"$gt":"Test for pagination"
}
},
"sort":[
{
"name:string":"asc"
}
],
"limit":5,
"use_index":[
"NameQueryIndex",
"nameQueryIndex_v1"
]
}
However, i'm getting the following error:
{
"error": "text_search_error",
"reason": "Cannot parse '(name_3astring:{Test\\ for\\ pagination TO u0x10FFFF])': Encountered \" <RANGE_GOOP> \"pagination \"\" at line 1, column 27.\nWas expecting one of:\n \"]\" ...\n \"}\" ...\n "
}
When i remove the whitespace (i.e. Testforpagination), it works fine

You can try to create a view on your database and using the same logic on your mapping function. Then access these documents using REST. You don't need the query and the documents would be pulled out much faster.

Related

Arcgis mapserver query not working as expected

With the following mapserver, https://ndgishub.nd.gov/arcgis/rest/services/All_GovtBoundaries/MapServer/4/query, I have been using the query "TOWNSHIP='158' AND TDIR='N' AND RANGE='88' AND RDIR='W' AND SECTION ='10'" (minus the double-quotes) for months now. I just realized today that it is returning JSON as
{
"error": {
"code": 400,
"extendedCode": -2147220985,
"message": "Unable to complete operation.",
"details": [
]
}
}
Anyone know why I'm getting this error now?
I still don't know what the JSON error code is, but for whatever reason it works just fine if I remove the single quotes from around the numeric values, like this:
"TOWNSHIP=158 AND TDIR='N' AND RANGE=88 AND RDIR='W' AND SECTION =10" (minus the double-quotes).

Proper way to convert Data type of a field in MongoDB

Possible Replication of How to change the type of a field?
I am currently newly learning MongoDB and I am facing problem while converting Data type of field value to another data type.
Below is an example of my document
[
{
"Name of Restaurant": "Briyani Center",
"Address": " 336 & 338, Main Road",
"Location": "XYZQWE",
"PriceFor2": "500.0",
"Dining Rating": "4.3",
"Dining Rating Count": "1500",
},
{
"Name of Restaurant": "Veggie Conner",
"Address": " New 14, Old 11/3Q, Railway Station Road",
"Location": "ABCDEF",
"PriceFor2": "1000.0",
"Dining Rating": "4.4",
}]
Like above I have 12k documents. Notice the datatype of PriceFor2 is a string. I would like to convert the data type to Integer data type.
I have referred many amazing answers given in the above link. But when I try to run the query, I get .save() is not a function error. Please advice what is the problem.
Below is the code I used
db.chennaiData.find().forEach( function(x){ x.priceFor2= new NumberInt(x.priceFor2);
db.chennaiData.save(x);
db.chennaiData.save(x);});
This is the error I am getting..
TypeError: db.chennaiData.save is not a function
From MongoDB's save documentation:
Starting in MongoDB 4.2, the
db.collection.save()
method is deprecated. Use db.collection.insertOne() or db.collection.replaceOne() instead.
Likely you are having a MongoDB with version 4.2+, so the save function is no longer available. Consider migrate to the usage of insertOne and replaceOne as suggested.
For your specific scenario, it is actually preferred to do with a single update as mentioned in another SO answer. It only does one db call(while your approach fetches all documents in the collection to the application level) and performs n db call to save them back.
db.collection.update({},
[
{
$set: {
PriceFor2: {
$toDouble: "$PriceFor2"
}
}
}
],
{
multi: true
})
Mongo Playground

How to search a field containing [ and/or ] in it in SOLR?

I have SOLR setup.
I want to search all documents having [ or ] in it.
I have tried
nk_title:"\["
But it returns all documents in my DB.
Tried
nk_title:[*
But it gave
"error": {
"msg": "org.apache.solr.search.SyntaxError: Cannot parse 'nk_source:156 AND nk_title:[*': Encountered \"<EOF>\" at line 1, column 29.\nWas expecting one of:\n \"TO\" ...\n <RANGE_QUOTED> ...\n <RANGE_GOOP> ...\n ",
"code": 400
}
I also tried
nk_title:\[*
nk_title:*[*
But returns empty results.
To search for [, just make sure to escape it with \ when creating the query. Given a collection with a title field defined as string with three documents:
{
"id":"doc1",
"title":"This is a title",
"_version_":1602438086510247936
},
{
"id":"doc2",
"title":"[This is a title",
"_version_":1602438093178142720
},
{
"id":"doc3",
"title":"This is [a title",
"_version_":1602438101227012096
}
Querying for title:[* gives doc2 as a hit:
{"numFound":1,"start":0,"docs":[{
"id":"doc2",
"title":"[This is a title",
"_version_":1602438093178142720}]}
And wildcarding on both sides work as you expect (title:*\[*):
"response":{"numFound":2,"start":0,"docs":[
{
"id":"doc2",
"title":"[This is a title",
"_version_":1602438093178142720},
{
"id":"doc3",
"title":"This is [a title",
"_version_":1602438101227012096}]
}}

Cloudant Search field was indexed without position data; cannot run PhraseQuery

I have the following Cloudant search index
"indexes": {
"search-cloud": {
"analyzer": "standard",
"index": "function(doc) {
if (doc.name) {
index("keywords", doc.name);
index("name", doc.name, {
"store": true,
"index": false
});
}
if (doc.type === "file" && doc.keywords) {
index("keywords", doc.keywords);
}
}"
}
}
For some reason when I search for specific phrases, I get an error:
Search failed: field "keywords" was indexed without position data; cannot run PhraseQuery (term=FIRSTWORD)
So If I search for FIRSTWORD SECONDWORD, it looks like I am getting an error on the first word.
NOTE: This does not happen to every search phrase I do.
Does anyone know why this would be happening?
doc.name and doc.keywords are just string.
doc.name is usually something like "2004/04/14 John Doe 1234 Document Folder"
doc.keywords is usually something random like "testing this again"
And the reason why I am storing name and keywords under the keywords index is because I want anyone to be able to search keywords or name by just typing on string value. Let me know if this is not the best practice.
Likely the problem is that some of your documents contain a keywords field with string values, while other documents contain a keywords field with a different type, probably an array. I believe that this scenario would result in the error that you received. Can you double check that all of the values for your keywords fields are, in fact, strings?

How to prevent Facet Terms from tokenizing

I am using Facet Terms to get all the unique values and their count for a field. And I am getting wrong results.
term: web
Count: 1191979
term: misc
Count: 1191979
term: passwd
Count: 1191979
term: etc
Count: 1191979
While the actual result should be:
term: WEB-MISC /etc/passwd
Count: 1191979
Here is my sample query:
{
"facets": {
"terms1": {
"terms": {
"field": "message"
}
}
}
}
If reindexing is an option, it would be the best to change mapping and mark this fields as not_analyzed
"your_field" : { "type": "string", "index" : "not_analyzed" }
You can use multi field type if keeping an analyzed version of the field is desired:
"your_field" : {
"type" : "multi_field",
"fields" : {
"your_field" : {"type" : "string", "index" : "analyzed"},
"untouched" : {"type" : "string", "index" : "not_analyzed"}
}
}
This way, you can continue using your_field in the queries, while running facet searches using your_field.untouched.
Alternatively, if this field is stored, you can use a script field facet instead:
"facets" : {
"term" : {
"terms" : {
"script_field" : "_fields.your_field.value"
}
}
}
As the last resort, if this field is not stored, but record source is stored in the index, you can try this:
"facets" : {
"term" : {
"terms" : {
"script_field" : "_source.your_field"
}
}
}
The first solution is the most efficient. The last solution is the least efficient and may take a lot of time on a large index.
Wow, I also got this same issue today while term aggregating in the recent elastic-search. After googling and some partial understanding, found how this geeky indexing works(which is very simple).
Queries can find only terms that actually exist in the inverted index
When you index the following string
"WEB-MISC /etc/passwd"
it will be passed to an analyzer. The analyzer might tokenize it into
"WEB", "MISC", "etc" and "passwd"
with its position details. And this tokens might filtered to lowercase such as
"web", "misc", "etc" and "passwd"
So, after indexing,the search query can see the above 4 only. not the complete word "WEB-MISC /etc/passwd". For your requirement the following are my options you can use
1.Change the Default Analyzer used by elasticsearch([link][1])
2.If it is not need, just TurnOff the analyzer by setting 'not_analyzed' for the fields you need
3.To convert the already indexed data searchable, re-indexing is the only option
I have briefly explained this problem and proposed two solutions here.
I have talked about multiple approaches here.
One is use of not_analyzed to preserve the string as it is. But then as it has the drawback of being case insensitive , a better approach would be use keyword tokenizer + lowercase filter