How to nest $type inside $in operator in mongoDB - mongodb-query

Mongo Document sample i am working on :
{
"_id" : ObjectId("5d07c59b0e914801ae05eb40"),
"tomato" : {
"rating" : 9,
"reviews" : 54,
"fresh" : 53,
}
}
I need a query to find documents with tomato.rating with $type either "string" or "int"
Basically i want to nest $type inside $in operator
How do we do that? Any idea would help me.
Thanks.

Example:
if your collection name vegetables:
db.vegetables.find(
{ "tomato.rating" : { $type : [ "string" , "int" ] } }
)

Related

What is the default doc sequence of the result from an Elasticsearch filter request?

I recently run an Elasticsearch filter request that is
{
"from" : 0,
"size" : 10,
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" : {
"terms" : {
"a_id" : [ 257793, 257798, 257844 ]
}
}
}
}
}
},
"explain" : false,
"fields" : "a_id"
}
So that I can find all docs with a_id in 257793, 257798, 257844 and the results are 257844, 257798, 257793. So far so good.
Then I find that whatever the sequence of the term numbers are, the return docs are always in the same a_id order. That is, even I run
"terms" : {
"a_id" : [257798, 257844, 257793 ]
}
The result docs are in the order of 257844, 257798, 257793 as well.
So I am so curious about the mechanism behind the Elasticsearch filtering. Can anyone help me and give me a hint?
By default, ES returns in descending order of _score. You can provide the sort option, to say in which order and based on what you want the results to be returned. For e.g., for based on date field
{
"sort": { "date": { "order": "desc" }}
"query" : {
"term" : { "user" : "kimchy" }
}
}
You can get more information:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_sorting.html

Errors doing mongodb nested query

I am having a little trouble with nested queries in mongodb.
I have a collection with the following structure --
{
"_id" : Objectid(..),
"result" : {
"name" : nameValue,
"reference" : base64Value,
"city" : cityValue
}
}
Now I am to do two queries in the mongo shell -
search for a specific reference value (so query for equality)
I am using the following query -
db.TestCollection.find("result.reference" : a3d245e343 }
but I get nothing when I know the record is there in the collection
search and print for all city values.
I am looking to print something like this--
{ "city": "new york city" }
{ "city" : "brooklyn" }
... etc
For this I use this query --
db.TestCollection.find( {}, {"results.city", 1} )
For this I do not get the output I was hoping for but only get a list of all "_id" values like this --
{ "_id" : ObjectId("52e466bd562bdb7b1b320d1d") }
{ "_id" : ObjectId("52e466be562bdb7b1b320d1e") }
{ "_id" : ObjectId("52e466be562bdb7b1b320d1f") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d20") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d21") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d22") }
{ "_id" : ObjectId("52e466c0562bdb7b1b320d23") }
{ "_id" : ObjectId("52e466c0562bdb7b1b320d24") }
{ "_id" : ObjectId("52e466c1562bdb7b1b320d25") }
{ "_id" : ObjectId("52e466c1562bdb7b1b320d26") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d27") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d28") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d29") }
{ "_id" : ObjectId("52e466c3562bdb7b1b320d2a") }
{ "_id" : ObjectId("52e466c3562bdb7b1b320d2b") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2c") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2d") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2e") }
{ "_id" : ObjectId("52e466c5562bdb7b1b320d2f") }
{ "_id" : ObjectId("52e466c5562bdb7b1b320d30") }
has more
What am I doing wrong?
I know there are lot of questions regarding queries but I am still wrapping my head around the whole idea. Thanks for helping a newbie out.
I found the issue with both the commands -
For some reason the condition should be given in single quotes like this --
db.TestCollection.find('result.reference' : a3d245e343 }
I thought I can use double quotes in there but looks like I am wrong.
I was using a comma instead of a colon in the projector. The correct answer is -
db.TestCollection.find( {}, {"results.city" : 1} )

How to query mongodb with “like” for number data type? [duplicate]

I want to regex search an integer value in MongoDB. Is this possible?
I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.
Consider:
> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
>
As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.
Thanks!
If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.
> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }
I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.
Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.
let regex = /123/;
$regexMatch and $regexFind
db.col.find({
"$expr": {
"$regexMatch": {
"input": {"$toString": "$name"},
"regex": /123/
}
}
})
$regexFinAll
db.col.find({
"$expr": {
"$gt": [
{
"$size": {
"$regexFindAll": {
"input": {"$toString": "$name"},
"regex": "123"
}
}
},
0
]
}
})
From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toString": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
which produces:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334
}
Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.
The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.
The $match operator returns all those documents that match your pattern using the $regex operator.
db.seDemo.aggregate(
[
{ "$project": {
"stringifyExample": { "$toLower": "$example" },
"example": 1
}},
{ "$match": { "stringifyExample": /^123.*/ } }
]
)
which yields:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234,
"stringifyExample" : "1234"
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334,
"stringifyExample" : "12334"
}
Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])

How to find out result of elasticsearch parsing a query_string?

Is there a way to find out via the elasticsearch API how a query string query is actually parsed? You can do that manually by looking at the lucene query syntax, but it would be really nice if you could look at some representation of the actual results the parser has.
As javanna mentioned in comments there's _validate api. Here's what works on my local elastic (version 1.6):
curl -XGET 'http://localhost:9201/pl/_validate/query?explain&pretty' -d'
{
"query": {
"query_string": {
"query": "a OR (b AND c) OR (d AND NOT(e or f))",
"default_field": "t"
}
}
}
'
pl is name of index on my cluster. Different index could have different analyzers, that's why query validation is executed in a scope of an index.
The result of the above curl is following:
{
"valid" : true,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"explanations" : [ {
"index" : "pl",
"valid" : true,
"explanation" : "filtered(t:a (+t:b +t:c) (+t:d -(t:e t:or t:f)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter#ce2d82f1)"
} ]
}
I made one OR lowercase on purpose and as you can see in explanation, it is interpreted as a token and not as a operator.
As for interpretation of the explanation. Format is similar to +- operators of query string query:
( and ) characters start and end bool query
+ prefix means clause that will be in must
- prefix means clause that will be in must_not
no prefix means that it will be in should (with default_operator equal to OR)
So above will be equivalent to following:
{
"bool" : {
"should" : [
{
"term" : { "t" : "a" }
},
{
"bool": {
"must": [
{
"term" : { "t" : "b" }
},
{
"term" : { "t" : "c" }
}
]
}
},
{
"bool": {
"must": {
"term" : { "t" : "d" }
},
"must_not": {
"bool": {
"should": [
{
"term" : { "t" : "e" }
},
{
"term" : { "t" : "or" }
},
{
"term" : { "t" : "f" }
}
]
}
}
}
}
]
}
}
I used _validate api quite heavily to debug complex filtered queries with many conditions. It is especially useful if you want to check how analyzer tokenized input like an url or if some filter is cached.
There's also an awesome parameter rewrite that I was not aware of until now, which causes the explanation to be even more detailed showing the actual Lucene query that will be executed.

ElasticSearch for Attribute(Key) value data set

I am using Elasticsearch with Haystacksearch and Django and want to search the follow structure:
{
{
"title": "book1",
"category" : ["Cat_1", "Cat_2"],
"key_values" :
[
{
"key_name" : "key_1",
"value" : "sample_value_1"
},
{
"key_name" : "key_2",
"value" : "sample_value_12"
}
]
},
{
"title": "book2",
"category" : ["Cat_3", "Cat_2"],
"key_values" :
[
{
"key_name" : "key_1",
"value" : "sample_value_1"
},
{
"key_name" : "key_3",
"value" : "sample_value_6"
},
{
"key_name" : "key_4",
"value" : "sample_value_5"
}
]
}
}
Right now I have set up an index model using Haystack with a "text" that put all the data together and runs a full text search! In my opinion this is not the a well established search 'cause I am not using my data set structure and hence this is some kind odd.
As an example if for an object I have a key-value
{
"key_name": "key_1",
"value": "sample_value_1"
}
and for another object I have
{
"key_name": "key_2",
"value": "sample_value_1"
}
and we it gets a query like "Key_1 sample_value_1" comes I get a thoroughly mixed result of objects who have these words in their fields rather than using their structures.
P.S. I am totally new to ElasticSearch and better to say new to the search technologies and challenges. I have searched the web and SO button didn't find anything satisfying. Please let me know if there is something wrong with my thoughts and expectations from these search engines and if there is SO duplicate question! And also if there is a better approach to design a database for this kind of search
Read the es docs on nested mappings and do something like this:
"book_type" : {
"properties" : {
// title, cat mappings
"key_values" : {
"type" : "nested"
"properties": {
"key_name": {
"type": "string", "index": "not_analyzed"
},
"value": {
"type": "string"
}
}
}
}
}
Then query using a nested query
"nested" : {
"path" : "key_values",
"query" : {
"bool" : {
"must" : [
{
"term" : {"key_values.key_name" : "key_1"}
},
{
"match" : {"key_values.value" : "sample_value_1"}
}
]
}
}
}