Errors doing mongodb nested query - mongodb-query

I am having a little trouble with nested queries in mongodb.
I have a collection with the following structure --
{
"_id" : Objectid(..),
"result" : {
"name" : nameValue,
"reference" : base64Value,
"city" : cityValue
}
}
Now I am to do two queries in the mongo shell -
search for a specific reference value (so query for equality)
I am using the following query -
db.TestCollection.find("result.reference" : a3d245e343 }
but I get nothing when I know the record is there in the collection
search and print for all city values.
I am looking to print something like this--
{ "city": "new york city" }
{ "city" : "brooklyn" }
... etc
For this I use this query --
db.TestCollection.find( {}, {"results.city", 1} )
For this I do not get the output I was hoping for but only get a list of all "_id" values like this --
{ "_id" : ObjectId("52e466bd562bdb7b1b320d1d") }
{ "_id" : ObjectId("52e466be562bdb7b1b320d1e") }
{ "_id" : ObjectId("52e466be562bdb7b1b320d1f") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d20") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d21") }
{ "_id" : ObjectId("52e466bf562bdb7b1b320d22") }
{ "_id" : ObjectId("52e466c0562bdb7b1b320d23") }
{ "_id" : ObjectId("52e466c0562bdb7b1b320d24") }
{ "_id" : ObjectId("52e466c1562bdb7b1b320d25") }
{ "_id" : ObjectId("52e466c1562bdb7b1b320d26") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d27") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d28") }
{ "_id" : ObjectId("52e466c2562bdb7b1b320d29") }
{ "_id" : ObjectId("52e466c3562bdb7b1b320d2a") }
{ "_id" : ObjectId("52e466c3562bdb7b1b320d2b") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2c") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2d") }
{ "_id" : ObjectId("52e466c4562bdb7b1b320d2e") }
{ "_id" : ObjectId("52e466c5562bdb7b1b320d2f") }
{ "_id" : ObjectId("52e466c5562bdb7b1b320d30") }
has more
What am I doing wrong?
I know there are lot of questions regarding queries but I am still wrapping my head around the whole idea. Thanks for helping a newbie out.

I found the issue with both the commands -
For some reason the condition should be given in single quotes like this --
db.TestCollection.find('result.reference' : a3d245e343 }
I thought I can use double quotes in there but looks like I am wrong.
I was using a comma instead of a colon in the projector. The correct answer is -
db.TestCollection.find( {}, {"results.city" : 1} )

Related

Querying data from Elasticsearch

Using Elasticsearch 7.*, trying to execute SQL query on an index 'com-prod':
GET /com-prod/_search
{
"script_fields": {
"test1": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime"
}
}
}
}
It gives the output and below as one of the hit successfully:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test1" : [
"29958"
]
}
}
Now, I am trying to increment the ElapsedTime by 2, as below:
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime + 2"
}
}
}
}
But its actually adding number 2 to the output, as below:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test2" : [
"299582"
]
}
}
Please guide what could be wrong here, and how to get the output as 29960.
You are getting 299582, instead of 29960, because the ElapsedTime field is of string type ("29958"), so when you are adding 2 in this using script, 2 gets appended at the end (similar to concat two strings).
So, in order to solve this issue, you can :
Create a new index, with updated mapping of the ElaspsedTIme field of int type, then reindex the data. Then you can use the same search query as given in the question above.
Convert the string to an int type value, using Integer.parseInt()
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "Integer.parseInt(params._source.ElapsedTime) + 2"
}
}
}
}

Mapping ElasticSearch apache module field

I am new to ES and I am facing a little problem I am struggling with.
I integrated metricbeat apache module with ES and the it works fine.
The problem is that metricbeat apache module reports the KB of web traffic of apache (field apache.status.total_kbytes), instead I would like to create my own field, the name of which would be "apache.status.total_mbytes).
I am trying to create a new mapping via Dev Console using the followind api commands:
PUT /metricbeat-7.2.0/_mapping
{
"settings":{
},
"mappings" : {
"apache.status.total_mbytes" : {
"full_name" : "apache.status.total_mbytes",
"mapping" : {
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Still ES returns the following error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
},
"status" : 400
}
FYI
The following may shed some light
GET /metricbeat-*/_mapping/field/apache.status.total_kbytes
Returns
{
"metricbeat-7.9.2-2020.10.06-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"metricbeat-7.2.0-2020.10.05-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
What am I missing? Is the _mapping command wrong?
Thanks in advance,
A working example:
Create new index
PUT /metricbeat-7.2.0
{
"settings": {},
"mappings": {
"properties": {
"apache.status.total_kbytes": {
"type": "long"
}
}
}
}
Then GET metricbeat-7.2.0/_mapping/field/apache.status.total_kbytes will result in (same as your example):
{
"metricbeat-7.2.0" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
Now if you want to add a new field to an existing mapping use the API this way:
Update an existing index
PUT /metricbeat-7.2.0/_mapping
{
"properties": {
"total_mbytes": {
"type": "long"
}
}
}
Then GET metricbeat-7.2.0/_mapping will show you the updated mapping:
{
"metricbeat-7.2.0" : {
"mappings" : {
"properties" : {
"apache" : {
"properties" : {
"status" : {
"properties" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Also, take a look at Put Mapping Api

db.find vs db.aggregation to select nested array Object

I'v tried to perform the following query :
db.getCollection('fxh').find({"username": "user1", "pf.acc.accnbr" : 915177},{userid: true, "pf.pfid": true, "pf.acc.accid":true})
and my collection is the following :
{
"_id" : ObjectId("5932fd8f381d4c0a7de21942"),
"userid" : 1496513894,
"username" : "user1",
"email" : "user1#gmail.com",
"fullname" : "User 1",
"pf" : {
"acc" : [
{
"cyc" : [
{
"det" : {
"status" : "New",
"dcycid" : 1496513941
},
"status" : "New",
"name" : "QPT202017_M1",
"cycid" : 1496513940
}
],
"status" : "New",
"accnbr" : 915177,
"accid" : 1496513939
},
{
"cyc" : [
{
"det" : {
"status" : "New",
"dcycid" : 1496552643
},
"status" : "New",
"name" : "QPT202017_S8",
"cycid" : 1496552642
}
],
"status" : "New",
"accnbr" : 73497,
"accid" : 1496552641
}
],
"pfid" : 1496513935,
},
"lastupdate" : ISODate("2017-06-03T18:18:55.080Z"),
"__v" : 0
}
When I execute the query the result is the following :
{
"_id" : ObjectId("5932fd8f381d4c0a7de21942"),
"userid" : 1496513894,
"portfolio" : {
"acc" : [
{
"accid" : 1496513939
},
{
"accid" : 1496552641
}
],
"pfid" : 1496513935
}
}
And my problem is that I need to see only the concerned accid and the result returns the all accid !.
Any idea how just to return the selected accid of accnbr ?
NB : I have also tried to add $ sign at the end of my query , it
selects the right acc but it returns the all objects or I need just
only ONE returned object.
On 6/5/17
I also used the aggregate command instead of find and it get result by using this :
db.getCollection('fxh').aggregate([ { $unwind : "$pf.acc"} , { $match : {"username":"adh1", "pf.acc.accbr": 915177 } }, {$project : {_id:0, accid: "$pf.acc.accid"}}])
But could NOT get a lower level result, when I ran this :
db.getCollection('fxh').aggregate([ { $unwind : "$pf.acc.cyc"} , { $match : {"username":"adh1", "pf.acc.accbr": 915177, "pf.acc.cyc.name": "QPT202017_M1" } }, {$project : {_id:0, cycid: "$pf.acc.cyc.cycid"}}])
Any idea ?
You can try the below aggregation pipeline.
The idea is to $unwind one nested level at a time, starting from the outermost to the innermost.
For each nested level unwinding, you can apply the$match to limit the documents and continue till you have the desired shape.
You can $group it together at the end to get back to the original shape.
db.getCollection('fxh').aggregate([
{ $match : {"username":"adh1"} },
{ $unwind : "$pf.acc"} ,
{ $match : {"pf.acc.accbr": 915177 } },
{ $unwind : "$pf.acc.cyc"},
{ $match : {"pf.acc.cyc.name": "QPT202017_M1" } },
{$project : {_id:0, accid: "$pf.acc.accid", cycid: "$pf.acc.cyc.cycid"}}])

Scoring documents in Lucene 6.2.0

My query in lucene 6.2.0 goes like:
query query = new PhraseQuery.Builder()
.add(new Term("country","russia"))
.setSlop(1)
.build();
Basically among all my documents which are:
{
"_id" : ObjectId("586b723b4b9a835db416fa26"),
"name" : "test",
"countries" : {
"country" : [
{
"name" : "russia"
},
{
"name" : "USA china"
}
]
}
}
{
"_id" : ObjectId("586b73f24b9a835fefb10ca5"),
"name" : "nitika jain",
"countries" : {
"country" : [
{
"name" : "russia and denmrk"
},
{
"name" : "USA china"
}
]
}
}
{
"_id" : ObjectId("586b744f4b9a835fefb10ca7"),
"name" : "arjun",
"countries" : {
"country" : [
{
"name" : "russia pakistan"
},
{
"name" : "india iraq"
}
]
}
}
I want a document which has only russia. Ideally it should be the one highest scored, but instead I get something like "Found 3 hits."
Document<stored,indexed,tokenized<id:586b723b4b9a835db416fa26> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:test> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia"} , { "name" : "USA china"}]}> stored,indexed,tokenized<country:russia> stored,indexed,tokenized<country:USA china>>**0.12874341**
Document<stored,indexed,tokenized<id:586b73f24b9a835fefb10ca5> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:nitika jain> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia and denmrk"} , { "name" : "USA china"}]}> stored,indexed,tokenized<country:russia and denmrk> stored,indexed,tokenized<country:USA china>>**0.12874341**
Document<stored,indexed,tokenized<id:586b744f4b9a835fefb10ca7> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:arjun> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia pakistan"} , { "name" : "india iraq"}]}> stored,indexed,tokenized<country:russia pakistan> stored,indexed,tokenized<country:india iraq>>**0.12874341**
All 3 results are equally scored. How can I get the document with only russia to be highest scored?
In Phrase queries, the slop is zero by default, requiring exact matches. that means that if you modify your query in this way:
query query = new PhraseQuery.Builder()
.add(new Term("country","russia"))
.build();
you'll get what you're looking for.

What is the default doc sequence of the result from an Elasticsearch filter request?

I recently run an Elasticsearch filter request that is
{
"from" : 0,
"size" : 10,
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" : {
"terms" : {
"a_id" : [ 257793, 257798, 257844 ]
}
}
}
}
}
},
"explain" : false,
"fields" : "a_id"
}
So that I can find all docs with a_id in 257793, 257798, 257844 and the results are 257844, 257798, 257793. So far so good.
Then I find that whatever the sequence of the term numbers are, the return docs are always in the same a_id order. That is, even I run
"terms" : {
"a_id" : [257798, 257844, 257793 ]
}
The result docs are in the order of 257844, 257798, 257793 as well.
So I am so curious about the mechanism behind the Elasticsearch filtering. Can anyone help me and give me a hint?
By default, ES returns in descending order of _score. You can provide the sort option, to say in which order and based on what you want the results to be returned. For e.g., for based on date field
{
"sort": { "date": { "order": "desc" }}
"query" : {
"term" : { "user" : "kimchy" }
}
}
You can get more information:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_sorting.html