Mongodb: Pull corresponding value from another collection - mongodb-query

I have two diff collections:
CollectionA
{
"_id" : 1.0,
"1234" : "GROUP"
}
{
"_id" : 2.0,
"2345" : "SUBGROUP"
}
CollectionB
{
"_id" : 1.0,
"config" : "1234",
"description" : "DCS"
}
{
"_id" : 2.0,
"config" : "2345",
"description" : "BCS"
}
I was expecting the below output when i write a find query by joining the two collections. Can we able to get the requested output by using $lookup function?
{
"_id" : 1.0,
"config" : "GROUP",
"description" : "DCS",
}
{
"_id" : 2.0,
"config" : "SUBGROUP",
"description" : "BCS",
}

You can implement your query like this:
db.getCollection('a').aggregate([{
$lookup: {
from: 'b',
localField: '_id',
foreignField: '_id',
as: 'data'
}
},
{
$unwind: '$data'
},
{
$project: {
config: '$1234', //In your schema you may have the same key instead of 1234 and 2345
description: '$data.description'
}
}
]);

Related

How do i change MongoDB JSON data to array

I need to update the MongoDB field with the array of objects where JSON object to be updated with as an array
if I have something like this in MongoDB
"designSectionContents" : [
{
"_id" : "5bae17ecbd7595540145ec98",
"type" : "subSection",
"columns" : [
{
"0" : {
"itemId" : "5b7465980783d9a37058f160",
"type" : "field"
}
},
{
"0" : {
"itemId" : "5b7465630783d9a37058f15c",
"type" : "field"
}
},
{
"0" : {
"itemId" : "5b7465810783d9a37058f15e",
"type" : "field"
}
}
],
"subSectionContentLayout" : {
"labelPlacement" : "Top",
"columns" : 3
}
}
]
I want to change the above snippet to below in MongoDB
"designSectionContents" : [
{
"_id" : ObjectId("5bae17ecbd7595540145ec98"),
"type" : "subSection",
"columns" : [
[
{
"itemId" : "5b7465980783d9a37058f160",
"type" : "field"
}
],
[
{
"itemId" : "5b7465630783d9a37058f15c",
"type" : "field"
}
],
[
{
"itemId" : "5b7465810783d9a37058f15e",
"type" : "field"
}
]
]
}
]
curly braces opening and closing tag has to be changed to array
This should work:
db.collection.aggregate([
{
"$project": {
"designSectionContents": {
"$map": {
"input": "$designSectionContents",
"as": "designSectionContent",
"in": {
"_id": "$$designSectionContent._id",
"type": "$$designSectionContent.type",
"columns": {
"$map": {
"input": "$$designSectionContent.columns",
"as": "inp",
"in": [
"$$inp.0"
]
}
}
}
}
}
}
}
]);
Here's the working link.

Querying data from Elasticsearch

Using Elasticsearch 7.*, trying to execute SQL query on an index 'com-prod':
GET /com-prod/_search
{
"script_fields": {
"test1": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime"
}
}
}
}
It gives the output and below as one of the hit successfully:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test1" : [
"29958"
]
}
}
Now, I am trying to increment the ElapsedTime by 2, as below:
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime + 2"
}
}
}
}
But its actually adding number 2 to the output, as below:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test2" : [
"299582"
]
}
}
Please guide what could be wrong here, and how to get the output as 29960.
You are getting 299582, instead of 29960, because the ElapsedTime field is of string type ("29958"), so when you are adding 2 in this using script, 2 gets appended at the end (similar to concat two strings).
So, in order to solve this issue, you can :
Create a new index, with updated mapping of the ElaspsedTIme field of int type, then reindex the data. Then you can use the same search query as given in the question above.
Convert the string to an int type value, using Integer.parseInt()
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "Integer.parseInt(params._source.ElapsedTime) + 2"
}
}
}
}

db.find vs db.aggregation to select nested array Object

I'v tried to perform the following query :
db.getCollection('fxh').find({"username": "user1", "pf.acc.accnbr" : 915177},{userid: true, "pf.pfid": true, "pf.acc.accid":true})
and my collection is the following :
{
"_id" : ObjectId("5932fd8f381d4c0a7de21942"),
"userid" : 1496513894,
"username" : "user1",
"email" : "user1#gmail.com",
"fullname" : "User 1",
"pf" : {
"acc" : [
{
"cyc" : [
{
"det" : {
"status" : "New",
"dcycid" : 1496513941
},
"status" : "New",
"name" : "QPT202017_M1",
"cycid" : 1496513940
}
],
"status" : "New",
"accnbr" : 915177,
"accid" : 1496513939
},
{
"cyc" : [
{
"det" : {
"status" : "New",
"dcycid" : 1496552643
},
"status" : "New",
"name" : "QPT202017_S8",
"cycid" : 1496552642
}
],
"status" : "New",
"accnbr" : 73497,
"accid" : 1496552641
}
],
"pfid" : 1496513935,
},
"lastupdate" : ISODate("2017-06-03T18:18:55.080Z"),
"__v" : 0
}
When I execute the query the result is the following :
{
"_id" : ObjectId("5932fd8f381d4c0a7de21942"),
"userid" : 1496513894,
"portfolio" : {
"acc" : [
{
"accid" : 1496513939
},
{
"accid" : 1496552641
}
],
"pfid" : 1496513935
}
}
And my problem is that I need to see only the concerned accid and the result returns the all accid !.
Any idea how just to return the selected accid of accnbr ?
NB : I have also tried to add $ sign at the end of my query , it
selects the right acc but it returns the all objects or I need just
only ONE returned object.
On 6/5/17
I also used the aggregate command instead of find and it get result by using this :
db.getCollection('fxh').aggregate([ { $unwind : "$pf.acc"} , { $match : {"username":"adh1", "pf.acc.accbr": 915177 } }, {$project : {_id:0, accid: "$pf.acc.accid"}}])
But could NOT get a lower level result, when I ran this :
db.getCollection('fxh').aggregate([ { $unwind : "$pf.acc.cyc"} , { $match : {"username":"adh1", "pf.acc.accbr": 915177, "pf.acc.cyc.name": "QPT202017_M1" } }, {$project : {_id:0, cycid: "$pf.acc.cyc.cycid"}}])
Any idea ?
You can try the below aggregation pipeline.
The idea is to $unwind one nested level at a time, starting from the outermost to the innermost.
For each nested level unwinding, you can apply the$match to limit the documents and continue till you have the desired shape.
You can $group it together at the end to get back to the original shape.
db.getCollection('fxh').aggregate([
{ $match : {"username":"adh1"} },
{ $unwind : "$pf.acc"} ,
{ $match : {"pf.acc.accbr": 915177 } },
{ $unwind : "$pf.acc.cyc"},
{ $match : {"pf.acc.cyc.name": "QPT202017_M1" } },
{$project : {_id:0, accid: "$pf.acc.accid", cycid: "$pf.acc.cyc.cycid"}}])

Scoring documents in Lucene 6.2.0

My query in lucene 6.2.0 goes like:
query query = new PhraseQuery.Builder()
.add(new Term("country","russia"))
.setSlop(1)
.build();
Basically among all my documents which are:
{
"_id" : ObjectId("586b723b4b9a835db416fa26"),
"name" : "test",
"countries" : {
"country" : [
{
"name" : "russia"
},
{
"name" : "USA china"
}
]
}
}
{
"_id" : ObjectId("586b73f24b9a835fefb10ca5"),
"name" : "nitika jain",
"countries" : {
"country" : [
{
"name" : "russia and denmrk"
},
{
"name" : "USA china"
}
]
}
}
{
"_id" : ObjectId("586b744f4b9a835fefb10ca7"),
"name" : "arjun",
"countries" : {
"country" : [
{
"name" : "russia pakistan"
},
{
"name" : "india iraq"
}
]
}
}
I want a document which has only russia. Ideally it should be the one highest scored, but instead I get something like "Found 3 hits."
Document<stored,indexed,tokenized<id:586b723b4b9a835db416fa26> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:test> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia"} , { "name" : "USA china"}]}> stored,indexed,tokenized<country:russia> stored,indexed,tokenized<country:USA china>>**0.12874341**
Document<stored,indexed,tokenized<id:586b73f24b9a835fefb10ca5> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:nitika jain> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia and denmrk"} , { "name" : "USA china"}]}> stored,indexed,tokenized<country:russia and denmrk> stored,indexed,tokenized<country:USA china>>**0.12874341**
Document<stored,indexed,tokenized<id:586b744f4b9a835fefb10ca7> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<name:arjun> stored,indexed,tokenized,omitNorms,indexOptions=DOCS<countries:{ "country" : [ { "name" : "russia pakistan"} , { "name" : "india iraq"}]}> stored,indexed,tokenized<country:russia pakistan> stored,indexed,tokenized<country:india iraq>>**0.12874341**
All 3 results are equally scored. How can I get the document with only russia to be highest scored?
In Phrase queries, the slop is zero by default, requiring exact matches. that means that if you modify your query in this way:
query query = new PhraseQuery.Builder()
.add(new Term("country","russia"))
.build();
you'll get what you're looking for.

Elasticsearch Not Returning Document By Field Name

Elasticsearch newb here. I seem to be having an issue selecting documents by a certain field. It feels like a corrupt index to me, but I'm not sure.
Here is a document that I can retrieve, and get the fields event.type and event.accountId:
$ curl -XGET 'http://127.0.0.1:9200/events-2015.04.08/event/AUyYpkl-r99VdGrSLpIX?pretty=1&fields=event.type,event.accountId'
{
"_index" : "events-2015.04.08",
"_type" : "event",
"_id" : "AUyYpkl-r99VdGrSLpIX",
"_version" : 1,
"found" : true,
"fields" : {
"event.type" : [ "USER_LOGIN" ],
"event.accountId" : [ 10399 ]
}
}
Notice the event.type: USER_LOGIN. Now I want to find all documents that have this field/value combination:
curl -XGET 'http://127.0.0.1:9200/events-2015.04.08/_search?q=event.type:USER_LOGIN&pretty=1'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
No results. I can find the document by event.accountId though:
$ curl -XGET 'http://127.0.0.1:9200/events-2015.04.08/_search?q=event.accountId:10399&pretty=1'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [ {
"_index" : "events-2015.04.08",
"_type" : "event",
"_id" : "AUyYpkjCr99VdGrSLpIW",
"_score" : 1.0,
"_source": {...}
}, {
"_index" : "events-2015.04.08",
"_type" : "event",
"_id" : "AUyYpkl-r99VdGrSLpIX", # <-- This is the doc I want
"_score" : 1.0,
"_source": {...}
} ]
}
}
So is this field corrupt or something? How do I check? I expect to be able to find this document by event.type.
UPDATE
The document is being indexed with the SQS plugin to Logstash. Here is the relevant part of logstash.conf:
input {
sqs {
queue => "the_queue"
region => "us-west-2"
type => "event"
}
}
filter {
json {
source => "Message"
target => "event"
remove_field => [ "Message" ]
}
mutate {
rename => { "Type" => "EventType" }
}
date {
match => [ "Timestamp", "ISO8601" ]
}
}