Joining multiple collections in different databases with $lookup - mongodb-query

I am a beginner to Mongo. I want to simulate an inner join using the aggregate $lookup and I have 3 collections ( 1 in a separate database ) I want to see all the projects that a user is part of can someone give me an example?
Here are the 3 collections
"projects.details"
{
"_id" : ObjectId("5684f3c454b1fd6926c324fd"),
"projName" : "I am a test project",
"active" : "true"
"projId" : "project1"
}
"userDetails.projMembership"
{
"_id" : ObjectId("56d82612b63f1c31cf906003"),
"projId" : "project1",
"userId" : "user1",
"status" : "Invite"
}
"userDetails.details"
{
"_id" : ObjectId("56d82612b63f1c31cf906003"),
"userId" : "user1",
"email" : "user1#somemail.com"
}

After searching far and wide for an answer, The answer is simple This is NOT possible in Mongo. The only way to achieve the result I was looking for is to perform a $lookup for the 2 collections in the userDetails DB, store the results in an array, then perform another lookup against the collection in the projects DB. Hope this helps someone out there.

Related

Karate - Conditional JSON schema validation

I am just wondering how can I do conditional schema validation. The API response is dynamic based on customerType key. If customerType is person then, person details will be included and if the customerType is org organization details will be included in the JSON response. So the response can be in either of the following forms
{
"customerType" : "person",
"person" : {
"fistName" : "A",
"lastName" : "B"
},
"id" : 1,
"requestDate" : "2021-11-11"
}
{
"customerType" : "org",
"organization" : {
"orgName" : "A",
"orgAddress" : "B"
},
"id" : 2,
"requestDate" : "2021-11-11"
}
The schema I created to validate above 2 scenario is as follows
{
"customerType" : "#string",
"organization" : "#? response.customerType=='org' ? karate.match(_,personSchema) : karate.match(_,null)",
"person" : "#? response.customerType=='person' ? karate.match(_,orgSchema) : karate.match(_,null)",
"id" : "#number",
"requestDate" : "#string"
}
but the schema fails to match with the actual response. What changes should I make in the schema to make it work?
Note : I am planning to reuse the schema in multiple tests so I will be keeping the schema in separate files, independent of the feature file
Can you refer to this answer which I think is the better approach: https://stackoverflow.com/a/47336682/143475
That said, I think you missed that the JS karate.match() API doesn't return a boolean, but a JSON that contains a pass boolean property.
So you have to do things like this:
* def someVar = karate.match(actual, expected).pass ? {} : {}

How do I query an array in MongoDB?

I have been trying multiple queries but still can't figure it out. I have multiple documents that look like this:
{
"_id" : ObjectId("5b51f519a33e7f54161a0efb"),
"assigneesEmail" : [
"felipe#gmail.com"
],
"organizationId" : "5b4e0de37accb41f3ac33c00",
"organizationName" : "PaidUp Volleyball Club",
"type" : "athlete",
"firstName" : "Mylo",
"lastName" : "Fernandes",
"description" : "",
"status" : "active",
"createOn" : ISODate("2018-07-20T14:43:37.610Z"),
"updateOn" : ISODate("2018-07-20T14:43:37.610Z"),
"__v" : 0
}
I need help writing a query where I can find this document by looking up the email in any part of the array element assigneeEmail. Any suggestions? I have tried $elemMatch but still could not get it to work.
Looks like my query was just incorrect. I figured it out.

How to prevent Facet Terms from tokenizing

I am using Facet Terms to get all the unique values and their count for a field. And I am getting wrong results.
term: web
Count: 1191979
term: misc
Count: 1191979
term: passwd
Count: 1191979
term: etc
Count: 1191979
While the actual result should be:
term: WEB-MISC /etc/passwd
Count: 1191979
Here is my sample query:
{
"facets": {
"terms1": {
"terms": {
"field": "message"
}
}
}
}
If reindexing is an option, it would be the best to change mapping and mark this fields as not_analyzed
"your_field" : { "type": "string", "index" : "not_analyzed" }
You can use multi field type if keeping an analyzed version of the field is desired:
"your_field" : {
"type" : "multi_field",
"fields" : {
"your_field" : {"type" : "string", "index" : "analyzed"},
"untouched" : {"type" : "string", "index" : "not_analyzed"}
}
}
This way, you can continue using your_field in the queries, while running facet searches using your_field.untouched.
Alternatively, if this field is stored, you can use a script field facet instead:
"facets" : {
"term" : {
"terms" : {
"script_field" : "_fields.your_field.value"
}
}
}
As the last resort, if this field is not stored, but record source is stored in the index, you can try this:
"facets" : {
"term" : {
"terms" : {
"script_field" : "_source.your_field"
}
}
}
The first solution is the most efficient. The last solution is the least efficient and may take a lot of time on a large index.
Wow, I also got this same issue today while term aggregating in the recent elastic-search. After googling and some partial understanding, found how this geeky indexing works(which is very simple).
Queries can find only terms that actually exist in the inverted index
When you index the following string
"WEB-MISC /etc/passwd"
it will be passed to an analyzer. The analyzer might tokenize it into
"WEB", "MISC", "etc" and "passwd"
with its position details. And this tokens might filtered to lowercase such as
"web", "misc", "etc" and "passwd"
So, after indexing,the search query can see the above 4 only. not the complete word "WEB-MISC /etc/passwd". For your requirement the following are my options you can use
1.Change the Default Analyzer used by elasticsearch([link][1])
2.If it is not need, just TurnOff the analyzer by setting 'not_analyzed' for the fields you need
3.To convert the already indexed data searchable, re-indexing is the only option
I have briefly explained this problem and proposed two solutions here.
I have talked about multiple approaches here.
One is use of not_analyzed to preserve the string as it is. But then as it has the drawback of being case insensitive , a better approach would be use keyword tokenizer + lowercase filter

Two "id" fields in one MongoDB collection with Rails 3?

I've got a Rails 3.0.9 project using the latest version of MongoDB and Mongoid 2.2.
I imported a CSV with an "id" field into a MongoDB collection named College, resulting in a collection like so:
{ "_id" : ObjectID("abc123"), "id" : ######, ... }
Observations:
The show action results in a URL utilizing the ObjectID
Displaying 'college.id' in index.html.erb displays the ObjectID
Questions:
How do I use the original "id" field as the parameter
Is "id" reserved by MongoDB, meaning I need to rename the "id" field in the
College collection (perhaps to "code") - if so, how?
Thanks!
Update
Answer:
db.colleges.update( { "name" : { $exists : true } } , { $rename : { "id" : "code" } }, false, true )
I used "name" since that was a field I could check for the existence.
_id is a reserved and required property in MongoDB - I think mongoid is mapping id to _id since that makes sense. There might be a way to access the id property through mongoid but I think you are much better off renaming the id column to something else to avoid confusion in the future.
{ $rename : { old_field_name : new_field_name } }
will rename the field name in a document (mongo 1.7.2+).
so
db.college.update({ "_id" : { $exists : true }}, { $rename : { 'id' : 'code' } }, false, true);
should update every record in that collection and rename the id field to code.
(obviously test this before running in any important data)

In MongoDB, is there anyway to tell what index is on a collection besides using coll.find({...}).explain()?

I think explain() will tell any possible index it can use. How about just showing all the indexes defined on the collection? (or even for the whole db?)
>db.system.indexes.find();
>db.system.indexes.find( { ns: "tablename" } );
will give you something like
{
"ns" : "test.fs.chunks",
"key" : { "files_id" : 1, "n" : 1 },
"name" : "files_id_1_n_1"
}
for every index (ns is the collection name).
Or use the collection name. I.e., if you have a users collection do:
db.users.getIndexes()