how to query embedded document using mongodb - sql

Need help constructing this mongo query.
So far I can query on the first level, but unable to do so at the next embedded level ("labels" > 2")
For example, the document structure looks like this:
> db.versions_20170420.findOne();
{
"_id" : ObjectId("54bf146b77ac503bbf0f0130"),
"account" : "foo",
"labels" : {
"1" : {
"name" : "one",
"color" : "color1"
},
"2" : {
"name" : "two",
"color" : "color2"
},
"3" : {
"name" : "three",
"color" : "color3"
}
},
"profile" : "bar",
"version" : NumberLong("201412192106")
This query I can filter at the first level (account, profile).
db.profile_versions_20170420.find({"account":"foo", "profile": "bar"}).pretty()
However, given this structure, I'm looking for documents where the "label" > "2". It doesn't look like "2" is a number, but a string. Is there a way to construct the mongo query to do that? Do I need to do some conversion?

If I correctly understand you and your data structure, "label" > "2" means that object labels must contain property labels.3, and it is easy to check with next code:
db.profile_versions_20170420.find(
{"account": "foo", "profile": "bar", "labels.3": {$exists: true}}
).pretty();
But it doesn't mean that your object contains at least 3 properties, because it is not $size function which calculates count of elements in array, and we cannot use $size because labels is object not array. Hence in our case, we only know that labels have property 3 even it is the only one property which labels contains.
You can improve find criteria:
db.profile_versions_20170420.find({
"account": "foo",
"profile": "bar",
"labels.1": {$exists: true},
"labels.2": {$exists: true},
"labels.3": {$exists: true}
}).pretty();
and ensure that labes contains elements 1, 2, 3, but in this case, you have to care about object structure on application level during insert/update/delete data in document.
As another option, you can update your db and add extra field labelsConut and after that you will be able to run query like this:
db.profile_versions_20170420.find(
{"account": "foo", "profile": "bar", "labelsConut": {$gt: 2}}
).pretty();
btw, it will work faster...

Related

Mongodb query problem, how to get the matching items of the $or operator

Thank you for first.
MongoDB Version:4.2.11
I have a piece of data like this:
{
"name":...,
...
"administration" : [
{"name":...,"job":...},
{"name":...,"job":...}
],
"shareholder" : [
{"name":...,"proportion":...},
{"name":...,"proportion":...},
]
}
I want to match some specified data through regular expressions:
For a example:
db.collection.aggregate([
{"$match" :
{
"$or" :
[
{"name" : {"$regex": "Keyword"}}
{"administration.name": {"$regex": "Keyword"}},
{"shareholder.name": {"$regex": "Keyword"}},
]
}
},
])
I want to set a flag when the $or operator successfully matches any condition, which is represented by a custom field, for example:{"name" : {"$regex": "Keyword"}}Execute on success:
{"$project" :
{
"_id":false,
"name" : true,
"__regex_type__" : "name"
}
},
{"administration.name" : {"$regex": "Keyword"}}Execute on success:"__regex_type__" : "administration.name"
I try do this:
{"$project" :
{
"_id":false,
"name" : true,
"__regex_type__" :
{
"$switch":
{
"branches":
[
{"case": {"$regexMatch":{"input":"$name","regex": "Keyword"}},"then" : "name"},
{"case": {"$regexMatch":{"input":"$administration.name","regex": "Keyword"}},"then" : "administration.name"},
{"case": {"$regexMatch":{"input":"$shareholder.name","regex": "Keyword"}},"then" : "shareholder.name"},
],
"default" : "Other matches"
}
}
}
},
But $regexMatch cannot match the array,I tried to use $unwind again, but returned the number of many array members, which did not meet my starting point.
I want to implement the same function as mysql this SQL statement in mongodb, like this:
SELECT name,administration.name,shareholder.name,(
CASE
WHEN name REGEXP("Keyword") THEN "name"
WHEN administration.name REGEXP("Keyword") THEN "administration.name"
WHEN shareholder.name REGEXP("Keyword") THEN "shareholder.name"
END
)AS __regex_type__ FROM db.mytable WHERE
name REGEXP("Keyword") OR
shareholder.name REGEXP("Keyword") OR
administration.name REGEXP("Keyword");
Maybe this method is stupid, but I don’t have a better solution.
If you have a better solution, I would appreciate it!!!
Thank you!!!
Since $regexMatch does not handle arrays, use $filter to filter individual array elements with $regexMatch, then use $size to see how many elements matched.
[{"$match"=>{"$or"=>[{"a"=>"test"}, {"arr.a"=>"test"}]}},
{"$project"=>
{"a"=>1,
"arr"=>1,
"src"=>
{"$switch"=>
{"branches"=>
[{"case"=>{"$regexMatch"=>{"input"=>"$a", "regex"=>"test"}},
"then"=>"a"},
{"case"=>
{"$gte"=>
[{"$size"=>
{"$filter"=>
{"input"=>"$arr.a",
"cond"=>
{"$regexMatch"=>{"input"=>"$$this", "regex"=>"test"}}}}},
1]},
"then"=>"arr.a"}],
"default"=>"def"}}}}]
[{"_id"=>BSON::ObjectId('5ffb2df748966813f82f15ad'), "a"=>"test", "src"=>"a"},
{"_id"=>BSON::ObjectId('5ffb2df748966813f82f15ae'),
"arr"=>[{"a"=>"test"}],
"src"=>"arr.a"}]

Cloudant search document by attributes of nested objects

My documents in cloudant have the following structure
{
"_id" : "1234",
"name" : "test",
"objects" : [
{
"type" : "TYPE1"
"time" : "1215
},
{
"type" : "TYPE2"
"time" : "1115"
}
]
}
Now I need to query my documents by a list of types.
Examples
1) If I would query with TYPE1 then all the documents where there is an object with this type would return. (The example doc would return)
2) If I would query with TYPE1 and TYPE3 it would return all documents which contain either of them (The example doc would return)
3) If I would query with TYPE3, TYPE4 and TYPE5 it would return all documents which contain either of them (The example doc would not return)
How would the code in the _design document look like and how would my API request look like?
One option is to use Cloudant Search.
Sample design document named types, which indexes each type property in your objects array
{
"_id": "_design/types",
"views": {},
"language": "javascript",
"indexes": {
"one-of": {
"analyzer": "standard",
"index": "function (doc) {\n for(var i in doc.objects) {\n index(\"type\", doc.objects[i].type); \n }\n}"
}
}
}
Query examples:
Search for one key (type=val)
GET https://$HOST/$DATABASE/_design/$DDOC/_search/one-of?q=type%3ATYPE1
Search for multiple keys (type=val1 OR type=val2)
GET https://$HOST/$DATABASE/_design/$DDOC/_search/one-of?q=type%3ATYPE1%20OR%20type%3ATYPE2
Search for multiple keys (type=val1 AND type=val2)
GET https://$HOST/$DATABASE/_design/$DDOC/_search/one-of?q=type%3ATYPE1%20AND%20type%3ATYPE2
To include the documents in the response append &include_docs=true.

Mongo adding an object to original object

I am not sure if I am asking the correct question but I assume this is just a basic mongodb question.
I currently have this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2"
}
and I would like to add an exercise object like this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises": {
"benchpress",
"rows",
"curls",
}
I am just unsure how to create exercises with the object without using $push which just opens up an array. I don't want an array, I want an object.
Any help would be greatly appreciated it.
An object is a key-value pair. In your representation of the second document, you have a nested document exercises as a key and its value as an object containing only strings. Don't you see something strange there? An object without keys?
It should probably be an array of strings. Note that an array is indeed an object where the key is the numeric index starting from 0 and the value is the string in that position.
(You have an additional comma and a missing curly-brace. Lets fix that.)
This is the document we wish to see after updating the document.
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises": [
"benchpress",
"rows",
"curls"
]
}
Now, back to your question. How can we update the existing document with the exercises document? Its pretty simple. Mongodb has a 'update' method which exactly does that. Since we don't want to replace the entire document and just add additional fields, we should use $set to update specific fields. Fire up the mongo shell and switch to your database using use db-name. Then execute the following command. I assume you have an existing document with id ObjectId("57af98d4d71c4efff5304335"). Note that ObjectId is a BSON datatype.
db.scratch.update({ "_id" : ObjectId("57af98d4d71c4efff5304335") }, { $set: {"exercises": ["benchpress", "rows", "curls"] } })
This will update the document as
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises" : [
"benchpress",
"rows",
"curls"
]
}
Here scratch refers to the collection name. The update method takes 3 parameters.
Query to find the document to update
The Update parameter(document to update). You can either replace the whole document or just specific parts of the document(using $set).
An optional object which can tell Mongodb to insert the record if the document doesn't exist(upsert) or update multiple documents that match the criteria(multiple).
EXTRA
Warning: If you execute the following in the mongo shell,
db.scratch.update({ "_id" : ObjectId("57af98d4d71c4efff5304335") }, {"exercises": ["benchpress", "rows", "curls"] })
the entire document would be replaced except the _id field. So, the record would be something like this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"exercises" : [
"benchpress",
"rows",
"curls"
]
}
You should only do this when you are aware of the consequence.
Hope this helps.
For more, see https://docs.mongodb.com/manual/reference/method/db.collection.update/

How to setup a field mapping for ElasticSearch that allows both exact and full text searching?

Here is my problem:
I have a field called product_id that is in a format similar to:
A+B-12321412
If I used the standard text analyzer it splits it into tokens like so:
/_analyze/?analyzer=standard&pretty=true" -d '
A+B-1232412
'
{
"tokens" : [ {
"token" : "a",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "b",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 2
}, {
"token" : "1232412",
"start_offset" : 5,
"end_offset" : 12,
"type" : "<NUM>",
"position" : 3
} ]
}
Ideally, I would like to sometimes search for an exact product id and other times use a sub string and or just do a query for part of the product id.
My understanding of mappings and analyzers is that I can only specify one analyzer per field.
Is there a way to store a field as both analyzed and exact match?
Yes, you can use the fields parameter. In your case:
"product_id": {
"type": "string",
"fields": {
"raw": { "type": "string", "index": "not_analyzed" }
}
}
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_multi_fields.html
This allows you to index the same data twice, using two different definitions. In this case it will be indexed via both the default analyzer and not_analyzed which will only pick up exact matches. This is also useful for sorting return results:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-fields.html
However, you will need to spend some time thinking about how you want to search. In particular, given part numbers with a mix of alpha, numeric and punctuation or special characters you may need to get creative to tune your queries and matches.

Search multiple fields with "and" operator (but use fields' own analyzers)

ElasticSearch Version: 0.90.2
Here's the problem: I want to find documents in the index so that they:
match all query tokens across multiple fields
fields own analyzers are used
So if there are 4 documents:
{ "_id" : 1, "name" : "Joe Doe", "mark" : "1", "message" : "Message First" }
{ "_id" : 2, "name" : "Ann", "mark" : "3", "message" : "Yesterday Joe Doe got 1 for the message First"}
{ "_id" : 3, "name" : "Joe Doe", "mark" : "2", "message" : "Message Second" }
{ "_id" : 4, "name" : "Dan Spencer", "mark" : "2", "message" : "Message Third" }
And the query is "Joe First 1" it should find ids 1 and 2. I.e., it should find documents which contain all the tokens from search query, no matter in which fields they are (maybe all tokens are in one field, or maybe each token is in its own field).
One solution would be to use elasticsearch "_all" field functionality: that way it will merge all the fields I need (name, mark, message) into one and I'll be able to query it with something like
"match": {
"_all": {
"query": "Joe First 1",
"operator": "and"
}
}
But this way I can specify analyzer for the "_all" field only. And I need "name" and "message" fields to have different set of tokenizers/token filters (let's say name will have phonetic analyzer and message will have some stemming token filter).
Is there a way to do this?
Thanks to guys at elasticsearch group, here's the solution... pretty simple need to say :)
All I needed to do is to use query_string query http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query/ with default_operator = AND and it will do the trick:
{
"query": {
"query_string": {
"fields": [
"name",
"mark",
"message"
],
"query": "Joe First 1",
"default_operator": "AND"
}
}
}
I think using a multi match query makes sense here. Something like:
"multi_match": {
"query": "Joe First 1",
"operator": "and"
"fields": [ "name", "message", "mark"]
}
As you say, you can set the analyzer (or search_analyzer/index_analyzer) to be used on the _all field. It seems to me that should indeed be your first step to achieve the query results you're looking for.
From http://jontai.me/blog/2012/10/lucene-scoring-and-elasticsearch-_all-field/, we have this tasty quote:
... the _all field copies the text from the other fields and analyzes
them again; it doesn’t copy the pre-analyzed tokens. You can set a
separate analyzer for the _all field.
Which I interpret to mean that you should set your _all analyzer(s) as well as individual field analyzer(s). The _all field won't re-analyze the individual field data; it will grab the original field contents.