Mongodb query problem, how to get the matching items of the $or operator - sql

Thank you for first.
MongoDB Version:4.2.11
I have a piece of data like this:
{
"name":...,
...
"administration" : [
{"name":...,"job":...},
{"name":...,"job":...}
],
"shareholder" : [
{"name":...,"proportion":...},
{"name":...,"proportion":...},
]
}
I want to match some specified data through regular expressions:
For a example:
db.collection.aggregate([
{"$match" :
{
"$or" :
[
{"name" : {"$regex": "Keyword"}}
{"administration.name": {"$regex": "Keyword"}},
{"shareholder.name": {"$regex": "Keyword"}},
]
}
},
])
I want to set a flag when the $or operator successfully matches any condition, which is represented by a custom field, for example:{"name" : {"$regex": "Keyword"}}Execute on success:
{"$project" :
{
"_id":false,
"name" : true,
"__regex_type__" : "name"
}
},
{"administration.name" : {"$regex": "Keyword"}}Execute on success:"__regex_type__" : "administration.name"
I try do this:
{"$project" :
{
"_id":false,
"name" : true,
"__regex_type__" :
{
"$switch":
{
"branches":
[
{"case": {"$regexMatch":{"input":"$name","regex": "Keyword"}},"then" : "name"},
{"case": {"$regexMatch":{"input":"$administration.name","regex": "Keyword"}},"then" : "administration.name"},
{"case": {"$regexMatch":{"input":"$shareholder.name","regex": "Keyword"}},"then" : "shareholder.name"},
],
"default" : "Other matches"
}
}
}
},
But $regexMatch cannot match the array,I tried to use $unwind again, but returned the number of many array members, which did not meet my starting point.
I want to implement the same function as mysql this SQL statement in mongodb, like this:
SELECT name,administration.name,shareholder.name,(
CASE
WHEN name REGEXP("Keyword") THEN "name"
WHEN administration.name REGEXP("Keyword") THEN "administration.name"
WHEN shareholder.name REGEXP("Keyword") THEN "shareholder.name"
END
)AS __regex_type__ FROM db.mytable WHERE
name REGEXP("Keyword") OR
shareholder.name REGEXP("Keyword") OR
administration.name REGEXP("Keyword");
Maybe this method is stupid, but I don’t have a better solution.
If you have a better solution, I would appreciate it!!!
Thank you!!!

Since $regexMatch does not handle arrays, use $filter to filter individual array elements with $regexMatch, then use $size to see how many elements matched.
[{"$match"=>{"$or"=>[{"a"=>"test"}, {"arr.a"=>"test"}]}},
{"$project"=>
{"a"=>1,
"arr"=>1,
"src"=>
{"$switch"=>
{"branches"=>
[{"case"=>{"$regexMatch"=>{"input"=>"$a", "regex"=>"test"}},
"then"=>"a"},
{"case"=>
{"$gte"=>
[{"$size"=>
{"$filter"=>
{"input"=>"$arr.a",
"cond"=>
{"$regexMatch"=>{"input"=>"$$this", "regex"=>"test"}}}}},
1]},
"then"=>"arr.a"}],
"default"=>"def"}}}}]
[{"_id"=>BSON::ObjectId('5ffb2df748966813f82f15ad'), "a"=>"test", "src"=>"a"},
{"_id"=>BSON::ObjectId('5ffb2df748966813f82f15ae'),
"arr"=>[{"a"=>"test"}],
"src"=>"arr.a"}]

Related

Karate - Conditional JSON schema validation

I am just wondering how can I do conditional schema validation. The API response is dynamic based on customerType key. If customerType is person then, person details will be included and if the customerType is org organization details will be included in the JSON response. So the response can be in either of the following forms
{
"customerType" : "person",
"person" : {
"fistName" : "A",
"lastName" : "B"
},
"id" : 1,
"requestDate" : "2021-11-11"
}
{
"customerType" : "org",
"organization" : {
"orgName" : "A",
"orgAddress" : "B"
},
"id" : 2,
"requestDate" : "2021-11-11"
}
The schema I created to validate above 2 scenario is as follows
{
"customerType" : "#string",
"organization" : "#? response.customerType=='org' ? karate.match(_,personSchema) : karate.match(_,null)",
"person" : "#? response.customerType=='person' ? karate.match(_,orgSchema) : karate.match(_,null)",
"id" : "#number",
"requestDate" : "#string"
}
but the schema fails to match with the actual response. What changes should I make in the schema to make it work?
Note : I am planning to reuse the schema in multiple tests so I will be keeping the schema in separate files, independent of the feature file
Can you refer to this answer which I think is the better approach: https://stackoverflow.com/a/47336682/143475
That said, I think you missed that the JS karate.match() API doesn't return a boolean, but a JSON that contains a pass boolean property.
So you have to do things like this:
* def someVar = karate.match(actual, expected).pass ? {} : {}

How do I query an array in MongoDB?

I have been trying multiple queries but still can't figure it out. I have multiple documents that look like this:
{
"_id" : ObjectId("5b51f519a33e7f54161a0efb"),
"assigneesEmail" : [
"felipe#gmail.com"
],
"organizationId" : "5b4e0de37accb41f3ac33c00",
"organizationName" : "PaidUp Volleyball Club",
"type" : "athlete",
"firstName" : "Mylo",
"lastName" : "Fernandes",
"description" : "",
"status" : "active",
"createOn" : ISODate("2018-07-20T14:43:37.610Z"),
"updateOn" : ISODate("2018-07-20T14:43:37.610Z"),
"__v" : 0
}
I need help writing a query where I can find this document by looking up the email in any part of the array element assigneeEmail. Any suggestions? I have tried $elemMatch but still could not get it to work.
Looks like my query was just incorrect. I figured it out.

$in query with $elemMatch in mongoDB

I need to find a document where "components" not exist OR if exists, its id is 1111 and isActive is true and "issues" not exist OR its id is 1111 and isActive is true.
Example:
Document 1: ////this contains "components" but not "issues"
{..
"components" : [
{
"id" : "1111",
"name" : "component1",
"isActive" : true
}
],
}
Document 2://this contains "issues" but not "components"
{..
"issues" : [
{
"id" : "1111",
"name" : "issue 1",
"isActive" : true
}
],
..}
Document 3://This is not having both
Query:
db.sample.find({
"issues":{$in:[null, {"$elemMatch" : {"id":"1111","isActive": {"$in":[true,null]}}}]},
"components":{$in:[null, {"$elemMatch" : {"id":"1111","isActive": {"$in":[true,null]}}}]}
})
But I am getting an error, "errmsg" : "cannot nest $ under $in",
Please help to form the query that returns all the above 3 documents.
You don't require $in. You need $or operator.
Try
db.sample.find({
$or:[
{"issues":{$exists:true,"$elemMatch" : {"id":"1111","isActive": true}}},
{"components":{$exists:true,"$elemMatch" : {"id":"1111","isActive": true}}}
]
})
Based on OP's comments. here is the working version.
db.Sample.find({
$or:[
{"issues":{$exists:true,"$elemMatch":{"id":"1111","isActive":true}},
"components":{$exists:false} },
{"components":{$exists:true,"$elemMatch":{"id":"1111","isActive":true}},
"issues":{$exists:false}},
{"issues":{$exists:true,"$elemMatch":{"id":"1111","isActive": true}},
"components":{$exists:true,"$elemMatch":{"id":"1111","isActive":true}}},
{"issues":{$exists:false}, "components":{$exists:false}}
]
})

Mongo adding an object to original object

I am not sure if I am asking the correct question but I assume this is just a basic mongodb question.
I currently have this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2"
}
and I would like to add an exercise object like this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises": {
"benchpress",
"rows",
"curls",
}
I am just unsure how to create exercises with the object without using $push which just opens up an array. I don't want an array, I want an object.
Any help would be greatly appreciated it.
An object is a key-value pair. In your representation of the second document, you have a nested document exercises as a key and its value as an object containing only strings. Don't you see something strange there? An object without keys?
It should probably be an array of strings. Note that an array is indeed an object where the key is the numeric index starting from 0 and the value is the string in that position.
(You have an additional comma and a missing curly-brace. Lets fix that.)
This is the document we wish to see after updating the document.
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises": [
"benchpress",
"rows",
"curls"
]
}
Now, back to your question. How can we update the existing document with the exercises document? Its pretty simple. Mongodb has a 'update' method which exactly does that. Since we don't want to replace the entire document and just add additional fields, we should use $set to update specific fields. Fire up the mongo shell and switch to your database using use db-name. Then execute the following command. I assume you have an existing document with id ObjectId("57af98d4d71c4efff5304335"). Note that ObjectId is a BSON datatype.
db.scratch.update({ "_id" : ObjectId("57af98d4d71c4efff5304335") }, { $set: {"exercises": ["benchpress", "rows", "curls"] } })
This will update the document as
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"fullname" : "test",
"username" : "test",
"email" : "test#gmail.com",
"password" : "$2a$10$Wl29i6FemBrnOKq/ZErSguxlfvqoayZQkaEDirkmDl5O3GDEQjOV2",
"exercises" : [
"benchpress",
"rows",
"curls"
]
}
Here scratch refers to the collection name. The update method takes 3 parameters.
Query to find the document to update
The Update parameter(document to update). You can either replace the whole document or just specific parts of the document(using $set).
An optional object which can tell Mongodb to insert the record if the document doesn't exist(upsert) or update multiple documents that match the criteria(multiple).
EXTRA
Warning: If you execute the following in the mongo shell,
db.scratch.update({ "_id" : ObjectId("57af98d4d71c4efff5304335") }, {"exercises": ["benchpress", "rows", "curls"] })
the entire document would be replaced except the _id field. So, the record would be something like this:
{
"_id" : ObjectId("57af98d4d71c4efff5304335"),
"exercises" : [
"benchpress",
"rows",
"curls"
]
}
You should only do this when you are aware of the consequence.
Hope this helps.
For more, see https://docs.mongodb.com/manual/reference/method/db.collection.update/

elasticsearch: how to index terms which are stopwords only?

I had much success building my own little search with elasticsearch in the background. But there is one thing I couldn't find in the documentation.
I'm indexing the names of musicians and bands. There is one band called "The The" and due to the stop words list this band is never indexed.
I know I can ignore the stop words list completely but this is not what I want since the results searching for other bands like "the who" would explode.
So, is it possible to save "The The" in the index but not disabling the stop words at all?
You can use the synonym filter to convert The The into a single token eg thethe which won't be removed by the stopwords filter.
First, configure the analyzer:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
{
"settings" : {
"analysis" : {
"filter" : {
"syn" : {
"synonyms" : [
"the the => thethe"
],
"type" : "synonym"
}
},
"analyzer" : {
"syn" : {
"filter" : [
"lowercase",
"syn",
"stop"
],
"type" : "custom",
"tokenizer" : "standard"
}
}
}
}
}
'
Then test it with the string "The The The Who".
curl -XGET 'http://127.0.0.1:9200/test/_analyze?pretty=1&text=The+The+The+Who&analyzer=syn'
{
"tokens" : [
{
"end_offset" : 7,
"position" : 1,
"start_offset" : 0,
"type" : "SYNONYM",
"token" : "thethe"
},
{
"end_offset" : 15,
"position" : 3,
"start_offset" : 12,
"type" : "<ALPHANUM>",
"token" : "who"
}
]
}
"The The" has been tokenized as "the the", and "The Who" as "who" because the preceding "the" was removed by the stopwords filter.
To stop or not to stop
Which brings us back to whether we should include stopwords or not? You said:
I know I can ignore the stop words list completely
but this is not what I want since the results searching
for other bands like "the who" would explode.
What do you mean by that? Explode how? Index size? Performance?
Stopwords were originally introduced to improve search engine performance by removing common words which are likely to have little effect on the relevance of a query. However, we've come a long way since then. Our servers are capable of much more than they were back in the 80s.
Indexing stopwords won't have a huge impact on index size. For instance, to index the word the means adding a single term to the index. You already have thousands of terms - indexing the stopwords as well won't make much difference to size or to performance.
Actually, the bigger problem is that the is very common and thus will have a low impact on relevance, so a search for "The The concert Madrid" will prefer Madrid over the other terms.
This can be mitigated by using a shingle filter, which would result in these tokens:
['the the','the concert','concert madrid']
While the may be common, the the isn't and so will rank higher.
You wouldn't query the shingled field by itself, but you could combine a query against a field tokenized by the standard analyzer (without stopwords) with a query against the shingled field.
We can use a multi-field to analyze the text field in two different ways:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
{
"mappings" : {
"test" : {
"properties" : {
"text" : {
"fields" : {
"shingle" : {
"type" : "string",
"analyzer" : "shingle"
},
"text" : {
"type" : "string",
"analyzer" : "no_stop"
}
},
"type" : "multi_field"
}
}
}
},
"settings" : {
"analysis" : {
"analyzer" : {
"no_stop" : {
"stopwords" : "",
"type" : "standard"
},
"shingle" : {
"filter" : [
"standard",
"lowercase",
"shingle"
],
"type" : "custom",
"tokenizer" : "standard"
}
}
}
}
}
'
Then use a multi_match query to query both versions of the field, giving the shingled version more "boost"/relevance. In this example the text.shingle^2 means that we want to boost that field by 2:
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"query" : {
"multi_match" : {
"fields" : [
"text",
"text.shingle^2"
],
"query" : "the the concert madrid"
}
}
}
'