How to query a array of json in a mongodb document using spring data mongodb - mongodb-query

I am using spring data mongodb sdk to query mongo db.
The document in mongoDb looks like this:
{
"data": {
"suggestions":[
{
"key": "take",
"value": 1
},
{
"key": "donttake",
"value": 0
}
]
}
}
In my api request I have a structure similar to "suggestions" element above.
I want to create a query criteria where "is" clause should be the value of "suggestions" element in the api request.
I tried the following code using spring data mongo db:
JsonParser jsonParser = new JsonParser();
ObjectMapper objMapper = new ObjectMapper();
String jsonArrayString = objMapper.writeValueAsString(apirequest.getSuggestions());
JsonArray arrayFromString = jsonParser.parse(jsonArrayString).getAsJsonArray();
criteria = Criteria.where("data.suggestions").is(arrayFromString);
The problem with this code is that when I debug and see the query that gets created using criteria above, I goes in as $java: [{"key": "take", "value": 1}]
Therefore, it can't match it with the mongo document and doesn't fetch me any result.
Is there another way to query and array of documents in mongodb from spring data mongo ?

I followed a completely different approach by reading some information on querying arrays in mongodb available at
https://docs.mongodb.com/manual/tutorial/query-array-of-documents/
I used elemMatch to solve this problem as follows:
Let's say my API request gets mapped to and object suggestions and keyVal is an object that stores the keyVal pair.
for (KeyVal keyVal: suggestions()){
Criteria c =
Criteria.where("key").is(keyVal.key()).and("value").is(keyVal.value());
criteria = Criteria.where("data.suggestions").elemMatch(c);
}
Then criteria can be used in a mongo Query
Also, keep in mind that elemMatch doesn't care about the ordering of elements inside a document in an array. So that way, elemMatch solves the purpose well.

Related

Kafka Lenses SQL - How to WHERE filter based on objects nested in an array

I am working in Kafka Lenses v2.2.2. I need to filter based on a the value of an object inside an array.
Sample message (redacted for simplicity):
{
"payload": {
"Data": {
"something" : "stuff"
},
"foo": {
"bar": [
{
"id": "8177BE12-F69B-4A51-B12E-976D2AE37487",
"info": "more_data"
},
{
"id": "06A846C5-2138-4107-A5B0-A2FC21B9F32D",
"info": "more_data"
}
]
}
}
In lenses this actually appears as a nested object with a integer properties... 0, 1, etc.
So I've tried this, but it is throwing an error: .0 appears out of place
SELECT *
FROM topic_name
WHERE payload.foo.bar.0.id = "8177BE12-F69B-4A51-B12E-976D2AE37487"
LIMIT 10
I tried wrapping the 0 in double/single quotes as well and that throws a 500 error.
I copied and pasted the UUID from the first message in the topic, so it's definitely there. I also copy and pasted the labels to rule out typos. I am thinking there is some special way to access arrays with nested objects like this, but I'm struggling to find any documentation or videos discussing it.
I can be confident the value is stored in the first array element, but methods that can search all objects would be awesome as well.
The syntax (if you know the array index - as in my initial question) is:
SELECT *
FROM topic_name
WHERE payload.foo.bar[0].id = "8177BE12-F69B-4A51-B12E-976D2AE37487"
LIMIT 10
Though I am still struggling to do this if the array index is unknown and you need to check them all. I'm assuming at this point it's not possible without a series of OR statements in the WHERE clause that checks them all.

Count of documents 0 after inserting data with Nest

I am using Nest with the following connection settings:
var connectionPool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionSettings(connectionPool, new InMemoryConnection());
settings.DisableDirectStreaming(true); // needed to see good looking debug log on insert
settings.DefaultIndex(Index);
Client = new ElasticClient(settings);
With new InMemoryConnection() I hope to query with Nest - changing data inside an Azure Cloud function.
Strangely the debug logs look promising Indexing:
/*
var res = await Client.IndexManyAsync(response.Elements, Index); //
Console.WriteLine(res.DebugInformation);
*/
/*
var res = await Client.IndexAsync(response, i => i.Index(Index)); // Index = "data"
Console.WriteLine(res.DebugInformation); // <--
*/
And logging directly after the insertions the count is 0:
// var anyDocs = await Client.CountAsync<OverpassElement>(c => c.Index(Index));
var anyDocs = await Client.CountAsync<OverpassElement>(c => c);
Console.WriteLine("count: " + anyDocs.Count);
..but the entire json data being logged with the insertion.
How come i can't count it (so that I can search in a next step), after insertion?
Actually I get:
Invalid NEST response built from a successful (200) low level call on POST: /data/_doc
And there is 0 Items in the on the IndexResponse inserting.
The data is of Element looking like the following part of an array containing 4221 such items:
{
"type": "relation",
"id": 8353694,
"timestamp": "2018-06-04T22:54:27Z",
"version": 1,
"changeset": 59551528,
"user": "asdf2",
"uid": 1416503,
"members": [
{
"type": "way",
"ref": 89956942,
"role": "from"
},
{
"type": "node",
"ref": 1042756547,
"role": "via"
},
{
"type": "way",
"ref": 89956938,
"role": "to"
}
],
"tags": {
"restriction": "no_left_turn",
"type": "restriction"
}
},
ElasticSearch has many similarities to a NoSql data store. In this case, "read after write" is not guaranteed by default. When the index API call returns success, it doesn't mean "this document is now available for searching"; it means "ElasticSearch has accepted your document and it will be available for searching shortly". ElasticSearch uses eventual consistency by default.
However, this can be annoying during testing. So ElasticSearch has a Refresh API that essentially just blocks until all documents already indexed are available for searching. I strongly recommend that you do not call this in production; only in test code.
As the risk of reviving an old question, this answer from Russ Cam explains that InMemoryConnection does not actually run the operation against Elasticsearch.
InMemoryConnection doesn't actually send any requests or receive any responses from Elasticsearch; used in conjunction with .SetConnectionStatusHandler() on Connection settings (or .OnRequestCompleted() in NEST 2.x+), it's a convenient way to see the serialized form of requests.
So you can inspect the query that NEST generates from your code but you won't be able to observe the results.
I don`t know what Nest is, but I'd bet 100$ that if it use Transactional concepts, maybe you should commit it in order to see count correctly ?

VB.NET Processing Json from GitLab

I'm using the API of GitLab in VB.Net.
To request groups, I'm using GET /groups.
GitLab returns a JSON string like this:
[
{
"id":5,
"web_url":"https://XXXXX/groups/AAAA",
"name":"AAAA",
"path":"AAAA",
"description":"blabla",
"visibility":"private",
"share_with_group_lock":false,
"require_two_factor_authentication":false,
"two_factor_grace_period":48,
"project_creation_level":"developer",
},
{
"id":8,
"web_url":"https://XXXXX/groups/BBBBBB",
"name":"BBBBBB",
"path":"BBBBBB",
"description":"",
"visibility":"private",
"share_with_group_lock":false,
"require_two_factor_authentication":false,
"two_factor_grace_period":48,
"parent_id":null,
"ldap_cn":null,
"ldap_access":null
},
etc ...
]
It's quite complicated to parse it with Newtonsoft.Json so I would like first to convert it to an array of Dictionary.
Then, I will be able to loop through the array and get myrow("id") for instance.
I couldn't find how to do this, could you help me please?
String (list of Dictionary) -> List (Dictionary)

Combining results of multiple API calls

I am invoking an API operation to fetch a list of objects and then, for each object in that list, I am invoking another API operation to fetch additional details of that object and adding those details into the object. The goal is to return a list of objects with all the properties (or details) that I need. In the example below, the /allObjects call will get a list of all objects with 2 properties - id and k1. But my application needs 3 properties, id, k1 and k4. So I invoke another API method to get detailed information for each object, which includes the k4 property. I copy that detailed property into the original result-set's object and return the "enriched" result-set. But the latency cost of this can be high. What is the best way of achieving this without slowing down the application too much? In the real world, I am dealing with 500-2000 objects.
GET /allObjects yields JSON results =>
{
"data" : [
{
"id" : "123",
"k1" : "v1"
}, {
"id" : "456",
"k1" : "v1"
}
]
}
for (obj in results.data) {
GET /object/{obj.id} yields JSON result =>
{
"data" : {
"id" : "123",
"k1" : "v1",
"k2" : "v2",
"k3" : "v3",
"k4" : "v4"
}
}
// Add k4 property to original result-set object, and assign
// it the value of the k4 property in the individual result object.
obj.k4 = result.data.k4;
}
return results.data;
Your requirement is such that, you have no other option but to go for a mash-up(unless you can convince the API developers to combine the two API).
You could, however, opt to make a mashup service with a low latency cost to stand in between your application to abstract out the mashup logic, conversely, you could opt to use a language that is custom made for this kind of work to program your application. Both of these options can be accommodated with Ballerina, I've written a post here showing how easy it is to do that using it.

Why does storing a Nancy.DynamicDictionary in RavenDB only save the property-names and not the property-values?

I am trying to save (RavenDB build 960) the names and values of form data items passed into a Nancy Module via its built in Request.Form.
If I save a straightforward instance of a dynamic object (with test properties and values) then everything works and both the property names and values are saved. However, if I use Nancy's Request.Form then only the dynamic property names are saved.
I understand that I will have to deal with further issues to do with restoring the correct types when retrieving the dynamic data (RavenJObjects etc) but for now, I want to solve the problem of saving the dynamic names / values in the first place.
Here is the entire test request and code:
Fiddler Request (PUT)
Nancy Module
Put["/report/{name}/add"] = parameters =>
{
reportService.AddTestDynamic(Db, parameters.name, Request.Form);
return HttpStatusCode.Created;
};
Service
public void AddTestDynamic(IDocumentSession db, string name, dynamic data)
{
var testDynamic = new TestDynamic
{
Name = name,
Data = data
};
db.Store(testDynamic);
db.SaveChanges();
}
TestDynamic Class
public class TestDynamic
{
public string Name;
public dynamic Data;
}
Dynamic contents of Request.Form at runtime
Resulting RavenDB Document
{
"Name": "test",
"Data": [
"username",
"age"
]
}
Note: The type of the Request.Form is Nancy.DynamicDictionary. I think this may be the problem since it inherits from IEnumerable<string> and not the expected IEnumerable<string, object>. I think that RavenDB is enumerating the DynamicDictionary and only getting back the dynamic member-names rather than the member name / value pairs.
Can anybody tell me how or whether I can treat the Request.Form as a dynamic object with respect to saving it to RavenDB? If possible I want to avoid any hand-crafted enumeration of DynamicDictionary to build a dynamic instance so that RavenDB can serialise correctly.
Thank You
Edit 1 #Ayende
The DynamicDictionary appears to implement the GetDynamicMemberNames() method:
Taking a look at the code on GitHub reveals the following implementation:
public override IEnumerable<string> GetDynamicMemberNames()
{
return dictionary.Keys;
}
Is this what you would expect to see here?
Edit 2 #TheCodeJunkie
Thanks for the code update. To test this I have:
Created a local clone of the NancyFx/Nancy master branch from
GitHub
Added the Nancy.csproj to my solution and referenced the project
Run the same test as above
RavenDB Document from new DynamicDictionary
{
"Name": "test",
"Data": {
"$type": "Nancy.DynamicDictionary, Nancy",
"username": {},
"age": {}
}
}
You can see that the resulting document is an improvement. The DynamicDictionary type information is now being correctly picked up by RavenDB and whilst the dynamic property-names are correctly serialized, unfortunately the dynamic property-values are not.
The image below shows the new look DynamicDictionary in action. It all looks fine to me, the new Dictionary interface is clearly visible. The only thing I noticed was that the dynamic 'Results view' (as opposed to the 'Dynamic view') in the debugger, shows just the property-names and not their values. The 'Dynamic view' shows both as before (see image above).
Contents of DynamicDictionary at run time
biofractal,
The problem is the DynamicDictionary, in JSON, types can be either objects or lists ,they can't be both.
And for dynamic object serialization, we rely on the implementation of GetDynamicMemberNames() to get the properties, and I assume that is isn't there.