MongoTemplate invalid projection with conflict path - mongodb-query

I am using Spring Data MongoTemplate to query docs.
The document stored in the collection is structured as
{
id: "string",
metadata: { -- embeded structure},
version: {
metadata: {
version: 1,
-- other fields
},
versionContent: { -- embeded structure --}
}
}
In my query, I only need a subset of fields, so the ProjectionOperation I use is
Aggregation.project("id", "metadata", "version.metadata");
Got the exception:
specification contains two conflicting paths. Cannot specify both 'metadata.version' and 'metadata'
What shall I do to deal with this?

Eventually, instead of specifying fields to include in the projection, I have to specify excluded fields, as below,
Aggregation.project().andExclude("not-needed-field1", "not-needed-field2")
to avoid the "conflicting paths" exception. Not sure if it is a bug of the MongoTemplate, or there is certain reason to not allow user to include these kind of paths in projection.

Related

Laravel Scout toSearchableArray attribute is not filterable

I've been doing some testing with laravel scout and according to the documentation (https://laravel.com/docs/8.x/scout#configuring-searchable-data), I've mapped my User model as such:
/**
* Get the indexable data array for the model.
*
* #return array
*/
public function toSearchableArray()
{
$data = $this->toArray();
return array_merge($data, [
'entity' => 'An entity'
]);
}
Just for the sake of testing, this is literally what I came down to on the debugging.
After importing the User model with this mapping, I can see on the meilisearch dashboard it is indeed showing the user data + the entity = 'An entity'.
However, when applying this:
User::search('something')->where('entity', 'An entity')->get()
It produces the following error:
"message": " --> 1:1\n |\n1 | entity=\"An entity\"\n | ^----^\n |\n = attribute `entity` is not filterable, available filterable attributes are: ",
"exception": "MeiliSearch\\Exceptions\\ApiException",
"file": "/var/www/api/vendor/meilisearch/meilisearch-php/src/Http/Client.php",
Tracing back to view the 'filterable attributes', I've ended at the conclusion that:
$client = app(\MeiliSearch\Client::class);
dump($client->index('users')->getFilterableAttributes()); // Returns []
$client->index('users')->updateFilterableAttributes(['entity']);
dump($client->index('users')->getFilterableAttributes()); // Returns ['entity']
Forcing the updateFilterableAttributes now allows me to complete the search as intended, but I don't feel this should be the regular behaviour? If its mapped on the searchableArray, it should be searchable? What am I not understanding and what other approaches are there to achieve this goal?
This is actually not an issue but a requirement of meilisearch in particular. Scout under the hood uses different drivers for indexing - "algolia", "meilisearch", "database", "collection" and even "null", all of them have different indexing methods unifing of which would be troublesome and inefficient for scout I believe.
So filtering or a faceted search, as meilisearch refers to it, requires us to establish filtering criteria first, which is empty by default for document (models in laravel) fields.
Quoting from the docs:
This step is mandatory and cannot be done at search time. Filters need
to be properly processed and prepared by Meilisearch before they can
be used.
Updating filterableAttributes requires recreating the entire
index. This may take a significant amount of time depending on your
dataset size.
For more info please refer to meilisearch official docs https://docs.meilisearch.com/learn/advanced/filtering_and_faceted_search.html

Can I compose two JSON Schemas in a third one?

I want to describe the JSON my API will return using JSON Schema, referencing the schemas in my OpenAPI configuration file.
I will need to have a different schema for each API method. Let’s say I support GET /people and GET /people/{id}. I know how to define the schema of a "person" once and reference it in both /people and /people/{id} using $ref.
[EDIT: See a (hopefully) clearer example at the end of the post]
What I don’t get is how to define and reuse the structure of my response, that is:
{
"success": true,
"result" : [results]
}
or
{
"success": false,
"message": [string]
}
Using anyOf (both for the success/error format check, and for the results, referencing various schemas (people-multi.json, people-single.json), I can define a "root schema" api-response.json, and I can check the general validity of the JSON response, but it doesn’t allow me to check that the /people call returns an array of people and not a single person, for instance.
How can I define an api-method-people.json that would include the general structure of the response (from an external schema of course, to keep it DRY) and inject another schema in result?
EDIT: A more concrete example (hopefully presented in a clearer way)
I have two JSON schemas describing the response format of my two API methods: method-1.json and method-2.json.
I could define them like this (not a schema here, I’m too lazy):
method-1.json :
{
success: (boolean),
result: { id: (integer), name: (string) }
}
method-2.json :
{
success: (boolean),
result: [ (integer), (integer), ... ]
}
But I don’t want to repeat the structure (first level of the JSON), so I want to extract it in a response-base.json that would be somehow (?) referenced in both method-1.json and method-2.json, instead of defining the success and result properties for every method.
In short, I guess I want some kind of composition or inheritance, as opposed to inclusion (permitted by $ref).
So JSON Schema doesn’t allow this kind of composition, at least in a simple way or before draft 2019-09 (thanks #Relequestual!).
However, I managed to make it work in my case. I first separated the two main cases ("result" vs. "error") in two base schemas api-result.json and api-error.json. (If I want to return an error, I just point to the api-error.json schema.)
In the case of a proper API result, I define a schema for a given operation using allOf and $ref to extend the base result schema, and then redefine the result property:
{
"$schema: "…",
"$id": "…/api-result-get-people.json",
"allOf": [{ "$ref": "api-result.json" }],
"properties": {
"result": {
…
}
}
}
(Edit: I was previously using just $ref at the top level, but it doesn’t seem to work)
This way I can point to this api-result-get-people.json, and check the general structure (success key with a value of true, and a required result key) as well as the specific form of the result for this "get people" API method.

Unable to use Ember data with JSONAPI and fragments to support nested JSON data

Overview
I'm using Ember data and have a JSONAPI. Everything works fine until I have a more complex object (let's say an invoice for a generic concept) with an array of items called lineEntries. The line entries are not mapped directly to a table so need to be stored as raw JSON object data. The line entry model also contains default and computed values. I wish to store the list data as a JSON object and then when loaded back from the store that I can manipulate it as normal in Ember as an array of my model.
What I've tried
I've looked at and tried several approaches, the best appear to be (open to suggestions here!):
Fragments
Replace problem models with fragments
I've tried making the line entry model a fragment and then referencing the fragment on the invoice model as a fragmentArray. Line entries add to the array as normal but default values don't work (should they?). It creates the object and I can store it in the backend but when I return it, it fails with either a normalisation issue or a serialiser issue. Can anyone state the format the data be returned in? It's confusing as normalising the data seems to require JSONAPI but the fragment requires JSON serialiser. I've tried several combinations but no luck so far. My line entries don't have actual ids as the data is saved and loaded as a block. Is this an issue?
DS.EmbeddedRecordsMixin
Although not supported in JSONAPI, it sounds possible to use JSONAPI and then switch to JSONSerializer or RESTSerializer for the problem models. If this is possible could someone give me a working example and the JSON format that should be returned by the API? I have header authorisation and other such data so would I still be able to set this at the application level for all request not using my JSONAPI?
Ember-data-save-relationships
I found an add on here that provides an add on to do this. It seems more involved than the other approaches but when I've tried this I can send the data up by setting a the data as embedded. Great! But although it saves it doesn't unwrap it correct and I'm back with the same issues.
Custom serialiser
Replace the models serialiser with something that takes the data and sends it as plain JSON data and then deserialises back into something Ember can use. This sounds similar to the above but I do the heavy lifting. The only reason to do this is because all examples for the above solutions are quite light and don't really show how to set this up with an actual JSONAPI set up that would need it.
Where I am and what I need
Basically all approaches lead to saving the JSON fine but the return JSON from the server not being the correct format or the deserialisation failing but it's unclear what it should be or what needs to change without breaking the existing JSONAPI models that work fine.
If anyone know the format for return API data it may resolve this. I've tried JSONAPI with lineEntries returning the same format as it saved. I've tried placing relationship sections like the add on suggested and I've also tried placing relationship only data against the entries and an include section with all the references. Any help on this would be great as I've learned a lot through this but deadlines a looming and I can't see a viable solution that doesn't break as much as it fixes.
If you are looking for return format for relational data from the API server you need to make sure of the following:
Make sure the relationship is defined in the ember model
Return all successes with a status code of 200
From there you need to make sure you return relational data correctly. If you've set the ember model for the relationship to {async: true} you need only return the id of the relational model - which should also be defined in ember. If you do not set {async: true}, ember expects all relational data to be included.
return data with relationships in JSON API specification
Example:
models\unicorn.js in ember:
import DS from 'ember-data';
export default DS.Model.extend({
user: DS.belongsTo('user', {async: true}),
staticrace: DS.belongsTo('staticrace',{async: true}),
unicornName: DS.attr('string'),
unicornLevel: DS.attr('number'),
experience: DS.attr('number'),
hatchesAt: DS.attr('number'),
isHatched: DS.attr('boolean'),
raceEndsAt: DS.attr('number'),
isRacing: DS.attr('boolean'),
});
in routes\unicorns.js on the api server on GET/:id:
var jsonObject = {
"data": {
"type": "unicorn",
"id": unicorn.dataValues.id,
"attributes": {
"unicorn-name" : unicorn.dataValues.unicornName,
"unicorn-level" : unicorn.dataValues.unicornLevel,
"experience" : unicorn.dataValues.experience,
"hatches-at" : unicorn.dataValues.hatchesAt,
"is-hatched" : unicorn.dataValues.isHatched,
"raceEndsAt" : unicorn.dataValues.raceEndsAt,
"isRacing" : unicorn.dataValues.isRacing
},
"relationships": {
"staticrace": {
"data": {"type": "staticrace", "id" : unicorn.dataValues.staticRaceId}
},
"user":{
"data": {"type": "user", "id" : unicorn.dataValues.userId}
}
}
}
}
res.status(200).json(jsonObject);
In ember, you can call this by chaining model functions. For example when this unicorn goes to race in controllers\unicornracer.js:
raceUnicorn() {
if (this.get('unicornId') === '') {return false}
else {
return this.store.findRecord('unicorn', this.get('unicornId', { backgroundReload: false})).then(unicorn => {
return this.store.findRecord('staticrace', this.get('raceId')).then(staticrace => {
if (unicorn.getProperties('unicornLevel').unicornLevel >= staticrace.getProperties('raceMinimumLevel').raceMinimumLevel) {
unicorn.set('isRacing', true);
unicorn.set('staticrace', staticrace);
unicorn.set('raceEndsAt', Math.floor(Date.now()/1000) + staticrace.get('duration'))
this.set('unicornId', '');
return unicorn.save();
}
else {return false;}
});
});
}
}
The above code sends a PATCH to the api server route unicorns/:id
Final note about GET,POST,DELETE,PATCH:
GET assumes you are getting ALL of the information associated with a model (the example above shows a GET response). This is associated with model.findRecord (GET/:id)(expects one record), model.findAll(GET/)(expects an array of records), model.query(GET/?query=&string=)(expects an array of records), model.queryRecord(GET/?query=&string=)(expects one record)
POST assumes you at least return at least what you POST to the api server from ember , but can also return additional information you created on the apiServer side such as createdAt dates. If the data returned is different from what you used to create the model, it'll update the created model with the returned information. This is associated with model.createRecord(POST/)(expects one record).
DELETE assumes you return the type, and the id of the deleted object, not data or relationships. This is associated with model.deleteRecord(DELETE/:id)(expects one record).
PATCH assumes you return at least what information was changed. If you only change one field, for instance in my unicorn model, the unicornName, it would only PATCH the following:
{
data: {
"type":"unicorn",
"id": req.params.id,
"attributes": {
"unicorn-name" : "This is a new name!"
}
}
}
So it only expects a returned response of at least that, but like POST, you can return other changed items!
I hope this answers your questions about the JSON API adapter. Most of this information was originally gleamed by reading over the specification at http://jsonapi.org/format/ and the ember implementation documentation at https://emberjs.com/api/data/classes/DS.JSONAPIAdapter.html

ElasticSearch mapping for nested enumerable objects (i18n)

I'm at a loss as to how to map a document for search with the following structure:
{
"_id": "007ff234cb2248",
"ids": {
"source1": "123",
"source2": "456",
"source3": "789"
}
"names": [
{"en":"Example"},
{"fr":"exemple"},
{"es":"ejemplo"},
{"de":"Beispiel"}
],
"children" : [
{
"ids": {
"source1": "CXXIII",
"source2": "CDLVI",
"source3": "DCCLXXXIX",
}
names: [
{"en":"Example Child"},
{"fr":"exemple enfant"},
{"es":"Ejemplo niño"},
{"de":"Beispiel Kindes"}
]
}
],
"relatives": {
// Typically no "ids" at this level.
"relation": 'uncle',
"children": [
{
"ids": {
"source1": "0x7B",
"source2": "0x1C8",
"source3": "0x315"
},
"names": [
{"en":"Example Cousin"},
{"fr":"exemple cousine"},
{"es":"Ejemplo primo"},
{"de":"Beispiel Cousin"}
]
}
]
}
}
The child object may appear in the children section directly, or further nested in my document as uncle.children (cousins, in this case). The IDs field is common to levels one (the root), level two (the children and the uncle), and to level three (the cousins), the naming structure is also common to levels one and three.
My use-case is to be able to search for IDs (nested objects) by prefix, and by the whole ID. And also to be able to search for child names, following an (as yet undefined) set of analyzer rules.
I haven't been able to find a way to map these in any useful way. I don't believe I'll have much success using the same technique for ids and names, as there's an extra level of mapping between names and the document root.
I'm not even certain that it is even mappable. I believe at least in principle that the ids should be mappable as terms, and perhaps that if I index the names as terms in some way, too.
I'm simply at a loss, and the documentation doesn't seem to cover anything like this level of complex mapping.
I have limited (read: no) control of the document as it's coming from the CouchDB river, and the upstream application already relies on this format, so I can't really change it.
I'm looking for being able to search by the following pseudo conditions, all of which should match:
ID: "123"
ID by source (I don't know how best to mark this up in pseudo language)
ID prefix: "CDL"
Name: "Example", "Example Child"
Localized name (I don't even know how best to pseudo-mark this up!
The specifics of tokenising and analysis I can figure out for myself, when I at least know how to map
Objects when both the key and the value of the object properties are important
Enumerable objects when the key and value are important.
If the mapping from an ID to its children is 1-to-many, then you could store the children's names in a child field, as a field can have multiple values. Each document would then have an ID field, possibly a relation field, and zero or more child fields.

Document serialization with Doctrine MongoDB ODM

I'm trying to code a class handling serialization of documents by reading their metadata. I got inspired by this implementation for entities with Doctrine ORM and modified it to match how Doctrine ODM handles documents. Unfortunatly something is not working correctly as one document is never serialized more than once even if it is refered a 2nd time thus resulting on incomplete serialization.
For example, it outputs this (in json) for a user1 (see User document) that belongs to some place1 (see Place document). Then it outputs the place and the users belonging to it where we should see the user1 again but we don't :
{
id: "505cac0d6803fa1e15000004",
login: "user1",
places: [
{
id: "505cac0d6803fa1e15000005",
code: "place1",
users: [
{
id: "505c862c6803fa6812000000",
login: "user2"
}
]
}
]
}
I guess it could be related to something preventing circular references but is there a way around it ?
Also, i'm using this in a ZF2 application, would there be a better way to implement this using the ZF2 Serializer ?
Thanks for your help.
I have a serializer already written for DoctrineODM. You can find it in http://github.com/superdweebie/DoctrineExtensions - look in lib/Sds/DoctrineExtensions/Serializer.
If you are are using zf2, then you might also like http://github.com/superdweebie/DoctrineExtensionsModule, which configures DoctrineExtensions for use in zf2.
To use the Module, install it with composer, as you would any other module. Then add the following to your zf2 config:
'sds' => [
'doctrineExtensions' => [
'extensionConfigs' => [
'Sds\DoctrineExtensions\Serializer' => null,
),
),
),
To get the serializer use:
$serializer = $serivceLocator->get('Sds\DoctrineExtensions\Serializer');
To use the serializer:
$array = $serializer->toArray($document)
$json = $serializer->toJson($document)
$document = $serializer->fromArray($array)
$document = $serializer->fromJson($json)
There are also some extra annotations available to control serialization, if you want to use them:
#Sds\Setter - specify a non standard setter for a property
#Sds\Getter - specify a non standard getter fora property
#Sds\Serializer(#Sds\Ignore) - ignore a property when serializing
It's all still a work in progress, so any comments/improvements would be much appreciated. As you come across issues with these libs, just log them on github and they will get addressed promptly.
Finally a note on serializing embedded documents and referenced documents - embedded documents should be serialized with their parent, while referenced documents should not. This reflects the way data is saved in the db. It also means circular references are not a problem.
Update
I've pushed updates to Sds/DoctrineExtensions/Serializer so that it can now handle references properly. The following three (five) methods have been updated:
toArray/toJson
fromArray/fromJson
applySerializeMetadataToArray
The first two are self explainitory - the last is to allow serialization rules to be applied without having to hydrate db results into documents.
By default references will be serialized to an array like this:
[$ref: 'CollectionName/DocumentId']
The $ref style of referencing is what Mongo uses internally, so it seemed appropriate. The format of the reference is given with the expectation it could be used as a URL to a REST API.
The default behaviour can be overridden by defineing an alternative ReferenceSerializer like this:
/**
* #ODM\ReferenceMany(targetDocument="MyTargetDocument")
* #Sds\Serializer(#Sds\ReferenceSerializer('MyAlternativeSerializer'))
*/
protected $myDocumentProperty;
One alternate ReferenceSerializer is already included with the lib. It is the eager serializer - it will serialize references as if they were embedded documents. It can be used like this:
/**
* #ODM\ReferenceMany(targetDocument="MyTargetDocument")
* #Sds\Serializer(#Sds\ReferenceSerializer('Sds\DoctrineExtensions\Serializer\Reference\Eager'))
*/
protected $myDocumentProperty;
Or an alternate shorthand annotation is provided:
/**
* #ODM\ReferenceMany(targetDocument="MyTargetDocument")
* #Sds\Serializer(#Sds\Eager))
*/
protected $myDocumentProperty;
Alternate ReferenceSerializers must implement Sds\DoctrineExtensions\Serializer\Reference\ReferenceSerializerInterface
Also, I cleaned up the ignore annotation, so the following annotations can be added to properties to give more fine grained control of serialization:
#Sds\Serializer(#Sds\Ignore('ignore_when_serializing'))
#Sds\Serializer(#Sds\Ignore('ignore_when_unserializing'))
#Sds\Serializer(#Sds\Ignore('ignore_always'))
#Sds\Serializer(#Sds\Ignore('ignore_never'))
For example, put #Sds\Serializer(#Sds\Ignore('ignore_when_serializing')) on an email property - it means that the email can be sent upto the server for update, but can never be serialized down to the client for security.
And lastly, if you hadn't noticed, sds annotations support inheritance and overriding, so they play nice with complex document structures.
Another very simple, framework independent way to transforming Doctrine ODM Document to Array or JSON - http://ajaxray.com/blog/converting-doctrine-mongodb-document-tojson-or-toarray
This solution gives you a Trait that provides toArray() and toJSON() functions for your ODM Documents. After useing the trait in your Document, you can do -
<?php
// Assuming in a Symfony2 Controller
// If you're not, then make your DocmentManager as you want
$dm = $this->get('doctrine_mongodb')->getManager();
$report = $dm->getRepository('YourCoreBundle:Report')->find($id);
// Will return simple PHP array
$docArray = $report->toArray();
// Will return JSON string
$docJSON = $report->toJSON();
BTW, it will work only on PHP 5.4 and above.