I'm struggling with a REST API design concept. I have these classes:
user:
- first_name
- last_name
metadata_fields:
- field_name
user_metadata:
- user_id
- field_id
- value
- unique index on [user_id, field_id]
Ok, so users have many metadata and the type of metadata is defined in metadata_fields. Typical HABTM with extra data in the join table.
If I were to update user_metadata through a Rails form, the data would look like this:
user_metadata: {
id: 1,
user_id: 2,
field_id: 3,
value: 'foo'
}
If I posted to the user#update controller, the data would look like this:
user: {
user_metadata: {
id: 1,
field_id: 3,
value: 'foo'
}
}
The trouble with this approach is that we're ignoring the uniqueness of the user_id/field_id relationship. If I change the field_id in either update, I'm not just changing data, I'm changing the meaning of that data. This tends to work fine in Rails because it's somewhat of a walled garden, but it breaks down when you open up an API endpoint.
If I allow this:
PATCH /api/user_metadata
Then I'm opening myself up to someone modifying the user_id or field_id or both. Similarly with this:
PATCH /api/user/:user_id/metadata
Now user_id is set but field_id can still change. So really the only way to solve this is to limit the update to a single field:
PATCH /api/user/:user_id/metadata/:field_id
Or a bulk update:
PATCH /api/user/:user_id/metadata
But with that call, we have to modify the data structure so that the uniqueness of the user_id/field_id relationship is intact:
user_metadata: {
field_id1: 'value1',
field_id2: 'value2',
...
}
I'd love to hear thoughts here. I've scoured Google and found absolutely nothing. Any recommendations?
As metadata belongs to a certain user /api/user/{userId}/metadata/{metadataId} is probably the clean URI for a single metadata resource of a user. The URI of your resource is already the unique-key you are looking for. There can't be 2 resources with the same URI! Furthermore, the URI already contains the user and field IDs.
A request like GET /api/user/1 HTTP/1.1 could return a HAL-like representation like the one below:
{
"user" : {
"id": "1",
"firstName": "Max",
"lastName": "Sample",
...
"_links": {
"self" : {
"href": "/api/user/1"
}
},
"_embedded": {
"metadata" : {
"fields" : [{
"id": "1",
"type": "string",
"value": "foo",
"_links": {
"self": {
"href": "/api/user/1/metadata/1"
}
}
}, {
"id": "2",
"type": "string",
"value": "bar",
"_links": {
"self": {
"href": "/api/user/1/metadata/2"
}
}
}],
"_links": {
"self": {
"href": "/api/user/1/metadata"
}
}
}
}
}
}
Of course you could send a PUT or a PATCH request to modify an existing metadata field. Though, the URI of the resource will still be the same (unless you move or delete a resource within a PATCH request).
You also have the possibility to ignore certain fields on incomming PUT requests which prevents modification of certain fields like id or _link. I'll assume this should also be valid for PATCH requests, though will have to re-read the spec again therefore.
Therefore, I'd suggest to ignore any id or _link fields contained in requests and update the remaining fields. But you also have the option to return a 403 Forbidden or 409 Conflict response if someone tries to update an ID-field.
UPDATE
If you want to update multiple fields within a single request, you have two options:
Using PUT and replace the current set of fields with the new version
Using PATCH and send the server the necessary steps to transform the current field-set to the new field-set
Example PUT:
PUT /api/user/1/metadata HTTP/1.1
{
"metadata": {
"fields": [{
"type": "string",
"value": "newFoo"
}, {
"type": "string",
"value": "newBar"
}]
}
}
This request would first delete every stored metadata field of the user the metadata belong to and afterwards create a new resoure for each contained field in the request. While this still guarantees unique URIs, there are a couple of drawbacks to this approach however:
all the data which should be available after the update, even fields that do not change, need to be transmitted
clients which have a URI pointing to a certain resource may point to a false representation. F.e. a client has retrieved /user/1/metadata/2right before a further client updated all the metadata, the IDs are dispatched via auto-increment, the update however introduced a new second item and therefore moved the former 2 to position 3, client1 has now a reference to /user/1/metadata/2 while the actual data is /user/1/metadata/3 however. To prevent this, unique UUIDs could be used instead of autoincrement IDs. If client 1 later on tries to retrieve or update former resource 2, his can be notified that the resource is not available anymore, even a redirect to the new location could be created.
Example PATCH:
A PATCH request contains the necessary steps to transform the state of a resource to the new state. The request itself can affect multiple resources at the same time and even create or delete other resources as needed.
The following example is in json-patch+json format:
PATCH /api/user/1/metadata HTTP/1.1
[
{
"op": "add",
"path": "/0/value",
"value": "newFoo"
},
{
"op": "add",
"path": "/2",
"value": { "type": "string", "value": "totally new entry" }
},
{
"op": "remove",
"path": "/1"
},
]
The path is defined as a JSON Pointer for the invoked resource.
The add operation of the JSON-Patch type is defined as:
If the target location specifies an array index, a new value is inserted into the array at the specified index.
If the target location specifies an object member that does not already exist, a new member is added to the object.
If the target location specifies an object member that does exist, that member's value is replaced.
For the removal case however, the spec states:
If removing an element from an array, any elements above the specified index are shifted one position to the left.
Therefore the newly added entry would end up in position 2 in the array. If not an auto-increment value is used for the ID, this should not be a big problem though.
Besindes add, and remove the spec also contains definitions for replace, move, copy and test.
The PATCH should be transactional - either all operations succeed or none. The spec states:
If a normative requirement is violated by a JSON Patch document, or if an operation is not successful, evaluation of the JSON Patch document SHOULD terminate and application of the entire patch document SHALL NOT be deemed successful.
I'll interpret this lines as, if it tries to update a field which it is not supposed to update, you should return an error for the whole PATCH request and therefore do not alter any resources.
Drawback to the PATCH approach is clearly the transactional requirement as well as the JSON Pointer notation, which might not be that popular (at least I haven't used it often and had to look it up again). Same as with PUT, PATCH allows to add new resources inbetween existing resources and shifting further ones to the right which may lead to an issue if you rely on autoincrement values.
Therefore, I strongly recommend to use randomly generated UUIDs as identifier rather than auto-increment values.
Related
I have an event entity.
What is the correct way to implement update of this entity? Our frontend-developer wants everything to be done with a single PUT request to the backend: changing the values of the title, description fields, as well as adding, deleting, and editing prices, event_dates, and event_dates.
I made separate endpoints put /event/{id}, put /price/{id}, put event_date/{id}
What can you recommend?
{
"id": 504,
"title": "First Event",
"description": "Description of First Event",
"created_at": "2022-08-16T08:42:11+00:00",
"prices": [
{
"id": 4,
"value": "12.99",
"type": "regular",
"is_entrance_free": false,
"info": "some extra infos",
"sorting": 7
}
],
"event_dates": [
{
"id": 2,
"start_date": "2022-12-10",
"end_date": "2022-12-31",
"start_time": "13:00",
"end_time": "16:00",
"entrance_time": "12:30",
"is_open_end": false,
"info": "7"
}
]
}
One of the standard ways is to POST or PUT the JSON for either the complete new record, with everything changed, effectively overwriting the old one, but keeping the same ID, or a subset.
The request would go to an endpoint for PUT /event/{id} where the action reads the current record, and gets the JSON with the information to update.
<?php
// various use statements as required
class ApiEventController extends AbstractController
{
#[Route('/api/event/{id}', methods: ['PUT'])]
public function eventPut(Request $request, \App\Entity\Event $event)
{
// Security here - ensure the current user has permission to access & edit the event
// a custom Deserializer can restrict what is used from the content
// for example, ensuring the ID, or other fields are not changed.
$serializer->deserialize(
$request->getContent(),
\App\Entity\Event::class,
'json',
[
// takes the new values, from the request content,
// and update the old value, fetched by ID from the URL
AbstractNormalizer::OBJECT_TO_POPULATE => $entity,
]
);
// $event is now the mix of the old, and new
$entityManager->persist($event);
$entityManager->flush();
// return the updated event details
}
Updating more complex contents (such as replacing an array of prices, or event_dates within the main entity) will need other deserializers and the configuration in the Event entity and others, so that the Symfony Serializer component understands what is required. https://symfony.com/doc/current/components/serializer.html and https://symfonycasts.com/tracks/symfony has more information and tutorials that well assist in learning more.
API-platform can make much of this simpler, for the simpler cases, but an understanding of the basics would be useful as a basis of understanding.
I'm going through the docs to try and figure out how loops work so I can validate every object of an array of objects match the schema.
It seems like recursion is what I want but the example given doesn't work: https://json-schema.org/understanding-json-schema/structuring.html
I'm trying to validate that example but its always "valid". I tried changing all the field names in the JSON and it doesn't matter:
Not sure what's happening. For this example how would I validate every child matches the person schema (without statically writing out each one in the schema).
For example, I want to valid this JSON. there could be any number of objects under toplevel and any number of objects under "objectsList". I want to make sure every object under "objectsList" has the right field names and types (again without hard coding the entire thing in the schema):
{
"toplevel": {
"objectOne": {
"objectsList": [
{
"field1": 1231,
"field2": "sekfjlskjflsdf",
"field3": ["ssss","eeee"],
},
{
"field1": 11,
"field2": "sef",
"field3": ["eeee","qqqq"],
},
{
"field1": 1231,
"field2": "wwwww",
"field3": ["sisjflkssss","esdfsdeee"],
},
]
},
"objectTwo": {
"objectsList": [
{
"field1": 99999,
"field2": "yuyuyuyuyu",
"field3": ["ssssuuu","eeeeeee"],
},
{
"field1": 221,
"field2": "vesdlkfjssef",
"field3": ["ewerweeee","ddddq"],
},
]
},
}
}
What's wrong?
The problem here is not the recursion – your schema looks good.
The underlying issue is the same as here: https://stackoverflow.com/a/61038256/5127499
JSON Schema is designed for extensibility. That means it allows any kind of additional properties to be added as long as they are not conflicting with the known/expected keywords.
Solution
The solution here is to add "additionalProperties": false in your "person" (from the screenshot) and top-level schema to prevent those incorrect objects to be accepted. Same goes for your second example: in any definitions of "type": "object" you'd have to add "additionalProperties": false if you don't want to allow these extraneous properties to be defined.
Alternatively, you can declare your expected properties as required to ensure that at least those are present.
Why?
As per json-schema.org/understanding-json-schema (emphasis mine):
The additionalProperties keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in the properties keyword. By default any additional properties are allowed.
The additionalProperties keyword may be either a boolean or an object. If additionalProperties is a boolean and set to false, no additional properties will be allowed.
To address the screenshot you posted and why the instance passes:
The schema is looking to find a person property, but that property doesn't exist.
The schema does not declare that person is required.
The schema does not declare requirements on undefined properties, so it will always accept the personsdfsd property with whatever value is in it, without checking it further.
So in short, your JSON data is bad and your schema doesn't have any protections against that.
Other than that, your schema looks good. It should validate that items in the children property match the person definition's subschema.
I have a JSON array, something like:
[{
"name": "John Smith",
"occupationId": 3
},
{
"name": "Steven Davis",
"occupationId": 2
}
]
The occupation response looks something like:
[{
"id": 2,
"name": "Teacher"
},
{
"id": 3,
"name": "Teaching Assistant"
}
]
Is there a way to allow RestKit to request the correct data for the occupations, given only their id? I know this can be done if the data is persisted using CoreData, via the addConnectionForRelationship:connectedBy: method, but I would rather that the data is transient, given that the server is local and there really is no need to persist the data. I'm also aware that RKObjectMapping does not support the identifiactionAttributes property, meaning I cannot (to my knowledge) designate a way to allow the class to declare a unique, identifying property.
Any help would be appreciated. I am using a mix of Objective-C and Swift, and as such, I do not mind answers in either language.
I store different kinds of documents in a single index with strict predefined mapping. All of them have some field (say, "body"), but I'd want them to be analyzed slightly differently when indexed (for example, to use different token filters for specific documents) and treaten the same way while searched. As far as I know, analyzers can't be specified per document.
What I also considered to use:
Object fields with differently analyzed subfields for document kinds, so each document has only one filled subfield (like, "body.mail", "body.html"). The problem is that I couldn't search on the whole "body" field which would look through all its subfields (to not break the existing application).
New reincarnation of multi-fields (to have "body" field with a generic analyzer and custonly analyzed "mail", "html", etc. inside it). Hovewer, I'm not sure if it's possible to use them directly while indexing and indirectly while searching (e.g., to save object with {"mail":"smth"} to use a specific index analyzer, then search by "query":{"body":"smth"} to use generic search analyzer).
To separate "body" into several fields with different mappings, remove them from _all, and set copy_to to a single body field. I'm not sure, but it will add a substantial index overhead due to copying.
As I mentioned in the comments, what you want is not possible. Your requirement, in one sentence, is: have the same data analyzed in multiple ways, but searched as a single field because this would break the existing application.
-- body.html
-- body.email
body field ---- body.content --- all searched as "body"
...
-- body.destination
-- body.whatever
Your first option is multi-fields which has this exact purpose in mind: have the same data analyzed multiple ways. The problem is that you cannot search for "body" and expect ES to search body.html, body.email... Even if this would be possible, you want to be searched with different analyzers. Again, not possible. This option requires you to change the application and search for each field in a multi_match or in a query_string.
Your second option - reincarnation of multi-fields - will again not work because you cannot refer to body and ES, in the background, to match mail, content etc.
Third option - using copy_to - will not work because copying to another field "X" means indexing the data being copied will be analyzed with X's analyzer, and this breaks your requirement of having the same data analyzed differently.
There could be a fourth option - "path": "just_name" from multi_fields - which at a first look it should work. Meaning, you can have 3 multi-fields (email, content, html) which all three have a body sub-field. Having "path": "just_name" allows you to search just for body even if body is a sub-field of multiple other fields. But this is not possible because this type of multi-fields will not accept different analyzers for the same body.
Either way, you need to change something in your requirements, because they will not work they way you want it.
These being said, I'm curious to see what queries are you using in your application. It would be a simple change (yes, you will need to change your app) from querying body field to querying body.* in a multi_match.
And I have another solution for you: create multiple indices, one index for each analyzer of your body. For example, for mail, content and html you define three indices:
PUT /multi_fields1
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "whitespace",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields2
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "standard",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields3
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "keyword",
"search_analyzer": "standard"
}
}
}
}
}
You see that all of them have the same type and the same field name - body - but different index_analyzers. Then you define an alias:
POST _aliases
{
"actions": [
{"add": {
"index": "multi_fields1",
"alias": "multi"}},
{"add": {
"index": "multi_fields2",
"alias": "multi"}},
{"add": {
"index": "multi_fields3",
"alias": "multi"}}
]
}
Name your alias the same as your current index. The application doesn't need to change, it will use the same name for index search, but this name will not point to an index, but to an alias which in turn refers to your multiple indices. What needs to change is how you index the documents, because a html documents needs to go in multi_fields1 index for example, an email document needs to be index in multi_fields2 index etc.
Whatever solution you find/choose, your requirements need to change because the way you want it is not possible.
I think you can use multi-field. With multi-field you can define analyzers (both indexing & searching) for each sub fields, and do the search on corresponding fields base on applications requirements.
In general, index analyzer can be difference from field to field, the same for search analyzer.
{
"your_type" : {
"properties":{
"body" : {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_body_analyzer",
"search_analyzer" : "search_body_analyzer",
"fields" : {
"mail" : {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_bodymail_analyzer",
"search_analyzer" : "search_bodymail_analyzer"
},
"html": {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_bodyhtml_analyzer",
"search_analyzer" : "search_bodyhtml_analyzer"
}
}
}
}
}
I have the following JSON structure which i get from a RestService:
{
"customer": {
"id": "123456",
[more attributes ....]
"items": [
{
"id": "1234",
},
{
"id": "2345",
}
[more items...]
]
}
}
which i successfully map into Core Data using RestKit. From another RestService (which i can not change) i then get more details to one single item in the items array. the JSON answer looks like
{
"customer": {
"id: "123456",
"item": {
"id": "1234",
"name": "foo",
[other attributes...]
}
}
}
Now the question: How can i map the second answer, so that the single item is added to the items array (or updated if it is already in there)?
Thanks for any ideas!
If you already know how to map JSON to Core Data, all that's left is just fetch theobject you want to add your item attributes to(using id or something else) and then just set it,rewriting the old one,or adding new fields.That's just general approach
If you set the appropriate primaryKeyAttribute of the RKManagedObjectMapping object you should be able to perform the mapping as you want it to.
It would actually be easier to help you, if you would post some of your mapping code, but this is how I meant it to be
Create the mapping for your customer object, defining all possible attributes and declare the mappingObject.primaryKeyAttribute = #"id"
Execute the mapping with the first request (or first answer as you put it)
After the first mapping step is finished execute the second request
This should initially create the customer objects you want and then update them.