json schemas: Is type necessary when const or enum are present? - jsonschema

When a property has const or enum, what are the benefits or downsides of proving type too?
{
"type": "object",
"properties": {
"has_car": {
"title": "Do you have a car?",
"enum": ["yes", "no"],
"type": "string",
"$comment": "Do I need type here?"
},
"car_brand": {
"title": "What's your car brand?",
"type": "string"
},
"terms": {
"title": "I accept my car terms",
"const": "acknowledged",
"$comment": "Do I need type here?"
}
}
}

There really is no benefit. const and enum verify exact values, so type adds nothing, as you suspect.

No it is not required because keywords are applied separately. Depending upon what is using the schema it can help to define both. For code generation, adding type helps because one can limit the allowed input types with that info. So that is a potential benefit.
One downside is that your schema definition is larger, and that if new enum values are added with different types are added, those new types need to be added also.

Related

What is the correct way to implement update of the entity? (Symfony 6)

I have an event entity.
What is the correct way to implement update of this entity? Our frontend-developer wants everything to be done with a single PUT request to the backend: changing the values of the title, description fields, as well as adding, deleting, and editing prices, event_dates, and event_dates.
I made separate endpoints put /event/{id}, put /price/{id}, put event_date/{id}
What can you recommend?
{
"id": 504,
"title": "First Event",
"description": "Description of First Event",
"created_at": "2022-08-16T08:42:11+00:00",
"prices": [
{
"id": 4,
"value": "12.99",
"type": "regular",
"is_entrance_free": false,
"info": "some extra infos",
"sorting": 7
}
],
"event_dates": [
{
"id": 2,
"start_date": "2022-12-10",
"end_date": "2022-12-31",
"start_time": "13:00",
"end_time": "16:00",
"entrance_time": "12:30",
"is_open_end": false,
"info": "7"
}
]
}
One of the standard ways is to POST or PUT the JSON for either the complete new record, with everything changed, effectively overwriting the old one, but keeping the same ID, or a subset.
The request would go to an endpoint for PUT /event/{id} where the action reads the current record, and gets the JSON with the information to update.
<?php
// various use statements as required
class ApiEventController extends AbstractController
{
#[Route('/api/event/{id}', methods: ['PUT'])]
public function eventPut(Request $request, \App\Entity\Event $event)
{
// Security here - ensure the current user has permission to access & edit the event
// a custom Deserializer can restrict what is used from the content
// for example, ensuring the ID, or other fields are not changed.
$serializer->deserialize(
$request->getContent(),
\App\Entity\Event::class,
'json',
[
// takes the new values, from the request content,
// and update the old value, fetched by ID from the URL
AbstractNormalizer::OBJECT_TO_POPULATE => $entity,
]
);
// $event is now the mix of the old, and new
$entityManager->persist($event);
$entityManager->flush();
// return the updated event details
}
Updating more complex contents (such as replacing an array of prices, or event_dates within the main entity) will need other deserializers and the configuration in the Event entity and others, so that the Symfony Serializer component understands what is required. https://symfony.com/doc/current/components/serializer.html and https://symfonycasts.com/tracks/symfony has more information and tutorials that well assist in learning more.
API-platform can make much of this simpler, for the simpler cases, but an understanding of the basics would be useful as a basis of understanding.

How to indicate dynamic additions support in json schema?

Just started with jsonschema. I want to describe a collection of objects for a property where the key may not be known a priori.
Here is my starting point:
"tx_properties": {
"type": "object",
"anyOf": [
{
"required": [
"original_msg"
]
}
],
"properties": {
"original_msg": {
"type": "string"
}
}
}
}
I want to be able to validate the additions of more properties for tx_properties that may have different types but are not known at schema definition time.
For example I might have, in json:
"tx_properties": {
"original_msg": "foo",
"something_else": "bar",
"or_something_numeric": 172,
"or_even_deeper_things": {
"fungible": false,
}
}
As a n00b I'm a bit stuck on how to accomplish this. The use of anyOne is what I thought I needed at least in the final solution.
As #Evert said, "additionalProperties": false can be used to indicate that no other properties other than those listed in properties keywords (and patternProperties) are permitted. If additionalProperties is omitted, the behaviour is as if the schema said "additionalProperties": true (that is, additional properties are permitted).
Also note that the value at additionalProperties doesn't have to be a boolean: it can be a subschema itself, to allow you to conditionally allow for additional properties depending on their value(s).
Reference: https://json-schema.org/understanding-json-schema/reference/object.html#additional-properties

JSON Schema - field named "type"

I have an existing JSON data feed between multiple systems which I do not control and cannot change. I have been tasked with writing a schema for this feed. The existing JSON looks in part like this:
"ids": [
{ "type": "payroll", "value": "011808237" },
{ "type": "geid", "value": "31826" }
]
When I try to define a JSON schema for this, I wind up with a scrap of schema that looks like this:
"properties": {
"type": { <====================== PROBLEM!!!!
"type": "string",
"enum": [ "payroll", "geid" ]
},
"value": {
"type": [ "string", "null" ],
"pattern": "^[0-9]*$"
}
}
As you might guess, when the JSON validator hits that "type" on the line marked "PROBLEM!!!" it gets upset and throws an error about how type needs to be a string or array.
That's a bug in the particular implementation you're using, and should be reported as such. It should be able to handle properties-that-look-like-keywords just fine. In fact, the metaschema (the schema to valid schemas) uses "type" in exactly this way, along with all the other keywords too: e.g. http://json-schema.org/draft-07/schema
I wonder if it is not making use of the official test suite (https://github.com/json-schema-org/JSON-Schema-Test-Suite)?
You didn't indicate what implementation you're using, or what language, but perhaps you can find an alternative one here: https://json-schema.org/implementations.html#validators
I found at least a work-around if not a proper solution. Instead of "type" inside "properties", use "^type$" inside "patternProperties". I.e.
"patternProperties": {
"^type$": {
"type": "string"
}
}
Unfortunately there doesn't seem to be a great way to make "^type$" a required property. I've settled for listing all the other properties as "required" and setting the min and max property counts to the number that should be there.

RestKit resolve objects given an id or partial representation in JSON

I have a JSON array, something like:
[{
"name": "John Smith",
"occupationId": 3
},
{
"name": "Steven Davis",
"occupationId": 2
}
]
The occupation response looks something like:
[{
"id": 2,
"name": "Teacher"
},
{
"id": 3,
"name": "Teaching Assistant"
}
]
Is there a way to allow RestKit to request the correct data for the occupations, given only their id? I know this can be done if the data is persisted using CoreData, via the addConnectionForRelationship:connectedBy: method, but I would rather that the data is transient, given that the server is local and there really is no need to persist the data. I'm also aware that RKObjectMapping does not support the identifiactionAttributes property, meaning I cannot (to my knowledge) designate a way to allow the class to declare a unique, identifying property.
Any help would be appreciated. I am using a mix of Objective-C and Swift, and as such, I do not mind answers in either language.

Elasticsearch multiple analyzers for a single field

I store different kinds of documents in a single index with strict predefined mapping. All of them have some field (say, "body"), but I'd want them to be analyzed slightly differently when indexed (for example, to use different token filters for specific documents) and treaten the same way while searched. As far as I know, analyzers can't be specified per document.
What I also considered to use:
Object fields with differently analyzed subfields for document kinds, so each document has only one filled subfield (like, "body.mail", "body.html"). The problem is that I couldn't search on the whole "body" field which would look through all its subfields (to not break the existing application).
New reincarnation of multi-fields (to have "body" field with a generic analyzer and custonly analyzed "mail", "html", etc. inside it). Hovewer, I'm not sure if it's possible to use them directly while indexing and indirectly while searching (e.g., to save object with {"mail":"smth"} to use a specific index analyzer, then search by "query":{"body":"smth"} to use generic search analyzer).
To separate "body" into several fields with different mappings, remove them from _all, and set copy_to to a single body field. I'm not sure, but it will add a substantial index overhead due to copying.
As I mentioned in the comments, what you want is not possible. Your requirement, in one sentence, is: have the same data analyzed in multiple ways, but searched as a single field because this would break the existing application.
-- body.html
-- body.email
body field ---- body.content --- all searched as "body"
...
-- body.destination
-- body.whatever
Your first option is multi-fields which has this exact purpose in mind: have the same data analyzed multiple ways. The problem is that you cannot search for "body" and expect ES to search body.html, body.email... Even if this would be possible, you want to be searched with different analyzers. Again, not possible. This option requires you to change the application and search for each field in a multi_match or in a query_string.
Your second option - reincarnation of multi-fields - will again not work because you cannot refer to body and ES, in the background, to match mail, content etc.
Third option - using copy_to - will not work because copying to another field "X" means indexing the data being copied will be analyzed with X's analyzer, and this breaks your requirement of having the same data analyzed differently.
There could be a fourth option - "path": "just_name" from multi_fields - which at a first look it should work. Meaning, you can have 3 multi-fields (email, content, html) which all three have a body sub-field. Having "path": "just_name" allows you to search just for body even if body is a sub-field of multiple other fields. But this is not possible because this type of multi-fields will not accept different analyzers for the same body.
Either way, you need to change something in your requirements, because they will not work they way you want it.
These being said, I'm curious to see what queries are you using in your application. It would be a simple change (yes, you will need to change your app) from querying body field to querying body.* in a multi_match.
And I have another solution for you: create multiple indices, one index for each analyzer of your body. For example, for mail, content and html you define three indices:
PUT /multi_fields1
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "whitespace",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields2
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "standard",
"search_analyzer": "standard"
}
}
}
}
}
PUT /multi_fields3
{
"mappings": {
"test": {
"properties": {
"body": {
"type": "string",
"index_analyzer": "keyword",
"search_analyzer": "standard"
}
}
}
}
}
You see that all of them have the same type and the same field name - body - but different index_analyzers. Then you define an alias:
POST _aliases
{
"actions": [
{"add": {
"index": "multi_fields1",
"alias": "multi"}},
{"add": {
"index": "multi_fields2",
"alias": "multi"}},
{"add": {
"index": "multi_fields3",
"alias": "multi"}}
]
}
Name your alias the same as your current index. The application doesn't need to change, it will use the same name for index search, but this name will not point to an index, but to an alias which in turn refers to your multiple indices. What needs to change is how you index the documents, because a html documents needs to go in multi_fields1 index for example, an email document needs to be index in multi_fields2 index etc.
Whatever solution you find/choose, your requirements need to change because the way you want it is not possible.
I think you can use multi-field. With multi-field you can define analyzers (both indexing & searching) for each sub fields, and do the search on corresponding fields base on applications requirements.
In general, index analyzer can be difference from field to field, the same for search analyzer.
{
"your_type" : {
"properties":{
"body" : {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_body_analyzer",
"search_analyzer" : "search_body_analyzer",
"fields" : {
"mail" : {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_bodymail_analyzer",
"search_analyzer" : "search_bodymail_analyzer"
},
"html": {
"type" : "string",
"index" : "analyzed",
"index_analyzer" : "index_bodyhtml_analyzer",
"search_analyzer" : "search_bodyhtml_analyzer"
}
}
}
}
}