How do I subclass a JSON schema - jsonschema

How do I subclass in JSON-Schema?
First I restrict myself to draft-07, because that's all I can find implementations of.
The naive way to do sub-classing is described in
https://json-schema.org/understanding-json-schema/structuring.html#extending
But this works poorly with 'additionalProperties': false?
Why bother with
additionalProperties': false?
Without it - nearly any random garbage input json will be considered valid, since all the
'error' (mistaken json) will just be considered 'additionalProperties'.
Recapping https://json-schema.org/understanding-json-schema/structuring.html#extending
use allOf(baseClass)
then add your own properties
The problem with this - is that it doesn't work with 'additionalProperties' (because of
unclear but appantly unfortunate definitions of additionalProperties that it ONLY applies
to locally defined (in that sub-schema) properties, so one or the other schema will fail validation.
Alternative Approaches:
meta languages/interpretters layered on top of JSONSchema
(such as https://github.com/mokkabonna/json-schema-merge-allof)
This is not a good choice as the scehma can only be used from javascript (or the
language of that meta processor). And not easily interoperable with other tools
https://github.com/java-json-tools/json-schema-validator/wiki/v5%3A-merge
An alternative I will propose as a 'solution' / answer

How do I subclass in JSON-Schema?
You don't, because JSON Schema is not object oriented and schemas are not classes. JSON Schema is designed for validation. A schema is a collection of constraints.
But, let's look at it from an OO perspective anyway.
Composition over inheritance
The first thing to note is that JSON Schema doesn't support an analog to inheritance. You might be familiar with the old OO wisdom, "composition over inheritance". The Go language, chooses not to support inheritance at all, so JSON Schema is in good company with that approach. If you build your system using only composition, you will have no issues with "additionalProperties": false.
Polymorphism
Let's say that thinking in terms of composition is too foreign (it takes time to learn to think differently) or you don't have control over how your types are designed. For whatever reason, you need to model your data using inheritance, you can use the allOf pattern you're familiar with. The allOf pattern isn't quite the same as inheritance, but it's the closest you're going to get.
As you've noted, "additionalProperties": false wreaks havoc in conjunction with the allOf pattern. So, why should you leave this out? The OO answer is polymorphism. Let's say you have a "Person" type and a "Student" type that extends "Person". If you have a Student, you should be able to pass it to a method that accepts a Person. It doesn't matter that Student has a few properties that Person doesn't, when it's being used as a Person, the extra properties are simply ignored. If you use "additionalProperties": false, your types can't be polymorphic.
None of this is the kind of solution you are asking for, but hopefully it gives you a different perspective to consider alternatives to solve your problem in different way that is more idiomatic for JSON Schema.

I struggled with that, especially since I had to use legacy versions of JSON Schema. And I found that the solution is a tiny bit verbose but quite easy to read and understand.
Let's say that you want describe that kind of type:
interface Book {
pageCount: number
}
interface Comic extends Book {
imageCount: number
}
interface Encyclopedia extends Book {
volumeCount: number
}
// This is the schema I want to represent:
type ComicOrEncyclopedia = Comic | Encyclopedia
Here is how I can both handle polymorphism and forbid any extra-prop (while obviously enforcing inherited types in the "child" definitions):
{
"$schema": "http://json-schema.org/draft-07/schema#",
"definitions": {
"bookDefinition": {
"type": "object",
"properties": {
"imageCount": {
"type": "number"
},
"pageCount": {
"type": "number"
},
"volumeCount": {
"type": "number"
}
}
},
"comicDefinition": {
"type": "object",
"allOf": [{ "$ref": "#/definitions/bookDefinition" }],
"properties": {
"imageCount": {},
"pageCount": {},
"volumeCount": {
"not": {}
}
},
"required": ["imageCount", "pageCount"],
"additionalProperties": false
},
"encyclopediaDefinition": {
"type": "object",
"allOf": [{ "$ref": "#/definitions/bookDefinition" }],
"properties": {
"imageCount": {
"not": {}
},
"pageCount": {},
"volumeCount": {}
},
"required": ["pageCount", "volumeCount"],
"additionalProperties": false
}
},
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/comicDefinition" },
{ "$ref": "#/definitions/encyclopediaDefinition" }]
}

This isn't a GREAT answer. But until the definition of JSONSchema is improved (or someone provides a better answer) - this is what I've come up with as workable.
Basically, you define two copies of each type, the first with all the details but no additionalProperties: false flag. Then second, REFERENCING the first, but with the 'additionalProperties: false' set.
The first you can think of as an 'abstract class' and the second as a 'concrete class'.
Then, to 'subclass', you use the https://json-schema.org/understanding-json-schema/structuring.html#extending approach, but referencing the ABSTRACT class, and then add the 'additionalProperties: false'. SADLY, to make this work, you must also REPEAT all the inherited properties (but no need to include their type info - just their names) - due to the sad choice for how JSONSchema draft 7 appears to interpret additionalProperties.
An EXAMPLE - based on https://json-schema.org/understanding-json-schema/structuring.html#extending should help:
https://www.jsonschemavalidator.net/s/3fhU3O1X
(reproduced here in case other site
/link not permanant/reliable)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://TEST",
"definitions": {
"interface-address": {
"type": "object",
"properties": {
"street_address": {
"type": "string"
},
"city": {
"type": "string"
},
"state": {
"type": "string"
}
},
"required": ["street_address", "city", "state"]
},
"concrete-address": {
"allOf": [
{
"$ref": "#/definitions/interface-address"
}
],
"properties": {
"street_address": {},
"city": {},
"state": {}
},
"additionalProperties": false
},
"in-another-file-subclass-address": {
"allOf": [
{
"$ref": "#/definitions/interface-address"
}
],
"additionalProperties": false,
"properties": {
"street_address": {},
"city": {},
"state": {},
"type": {
"enum": ["residential", "business"]
}
},
"required": ["type"]
},
"test-of-address-schemas": {
"type": "object",
"properties": {
"interface-address-allows-bad-fields": {
"$ref": "#/definitions/interface-address"
},
"use-concrete-address-to-only-admit-legit-addresses-without-extra-crap": {
"$ref": "#/definitions/concrete-address"
},
"still-can-subclass-using-interface-not-concrete": {
"$ref": "#/definitions/in-another-file-subclass-address"
}
}
}
},
"anyOf": [
{
"$ref": "#/definitions/test-of-address-schemas"
}
]
}
and example document:
{
"interface-address-allows-bad-fields":{
"street_address":"s",
"city":"s",
"state":"s",
"allow-bad-fields-this-is-why-we-need-additionalProperties":"s"
},
"use-concrete-address-to-only-admit-legit-addresses-without-extra-crap":{
"street_address":"s",
"city":"s",
"state":"s"
},
"still-can-subclass-using-interface-not-concrete":{
"street_address":"s",
"city":"s",
"state":"s",
"type":"business"
}
}

Related

Using $vars within json schema $ref is undefined

While following the documentation for using variables in json schema I noticed the following example fails. It looks like the number-type doesn't get stored as a variable and cannot be read.
{
"$id": "http://example.com/number#",
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["natural", "integer"]
},
"value": {
"$ref": "#/definitions/{+number-type}",
"$vars": {
"number-type": {"$ref": "1/type"}
}
}
},
"required": ["type", "value"],
"definitions": {
"natural": {
"type": "integer",
"minimum": 0
},
"integer": {
"type": "integer"
}
}
}
results in
Could not find a definition for #/definitions/{+number-type}
tl;dr $vars is not a JSON Schema keyword. It is an implementation specific extension.
The documentation you link to is not JSON Schema. It is documentation for a specific library which adds a preprocessing step to its JSON Schema processing model.
As such, this would only ever work when using that library, and would not create an interoperable or reuseable JSON Schema, if that's a consideration.
If you are using that library specifically, it sounds like a bug, and you should file an Issue in the appropriate repo. As you haven't provided any code, I can't tell what implementation you are using, so I can't be sure on that.

jsonschema dependentSchema not validating

I am trying to learn json schema, but something isn't working out for me.
I'm trying to run the example from http://json-schema.org/understanding-json-schema/reference/conditionals.html#id4 for dependentSchemas, but it just doesn't validate.
I'm using this schema:
check_schema = {"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" }
},
"required": ["name"],
"dependentSchemas": {
"credit_card": {
"properties": {
"billing_address": { "type": "string" }
},
"required": ["billing_address"]
}
}
}
and this json, that should raise an error since it is missing the key billing_address:
check_dict={
"name": "John Doe",
"credit_card": 5555555555555555
}
but when I use jsonschema.validate(dic_check, schema_check) (with python, jsonschema package version 4.2.1), the validation passes with no issues.
What am I doing wrong here?
If you are using an implementation that doesn't support at least draft2019-09 of the specification, dependentSchemas won't be recognized as a keyword. In earlier versions (draft7 and before), that keyword was known as dependencies, with the same syntax (actually, dependencies was split into two, dependentSchemas and dependentRequired).
The details are described on the page you linked, https://json-schema.org/understanding-json-schema/reference/conditionals.html#dependentschemas.
If you still believe that what you have should work, I suggest you open a bug report on the implementation's issue queue.

Are there more examples of readOnly being used at several levels of a schema to clarify the semantics?

I've been using the readOnly keyword in my schemas and I just realized that I was just making up my own semantics. Now I'm cleaning up a bunch of my designs and trying to validate that I was using this annotation as it was intended. The validation spec is what I'm basing this question on but I'd like to be aware of more example usage scenarios.
Let me give three examples. In this first example I mean to say the entire resource is read only. Nothing can be mutated at any level.
{
"type": "object",
"readOnly": true,
"properties": {
"name": {
"type": "string",
},
"members": {
"type": "object",
"properties": {
"member1": { "type" : "string" },
"member2": { "type" : "string" }
}
}
}
}
I don't think that's too controversial. But originally, my own mental model was that readOnly at the top level meant you couldn't replace this resource with a new resource. The server would prevent that. But the internal members were still mutable. So I sprinkled readOnly at the name sub-schema and each member sub-schema. I think removing all of those was correct. (My mental model was maybe loosely based on how I interpret const variables in JavaScript. If my const var is an object, I can't change the value of the variable, but I can mutate its members or even add members to it.)
In the second example I leave readOnly out of the schema completely. So it's not too controversial to take that to mean anything is mutable in the resource.
{
"type": "object",
"properties": {
"name": {
"type": "string",
},
"members": {
"type": "object",
"properties": {
"member1": { "type" : "string" },
"member2": { "type" : "string" }
}
}
}
}
In the third example, I want to mix and match
{
"type": "object",
"properties": {
"name": {
"type": "string",
"readOnly": true
},
"members": {
"type": "object",
"properties": {
"member1": { "type" : "string", "readOnly": true },
"member2": { "type" : "string", "readOnly": true },
"member3": { "type" : "string"},
"member4": { "type" : "string"}
}
}
}
}
In this example the name, member1 and member2 are immutable. member3 and member4 can be modified.
So the question is, is there anything wrong about my interpretation of readOnly?
The spec, as you linked defines the following for readOnly...
If "readOnly" has a value of boolean true, it indicates that the value
of the instance is managed exclusively by the owning authority, and
attempts by an application to modify the value of this property are
expected to be ignored or rejected by that owning authority.
https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-validation-02#section-9.4
If you take the JSON defined meaning of value, it's the bit to the right of the key followed by colon. Therefore I would read this as any part of the value.
The OpenAPI specification only really defines readOnly as being applicable to individual properties.

Is it possible to inline JSON schemas into a JSON document? [duplicate]

For example a schema for a file system, directory contains a list of files. The schema consists of the specification of file, next a sub type "image" and another one "text".
At the bottom there is the main directory schema. Directory has a property content which is an array of items that should be sub types of file.
Basically what I am looking for is a way to tell the validator to look up the value of a "$ref" from a property in the json object being validated.
Example json:
{
"name":"A directory",
"content":[
{
"fileType":"http://x.y.z/fs-schema.json#definitions/image",
"name":"an-image.png",
"width":1024,
"height":800
}
{
"fileType":"http://x.y.z/fs-schema.json#definitions/text",
"name":"readme.txt",
"lineCount":101
}
{
"fileType":"http://x.y.z/extended-fs-schema-video.json",
"name":"demo.mp4",
"hd":true
}
]
}
The "pseudo" Schema note that "image" and "text" definitions are included in the same schema but they might be defined elsewhere
{
"id": "http://x.y.z/fs-schema.json",
"definitions": {
"file": {
"type": "object",
"properties": {
"name": { "type": "string" },
"fileType": {
"type": "string",
"format": "uri"
}
}
},
"image": {
"allOf": [
{ "$ref": "#definitions/file" },
{
"properties": {
"width": { "type": "integer" },
"height": { "type": "integer"}
}
}
]
},
"text": {
"allOf": [
{ "$ref": "#definitions/file" },
{ "properties": { "lineCount": { "type": "integer"}}}
]
}
},
"type": "object",
"properties": {
"name": { "type": "string"},
"content": {
"type": "array",
"items": {
"allOf": [
{ "$ref": "#definitions/file" },
{ *"$refFromProperty"*: "fileType" } // the magic thing
]
}
}
}
}
The validation parts of JSON Schema alone cannot do this - it represents a fixed structure. What you want requires resolving/referencing schemas at validation-time.
However, you can express this using JSON Hyper-Schema, and a rel="describedby" link:
{
"title": "Directory entry",
"type": "object",
"properties": {
"fileType": {"type": "string", "format": "uri"}
},
"links": [{
"rel": "describedby",
"href": "{+fileType}"
}]
}
So here, it takes the value from "fileType" and uses it to calculate a link with relation "describedby" - which means "the schema at this location also describes the current data".
The problem is that most validators do not take any notice of any links (including "describedby" ones). You need to find a "hyper-validator" that does.
UPDATE: the tv4 library has added this as a feature
I think cloudfeet answer is a valid solution. You could also use the same approach described here.
You would have a file object type which could be "anyOf" all the subtypes you want to define. You would use an enum in order to be able to reference and validate against each of the subtypes.
If the sub-types schemas are in the same Json-Schema file you don't need to reference the uri explicitly with the "$ref". A correct draft4 validator will find the enum value and will try to validate against that "subschema" in the Json-Schema tree.
In draft5 (in progress) a "switch" statement has been proposed, which will allow to express alternatives in a more explicit way.

reusing an object for multiple JSON schemas

I have two separate JSON schemas (used to validate HTTP request endpoints for a REST API) where they both accept the same exact object, but have different required fields (this is a create vs update request). Is there a way I can reuse a single definition of this object and only change the required fields? I know how to use $ref for reusing an object as a property of another object, but I cannot figure out how to reuse an entire object as the top-level object in a schema. My failed attempt so far:
event.json
{
"id": "event",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"start_date": {
"type": "integer"
},
"end_date": {
"type": "integer"
},
"description": {
"type": "string"
}
},
"additionalProperties": false
}
event-create.json
{
"id": "event-create",
"type": "object",
"$ref": "event",
"additionalProperties": false,
"required": [ "name", "description" ]
}
Obviously that doesn't work. It seems like it tries to insert the entirety of 'event' into the definition of 'event-create', including the ID and such. I tried referincing event#/properties to no avail. I can't seem to do a $ref as the sole value inside a properties property either. Any ideas?
Any members other than "$ref" in a JSON Reference object SHALL be ignored.
- https://datatracker.ietf.org/doc/html/draft-pbryan-zyp-json-ref-03#section-3
This is why your example doesn't work. Anything other than the $ref field is supposed to be ignored.
Support for $ref is limited to fields whose type is a JSON Schema. That is why trying to use it for properties doesn't work. properties is a plain object whose values are JSON Schemas.
The best way to do this is with allOf. In this case allOf can sort-of be thought of as a list of mixin schemas.
{
"id": "event-create",
"type": "object",
"allOf": [{ "$ref": "event" }],
"required": ["name", "description"]
}
I found some syntax that seems to work, but I'm not terribly happy with it:
{
"id": "event-create",
"allOf": [
{ "$ref": "event" },
{ "required": [ "name", "description" ] }
]
}
Seems like an abuse of the allOf operator, particularly for another case where there are no required fields (thus only one element insid the allof). But it works, so I'm going with it unless someone has a better idea.