validate array json contains several unordered objects using json schema - jsonschema

Problem
I want to use json schema draft 7 to validate that an array contains several unordered objects. For example, the array should contains student A, B, regardless of their orders.
[{"name": "A"}, {"name": "B"}] //valid
[{"name": "B"}, {"name": "A"}] //valid
[{"name": "A"}, {"name": "C"}, {"name": "B"}] //extra students also valid
[] or [{"name": "A"}] or [{"name": "B"}] //invalid
Current Attempt
json schema contains keyword doesn't support a list
json schema Tuple validation keyword must be ordered

You want the allOf applicator keyword. You need to define multiple contains clauses.
allOf allows you to define multiple schemas which must all pass.
{
"$schema": "http://json-schema.org/draft-07/schema#",
"allOf": [
{
"contains": {
"required": ["name"],
"properties": {
"name": {
"const": "A"
}
}
}
},
{
"contains": {
"required": ["name"],
"properties": {
"name": {
"const": "B"
}
}
}
}
]
}
Live demo here.

Related

Is it possible to be agnostic on the properties' names?

Let's say I want to have a schema for characters from a superhero comics. I want the schema to validate json objects like this one:
{
"Name": "Roberta",
"Age": 15,
"Abilities": {
"Super_Strength": {
"Cost": 10,
"Effect": "+5 to Strength"
}
}
}
My idea is to do it like that:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "characters_schema.json",
"title": "Characters",
"description": "One of the characters for my game",
"type": "object",
"properties": {
"Name": {
"type": "string"
},
"Age": {
"type": "integer"
},
"Abilities": {
"description": "what the character can do",
"type": "object"
}
},
"required": ["Name", "Age"]
}
And use a second schema for abilities:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "abilities_schema.json",
"title": "Abilities",
"type": "object",
"properties": {
"Cost": {
"description": "how much mana the ability costs",
"type": "integer"
},
"Effect": {
"type": "string"
}
}
}
But I can't figure how to merge Abilities in Characters. I could easily tweak the schema so that it validates characters formatted like:
{
"Name": "Roberta",
"Age": 15,
"Abilities": [
{
"Name": "Super_Strength"
"Cost": 10,
"Effect": "+5 to Strength"
}
]
}
But as I need the name of the ability to be used as a key I don't know what to do.
You need to use the additionalProperties keyword.
The behavior of this keyword depends on the presence and annotation
results of "properties" and "patternProperties" within the same schema
object. Validation with "additionalProperties" applies only to the
child values of instance names that do not appear in the annotation
results of either "properties" or "patternProperties".
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.3
In laymans terms, if you don't define properties or patternProperties the schema value of additionalProperties is applied to all values in the object at that instance location.
Often additionalProperties is only given a true or false value, but rememeber, booleans are valid schema values.
If you have constraints on the keys for the object, you may wish to use patternPoperties followed by additionalProperties: false.

JSON schema AND/OR for array enum

I'd like to specify JSON schema to restrict combinations of array values.
For example if I have an array where the values could be "apple", "orange" or "banana", but "apple" and "orange" would never appear together.
i.e. these are all valid
[]
["apple"]
["orange"]
["banana"]
["apple, "banana"]
["orange", "banana"]
but these are NOT valid:
["apple","orange"]
["apple","orange","banana"]
I've got as far as the an enum array, but I'm not sure whether I can specify an OR operator somehow:
"options":
{
"type": "array",
"items": {
"type": "string",
"enum": [
"apple",
"orange",
"banana"
]
}
}
p.s. ["apple","apple"] would also be invalid, but perhaps that's another story.
You need to use a combination of not, allOf, and contains.
not inverts the validation result.
allOf requires that all of the subschemas are valid.
contains requires that the array contains an item that is valid according to the subschema value.
{
"$schema": "http://json-schema.org/draft-07/schema",
"type": "array",
"items": {
"type": "string",
"enum": ["apple","orange","banana"]
},
"not": {
"allOf": [
{
"contains": {
"const": "apple"
}
},
{
"contains": {
"const": "orange"
}
}
]
}
}
Live demo: https://jsonschema.dev/s/C835R

JSON Schema to represent a name and value with value constrained by name

I have the following JSON snippets which are all valid
"units": { "name": "EU", "value": "Grams" }
"units": { "name": "EU", "value": "Kilograms" }
"units": { "name": "US", "value": "Ounces" }
"units": { "name": "US", "value": "Pounds" }
The name values can be EU and US and the valid value value should depend on the name value.
It's easy to use JSON Schema enums for both these properties, but can I enforce the additional constraint using JSON Schema?
I would consider changing the overall schema so that there is a parent child relationship between a name object and value object, but ideally this would be avoided.
I managed to crack it using https://www.jsonschemavalidator.net/ to work though an example. The following schema provides the solution:
"units": {
"type":"object",
"oneOf": [ {
"properties": {
"name": { "enum": [ "EU" ] },
"value": { "enum" : ["Grams", "Kilograms"]}}}, {
"properties": {
"name": { "enum": [ "US" ] },
"value": { "enum": ["Ounces", "Pounds"]}}}]
}

jsonschema ref as direct parent schema fields

I have two json schemas:
//person schema
{
"id": "/person",
"type": "object",
"properties": {
"name": {"type": "string"},
"baseFields": {"$ref": "/baseFields"}
},
"additionalProperties": false
}
//baseFields schema
{
"id": "/baseFields",
"type": "object",
"properties": {
"age": {"type": "string"},
"hobby": {"type": "string"}
},
"additionalProperties": false
}
below object will pass 'person scema' validation:
{
"name":"person1",
"baseFields":{
"age":"33",
"hobby":"diving"
}
}
what I need is that below object to pass 'person scema' validation:
{
"name":"person1",
"age":"33",
"hobby":"diving"
}
I need it because I have few fields that are relevant to few different schemas
Thank you
What you are trying to do is inheritance. But there is no inheritance in JSON schema.
You could use the "allOf" keyword. But it has some gotchas. See here for an example and more info (check the 3rd example that has "street_address, city and state" fields).
Also check this answer in so.

Schema to load JSON to Google BigQuery

Suppose I have the following JSON, which is the result of parsing urls parameters from a log file.
{
"title": "History of Alphabet",
"author": [
{
"name": "Larry"
},
]
}
{
"title": "History of ABC",
}
{
"number_pages": "321",
"year": "1999",
}
{
"title": "History of XYZ",
"author": [
{
"name": "Steve",
"age": "63"
},
{
"nickname": "Bill",
"dob": "1955-03-29"
}
]
}
All the fields in top-level, "title", "author", "number_pages", "year" are optional. And so are the fields in the second level, inside "author", for example.
How should I make a schema for this JSON when loading it to BQ?
A related question:
For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
How should I make a schema for this JSON when loading it to BQ?
The following schema should work. You may want to change some of the types (e.g. maybe you want the dob field to be a TIMESTAMP instead of a STRING), but the general structure should be similar. Since types are NULLABLE by default, all of these fields should handle not being present for a given row.
[
{
"name": "title",
"type": "STRING"
},
{
"name": "author",
"type": "RECORD",
"fields": [
{
"name": "name",
"type": "STRING"
},
{
"name": "age",
"type": "STRING"
},
{
"name": "nickname",
"type": "STRING"
},
{
"name": "dob",
"type": "STRING"
}
]
},
{
"name": "number_pages",
"type": "INTEGER"
},
{
"name": "year",
"type": "INTEGER"
}
]
A related question: For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
It should be possible to union two tables with differing schemas without too much difficulty.
Here's a quick example of how it works over public data (kind of a silly example, since the tables contain zero fields in common, but shows the concept):
SELECT * FROM
(SELECT * FROM publicdata:samples.natality),
(SELECT * FROM publicdata:samples.shakespeare)
LIMIT 100;
Note that you need the SELECT * around each table or the query will complain about the differing schemas.