Different minimum and maximum for integer and number in json schema - jsonschema

Let's say I have schema where a property can be either an integer or a number. Is there a way to specify a different maximum depending on the type of value? For example:
{
"type": ["integer", "number"],
"max-if-integer": 255,
"max-if-number": 1.0
}
couldn't find anything about it in the docs.

if you are on draft7 or later, if/then/else is probably the best way to express this. if it's earlier, you can get there with oneOf, as discussed on gregsdennis's answer.
{
"if": {"type": "integer"},
"then": {
"minimum": integer-minimum,
"maximum": integer-maximum
},
"else": {
"type": "number",
"minimum": float-minimum,
"maximum": float-maximum
}
}
do note however that 1.0 and 1 are both considered integers according to the json schema spec, and will use the integer then, not the number else like non-whole-number floats.

oneOf is going to be your friend!
{
"oneOf": [
{ "type": "integer", "maximum": 255 },
{ "type": "number", "maximum": 1 }
]
}

Related

Put validation of two array fields in JSON Schema using oneOf

Can I put check on two fields in JSON schema ? Both field are of type array of objects. Conditions:
Either one of them can contain value at a time (i.e. other should be empty).
Both can be empty.
Any leads ?
// The schema
var schema = {
"id": "https://kitoutapi.lrsdedicated.com/v1/json_schemas/login-request#",
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Login request schema",
"type": "object",
"oneOf": [
{ "categories": {
"maxItems": 0
},
"positionedOffers": {
"minItems": 1
}},
{ "categories": {
"minItems": 1
},
"positionedOffers": {
"maxItems": 0
}}
],
"properties": {
"categories": {
"type": "array"
},
"positionedOffers": {
"type": "array"
}
},
"additionalProperties": false
};
// Test data 1
// This test should return a good result
var data1 = {
"positionedOffers":['hello'],
"categories":[],
}
For your requirement, I think I'd come at this from the other direction. Rather than saying
If one contains a value, the other must be empty, but both may be empty.
I'd say
At least one must be empty.
That leads you to use an anyOf with subschemas checking that each property is an empty array.
{
"id": "https://kitoutapi.lrsdedicated.com/v1/json_schemas/login-request#",
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Login request schema",
"type": "object",
"anyOf": [
{
"properties": {
"categories": {
"maxItems": 0
}
}
},
{
"properties": {
"positionedOffers": {
"maxItems": 0
}
}
}
],
"properties": {
"categories": {
"type": "array"
},
"positionedOffers": {
"type": "array"
}
},
"additionalProperties": false
}
Bonus Material
In your original post, you omitted the properties keywords under the oneOf. This may have been the cause of the schema's failure to validate. I've added it in the above.
Secondly, draft 4 is quite old at this point. You may be limited by the implementation you're using, but if you can, you should consider using a more recent version of JSON Schema. You can view available implementations and what versions they support on the JSON Schema implementations page.

Validate phone only when provided using Json Schema

Using following JSON schema to validate phone number if provided.
Accepted validation
Min length 10
Max length 20
and Pattern
If phone is null or empty, no validation is required
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"Item": {
"type": "object",
"properties": {
"Phone": {
"anyOf": [
{
"type": "integer",
"minLength": 10,
"maxLength": 20,
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
},
{
"type": [ "integer", "null" ]
}
]
}
}
}
}
}
Can you please suggest what is missing in the above schema?
Thank you!
Remove integer from the null case. It's slowing so integers through, which overrides the phone number case.
Secondarily, if possible, you may want to use a later draft for your schema. Draft 4 is quite old. Check with your validator to see if it supports a newer draft.
There are errors in your schema, but you're missing the understanding about how JSON Schema works in terms of applicability.
JSON Schema has many keywords that are only applicable to a specific type. When the type is not that of a keywords applicability, it has no effect.
The subschema for "phone" can be simplified as the following:
{
"type": ["string", "null"],
"minLength": 10,
"maxLength": 20,
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}
The keywords minLenght, maxLength, and pattern are only applicable to strings. If the value is not a string (and is null), those keywords are not applicable, and so are ignored.
(I've not checked your regex here, just copied what you had already.)

How to use anyOf on different properties type?

In the schema below, I need items_list, price and variance as required keys. Condition is price and variance may or may not be null but both cannot be null.
Though I'm able to achieve it, I'm looking forward to if there's any shorter way to do this. Also, I'm not sure where exactly to put required and additionalProperties keys.
Any help is greatly appreciated.
{
"type": "object",
"properties": {
"items_list": {
"type": "array",
"items": {
"type": "string"
}
},
},
"anyOf": [
{
"properties": {
"price": {
"type": "number",
"minimum": 0,
},
"variance": {
"type": [
"number",
"null"
],
"minimum": 0,
},
},
},
{
"properties": {
"price": {
"type": [
"number",
"null"
],
"minimum": 0,
},
"variance": {
"type": "number",
"minimum": 0,
},
},
},
],
# "required": [
# "items_list",
# "price",
# "variance",
# ],
# "additionalProperties": False,
}
To answer the question, "can it be shorter?", the answer is, yes. The general rule of thumb is to never define anything in the boolean logic keywords. Use the boolean logic keywords only to add compound constraints. I use the term "compound constraint" to mean a constraint that is based on more that one value in a schema. In this case, the compound constraint is that price and variance can't both be null.
{
"type": "object",
"properties": {
"items_list": {
"type": "array",
"items": { "type": "string" }
},
"price": { "type": ["number", "null"], "minimum": 0 },
"variance": { "type": ["number", "null" ], "minimum": 0 }
},
"required": ["items_list", "price", "variance"],
"additionalProperties": false,
"allOf": [{ "$ref": "#/definitions/both-price-and-variance-cannot-be-null" }],
"definitions": {
"both-price-and-variance-cannot-be-null": {
"not": {
"properties": {
"price": { "type": "null" },
"variance": { "type": "null" }
},
"required": ["price", "variance"]
}
}
}
}
Not only do you not have to jump through hoops to get additionalProperties working properly, it's also easier to read. It even matches your description of the problem, "price and variance may or may not be null" (properties) but "both cannot be null" (not (compound constraint)). You could make this even shorter by inlining the definition, but I included it to show how expressive this technique can be while still being shorter than the original schema.
Looks like you have this mostly right. That's the right place to put required.
Using additionalProperties: false, you need to also define properties at the top level, additionalProperties cannot "see through" *Of keywords (applicators).
You can add properties: [prop] : true, but define all the properties.
You need to do this because additionalProperties only knows about properties within the same schema object at the same level.

JSON schema: Use field value as required field name

I need to define a JSON schema for a JSON in which a field/key is called as the value of a previous field. Examples:
{
"key1": "SOME_VALUE",
"SOME_VALUE": "..."
}
{
"key1": "ANOTHER_VALUE",
"ANOTHER_VALUE": "..."
}
Moreover, the second field should be among the required ones.
I have been looking around but I am not sure JSON schema offers such feature. Maybe some advanced semantics check?
Thanks for the help
The only way you could do this is if you knew the values in advance, but it looks like this is not possible for you. This would need to be in your business logic validation as opposed to your format validation.
So, thanks to Relequestual suggestions, I managed to get to a solution.
Constraint: possible values of "key1" need to be finite and known in advance
Suppose we need a JSON schema for validating a JSON that:
Requires the string properties "required_simple_property1" and "required_simple_property2".
Requires the property "key1" as an enum with 3 possible values ["value1", "value2", "value3"].
Requires a third property, whose key must be the value taken by key1.
This can be accomplished with a schema like:
"oneOf": [
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value1"
],
"properties": {
"key1": {
"type": "string",
"const": "value1"
}
}
},
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value2"
],
"properties": {
"key1": {
"type": "string",
"const": "value2"
}
}
},
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value3"
],
"properties": {
"key1": {
"type": "string",
"const": "value3"
}
}
}
],
"properties": {
"required_simple_property1": {
"type": "string"
},
"required_simple_property2": {
"type": "string"
},
"value1": {
... (anything)
},
"value2": {
... (anything)
},
"value3": {
... (anything)
},
}

Specific analyzers for sub-documents in lucene / elasticsearch

After reading the documentation, testing and reading a lot of other questions here on stackoverflow:
We have documents that have titles and description in multiple languages. There are also tags that are translated to the same languages. There might be up to 30-40 different languages in the system, but probably only 3 or 4 translations for a single document.
This is the planned document structure:
{
"luck": {
"id": 10018,
"pub": 0,
"pr": 100002,
"loc": {
"lat": 42.7,
"lon": 84.2
},
"t": [
{
"lang": "en-analyzer",
"title": "Forest",
"desc": "A lot of trees.",
"tags": [
"Wood",
"Nature",
"Green Mouvement"
]
},
{
"lang": "fr-analyzer",
"title": "ForĂȘt",
"desc": "A grand nombre d'arbre.",
"tags": [
"Bois",
"Nature",
"Mouvement Vert"
]
}
],
"dates": [
"2014-01-01T20:00",
"2014-06-06T20:00",
"2014-08-08T20:00"
]
}
}
Possible queries are "arbre" or "wood" or "forest" or "nature" combined with a date and a geo_distance filter, furthermore there will be some facets over the tags array (that obviously include counting).
We can produce any document structure that fits best for elasticsearch (or for lucene). It's crucial that each language is analyzed specifically, so we use "_analyzer" in order to distinguish the languages.
{
"luck": {
"properties": {
"id": {
"type": "long"
},
"pub": {
"type": "long"
},
"pr": {
"type": "long"
},
"loc": {
"type": "geo_point"
},
"t": {
"_analyzer": {
"path": "t.lang"
},
"properties": {
"lang": {
"type": "string"
},
"properties": {
"title": {
"type": "string"
},
"desc": {
"type": "string"
},
"tags": {
"type": "string"
}
}
}
}
}
}
A) Apparently, this idea does not work: after PUTting the mapping, we retrieve the same mapping ("GET") and it seems to ignore the specific analyzers (A test with a top-level "_analyzer" worked fine). Does "_analyzer" work for sub-documents and if yes how to should we refer to it? We also tested declaring the sub-document as "object" or "nested". How is multi-language document indexing supposed to work.
B) One possibility would be to put each language in its own document: In that case how do we manage the id? Finally both documents should refer to the same id. For example if the user searches for "nature" (and we don't know if the user intends to find "nature" in English or French), this document would appear twice in the result set, and the counting and paging would be very wrong (also facet counting).
Any ideas?