How to handle schema errors in rapidjson? - jsonschema

How can I detect the following error situation:
A rapidjson::SchemaDocument is constructed from a rapidjson::Document, but the JSON contained in that Document is no proper schema; for example,
{ "type": "object", "properties": [1] }.
Currently, all I get is an access violation when I validate a document against this faulty schema.
Thanks
Hans

Related

JSON Schema validator compatibility

I'm trying to understand how a single JSON Schema behaves when used in different validators. Some validators define custom keywords. For example ajv validator ajv-keywords package defines a prohibited keyword that is not part of the JSON Schema standard. JSON Schema on the other hand defines the required keyword that would seem to be the polar opposite of prohibited. JSON Schema also defines a oneOf combinator that can be used to validate that the input should match one and only one of several schema definitions.
Consider the following schema example. By reading the json schema specification, I get the impression that the example json schema should validate any json object when used in ajv. However, according to the unknown keyword rules, validators are supposed to ignore any keywords they do not support. So, I imagine that another validator would ignore the custom prohibited keyword, causing the schema to reject an input with property foo. Is this correct or am I failing to read the json schema specification?
{
"oneOf": [
{
"type": "object",
"required": ["foo"]
},
{
"type": "object",
"prohibited": ["foo"]
}
]
}
You are correct. A standard JSON Schema validator will fail validation for an object that has a property "foo". You should be very careful using non-standard keywords if you expect your schemas to be used by standard validators.
It should be okay to use custom keywords as long as you follow the principle of progressive enhancement. Effectively, that means the behavior should degrade as gracefully as possible if the custom keyword is ignored. Your example violates this principle because you end up with a false negative result if prohibited is ignored.
An simple example that does follow progressive enhancement might look like this...
{
"type": "object",
"properties": {
"foo": {}
},
"required": ["foo"],
"prohibited": ["bar"]
}
If I run this through a standard validator, all assertions work as expected except prohibited which is ignored. Assuming a client-server architecture, this allows clients to mostly validate their requests before sending them to the server. The server then does it's own validation with a validator that understands the custom keywords and can respond with an error if "bar" is present.

JSON API relationship with meta information

I have an entity contract with relationship contract_contacts that should be presented in JSON API format.
To be more clear here's the structure of my entities:
Contract
id
name
ContractContact
contract_id
contact_id
type
comment
Contact
id
name
Possible JSON API output will look like:
{
"data": {
"type": "contracts",
"id": "1",
"attributes": {
"name": "Contract 1"
},
"relationships": {
"contacts": {
"data": [
{
"type": "contract_contacts",
"id": "1"
},
{
"type": "contract_contacts",
"id": "2"
}
]
}
}
}
}
This approach is not good enough - you have to create additional resource for relation where you will store your contact and comment with type. You have to include with 2 levels deep to get you contact fields. Also in this case to create contract frontend should work with both resources:
Create contract contact and get id
Then Create contract with relationship
with id from above
The second approach is seems hacky to me because it will use meta and it's up to you how to use it. Example:
{
"data": {
"type": "contracts",
"id": "1",
"attributes": {
"name": "Contract 1"
},
"relationships": {
"contacts": {
"data": [
{
"meta": {
"comment": "comment 1",
"type": 1
},
"type": "contacts",
"id": "10"
},
{
"meta": {
"comment": "comment 2",
"type": 2
},
"type": "contacts",
"id": "11"
}
]
}
}
}
}
This approach will simplify the mess with api requests that was in previous example.
But is that correct to POST/PUT/PATCH with meta fields as they are not supposed to be changed from client (or supposed to be)? I'm confused with this part.
The relationship that you are describing is often referred to as a has-many-through relationship: A contract has a many contacts through a contract_contacts. These is defined as a relationship that links two resources through an intermediate resource.
JSON:API specification does not provide first-level support for these kind of relationship. You should instead model them through separate resources as described by you as your first option. This allows you to create, modify and delete your intermediate resource in the same way as any other resource. Doing so reduces the complexity as the intermediate resource is just another resource type as any other.
You mentioned two problems with doing so:
You have to include with 2 levels deep to get you contact fields.
This is true but shouldn't be an issue. include query parameter allows your client to sideload resources any many level deep as it needs. The response document might be a little bit bigger than it would be if the information of the intermediate resource is stored on the relationship itself but that shouldn't be relevant in production after gzip.
Also in this case to create contract frontend should work with both resources:
Create contract contact and get id
Then Create contract with relationship with id from above
This is true and a serious limitation of the current stable version of JSON:API specification (v1.0). It's not directly related to has-many-through relationships so. It's a general limitation of the specification, which does not support creating, modifying and/or deleting more than one resource with one request.
An official Atomic Operations extension is proposed for v1.1 of the specification to address that limitation. It's very likely that these one or a similar proposal will be included in the upcoming version.
It might be tempting to store the information of the intermediate model as meta data on the relationship. But doing so will introduce serious limitations which for I would strongly recommend to not take that path:
The JSON:API specification does not cover changing meta data. You would need to introduce your own specification to create or update these meta data.
Client-side libraries for the JSON:API specification do not expect such information to be available as meta data of the relationship. It's very likely that the consumers will have a hard time processing the information.
Storing information of the intermediate resource would lock you into using resource linkage to express relationship information in a resource document. You would not be able to use related resource links. These may introduce serious performance issues as resource linkage requires to always lookup the IDs of related resources in the database, which is not required if using related resource links.

value of key A equals to value of Key B in JSONschema

For {keyA:valueA},{KeyB:valueB} Is it possible to define in the schema, valueB must equal to valueA. In other words, copying down ValueA to ValueB?
I understand it causes duplication. But two different keys must be used to meet different standards.
For example, I want to use name as sample name in the schema below.
Schema
{
"$id": "sampleSchema",
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"sample name":{
"type":"string"
},
}
}
The data will be like:
{
"name":"example1",
"sample name":"example1"
}
JSON Schema does not support operations like this.
We call this "data consistency validation" because it tests that data in one place is consistent with how it's defined in another location.
Supporting these types of operations would be very difficult. It would probably require a general purpose programming language to support most of the cases that people would like to see.
For more information, see Scope of JSON Schema Validation.
As an alternative, some validators allow you to implement custom keywords, or implement events or hooks when an instance is being validated against a schema with a particular ID. You can use this to implement the functionality you're looking for.

choosing between different objects in JSON-schema

I'm creating a schema for receipts and want to have a master schema for the core concepts with a variety of different detail objects for specialized receipt types (e.g. itemized hotel receipts, etc.) My current implementation is leveraging the oneOf mechanism in JSON-schema
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Receipt",
"type": "object",
"properties": {
...
"amount": { "type": "number" },
"detail": {
"type": "object",
"oneOf": [
{ "$ref": "general-detail.schema.json" },
{ "$ref": "hotel-detail.schema.json" },
...
]
}
}
}
The problem with this approach is that when I validate (using tv4), it appears that all of the schemas specified in oneOf are being checked, and are in fact, returning errors. I can minimize this effect by getting rid of the detail property, moving oneOf to the schema-level (e.g. outside of properties) and then creating root property names in each of the sub-schemas. However, even in that case, I get a "Missing required property: generalDetail" in the event that there's an error when I'm validating a hotel receipt type.
So 2 questions:
is it even possible to use a generic detail property like I'm currently doing and not have the validator completely validate each sub-schema in the oneOf structure (e.g. am I using oneOf wrongly)?
if it is not possible, I would be more than fine simply having a set of 'typed' detail properties (like 'generalDetail', 'hotelDetail', etc.) - but is there a way to specify that they are a group and that only one of them should exist in the document being validated?
TIA
It is usually better using anyOf - it is very rarely when you need oneOf. The latter will alway validate all schemas, the former will most likely exit at the first that passes.
You may look at some other validators. tv4 has many deviations from the standard and also is very slow. https://github.com/ebdrup/json-schema-benchmark
All of the schemas in oneOf need to be validated in order for the validator to ensure that only one of the schemas pass. If none pass or more than one pass, the validator needs to tell you the validation results of each schema in order for you to determine how to fix the error.
So, just because the validator is telling you why each of the schemas are failing doesn't mean that it expects all of those schemas to pass.

Null nested fields in Google BigQuery

I'm trying to upload a json file to BigQuery contaning a nested field which is null but it's not accepting.
I tried a lot of different syntax but I always got the error:
File: 0 / Offset:0 / Line:1 / Column:410, missing required field(s)
I tried to sent the value as many different values listed below and even ommiting it...
"quotas": []
"quotas": null
"quotas": "null"
etc...
The schema definition...
[..]
"name": "quotas",
"type": "record",
"mode": "repeated",
"fields":[
{
"name": "service",
"type": "string",
"mode": "nullable"
},
[..]
]
[..]
From what I can tell in the logs for the import worker for that job, the line in question is missing a required field (the field name starts with "msi"). The line is otherwise well-formatted from what I can tell.
I've filed a bug that BigQuery should give the name of the required field or fields that are missing to make this easier to debug in the future.