what is the difference between properties and patternProperties in json schema? - jsonschema

For the following json string :
{
"abc" : 123,
"def" : 345
}
The following schema considers it valid :
{
"$schema": "http://json-schema.org/draft-03/schema#",
"title": "My Schema",
"description": "Blah",
"type": "object",
"patternProperties": {
".+": {
"type": "number"
}
}
}
However, changing the the patternProperties to properties still considers it valid. What then, is the difference between these 2 tags?

For the schema above all properties should be number. This data is invalid:
{ a: 'a' }
If you replace patternProperties with properties only property '.+' should be number. All other properties can be anything. This would be invalid:
{ '.+': 'a' }
This would be valid:
{ a: 'a' }

The properties (key-value pairs) on an object are defined using the properties keyword. The value of properties is an object, where each key is the name of a property and each value is a JSON schema used to validate that property.
additionalProperties can restrict the object so that it either has no additional properties that weren’t explicitly listed, or it can specify a schema for any additional properties on the object. Sometimes that isn’t enough, and you may want to restrict the names of the extra properties, or you may want to say that, given a particular kind of name, the value should match a particular schema. That’s where patternProperties comes in: it is a new keyword that maps from regular expressions to schemas. If an additional property matches a given regular expression, it must also validate against the corresponding schema.
Note: When defining the regular expressions, it’s important to note that the expression may match anywhere within the property name. For example, the regular expression "p" will match any property name with a p in it, such as "apple", not just a property whose name is simply "p". It’s therefore usually less confusing to surround the regular expression in ^...$, for example, "^p$".
for further reference --http://spacetelescope.github.io/understanding-json-schema/reference/object.html

Semantic of properties:
If you declare a property with a key included in properties, it must satisfy the schema declared in properties.
Semantic of patternProperties:
If you declare a property and the key satisfy the regex defined in patternProperties, it must satisfy the schema declared in patternProperties.
According to the docs, properties priority is higher than patternProperties, meaning that the schema is validated against patternProperties only if there has not been a match in properties first.

A JSON object is composed of key: value pairs. In a schema the key correspond to a property and for the value part we define it's data type and some other constratints.
Therefore the following schema
{
"type": "object",
"properties": {
"a": {
"type": "number"
}
}
will only validate a JSON object with the key "a" that is an object like {"a": 1}. An object like {"b": 1} won't validate
Meanwhile the patternProperties tag allows you to define properties using a regex. In this you basically don't need to define all the properties one after another. A use case of this will be for example if you don't know the name of the keys in advance but you know that all the keys match a certain pattern.
Hence your schema can validate {"a": 1} as well as {"b": 1}
The patternProperties tag does the job of an additionalProperties tag but in addition allows you to have a finer control on the keys

Related

How can I read value in square brackets of appsettings.json

I have appsettings.json with code:
"Serilog": {
"WriteTo": [
{
"Name": "RollingFile",
"Args": {
"pathFormat": "/home/www-data/aissubject/storage/logs/log-{Date}.txt"
}
}
]
}
How can I read value of "pathFormat" key?
What you're referring to is a JSON array. How you access that varies depending on what you're doing, but I'm assuming that since you're asking this, you're trying to get it directly out of IConfiguration, rather than using the options pattern (as you likely should be).
IConfiguration is basically a dictionary. In order to create the keys of that dictionary from something like JSON, the JSON is "flattened" using certain conventions. Each level will be separated by a colon. Arrays will be flattened by adding a colon-delimited component containing the index. In other words, to get at pathFormat in this particular example, you'd need:
Configuration["Serilog:WriteTo:0:Args:pathFormat"]
Where the 0 portion denotes that you're getting the first item in the array. Again, it's much better and more appropriate to use the options pattern to map the configuration values onto an actual object, which would let you actually access this as an array rather than a magic string like this.

JsonSchema: Using type/format with binary data

I need a system to describe input and output data types.
A type can be a primitive type like "integer" or "string' or a custom type like "TensorFlow model" or "CSV table".
The validation properties I'm adding to the data validation properties that has big resemblance to the JsonSchema validation properties.
It might be nice to describe the input and output data types using the JsonSchema language.
What's the best way to do that?
I had something like this in mind:
{"inputs": {
"model": {"type": "binary", "format": "TensorFlow model", "required": "true"},
"rounds": {"type": "integer", "minimum": 1, "default": 100}
}}
P.S. I find the way type and format are used really confusing. Types are basic and general while formats are specific. My associations are the opposite. Usually you have many specialized types that can be expressed in one of the few formats.
The primary aim of JSON Schema is to provide the format of JSON data.
The validation specification (draft-7) documents format in part as follows:
Implementations MAY add custom format attributes. Save for agreement
between parties, schema authors SHALL NOT expect a peer
implementation to support this keyword and/or custom format
attributes.
https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-validation-01#section-7.1
This means, you can add any format you want, but you can't expect it to work elsewhere. You should form agreements (or document what you mean) with anyone else that you expect to be able to use your schemas to validate the data you're providing.

1:n relationships and complex attribute types in ALFA

I'm trying to enter our database model into ALFA in order to check the capabilities of ALFA and XACML.
Are attributes like the following possible? How would look the rules then?
1:n by list of strings
namespace com.mycompany {
namespace resources {
namespace patient {
attribute trustedDoctorIds{
category = resourceCat
id = "trustedDoctorIds"
type = list<string> //maybe it should be bag[string]
}
}
}
}
1:n by list of complex type
namespace com.mycompany {
namespace resources {
namespace patient {
attribute trustedDoctors{
category = resourceCat
id = "trustedDoctors"
type = list<doctor> //maybe it should be bag[doctor]
}
}
}
namespace subjects {
namespace doctor {
attribute id {
category = subjectCat
id = "id"
type = string
}
attribute lastname {
category = subjectCat
id = "lastname"
type = string
}
}
}
}
You have a great question there.
By default all attributes in ALFA and XACML are multi-valued. Attributes are bags of values rather than single values. This means that when you define the following,
attribute trustedDoctorIds{
category = resourceCat
id = "trustedDoctorIds"
type = string
}
This means the attribute has a type of string and it can be multi-valued. You could choose to express cardinality information in the comments above the attribute definition e.g.
/**
* This attribute, trustedDoctorIds, contains the list of doctors a patient
*trusts. The list can have 0 or more values.
*/
The policy is the one that will convey how many values there can be depending on the functiosn being used.
For instance, you could write a condition that states
stringOneAndOnly(trustedDoctorIds)==stringOneAndOnly(userId)
In that case, you are forcing each attribute to have one value and one value only. If you have 0 or more than 1 value, then the evaluation of the XACML policy will yield Indeterminate.
In a XACML (or ALFA) target, when you write:
trustedDoctorIds == "Joe"
You are saying: if there is at least one value in trustedDoctorIds equal to 'Joe'...
In an ALFA condition, when you write
trustedDoctorIds==userId
You are saying: *if there is at least one value in trustedDoctorIds equal to at least one value in userId
Note: I always use singular names for my attributes when I can. It's a convention, not a hard limit. Remembering the cardinality of your attributes will help later in your policy testing.
Answers to the comments
What would be a plural name you try to avoid by your convention?
Well trustedDoctorId***s*** looks rather plural to me. I would use trustedDoctorId unless you know that the attribute is necessarily always multi-valued.
So, this should be possible: In my request I provide resource.patient.trustedDoctorIds=="2,13,67" and subject.doctor.id=="6". How would the rule then look like in ALFA? Smth. like "resource.patient.trustedDoctorIds.contains(subject.doctor.id) permit"
The rule would look like the following:
stringIsIn(stringOneAndOnly(subject.doctor.id),resource.patient.trustedDoctorIds)
Make sure that you provide multiple values in your request, not one value that contains comma-separated values. Send in [1,2,3] rather than "1,2,3".
Further edits
So, by [2,13,67] the result is deny as expected and not permit like with "2,13,67" and doctorId==6. I chose that example on purpose, since the stringIsIn function would result unwantedly with true since 6 is included in 67
Do not confuse stringIsIn() and stringContains().
stringIsIn(a, b) takes in 2 parameters a and b where a is an atomic value and b is a bag of values. stringIsIn(a, b) returns true if the value of a is in the bag of values of b.
stringContains(a, b) takes in 2 parameters a and b that are both atomic values of type string. It returns true if the string value a is found inside b.
Example:
stringIsIn(stringOneAndOnly(userCitizenship), stringBag("Swedish", "German")) returns true if the user has a single citizenship equal to either of Swedish or German.
stringContains("a", "alfa") returns true if the second string contains the first one. So it returns true in this example.

performing virtual alignment between ontologically annotated JSON objects

I have an application that requests JSON objects from various other applications via their REST APIs. The response from any application comes in the following format:
{
data : {
key1: { val: value, defBy: "ontology class"}
key2: ...,
}
}
The following code depicts an object from App1:
{
data : {
key1: { val: "98404506-385576361", defBy: "abc:SHA-224"}
}
}
The following code depicts an object from App2:
{
data : {
key2: { val: "495967838-485694812", defBy: "xyz:SHA3-224"}
}
}
Here, DefBy refers to the algorithm used to encrypt the string in val. When my application receives such objects, it parses the JSON and converts each kv in the object into RDF such that:
// For objects from App1:
key1 rdf:type osba:key
key1 osba:generatedBy abc:SHA-224
...
// For objects from App2
key2 rdf:type osba:key
key2 osba:generatedBy xyz:SHA3-224
I need to query the generated RDF data in a way that I can specify if osba:generatedBy of any key belongs to the SHA family, then return the subject as a valid query-result, such that: where {?k osba:generatedBy ???}
Please note the following points:
I also receive objects with other encryption algorithms such as MD5, etc.
I don't know in advance what encryption algorithm will be used by a new application joining the network nor what NS it uses. For example, in the above objects, one uses abc:, and the other uses xyz:.
I can't use SPARQL filtering because the value could be SecureHashAlgorithm instead of SHA
My problem is that I can't define an upper (referenced) ontology in advance and map the value stored in defBy: of the incoming objects, because I don't know in advance what ontology is used nor what encryption algorithm the value represents.
I read about Automatic Ontology Integration, Alignment, Mapping, etc,. but I can't find the rationale of this concept to my problem.
Any solutions?
3) I can't use SPARQL filtering because the value could be SecureHashAlgorithm instead of SHA
SPARQL filtering supports matching against regular expressions as defined by xpath. Thus, something along the line of
SELECT ?key
WHERE { ?key osba:generatedBy ?generator
FILTER regex(?generator, "^s(ecure)?h(ash)?a(lgorithm)?.*", "i") }
(note: untested) should do the job. To build a good regex I can recommend http://regexr.com/
In case it's necessary: You can convert an IRI to a string (for matching) with the str() function.

Passing enum value in restfull api

I have this scenario at hand:
I have an API server holding enum value and sending it using Restfull API.
The client will receive it and needs to act according to the value.
My main question: Which is better - sending the int value or the string? I can see benefits for both approaches.
Is there any way to avoid holding the enum in both sides? I am not familiar with one that actually can be useful.
Thanks!
If the API server maintains the enum, the client could fetch it by:
GET /enums
... which will return a list of enum values:
[
{ "id" : "1001", "value" : "enum-item-1" },
{ "id" : "1002", "value" : "enum-item-2" },
...
{ "id" : "100N", "value" : "enum-item-3" },
]
Thus allowing the client to fetch one of the enum items:
GET /enums/1017
Or perhaps perform an operation on it:
POST /enums/1017/disable
Normally, one would only refer to the enum items by their unique ID - and if the client always starts by querying the server for the enum list, you will not need to maintain the enum on the client and server.
But - if the values in your business case are permanently unique and there is a compelling reason to have 'nicer' human-readable URLs, one could use:
GET /enums/enum-item-26
Generally, this is not best practice, as the enum item value probably has business meaning and thus might change. Even though that currently seems unlikely.