I want to construct a complex POJO during run time based on the scenario .
In the below sample request structure consider addresses.line1 as mandatory field
And I dont have to pass the other fields everytime but need to do on test cases basis.
{
"site": [{
"code": "string",
"mrn": "string"
}
],
"email": ["string"],
"addresses": [{
"line1": "string",
"line2": "string",
"city": "string",
"state": "string",
"postalCode": "string"
}
],
"names": [{
"first": "string",
"middle": "string",
"last": "string",
"suffix": "string"
}
]
}
Ex:
For TestCase#1 I need only below JSON:
{
"addresses": [{
"line1": "string"
}
]
}
Where as for TestCase#2 I need below JSON
{
"email": ["string"],
"addresses": [{
"line1": "string",
"line2": "string"
}
],
"names": [{
"first": "string",
"last": "string"
}
]
}
I referred https://github.com/intuit/karate/blob/master/karate-demo/src/test/java/demo/outline/examples.feature
but the example was pretty straight forward with replaceable values.
I was looking for something like #JsonInclude(JsonInclude.Include.NON_DEFAULT)
Karate is designed to completely avoid POJO-s and give you complete control over creating and modifying complex JSON. So I suggest you temporarily forget about POJO-s and Java, else you won't get the best out of Karate.
There are a few ways to do this, but here is one. First store the complex JSON in a file, called main.json
Then creating the different variants is simple:
Background:
* def main = read('main.json')
Scenario: one
* def payload = karate.filterKeys(main, 'addresses')
Scenario: two
* def payload = main
* remove payload.site
I suggest you read the docs on reading files for more ideas, look out for embedded expressions.
Also see: https://stackoverflow.com/a/51896522/143475
Related
Is it possible to get a list of all additionalProperties found by json-schema ?
For example, if my schema looks like this :
{
"type": "object",
"properties": {
"firstName": {
"type": "string",
},
"lastName": {
"type": "string",
},
"age": {
"type": "integer"
}
}
}
And data loooks like this :
{
"firstName": "John",
"lastName": "Doe",
"age": 21,
"extraField": "some new data I was not expecting",
"anotherExtraField": "another unexpected data point"
}
In this case, instead of an exception from json-schema because of additionalProperties: false, I want a list in return, like : [extraField, anotherExtraField]
If you're using an implementation that supports 2019-09 or 2020-12 with annotations, you're in luck! additionalProperties should produce an annotation result of the properties it validates (spec).
If you add additionalProperties: true, then all extra properties pass and are validated by the keyword, which means those extra properties should be listed in the annotation result.
{
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"type": "integer"
}
},
"additionalProperties": true
}
This yields (in the Detailed output format)
{
"valid": true,
"keywordLocation": "#",
"instanceLocation": "#",
"annotations": [
{
"valid": true,
"keywordLocation": "#/properties",
"instanceLocation": "#",
"annotation": [
"firstName",
"lastName",
"age"
]
},
{
"valid": true,
"keywordLocation": "#/additionalProperties",
"instanceLocation": "#",
"annotation": [
"extraField",
"anotherExtraField"
]
}
]
}
You can try it on https://json-everything.net, which is powered by my validator, JsonSchema.Net.
If you're not using .Net, you can browse the implementations page for other libraries. Some of them may also support annotations, but I'm not sure which do.
I have been working on my own validator for JSON schema and FINALLY have most of how unevaluatedProperties are supposed to work,... I think. That's one tricky piece there! However I really just want to confirm one thing. Given the following schema and JSON, what is the expected outcome... I have tried it with a https://www.jsonschemavalidator.net and gotten an answer, but I was hoping I could get a more definitive answer.
The focus is the faz property is in fact being evaluated, but the command to disallow unevaluatedProperties comes from a deeply nested schema.
Thoguhts?
Here is the schema...
{
"type": "object",
"properties": {
"foo": {
"type": "object",
"properties": {
"bar": {
"type": "string"
}
},
"unevaluatedProperties": false
}
},
"anyOf": [
{
"properties": {
"foo": {
"properties": {
"faz": {
"type": "string"
}
}
}
}
}
]
}
Here is the JSON...
{
"foo": {
"bar": "test",
"faz": "test"
}
}
That schema will successfully evaluate against the provided data. The unevaluatedProperties keyword will be aware of properties evaluated in subschemas of adjacent keywords, and is evaluated after all other applicator keywords, so it will see the annotation produced from within the anyOf subschema, also.
Evaluating this keyword is easy if you follow the specification literally -- it uses annotations to decide what to do. You just need to make sure that all keywords either produce annotations correctly or propagate annotations correctly that were produced by other keywords, and then all the information is available to generate the correct result.
The result produced by my implementation is:
{
"annotations" : [
{
"annotation" : [
"faz"
],
"instanceLocation" : "/foo",
"keywordLocation" : "/anyOf/0/properties/foo/properties"
},
{
"annotation" : [
"foo"
],
"instanceLocation" : "",
"keywordLocation" : "/anyOf/0/properties"
},
{
"annotation" : [
"bar"
],
"instanceLocation" : "/foo",
"keywordLocation" : "/properties/foo/properties"
},
{
"annotation" : [],
"instanceLocation" : "/foo",
"keywordLocation" : "/properties/foo/unevaluatedProperties"
},
{
"annotation" : [
"foo"
],
"instanceLocation" : "",
"keywordLocation" : "/properties"
}
],
"valid" : true
}
This is not an answer but a follow up example which I feel is in the same vein. I feel this guides us to the answer.
Here we have a single object being validated. But the unevaluated command resides in two different schemas each a part of a different "adjacent keyword subschemas"(from the core spec http://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.11)
How should this be resolved. If all annotations must be evaluated then in what order do I evaluate? The oneOf first or the anyOf? According the spec an unevaluated command(properties or items) generate annotation results which means that that result would affect any other unevaluated command.
http://json-schema.org/draft/2020-12/json-schema-core.html#unevaluatedProperties
"The annotation result of this keyword is the set of instance property names validated by this keyword's subschema."
This is as far as I am understanding the spec.
According to the two validators I am using this fails.
Schema
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"type": "object",
"properties": {
"foo": {
"type": "string"
}
},
"oneOf": [
{
"properties": {
"faz": {
"type": "string"
}
},
"unevaluatedProperties": true
}
],
"anyOf": [
{
"properties": {
"bar": {
"type": "string"
}
},
"unevaluatedProperties": false
}
]
}
Data
{
"bar": "test",
"faz": "test",
}
I need to define a JSON schema for a JSON in which a field/key is called as the value of a previous field. Examples:
{
"key1": "SOME_VALUE",
"SOME_VALUE": "..."
}
{
"key1": "ANOTHER_VALUE",
"ANOTHER_VALUE": "..."
}
Moreover, the second field should be among the required ones.
I have been looking around but I am not sure JSON schema offers such feature. Maybe some advanced semantics check?
Thanks for the help
The only way you could do this is if you knew the values in advance, but it looks like this is not possible for you. This would need to be in your business logic validation as opposed to your format validation.
So, thanks to Relequestual suggestions, I managed to get to a solution.
Constraint: possible values of "key1" need to be finite and known in advance
Suppose we need a JSON schema for validating a JSON that:
Requires the string properties "required_simple_property1" and "required_simple_property2".
Requires the property "key1" as an enum with 3 possible values ["value1", "value2", "value3"].
Requires a third property, whose key must be the value taken by key1.
This can be accomplished with a schema like:
"oneOf": [
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value1"
],
"properties": {
"key1": {
"type": "string",
"const": "value1"
}
}
},
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value2"
],
"properties": {
"key1": {
"type": "string",
"const": "value2"
}
}
},
{
"required": [
"required_simple_property1",
"required_simple_property2",
"value3"
],
"properties": {
"key1": {
"type": "string",
"const": "value3"
}
}
}
],
"properties": {
"required_simple_property1": {
"type": "string"
},
"required_simple_property2": {
"type": "string"
},
"value1": {
... (anything)
},
"value2": {
... (anything)
},
"value3": {
... (anything)
},
}
Suppose I have the following JSON, which is the result of parsing urls parameters from a log file.
{
"title": "History of Alphabet",
"author": [
{
"name": "Larry"
},
]
}
{
"title": "History of ABC",
}
{
"number_pages": "321",
"year": "1999",
}
{
"title": "History of XYZ",
"author": [
{
"name": "Steve",
"age": "63"
},
{
"nickname": "Bill",
"dob": "1955-03-29"
}
]
}
All the fields in top-level, "title", "author", "number_pages", "year" are optional. And so are the fields in the second level, inside "author", for example.
How should I make a schema for this JSON when loading it to BQ?
A related question:
For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
How should I make a schema for this JSON when loading it to BQ?
The following schema should work. You may want to change some of the types (e.g. maybe you want the dob field to be a TIMESTAMP instead of a STRING), but the general structure should be similar. Since types are NULLABLE by default, all of these fields should handle not being present for a given row.
[
{
"name": "title",
"type": "STRING"
},
{
"name": "author",
"type": "RECORD",
"fields": [
{
"name": "name",
"type": "STRING"
},
{
"name": "age",
"type": "STRING"
},
{
"name": "nickname",
"type": "STRING"
},
{
"name": "dob",
"type": "STRING"
}
]
},
{
"name": "number_pages",
"type": "INTEGER"
},
{
"name": "year",
"type": "INTEGER"
}
]
A related question: For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
It should be possible to union two tables with differing schemas without too much difficulty.
Here's a quick example of how it works over public data (kind of a silly example, since the tables contain zero fields in common, but shows the concept):
SELECT * FROM
(SELECT * FROM publicdata:samples.natality),
(SELECT * FROM publicdata:samples.shakespeare)
LIMIT 100;
Note that you need the SELECT * around each table or the query will complain about the differing schemas.
After reading the documentation, testing and reading a lot of other questions here on stackoverflow:
We have documents that have titles and description in multiple languages. There are also tags that are translated to the same languages. There might be up to 30-40 different languages in the system, but probably only 3 or 4 translations for a single document.
This is the planned document structure:
{
"luck": {
"id": 10018,
"pub": 0,
"pr": 100002,
"loc": {
"lat": 42.7,
"lon": 84.2
},
"t": [
{
"lang": "en-analyzer",
"title": "Forest",
"desc": "A lot of trees.",
"tags": [
"Wood",
"Nature",
"Green Mouvement"
]
},
{
"lang": "fr-analyzer",
"title": "ForĂȘt",
"desc": "A grand nombre d'arbre.",
"tags": [
"Bois",
"Nature",
"Mouvement Vert"
]
}
],
"dates": [
"2014-01-01T20:00",
"2014-06-06T20:00",
"2014-08-08T20:00"
]
}
}
Possible queries are "arbre" or "wood" or "forest" or "nature" combined with a date and a geo_distance filter, furthermore there will be some facets over the tags array (that obviously include counting).
We can produce any document structure that fits best for elasticsearch (or for lucene). It's crucial that each language is analyzed specifically, so we use "_analyzer" in order to distinguish the languages.
{
"luck": {
"properties": {
"id": {
"type": "long"
},
"pub": {
"type": "long"
},
"pr": {
"type": "long"
},
"loc": {
"type": "geo_point"
},
"t": {
"_analyzer": {
"path": "t.lang"
},
"properties": {
"lang": {
"type": "string"
},
"properties": {
"title": {
"type": "string"
},
"desc": {
"type": "string"
},
"tags": {
"type": "string"
}
}
}
}
}
}
A) Apparently, this idea does not work: after PUTting the mapping, we retrieve the same mapping ("GET") and it seems to ignore the specific analyzers (A test with a top-level "_analyzer" worked fine). Does "_analyzer" work for sub-documents and if yes how to should we refer to it? We also tested declaring the sub-document as "object" or "nested". How is multi-language document indexing supposed to work.
B) One possibility would be to put each language in its own document: In that case how do we manage the id? Finally both documents should refer to the same id. For example if the user searches for "nature" (and we don't know if the user intends to find "nature" in English or French), this document would appear twice in the result set, and the counting and paging would be very wrong (also facet counting).
Any ideas?