JSON LD Transformation : to remove certain objects from json with context or jsonld framing - jsonschema

I have a JSON with the following structure and I am looking to remove certain parts of it either by using JSON-LD Context or JSON-LD Framing so in this situation I need a specific type2 only in the output along with the parent
{
"#context":{
"#vocab":"http://xyz.abc.com/01#",
"sdf":"http://xyz.abc.com/01#",
"#base":"http://xyz.abc.com/01#",
"rowid":"#id",
"values":"#nest",
"type":"#type"
},
"#id":"sdf:parent",
"type":"type1",
"values":[
{
"value":984657
}
],
"rows":[
{
"rowid":"5637220",
"type":"type2",
"values":[
{
"value":"i am type 2"
}
]
},
{
"rowid":"9847589",
"type":"type3",
"values":[
{
"value":"I am type 3"
}
]
}
]
}
So the output should look somewhat like this with the child either embedded in the parent or separate like below with a predicate hasChild
{
"#graph": [
{
"#id": "http://xyz.abc.com/01#parent",
"#type": "http://xyz.abc.com/01#type1",
"http://xyz.abc.com/01#value": 984657 ,
"hasChild" : "http://xyz.abc.com/5637220"
},
{
"#id": "http://xyz.abc.com/5637220",
"#type": "http://xyz.abc.com/01#type2",
"http://xyz.abc.com/01#value": "i am type 2"
}
]
}

Related

Can JSON Schema if statements handle nested $refs?

I have a JSON Schema using draft 2020-12 and I am trying to use an if-else subschema to check that a particular property exists based on the value of another property. Below is the if statement I am currently using. There are more properties but I have have omitted them for the sake of brevity. They are identical except the type of the property in the then statement is different. They are all wrapped in an allOf array:
{
"AValue": {
"allOf": [
{
"if": {
"$ref": "#/$defs/ValueItem/properties/dt",
"const": "type1"
},
"then": {
"properties": {
"string": { "type": "string" }
},
"required": ["string"]
}
}
]
}
}
The #/$defs/ValueItem/properties/dt being referred to is here:
{
"ValueItem": {
"properties": {
"value": {
"$ref": "#/$defs/AValue"
},
"dt": {
"$ref": "#/$defs/DT"
}
},
"additionalProperties": false
}
}
#/$defs/DT is here:
{
"DT": {
"type": "string",
"enum": [
"type1",
"type2",
"type3",
"type4"
]
}
}
I expected that when dt is encountered in a JSON instance document, the validator will check if the value of dt is type1 and then validate that an additional property called string is also present and is of type string. However, what actually happens is the validator complains that "Property 'string' has not been defined and the schema does not allow additional properties".
I assume that this is because the condition in the if statement evaluates to false so the subschema is never applied. However, I am unsure why that would be as I followed the example here when creating the if-then-else block. The only thing I can think of that is different is the use of $ref which I have in my schema but it is not in the example.
I found this answer which makes me think that it is possible to use $ref in an if statement but is it possible to use a ref that points to another ref or am I thinking about it incorrectly?
I have also tried removing the $ref from the if statement like below but it still doesn't work. Is it because of the $ref in the properties?
{
"AValue": {
"properties": {
"dt": {
"$ref": "#/$defs/DT"
}
},
"required": [
"dt"
],
"allOf": [
{
"if": {
"properties": {
"dt": {
"const": "type1"
}
}
},
"then": {
"properties": {
"string": {
"type": "string"
}
}
}
}
]
}
}
The problem is not cascading the $ref keywords. The const keyword at the if statement is not applied to the target of the $ref, but to the current location in the JSON input data. In this case to "AValue". To check if the property "dt" is of value "type1" at this point, you would need an if like this (simple solution with no $ref):
"if": {
"properties": {
"dt": {
"const": "type1"
}
},
"required": [ "dt" ]
}
Edit: Showing complete JSON Schema and error in JSONBuddy with $ref:

Can't make PUT /raylight/v1/documents/id/parameter/id work properly

I need to update document parameters via REST API.
I've tried using the following:
PUT .../raylight/v1/documents/33903/parameters/3
with the following json payload
{
"parameters":{
"parameter": {
"id": 3,
"answer": {
"values": {
"value": [
"2019/9"
]
}
}
}
}
}
But the returned answer shows unmodified parameters:
{
"parameter": {
"#optional": "false",
"#type": "prompt",
...
"id": 3,
...
"answer": {
...
"info": {
...
"previous": {
"value": [
"2015\/12"
]
}
},
"values": {
"value": [
"2015\/12"
]
}
}
}
}
How can I properly set new prompt parameters?
Do:
PUT .../raylight/v1/documents/33903/parameters
instead of:
PUT .../raylight/v1/documents/33903/parameters/3
Adding a parameter ID at the end performs a different function: it returns the list of parameters that are dependent upon the one provided. You have only one in this case, and it's returning itself. Leave it off, to refresh the document.

getting lvalue and rvalue of a declaration

Im parsing C++ with ANTLR4 grammar, I have a visitor function for visitDeclarationStatement. In the C++ code that Im trying to parse Person p; or a declaration of any custom type, in the tree I get two similar nodes and I cannot differentiate between the Lvalue and Rvalue!
"declarationStatement": [
{
"blockDeclaration": [
{
"simpleDeclaration": [
{
"declSpecifierSeq": [
{
"declSpecifier": [
{
"typeSpecifier": [
{
"trailingTypeSpecifier": [
{
"simpleTypeSpecifier": [
{
"theTypeName": [
{
"className": [
{
"type": 128,
"text": "Person"
}
]
}
]
}
]
}
]
}
]
}
]
},
{
"declSpecifier": [
{
"typeSpecifier": [
{
"trailingTypeSpecifier": [
{
"simpleTypeSpecifier": [
{
"theTypeName": [
{
"className": [
{
"type": 128,
"text": "p"
}
]
}
]
}
]
}
]
}
]
}
]
}
]
},
{
"type": 124,
"text": ";"
}
]
}
I want to be able to get Variable Type and Variable Name separately. What is the right way of doing that? How can I change the g4 file to get those results in a way that I can differentiate between the type and the name?
Thanks
You don't need to change the grammar. In case of Person p;, which is matched by:
declSpecifierSeq: declSpecifier+ attributeSpecifierSeq?;
the first child of declSpecifierSeq (which is declSpecifier+) will be a List of declSpecifier-contexts, of which the first is the type and the second the name.

hierarchical faceting with Elasticsearch

I'm using elasticsearch and need to implement facet search for hierarchical object as follow:
category 1 (10)
subcategory 1 (4)
subcategory 2 (6)
category 2 (X)
...
So I need to get facets for two related objects. Documentation says that it's possible to get such kind of facets for numeric value, but I need it for strings http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html
Here is another interesting topic, unfortunately it's old: http://elasticsearch-users.115913.n3.nabble.com/Pivot-facets-td2981519.html
Does it possible with elastic search?
If so, how can I do that?
The previous solution works really well until you have no more than a multi-level tag on a single-document. In this case a simple aggregation doesn't work, because the flat structure of the lucene fields mix the results on the internal aggregation.
See the example below:
DELETE /test_category
POST /test_category
# Insert a doc with 2 hierarchical tags
POST /test_category/test/1
{
"categories": [
{
"cat_1": "1",
"cat_2": "1.1"
},
{
"cat_1": "2",
"cat_2": "2.2"
}
]
}
# Simple two-levels aggregations query
GET /test_category/test/_search?search_type=count
{
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
That's the WRONG response that I have got on ES 1.4, where the fields on the internal aggregation are mixed at a document level:
{
...
"aggregations": {
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
},
{
"key": "2.2", <= WRONG
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1", <= WRONG
"doc_count": 1
},
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
A Solution can be to use nested objects. These are the steps to do:
1) Define a new type in the schema with nested objects
POST /test_category/test2/_mapping
{
"test2": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"cat_1": {
"type": "string"
},
"cat_2": {
"type": "string"
}
}
}
}
}
}
# Insert a single document
POST /test_category/test2/1
{"categories":[{"cat_1":"1","cat_2":"1.1"},{"cat_1":"2","cat_2":"2.2"}]}
2) Run a nested aggregation query:
GET /test_category/test2/_search?search_type=count
{
"aggs": {
"categories": {
"nested": {
"path": "categories"
},
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
}
}
That's the response, now correct, that I have got:
{
...
"aggregations": {
"categories": {
"doc_count": 2,
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
}
The same solution can be extended to a more than two-levels hierarchy facet.
Currently, elasticsearch does not support hierarchical facetting out-of-the-box. But the upcoming 1.0 release features a new aggregations module, that can be used to get these kind of facets (which are more like pivot-facets rather than hierarchical facets). Version 1.0 is currently in beta, you can download the second beta and test out aggregatins by yourself. Your example might look like
curl -XPOST 'localhost:9200/_search?pretty' -d '
{
"aggregations": {
"main category": {
"terms": {
"field": "cat_1",
"order": {"_term": "asc"}
},
"aggregations": {
"sub category": {
"terms": {
"field": "cat_2",
"order": {"_term": "asc"}
}
}
}
}
}
}'
The idea is, to have a different field for each level of facetting and bucket your facets based on the terms of the first level (cat_1). These aggregations then would have sub-buckets, based on the terms of the second level (cat_2). The result may look like
{
"aggregations" : {
"main category" : {
"buckets" : [ {
"key" : "category 1",
"doc_count" : 10,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 4
}, {
"key" : "subcategory 2",
"doc_count" : 6
} ]
}
}, {
"key" : "category 2",
"doc_count" : 7,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 3
}, {
"key" : "subcategory 2",
"doc_count" : 4
} ]
}
} ]
}
}
}

ElasticSearch for Attribute(Key) value data set

I am using Elasticsearch with Haystacksearch and Django and want to search the follow structure:
{
{
"title": "book1",
"category" : ["Cat_1", "Cat_2"],
"key_values" :
[
{
"key_name" : "key_1",
"value" : "sample_value_1"
},
{
"key_name" : "key_2",
"value" : "sample_value_12"
}
]
},
{
"title": "book2",
"category" : ["Cat_3", "Cat_2"],
"key_values" :
[
{
"key_name" : "key_1",
"value" : "sample_value_1"
},
{
"key_name" : "key_3",
"value" : "sample_value_6"
},
{
"key_name" : "key_4",
"value" : "sample_value_5"
}
]
}
}
Right now I have set up an index model using Haystack with a "text" that put all the data together and runs a full text search! In my opinion this is not the a well established search 'cause I am not using my data set structure and hence this is some kind odd.
As an example if for an object I have a key-value
{
"key_name": "key_1",
"value": "sample_value_1"
}
and for another object I have
{
"key_name": "key_2",
"value": "sample_value_1"
}
and we it gets a query like "Key_1 sample_value_1" comes I get a thoroughly mixed result of objects who have these words in their fields rather than using their structures.
P.S. I am totally new to ElasticSearch and better to say new to the search technologies and challenges. I have searched the web and SO button didn't find anything satisfying. Please let me know if there is something wrong with my thoughts and expectations from these search engines and if there is SO duplicate question! And also if there is a better approach to design a database for this kind of search
Read the es docs on nested mappings and do something like this:
"book_type" : {
"properties" : {
// title, cat mappings
"key_values" : {
"type" : "nested"
"properties": {
"key_name": {
"type": "string", "index": "not_analyzed"
},
"value": {
"type": "string"
}
}
}
}
}
Then query using a nested query
"nested" : {
"path" : "key_values",
"query" : {
"bool" : {
"must" : [
{
"term" : {"key_values.key_name" : "key_1"}
},
{
"match" : {"key_values.value" : "sample_value_1"}
}
]
}
}
}