Assuming IRIs when converting legacy JSON documents to RDF knowledge with JSON-LD - semantic-web

I am currently trying to use JSON-LD to make legacy JSON documents machine readable. Now I have a legacy document that contains a list of names. Each of these names references an entity for which I already have further knowledge in my graph and therefore I would like to link the existence of its name to the further attributes already known about it. But there is no IRI contained in the legacy JSON document, so when I convert it via JSON-LD, I just get unlinked string literals into my graph.
My legacy JSON document looks like this:
{
"url": "https://example.com/123",
"name": "Test",
"children": [
"Abc",
"Def",
"Xyz"
]
}
My JSON-LD context looks something like this currently:
{
"#context": {
"ex": "https://example.com/",
"name": "ex:name",
"children": "ex:has"
}
}
And in my knowledge store I have something like this already before interpreting the legacy document:
#PREFIX ex: <https://example.com/>
ex:abc ex:name "ABC".
ex:hasValue 123.
ex:def ex:name "DEF"
ex:hasValue 456.
ex:xyz ex:name "XYZ".
ex:hasValue 789.
Is there some supposed way to link the data attached i.e. to ex:xyz by the name "Xyz" in the legacy document? Or is there no other way than to write a custom program to add futher triples for these links?
I would like to see the following in my extracted knowledge from the legacy JSON:
ex:123 ex:name "Test".
ex:has ex:abc.
ex:has ex:def.
ex:has ex:xyz.

Related

How to specify 'includes' for nested properties in a document in RavenDB

I am trying to use RavenDB's REST API to make some calls to my database and I'm wondering if there's a way to use the 'includes' feature to return documents that are nested in a document.
For example, I have an object, Order, that looks similar to this:
"Order": {
"Lines": [
{
"Product": "products/11-A"
},
{
"Product": "products/42-A"
},
{
"Product": "products/72-A"
}
],
"OrderedAt": "1996-07-04T00:00:00.0000000",
"Company": "companies/85-A"
}
Company maps to another document and is simple enough to include in the query.
{ "Query": "from Orders include Company" }
My problem is Product that is nested in Lines, which is an array of order lines. Since I didn't find anything in the documentation about it I tried things like include Product or include Lines.Product, but those didn't work.
Is this kind of thing possible with the REST API? If so, how would I go about doing it?
The syntax to Query for Related Documents from the client can be found in this demo:
https://demo.ravendb.net/demos/csharp/related-documents/query-related-documents
The matching RQL to be used in the Query when using REST API is:
from 'Orders' include 'Lines[].Product'

Is there a way to add a default to a json schema array

I just want to understand if there is a way to add a default set of values to an array. (I don't think there is.)
So ideally I would like something like how you might imagine the following working. i.e. the fileTypes element defaults to an array of ["jpg", "png"]
"fileTypes": {
"description": "The accepted file types.",
"type": "array",
"minItems": 1,
"items": {
"type": "string",
"enum": ["jpg", "png", "pdf"]
},
"default": ["jpg", "png"]
},
Of course, all that being said... the above actually does seem to be validate as json schema however for example in VS code this default value does not populate like other defaults (like for strings) populate when creating documents.
It appears to be valid based on the spec.
9.2. "default"
There are no restrictions placed on the value of this keyword. When multiple occurrences of this keyword are applicable to a single sub-instance, implementations SHOULD remove duplicates.
This keyword can be used to supply a default JSON value associated with a particular schema. It is RECOMMENDED that a default value be valid against the associated schema.
See https://json-schema.org/draft/2020-12/json-schema-validation.html#rfc.section.9.2
It's up to the tooling to take advantage of that keyword in the JSON Schema and sounds like VS code is not.

Use of type : object and properties in JSON schema

I'm new to JSON.
I see in various examples of JSON, like the following, where complex values are prefixed with "type":"object", properties { }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"motor" : {
"type" : "object",
"properties" : {
"class" : "string",
"voltage" : "number",
"amperage" : "number"
}
}
}
I have written JSON without type, object, and properties, like the following.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"motor" : {
"class" : "string",
"voltage" : "number",
"amperage" : "number"
}
}
and submitted to an on-line JSON schema validator with no errors.
What is the purpose of type:object, properties { }? Is it optional?
Yes it is optional, try removing it and use your validator.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"foo": "bar"
}
You actually don't even need to use the $schema keyword i.e. {} is valid json
I would start by understanding what json is, https://www.json.org/ is the best place to start but you may prefer something easier to read like https://www.w3schools.com/js/js_json_intro.asp.
A schema is just a template (or definition) to make sure you're producing valid json for the consumer
As an example let's say you have an application that parses some json and looks for a key named test_score and saves the value (the score) in a database in some table/column. For this example we'll call the table tests and the column score. Since a database column requires a type we'll choose a numeric type, i.e. integer for our score column.
A valid json example for this may look like
{
"test_score": 100
}
Following this example the application would parse the key test_score and save the value 100 to the tests.score database table/column.
But let's say a score is absent so you put in a string i.e "NA"
{
"test_score": "NA"
}
When the application attempts to save NA to the database it will error because NA is a string not an integer which the database expects.
If you put each of those examples into any online json validator they are valid json example. However, while it's valid json to use "NA" or 100 it is not valid for the actual application that needs to consume the json.
So now you may understand that the author of the json may wonder
What are the different valid types I can use as values for my test
score?
The responsibility then falls on the writers of the application to provide some sort of definition (i.e a schema) that the clients (authors) can reference so the author knows exactly how to structure the json so the application can process it accordingly. Having a schema also allows you to validate/test your json so you know it can be processed by the application without actually having to send your json through the application.
So putting it altogether let's say in the schema you see
"$test_score": {
"type": "integer",
"format": "tinyint"
},
The writer of the json now knows that they must pass an integer and the range is 0 to 255 because it's a tinyint. They no longer have to trial by error different values and see which ones the application process. This is a big benefit to having a schema.

Lucene - Custom Analyzer/Parser for JSON objects?

I have a requirement for a very specific Lucene implementation which stores multiple "Properties" fields with deserialized JSON strings.
Example:
Document:
ID: "99"
Text: "Lorepsum Ipsum"
Properties: "{
"lastModified": "1/2/2015",
"user": "johndoe",
"modifiedChars": 2,
"before": "text a",
"after": "text b",
}"
Properties:"{
"lastModified": "1/2/2013",
"user": "johncotton",
"modifiedChars": 6,
"before": "text aa",
"after": "text bbb",
}"
Properties: "{
"lastModified": "1/3/2015",
"user": "johnmajor",
"modifiedChars": 3,
"before": "text aa",
"after": "text b",
}"
I'm aware that ElasticSearch and Solr have implementations to lookup within JSON objects but I'm using Lucene's core API (3.0.5).
My goal is to use lucene's API and with some added implementation to search within the JSON strings, for example:
Building a type of BooleanQuery where at least one "Properties" Field MUST match all the values in the query. (e.g query "+user:tom +modifiedChars:3 +before:"text A", etc)
I have some ideas but I have no clue where to begin. What I'm asking is some high level ideas to achieve such implementation. A custom analyzer maybe to use with a query parser?
Consider it an open ended question. All suggestions are welcome.
If you will always search for the complete set of values...
Create a "property" field for each set. The value would just be the concatenated set of values ie "1/2/2015:johndoe:2:text a:text b".
Alternatively... create a separate doc for each set. This would allow you to search for different combinations of values without conflating the different sets.
Yes that might mean duplicating the Text field. If it's not big then I wouldn't care too much (especially if you're not using a "stored" field).
Do you need to need to combine text and property in your queries? ("text:ipsum AND property:xxx")
If not then put the text in yet another doc.
If the idea is to search in order to get the "ID" field then some combination of the above ought to work

How to add content and moreDetailsUrl for Google Search suggest?

I'm using GSA (version 6.14) and we would like to get an auto suggest function on our website. Works fine for basic requests, but it seems the GSA offers more functionality when you would be using user-added results. However, I can find nowhere a reference on how to add user-added results.
This is what the information tells me today :
/suggest?q=<query>&max=<num>&site=<collection>&client=<frontend>&access=p&format=rich
should return a response as below :
{
"query": "<query>",
"results": [
{ "name": "<term 1>", "type": "suggest"},
{ "name": "<term 2>", "type": "suggest"},
{ "name": "<term 3>", "type": "uar", "content": "Title of UAR",
"moreDetailsUrl": "URL of UAR"}
]
}
I am able to get results as the first 2 lines, but would like to get results as the last line also, so with content and a moreDetailsUrl. So maybe a very stupid question but I am not able to find the answer anywhere : How and where do I add this UAR ?
I actually want to understand if it's feasible to get metadata into the content part of the JSON, so if for instance an icon meta is available I'd like to have it included in the JSON so I can enrich my search results.
User Added Results are a OneBox that can be added to multiple frontends. See this: https://developers.google.com/search-appliance/documentation/614/admin_searchexp/ce_improving_search#uar
When done with Suggest, the data is fed from user entering 'keymatches' directly. What's different about them is that they are a direct link versus a suggested query. If you use the out of the box experience, you'll click a link to the url instead of running another query.