Json-LD: Linking documents using Json-LD - node-webkit

I am creating a desktop application using node-webkit.
The purpose of creating the application is to add documents and that anyone can comment on the document. The document will be split into paragraphs and people can comment on the paragraphs. Each of the paragraphs will be considered as different sections. I would like to link each section (or paragraph) with the comments using JSON-LD.
I am new to JSON-LD and I would like to know how it can be used.

In a document (an HTML document, anyway), sections (or any element) can be identified using the #id attribute, which typically becomes a fragment identifier for the document. For instance, http://www.w3.org/TR/json-ld/#abstract is a URL with the "abstract" fragment identifier, if you look at the html source, you'll see the following:
<section id="abstract" class="introductory" property="dcterms:abstract" datatype="" typeof="bibo:Chapter" resource="#abstract" rel="bibo:chapter"><h2 aria-level="1" role="heading" id="h2_abstract">Abstract</h2>
<p>JSON is a useful data serialization and messaging format.
This specification defines JSON-LD, a JSON-based format to serialize
Linked Data. The syntax is designed to easily integrate into deployed
systems that already use JSON, and provides a smooth upgrade path from
JSON to JSON-LD.
It is primarily intended to be a way to use Linked Data in Web-based
programming environments, to build interoperable Web services, and to
store Linked Data in JSON-based storage engines.</p>
</section>
(note that some of this is automatically generated, so there is other non-relevant boiler plate as well).
This provides you one mechanism of describing the structure of a document using JSON-LD:
{
"#id": "http://www.w3.org/TR/json-ld",
"#type": "bibo:Document",
"bibo:chapter": [{
"#id": "#abstract"
}, {
"#id": "#sotd"
}, {
"#id": "#references"
}],
}
Note, in this case, the JSON-LD is defined to have the same URI (URL) of the HTML document, so the "#abstract" really expands to http://www.w3.org/TR/json-ld#abstract, this providing you a way to reference that section, and an identifier for the section. Much more is possible.
In fact, many W3C specifications are marked up in RDFa, as both RDFa and JSON-LD are RDF formats, you can actually turn this document into JSON-LD with an appropriate too, such as the RDF distiller I maintain. For example, try the following in your browser: http://rdf.greggkellogg.net/distiller?fmt=jsonld&in_fmt=rdfa&uri=http://www.w3.org/TR/json-ld/#abstract.

Related

RESTful API HATEOAS

I've come to the conclusion that building a truly RESTful API, one the uses HATEOAS is next to impossible.
Every content I've come across either fails to illustrate the true power of HATEOAS
or simply does not explicitly mentions the inherent pain points with the dynamic nature of HATEOAS.
What I believe HATEOAS is all about:
From my understanding, a truly HATEOAS API should have ALL the information needed to interact with the API, and while that is possible, it is a nightmare to use especially with different stacks.
For example, consider a collection of resources located at "/books":
{
"items": [
{
"self": "/book/sdgr345",
"id": "sdgr345",
"name": "Building a RESTful API - The unspoken truth",
"author": "Elad Chen ;)",
"published": 1607606637049000
}
],
// This describes every field needed to create a new book
// just like HyperText Markup Language (i.e. HTML) rendered on the server does with forms
"create-form": {
"href": "/books",
"method": "POST",
"rel": ["create-form"],
"accept": ["application/x-www-form-urlencoded"],
"fields": [
{ "name": "name", "label": "Name", "type": "text", "max-length": "255", "required": true }
{ "name": "author", "label": "Author", "type": "text", "max-length": "255", "required": true }
{ "name": "author", "label": "Publish Date", "type": "date", "format": "dd/mm/YY", "required": true }
]
}
}
Giving the above response, a client (such as a web app) can use the "create-form" property to render an actual HTML form.
What value do we get from all this work?
The same value we've been getting from HTML for years.
Think about it, this is exactly what hypertext is all about, and what HTML has been designed for.
When a browser hits "www.pizza.com" the browser has no knowledge of the other paths that a user
can visit, it does not concatenate strings to produce a link to the order page -> "www.pizza.com/order", it simply renders anchors
and navigates when a user clicks them. This is what allows web developers to change the path from "/order" to "/shut-up-and-take-my-money" without changing any client (browsers).
The above idea is also true for forms, browsers do not guess the parameters needed to order a pizza, they simply render a form and its inputs, and handle its submission.
I have seen too many lines of codes in front-ends and back-ends alike, that build strings
like -> "https://api.com" + "/order" - You don't see browsers do that, right?
The problems with HATEOAS
Giving the above example ("/books" response), in order to create a new book, clients are expected to parse the response in order to leverage the true power of this RESTful API, otherwise, they risk assuming what the names of the fields are, which of them is required, what their expected type is, etc...
Now consider having two clients within your company that are using this API,
one for the web (browsers) written in JS, and another for the mobile (say an android app) written in Java. They can be published as SDK's, hopefully making 3 party consumers have an easier integration.
As soon as the API is used by clients outside your control, say a 3rd party developer with an affinity to python, with the purpose of creating a new book.
That developer is REQUIRED to parse such a response, to figure out what the parameters are, their name, the URL to send inputs to, and so on.
In all my years of developing I have yet to come across an API such as the one I have in mind.
I have a feeling this type of API is nothing more than a pipe dream, and I was hoping to understand whether my assumptions are correct, and what downfalls it brings before starting the implementation phase.
P.S
in case it's not clear, this is exactly what HATEOAS compliant API is all about - when the fields to create a book change clients adapt without breaking.
On the Hypermedia Maturity Model (HMM), the example you give is at Level 0. At this level, you are absolutely correct about the problems with this approach. It's a lot of work and developers are probably going to ignore it and hard-code things anyway. However, with a generic hypermedia enabled media type, not only does all that extra work go away, it actually reduces the work for developers.
Let's take a step back for a moment and consider how the web works. There are three main components: the web server, the web browser, and the driver (usually a human user). The web server provides HTML, which the web browser executes to present a graphical user interface, which the driver can use to follow links and fill out forms. The browser uses the HTML to completely abstract from the driver all the details about how to present the form and how to send it over HTTP.
In the API world, this concept of the generic browser that abstracts away the media type and HTTP details hasn't taken hold yet. The only one I know about that is both active and high quality is Ketting. Using a browser like Ketting, removes all of that extra work the developer would have to put into making use of all that hypermedia. A hypermedia browser is like an SDK that API vendors often provide except that it works for any API. In theory, you can link from one API to another completely unrelated API. APIs would no longer be islands, they would become a web.
The thing that makes hypermedia browsers possible are general purpose hypermedia enabled media types. HTML is of course the most successful and famous example, but there are JSON based media types as well. Some of the more widely used examples are HAL and Siren.
The higher level a media type is on the Hypermedia Maturity Model, the more that a generic browser can do to abstract way the media-type, URIs, and HTTP details. Here's a brief explanation. Checkout the blog post linked above for more details and examples.
Level 0: At this level, hypermedia is encoded in an ad-hoc way. A browser can't do much with this because every API might encode things a little different. At best a browser can use heuristics or AI to guess that something is a link or a form and treat it as such, but generally HMM Level 0 media-types are intended for developers to read an interpret. This leads to many of the challenges you identified your question. Examples: JSON and XML.
Level 1: At this level, links are a first class feature. The media type has a well defined way to represent a link. A browser knows unambiguously what is to be interpreted as a link and can provide an interface to follow that link without the user needing to be concerned about URI or HTTP. This is sufficient for read-only APIs, but if we need the user to provide input, we don't have a way to represent a form-like hypermedia control. It's up to a human to read the documentation or an ad-hoc form representation to know how to submit data. Examples: HAL, RESTful JSON.
Level 2: At this level, forms (or form-like controls) are a first class feature. A media type has a well defined way of representing a form-like hypermedia control. A browser can use this media type to do things like build an HTML Form, validate user input, encode user input into an acceptable media type, and make an HTTP request using the appropriate HTTP method. Let's say you want to change your API to support PATCH and you prefer that application start using it over PUT. If you are using a HMM Level 2 media type, you can change method in the representation and any application that uses a hypermedia browser that knows how to construct a PATCH request will start sending PATCHs instead of PUTs without any developer intervention. Without a HMM Level 2 media type, you're stuck with those PUTs until you can get all the applications that use your API to update their code. Examples: HTML, HAL-Forms, Siren, JSON Hyper-Schema, Uber, Mason, Collection+JSON.
Level 3: At this level, in addition to hypermedia controls, data are also self-describing. Remember those three main components I mentioned? The last one, "driver", is the major difference between using hypermedia on the web and using hypermedia in an API. On the web, the driver is a human (excluding crawlers for simplicity), but with an API, the driver is an application. Humans can interpret the meaning of what they are presented with and deal with changes. Applications may act on heuristics or even AI, but usually they are following a fixed routine. If something changes about the data that the application didn't expect, the application breaks. At this level, we apply semantics to the data using something like JSON-LD. This allows us to construct drivers that are better at dealing with change and can even make decisions without human intervention. Examples: Hydra.
I think the only downside to choosing to use hypermedia in your API is that there aren't production-ready HMM Level 2 hypermedia browsers available in most languages. But, the good news is that one implementation will cover any API that uses a media type it supports. Ketting will work for any API and application that is written in JavaScript. It would only take a few more similar implementations to cover all the major languages and choosing to use hypermedia would be an easy choice.
The other reason to choose hypermedia is that it helps with the API design process. I personally use JSON Hyper-Schema to rapid-prototype APIs and use a generic web client to click through links and forms to get a feel for API workflows. Even if no one else uses the schemas, for me, it's worth having even if just for the design phase.
Implementing a HATEOAS API needs to be done both on the server and on the clients, so this point you make in the comments is very valid indeed:
changing a resource URI is risky given I don't believe clients actually "navigate" the API
Besides the World Wide Web, which is the best implementation of HATEOAS, I only know of the SunCloud API on the now decommissioned Project Kenai. Most APIs out there don't quite make use of HATEOAS, but are just a bunch of documented URLs where you can get or submit specific resources (basically, instead of being "hypermedia driven", they are "documentation driven"). With this type of API, clients don't actually navigate the API, they concatenate strings to go at specific endpoints where they know they can find specific resources.
If you expose a HATEOAS API, developers of any clients can still look at the links you return and may decide to build them on their own 'cause they figure what the API is doing, so then think they can just bypass any other navigation that might be needed and go straight for the URL, because it is always /products/categories/123, until - of course - it isn't anymore.
A HATEOAS API is more difficult to build and adds complexity to both the server and the clients, so when deciding to build one, the questions are:
do you need the flexibility HATEOAS is giving you to justify the extra complexity of the implementation?
do you want to make it easier or harder for clients to consume your API?
does the knowledge and discipline to make it all work exist on both sides (server developers and clients developers)?
Most of the times, the answer is no. In addition, many times the questions don't even get asked, instead people end up with the approach that is more familiar because they've seen the sort of APIs people build, have used some, or built some before. Also, many times, a REST API is just stuck in front of a database and doesn't do much but expose data from the database as JSON or XML, so not that much need for navigation there.
There is no one forcing you to implement HATEOAS in your API, and no one preventing you to do so either. At the end of the day, it's a matter of deciding if you want to expose your API this way or not (another example would be, do you expose your resources as JSON, XML, or let the client chose the content type?).
In the end, there is always the chance of breaking your clients when you make changes to your API (HATEOAS or no HATEOAS), because you don't control your clients and you can't control how knowledgeable the client developer are, how disciplined, or how good of a work they do in implementing someone else's API.

What's a `<script type='application/ld+json'>{jsonObj}</script>` in a `head` section do?

I got this link but didn't understand well. Saw:
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "WebSite",
"url": "http://website.com",
"name": "wbs",
"description": "Web Studio"
}
</script>
in a source code.
How a code snippet like above in my website header help me or my site?
In your example, the script element is used as data block, which contains JSON-LD (type="application/ld+json").
JSON-LD is a RDF serialization. It allows you to publish Linked Data (or structured data) using JSON. In your example, the vocabulary Schema.org is used ("#context": "http://schema.org").
This structured data can be used by any interested consumer. Prominent consumers are the search engines Bing, Google, Yahoo, and Yandex, which support structured data that uses the vocabulary Schema.org. One use case they have is displaying more details in their result snippets.
Your example probably doesn’t lead to such an enhanced snippet. You have to check the search engine’s documentation if you want to know what features they offer and which structured data you have to provide for these. For example, Google uses the WebSite type (that’s used in your example) for their Sitelinks Search Box, but you would have to add a potentialAction in addition (for the search function).
It gives Google and other crawlers structured data about a website. This is used for rich snippets and knowledge graph panels among others. Have a look at this site for more information: https://developers.google.com/search/docs/guides/intro-structured-data
That's one way to include structured data in your site which helps any kind of users/crawlers use the information on the site in an efficient way. The most popular example is Google news cards:
This kind of card data are actually coming from structured data.
Other ways to include structured data is through Microdata
And the time of asking this question, I have no idea about these. Now I worked on structured data for some publishers.
The snippet you got is a script containing JSON-LD data format, a method of encoding Linked Data using JSON. Schema.org vocabulary is used to mark up web contents so that they can be understood by majors search engines (Google, Microsoft, Yandex and Yahoo!). Search engines use this information to display to display relevant .contents to users. For instance, you a website with a well-known term as it’s brand name e.g. Coder. Search engines will interpret it as someone who writes code for softwares. To help search engines interpret this better, you need to provide the data using Schema.org vocabulary.
e.g.
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "WebSite",
"url": "https://coder.com",
"name": “Coder”,
"description": “Platform to learn code”
}
</script>
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "WebSite",
"url": "http://website.com",
"name": "wbs",
"description": "Web Studio"
}
</script>
The snippet above is a JSON-LD based Structured Data Island (or Block) embedded in HTML that provides data to User Agents (Software Apps) for additional processing. This data can take the form of Metadata that informs said User Agents about the nature of the host document.
For instance, you can inform a User Agent such as Google's Crawler about the identity of a person associated with a document by embedding the following structured data island:
## JSON-LD Start ##
{
"#context": "https://schema.org",
"#type": "Person",
"#id": "https://stackexchange.com/users/74272/kingsley-uyi-idehen#me",
"mainEntityOfPage": "https://stackexchange.com/users/74272/kingsley-uyi-idehen",
"sameAs": "https://stackexchange.com/users/74272/kingsley-uyi-idehen",
"name": "Kingsley Uyi Idehen",
"description": "#kidehen Identified at Web, relatively"
}
## JSON-LD End ##
This is possible because the semantics that underly the schema:sameAs property deem it to be uniquely identifying.
You can also add a browser extension (e.g., our Structured Data Sniffer) to your existing browser(s) that understands the power of structured data islands deployed using , producing what's depicted in the attached screenshot.
I wrote this JavaScript code for users to write your brand name on Google.
The search form will be displayed to users.
Users only need to type in your brand name to display this search and it is most commonly used on the homepage.
To use this code copy the JavaScript and paste it at the bottom of the last line of the main content, and don't worry the JavaScript code will not be displayed to users and will only appear in Google results.
<script type="application/Id+json"> {
"#context": "schema.org",
"#type": "WebSite", "url": "coolernew.com", "potentialAction": {
"#type": "SearchAction", "target": "query.example.com/search?q={search_term_string}", "query-input": "required name=search_term_string"
}
} </script>

Hypermedia Api - Presenting picklist data

I am creating a hypermedia api that conforms to the HAL spec
When a user submits a payment they need to specify what type of card they are using (Visa, Master Card etc)
So for a particular field that is submitted there is a specific list of values that can be used
How do I present that pick list to the user?
As embedded data?
Is there generally a way to associate a field with a given set of data?
I realise the HAL spec is very small and doesnt cover this issue specifically. But in general hypermedia apis how do people usually present this data?
Or should I simply explain the field in the CURIE link?
thanks
You are right, HAL does not specifically cover this issue. You can solve this by essentially copying HTML. There are different widgets defined in HTML to present stuff, for example a combobox with listed options.
You can define a media-type that has similar controls in it, and you can define the processing model for the media-type as well. It can be a json representation of course, does not need to be xml.
For example
{
...
"cardType": {
"inputType": "select",
"possibleValues": ["Visa", "MasterCard", ... ]
}
...
}
There is no ready-made format that I know of unfortunately.

RDFa reference documents

I have to implement a Google Custom Search for a website. The website has different content types. One of them is a "publication". A publication consists of different fields:
Title
Author
Published date
Document
Document type
Document is the URL to a PDF, Text, MS-Word, etc. document. And Document type is, as you can expect, the document type (ie. PDF, DOC, TXT, etc).
I will need this information to be in the Rich Snippet because I have to format the search results differently for each document type (ie. include a different icon, etc).
What schema should I be using for that? I could not find information about how to structure data for that kind of content. Can I use anything from Schema.org? Or should I create my own? Any idea?
Thanks in advance for any input on that.
For customizing results snippets in Google Custom Search, it doesn’t matter which vocabulary you use in your RDFa. You could use an existing one (like Schema.org), or create your own, or use any combination of multiple vocabularies.
You can see the extracted structured data that can be used for this purpose in the Google Structured Data Testing Tool by clicking at "Custom Search Result Filters" (or by changing the results filter from "All data" to "Custom Search Engine").
You can fetch this structured data and create your own presentation layer.

JSON-LD Schema.org event info not being pulled into Google

I have a handful of WordPress websites that use The Events Calendar for displaying events that are open to the public.
I notice if I type a cities name and then the word event, that our website is not being pulled in to the special section that appear. Google uses its Knowledge Graph. I was looking through the source code and noticed that our sites uses JSON-LD, generated from the information used for the event, one of the methods Google talks about using, but don’t understand why our site information isn’t being shown.
These sites have been up a year and get 3k visits a month so they're being indexed fairly regularly.
I was looking through the event properties JSON-LD, and I noticed the entire event address (street, city, state zip) gets put inside the name property of the Place or Postal array (Heres a screenshot of my sites schema). When I look up other events that are pulled into Google, they list the those attributes in the address properties (Screenshot of other site’s schema).
I think because the address is put into the name property instead of the address property, that Google might not be showing the events. Has anybody else seen this happen with their sites? Or is something else wrong with the sites we set up?
Right now your events are marked up using the Google example, but I believe this is wrong:
https://developers.google.com/structured-data/rich-snippets/events
"location" : {
"#type" : "Place",
"sameAs" : "http://www.hi-dive.com",
"name" : "The Hi-Dive",
"address" : "7 S. Broadway, Denver, CO 80209"
}
2019 edit: The markup and URL above have since changed and match what is expected from the testing tool.
In order for your sites structured data to match that other event you have a screen shot of, you will need to adjust your JSON-LD to the way it's presented on schema.org, which uses PostalAddress and narrows down a little bit more:
https://schema.org/location (and https://schema.org/PostalAddress) - Click the JSON-LD example tabs
"location": {
"#type": "Place",
"name": "Withworth Institute",
"address": {
"#type": "PostalAddress",
"addressLocality": "Seattle",
"addressRegion": "WA",
"postalCode": "98052",
"streetAddress": "20341 Whitworth Institute 405 N. Whitworth"
},
"url": "wells-fargo-center.html"
}
I can't say for certain if this is the primary reason for your issue but I do think you should follow the schema.org approach either way. Even the Structured Data Tool per your screenshots seems to indicate that it's looking for postalAddress even though Google doesn't use that in the example.. perhaps that article is outdated.
I can confirm that a migration to JSON-LD from inline RDFa style schema, which validates 100% using their new rich snippet validator tool no longer shows Review stars in search results. They've also taken away the ability to see stars validate using old style RDFa schema validation.
This could be an issue with the search team not talking to the developers responsible for the structured data and schema tools, rolling out disjointed feature upgrades. Their recommended use of JSON-LD will likely have a negative impact on display in search in the near term if you'd like to see additional meta data populate in search results pages.
If meta data in search results is a firm requirement you could roll off your JSON-LD module and use a module with the older RDFa or microdata implementation inline in your HTML. Hopefully this will be remedied soon.