Ontologies on linked open data cloud - sparql

I am currently working on the Linked Open Data cloud and would like to know whether it is possible to have the ontologies of datasets present in the LOD cloud.

Publishing in the Linked Open Data cloud is as easy as making your data publicly available for example as RDF, as HTML, via a SPARQL endpoint, ...
To spread awareness about your data, so that people can use it and link to it, describe your data according to Guidelines for Collecting Metadata on Linked Datasets in CKAN and add it to datahub.io (a canonical CKAN installation).
For an intro about publishing open data have a look at Richard Cygniak's presentation. A comprehensive guide is the Linked Data: Evolving the Web into a Global Data Space book.
Linked Open Data normally contain both ontologies and instance data. I don't think that the ontologies of LOD datasets are available separately. There are various ontology repositories: Where can I find useful ontologies? Ontology repositories list at W3C but none of them are really comprehensive and really well maintained as far as I know.
For most datasets, you should be able to get the ontology by dereferencing its namespace URL. For example if I take the arbitrary LOD dataset http://datahub.io/dataset/oferta-empleo-zaragoza/resource/6704184d-42e3-4778-a576-826f1d3672e2 I can see that it starts with:
<rdf:RDF
xmlns:j.0="http://purl.org/dc/terms/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:j.1="http://purl.org/ctic/empleo/oferta#" >
So if you point your browser to http://purl.org/ctic/empleo/oferta# it will redirect to http://data.fundacionctic.org/vocab/empleo/oferta.html which is the HTML description of the ontology which also links to an RDF representation: http://data.fundacionctic.org/vocab/empleo/oferta_20130903.rdf
The namespace URL has content-negotiation configured so you can get the ontology in RDF directly if you set the header "Accept: application/rdf+xml" for the get request:
curl -LH "Accept: application/rdf+xml" http://purl.org/ctic/empleo/oferta#

Related

Python / rdflib HTTP server for sparql endpoint

Is there a capability for or example of creating a Sparql HTTP endpoint with rdflib? We would want it to follow the spec and be able to return json and/or csv formats. This would mostly be for POC usage. It would also be possible to use Javascript/Node.
Thanks!
You might try https://github.com/rdflib/pyLDAPI. It's been touched much more recently than https://github.com/RDFLib/rdflib-web and there are some public examples of it to follow, e.g. https://geofabricld.net. Also, the SKOS-specific tool VocPrez uses it under the hood.
As of earlier this year, pyLDAPI implements the W3C's Content Negotiation by Profile specification which is, I suppose, the latest an greatest Linked Data API-relevant specification, although it's not just for Linked Data APIs.
Feel free to contact me directly if you need more of a hand with this.

How to restrict publishing of RDF graphs on the Semantic Web?

I am trying to create a sample ontology with some dummy data using protege 5.5. But in the owl file it generated, it is showing something like this:
<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.semanticweb.org/hs/ontologies/2019/3/untitled-ontology-3#"
xml:base="http://www.semanticweb.org/hs/ontologies/2019/3/untitled-ontology-3"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<owl:Ontology rdf:about="http://www.semanticweb.org/hs/ontologies/2019/3/untitled-ontology-3"/>
Seems like this data can be accessed publicly (http://www.semanticweb.org/hs/ontologies/2019/3/untitled-ontology-3). I dont wish to publish my data on the Semantic Web. Is there any way to privatise these datas? Could not find the answer on the web.
No, just because a URI shows up in an OWL or RDF file doesn't mean the data is publicly accessible. A local file on your computer is just a local file, until you upload it to a server somewhere.
OWL and RDF use URIs mainly just as identifiers—that is, as names that allows different programs and people to work out whether they are talking about the same thing. So, if your ontology and my ontology use the same URI for some entity, we know that we are talking about the same entity. This doesn't mean your ontology or my ontology are publicly accessible, and works even if we keep both ontologies private.
By convention, the owner of a URI gets to decide what entity to use a URI for. This ensures that there are no accidental clashes. Ownership of URIs is based on domain names. For example, the owner of the domain dbpedia.org (the DBpedia project) has decided that http://dbpedia.org/resource/London is a URI that names the city of London. They also happen to publish some data about London at that URI, which is a good way of letting the world know what the URI identifies.
Protégé is actually a bad citizen of the web by encouraging people to use URIs on a domain they don't own (www.semanticweb.org).
If you don't own a domain, you can use http://example.org/ for local experiments and private use, because that domain is explicitly allowed to be used by anyone. But if you actually decide to publish your ontology/data at some point, then you should change to a real domain.

How do I find data sets using a particular Schema.org entity?

I am trying to decide whether to use schema.org entities in my own open source app, for potential compatibility with existing open data sets. So I'm looking for usage of relevant schema.org entities "in the wild".
Right now I'm looking for dietary supplement data, IE http://schema.org/DietarySupplement, or http://health-lifesci.schema.org/DietarySupplement
I've been searching for semantic web search engines, and have only found Swoogle, but I get no results for that URI, or "service temporarily unavailable".
The DietarySupplement page on schema.org says that "between 10 and 100" domains are using this entity. Is that talking about DNS, abstract domains that are defined on Schema.org, abstractions defined elsewhere, or something else?
There are only a couple of other resources I can find on this subject.
Web Data Commons - RDFa, Microdata, and Microformat Data
Sets
BuiltWith trends - Microdata Usage Statistics

DBpedia SPARQL endpoints for past versions

Are there any DBpedia SPARQL endpoints that allow to send a query to the past versions of DBpedia?
I don't think that endpoints for different versions of the data exists. At the very least, the DBpedia wiki article, Accessing the DBpedia Data Set over the Web, doesn't seem to mention it, and I think that would be the authoritative resource. However, the Downloads page does list a bunch of old datasets that you can download, so if you have a fair amount of space to set up a SPARQL endpoint (e.g., with Jena Fuseki or Virtuoso), you can set one up for yourself.
Another option is to use the Triple Pattern Fragments endpoints for the previous DBpedia versions. Those are provided by the Linked Data Fragments Archive.
Here are the endpoints,
http://fragments.mementodepot.org/dbpedia_2_0
http://fragments.mementodepot.org/dbpedia_3_0
http://fragments.mementodepot.org/dbpedia_3_1
...
http://fragments.mementodepot.org/dbpedia_3_9
http://fragments.mementodepot.org/dbpedia_2014
http://fragments.mementodepot.org/dbpedia_2015
You can use the Linked Data Fragments client with one of the above endpoint as the data source to query previous version of DBpedia.
Please note that there are some limitations with respect to the SPARQL features are supported by the LDF client.

Semantic store and entity hub

I am working on a content platform that should provide semantic features such as querying with SPARQL and providing rdf documents for the contained content.
I would be very thankful for some
clarification on the following
questions:
Did I get that right, that an entity
hub can connect several semantic
stores to a single point of access?
And if not, what is the difference
between a semantic store and an
entity hub?
What frameworks would you use to
store content documents as well as
their semantic annotation?
It is important for the solution to be able to later on retrieve the document (html page / docs such as pdf, doc,...) and their annotated version.
Thanks in advance,
Chris
The only Entityhub term that I know is belong to Apache Stanbol project. And here is a paragraph from the original documentation explaining what Entityhub does:
The Entityhub provides two main services. The Entityhub provides the
connection to external linked open data sites as well as using indexes
of them locally. Its services allow to manage a network of sites to
consume entity information and to manage entities locally.
Entityhub documentation:
http://incubator.apache.org/stanbol/docs/trunk/entityhub.html
Enhancer component of Apache Stanbol provides extracting external entities related with the submitted content using the linked open data sites managed by Entityhub. These enhancements of contents are formed as RDF data. Then, it is also possible to store those content items in Apache Stanbol and run SPARQL queries on top of RDF enhancements. Contenthub component of Apache Stanbol also provides faceted search functionality over the submitted content items.
Documentation of Apache Stanbol:
http://incubator.apache.org/stanbol/docs/trunk/
Access to running demos:
http://dev.iks-project.eu/
You can also ask your further questions to stanbol-dev AT incubator.apache.org.
Alternative suggestion...
Drupal 7 has in-built RDFa support for annotation and is more of a general purpose CMS than Semantic MediaWiki
In more detail...
I'm not really sure what you mean by entity hub, where are you getting that definition from or what do you mean by it?
Yes one can easily write a system that connects to multiple semantic stores, given the context of your question I assume you are referring to RDF Triple Stores?
Any decent CMS should be assigning documents some form of unique/persistent ID to documents so even if the system you go with does not support semantic annotation natively you could build your own extension for this. The extension would simply store annotations against the documents ID in whatever storage layer you chose (I'd assume a Triple Store would be appropriate) and then you can build appropriate query and presentation layers for querying and viewing this data as required.
http://semantic-mediawiki.org/wiki/Semantic_MediaWiki
Apache Stanbol
Do you want to implement a traditional CMS extended with some Semantic capabilities, or do you want to build a Semantic CMS? It could look the same, but actually both a two completely opposite approaches.
It is important for the solution to be able to later on retrieve the document (html page / docs such as pdf, doc,...) and their annotated version.
You can integrate Apache Stanbol with a JCR/CMIS compliant CMS like Alfresco. To get custom annotations, I suggest creating your own custom enhancement engine (maven archetype) based on your domain and adding it to the enhancement engine chain.
https://stanbol.apache.org/docs/trunk/components/enhancer/
One this is done, you can use the REST API endpoints provided by Stanbol to retrieve the results in RDF/Turtle format.