How do I query the RDFS file in Virtuoso? - sparql

The URI "https://www.w3.org/2000/01/rdf-schema#" is an RDF file with triples, prefix and other things.
So if I do this query, it should return the URIs of all the "things" that are Class:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select *
from <http://www.w3.org/2000/01/rdf-schema#>
where
{
?o rdf:type rdfs:Class.
} LIMIT 100 .
But running it in Virtuoso gives me an error.
I'm new with SPARQL and RDF, so could be that I'm sayng something so wrong.

In most RDF database systems ("triplestores"), such as Virtuoso, the URL you put in the FROM clause is not used to retrieve a file from the Web and read its contents. Instead, you usually have a database in which you previously loaded some RDF(S) data yourself, and the FROM clause in your query is used to identify a subset in that data (a so-called "named graph").
In order for your query to work, you should load the file at http://www.w3.org/2000/01/rdf-schema# into your Virtuoso store, making sure it uses the file location as the named graph identifier as well. You can use a SPARQL update command for this:
LOAD <http://www.w3.org/2000/01/rdf-schema#> INTO GRAPH <http://www.w3.org/2000/01/rdf-schema#>
After you've done that, the query should return results.
There are some SPARQL engines that do offer automatic retrieval of remote data like you expected (I know Eclipse RDF4J and Apache Jena Fuseki have options for this - not sure about Virtuoso) but it's not the 'typical' way to do SPARQL querying.

Related

MarkLogic seems not to follow the RDF specification

Write the following RDF data into MarkLogic:
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://John> <http://have> 10 .
<http://Peter> <http://have> "10"^^xsd:double .
Then execute the following query:
SELECT *
WHERE { ?s ?p 10. }
The query result is:
?s
?p
<http://John>
<http://have>
<http://Peter>
<http://have>
In RDF document, RDF term is equal if and only if the value and data type are both the same.
Literal term equality: Two literals are term-equal (the same RDF literal) if and only if the two lexical forms, the two datatype IRIs, and the two language tags (if any) compare equal, character by character. Thus, two literals can have the same value without being the same RDF term.
https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal
It seems that MarkLogic does not follow the RDF specification?
Expected query result:
The expected query result is:
?s
?p
<http://John>
<http://have>
Alternatives
Executing the SparQL query in Query Console and with Java Client API will both get the unexpected query result.
This query can return the expected query result in Apache Jena and RDF4j.
Can someone give me an answer or a hint about it?
A triple store that does that is not automatically wrong.
Try:
{ ?s ?p ?x . FILTER (sameTerm(?x, 10)}
{ ?s ?p ?x . FILTER (?x = 10) }
and also
Also try:
with 010 and +10, "010^^xsd:integer
and explore the lexical forms:
FILTER(str(?x) = "10")
The use cases for both value-based matching and term-based matching exist.
SPARQL query does not say whether the data loaded is exactly the data queried (by design). Several store do variations of value-processing as default behaviour. Jena does for TDB2 but keeps datatypes apart (TDB1 conflating datatypes was less popular).
It depends on what use case they are targetting. If you want a particular effect, then use a FILTER form of the query assuming data was not pre-processed on loading.
See the answer on Unexpected data writing result in MarkLogic
See https://www.w3.org/TR/sparql11-entailment/#CanonicalLit for the discussion that D-entailment gives many, many answers even with a restriction to canonical lexical forms.

Why different endpoints do not query same datasets?

I would like to query datasets such as FOAF and DBPedia. The aim is to run quite simple requests such as “Which paintings did Magritte painted ?”,”Which are the American actor who played in American movies ?” …
So I wrote my queries, and used DBpedia snorql to run them. Then, for some other reasons, I tried Live DBpedia and OpenLinks demo.openlinksw.com to discover that the results were different according to the endpoint.
Here are 2 examples :
1) Answer with DBpedia SnorQL but neither Live DBpedia nor OpenLinks demo.openlinksw.com
#works of Magritte
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT * WHERE {
?person a dbo:Artist .
?person foaf:surname "Magritte"#en .
?work dbo:author ?person .
OPTIONAL {?work dbp:year ?year ; dbo:museum ?museum .}
}
ORDER BY ?year
2) Answer with Live DBpedia but neither DBpedia SnorQL nor OpenLinks demo.openlinksw.com
#american actors from Willem Robert van Hage R tutorial
SELECT ?actor ?movie ?director ?movie_date
WHERE {
?m dc:subject <http://dbpedia.org/resource/Category:American_films> .
?m rdfs:label ?movie .
FILTER(LANG(?movie) = "en")
?m dbp:released ?movie_date .
FILTER(DATATYPE(?movie_date) = xsd:date)
?m dbp:starring ?a .
?a rdfs:label ?actor .
FILTER(LANG(?actor) = "en")
?m dbp:director ?d .
?d rdfs:label ?director .
FILTER(LANG(?director) = "en")
}
LIMIT 1000
I thought an endpoint was simply a tool to query dataset whatever it is. So I thought that you can query DBPedia and FOAF from dbpedia, live dbpedia or openlinks demo.openlinksw.com ..
I read that actually different endpoints use different datasets but I can’t get why, as you give specific URI to reach.
Why do same query returns different results according to the SPARQL endpoint ?
Much like different instances of SQL DBMS (such as [in no particular order and without implication of endorsement] OpenLink Virtuoso, Oracle, MySQL, Informix, SQL Server, Sybase, DB2, PostgreSQL, Ingres, Progress OpenEdge, and many others) hold different data, different instances (read: SPARQL endpoints) of RDF RDBMS, also known as Quadstores or Triplestores (such as [in no particular order and without implication of endorsement] OpenLink Virtuoso, AllegroGraph, Stardog, Neo4J, MarkLogic, and many others) hold different data.
You cannot query Joe's database in DBMS A through Fred's database in DBMS B -- unless someone has already told Fred's database and/or DBMS about Joe's database and/or DBMS (e.g., VDBMS functionality), or you include some information about Joe's database and/or DBMS in your query (e.g., SPARQL Federation), etc.
(A "DBMS" is a Database Management System, such as listed above. A "database" is a collection of data, typically stored in a [large] document, which is managed by a DBMS.)
Of particular note relative to your question --
FOAF is an ontology, a vocabulary, which is used to describe entities.
DBpedia is a dataset (which has had various versions over time), and a project, and an organization, and various other things (the ambiguity of literal identifiers!).
OpenLink Software (not "openlinks") is a company which produces OpenLink Virtuoso, among other data-related software and services, and which provides a number of live endpoints on the web -- including the main DBpedia endpoint. (ObDisclaimer: OpenLink Software is also my employer.)

SPARQL Query of FOAF doesn't return any result

I want to get some information from the FOAF ontology. I tried the following SPARQL query, but it returns no results. I tried this query to get familiar with FOAF, but what I really want to do is to find all the people that a particular person ?x knows (using the property foaf:knows). How do I do this?
PREFIX foaf:<http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE { ?x foaf:name ?name .
?x foaf:mbox ?mbox .
}
Semantic web is made of different components.
Knowledge is represented as RDF triples. These triples describes Resources based on a Subject - Predicate - Object syntax. For example, "John is a Male" may be represented as a RDF triple.
On top of RDF, we may use RDFS and OWL to specify restrictions and other information on these data. Thanks to RDFS, I can specify that "Male is a subclass of Person" and it is therefore possible to infer that "John is a Person". RDFS and OWL helps to define ontologies. An ontology is a vocabulary (that can be general or specific to a domain) to represents data. For example, I may want to create an ontology CAT to represent data on cats.
In that case, I would create my CAT vocabulary defining that "Cat is a subclass of Animal" and "hasOwner is a property that links a cat to a Person" and some other properties. Then, I am able to instantiate some individuals to create data on cats. For example by saying that "Baccara is a Cat" and "Baccara hasOwner John".
FOAF is basically a vocabulary to represent data on people and especially links between these people. FOAF vocabulary gives some properties and classes to handle easily information on people. But it doesn't provide any piece of information, only the "structure"/"model"/"schema" to organize information.
There are no individuals in the FOAF dataset. That is why your query returns no result. Since there's no people in the FOAF dataset, it is normal that the query returns nothing.
You may want to build your own RDF dataset based on FOAF vocabulary. To do so, you can try a tool like Protégé, or more easily with a text editor if you're familiar with RDF/XML or Turtle.
Otherwise, if you only need to get familiar with FOAF, you can query the model. For example, you may want to get all the subclasses of Agent :
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT distinct ?c
WHERE { ?c rdfs:subClassOf foaf:Agent }
I recommend you to read a bit on the semantic web components (especially RDF and RDFS, and differences between them) before going any further in FOAF. Plus, a nice exercise to learn SPARQL consists in querying DBpedia: http://dbpedia.org/sparql.

Sesame not inferencing owl:sameAs

I have some data on vaccines in my Sesame triplestore. To the same store, I added additional data about vaccines from DBpedia.
<http://dbpedia.org/resource/Rotavirus_vaccine>
dbpedia2:routesOfAdministration "oral"#en
To specify that a particular vaccine in my native data is the same entity as the subject of the imported data from DBpedia, I inserted an owl:sameAs statement linking the two entities.
my_ns:Rota owl:sameAs <http://dbpedia.org/resource/Rotavirus_vaccine> .
Though that single triple has been added, I find no additional inferencing. For instance, I want this query to give me the route of administration of the vaccine in my native data by inferencing the property of the vaccine entity in DBpedia:
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX my_ns: <http://purl.org/net/ontology/my_ns/>
select ?roa where
{my_ns:Rota dbpedia2:routesOfAdministration ?roa}
At present, executing the query doesn't yield any results. I'd like the system to infer the following as the output of the query above:
my_ns:Rota dbpedia2:routesOfAdministration "oral"#en .
I installed GraphDB-Lite(OWLIM) by replacing the war files and verified that owl:sameAs works by executing a query on DBpedia.
The Sesame in-memory and native stores do not support OWL reasoning out of the box. They do offer (optional) support for RDFS reasoning (so understanding rdfs:subClassOf etc), which can be enabled at repository creation time (in the workbench, this is the dropdown option 'Memory/Native Store RDF Schema'). However, owl:sameAs is of course not part of RDFS reasoning.
Sesame also supports a custom graph query reasoner on top of the memory or native stores. This custom reasoner can be configured with your own inference rule, formulated as a combination of two SPARQL CONSTRUCT queries: a 'rule' query that expresses the actual inference rule, and a 'match' query that is used to do maintenance on the inferred statements when the store is updated. More explanation on how to set this up can found in the section on Repository creation in Programming with Sesame. The option in the Workbench is "Memory/Native store Custom Graph Query Inference".
In the case of owl:sameAs, a custom rule to support it would look roughly like this:
CONSTRUCT { ?s1 ?p1 ?o1 . ?o1 ?p2 ?o3 }
WHERE {
?o1 owl:sameAs ?o2 .
OPTIONAL { ?s1 ?p1 ?o2 . }
OPTIONAL { ?o2 ?p2 ?o3 . }
}
If your goal is purely to have owl:sameAs reasoning, this might be a simple way to enable it. However, for more comprehensive OWL reasoning support, the custom reasoner is not sufficiently powerful or scalable. Instead, you should use a Sesame backend store that has built-in support for it, such as Ontotext GraphDB (formerly known as OWLIM).
Solved the problem. The issue was the absence of GraphDB-Lite (formerly OWLIM-Lite). I was under the impression I had it installed by replacing the .war files. However, the absence of OWLIM-Lite option in the drop down while creating a new repository indicated that it had not been installed.
When I initially checked wherether owl:sameAs queries were working, I used the SERVICE clause in SPARQL to query DBpedia. As I was querying DBpedia (that supports owl:sameAs), the queries were being executed as I was essentially querying outside Sesame.
I solved the problem by deleting the old .war files and their corresponding folders in Tomcat, and copying the .war files from GraphDB distribution. When running the server for the first time after copying the files, the corresponding folders (openrdf-sesame and openrdf-workbench) are auto-generated. While creating a repository, the OWLIM-Lite option is then available.
I created an OWLIM-Lite repository and added the triples there. The owl:sameAs inferencing then started working and the query in the question was successfully executed.

SPARQL query returns no results when a property path is present

The following query returns some results which have skos:broader set as category:History
select ?subject
where
{
?subject skos:broader category:History .
}
However replacing skos:broader with skos:broader+ or skos:broader* returns no results. Why is this? I would expect ethier to fetch at least the results returned in the first query.
I'm using the SPARQL front end here: http://dbpedia.org/sparql
Virtuoso (the endpoint that DBpedia uses) has some idiosyncrasies, supports some non-standard syntax (which often leads people to wonder why a query that worked on DBpedia doesn't work with other libraries), and (I think) doesn't support all of SPARQL 1.1. This may be a case where you've run into some internal limitations. You can approximate the results that you want with a query like the following, though:
select ?category { ?category skos:broader{,7} category:History }
This only follows paths of length seven or less. The {m,n} notation for property paths isn't part of SPARQL 1.1, but was in early drafts, and Virtuoso supports it. It is convenient for limiting the resources used in answer a query, and this is a good use case for it.