Infer data with SPARQL in Protege - sparql

I'm trying to get my head around inferring RDF data. Say that I have these triples (RDF Turtle), which I created using Protege:
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
:hasSpouse rdf:type owl:ObjectProperty ,
owl:SymmetricProperty ;
rdfs:domain :People ;
rdfs:range :People .
:People rdf:type owl:Class .
:Jane_Doe rdf:type owl:NamedIndividual ,
:People .
:John_Doe rdf:type owl:NamedIndividual ,
:People ;
:hasSpouse :Jane_Doe .
The reasoner in Protege will kindly highlight the expected inference, that is :Jane_Doe :hasSpouse :John_Doe.
How can I see that inference with SPARQL? If I run this query in Protege (SPARQL tab):
SELECT ?subject
WHERE {?subject hasSpouse ?object .}
It shows the asserted triple, not the inferred one. I understand how to do it manually, e.g. :
CONSTRUCT {?object ?prop ?subject }
WHERE { ?prop rdf:type owl:SymmetricProperty .
?subject ?prop ?object .}
I'd see now the inferred data I'm expecting but 1) that would be losing the point imho (i.e; reinventing the wheel) 2) I cannot have 2 queries in this tab (construct, then select). There's got to be a way to do this automatically, just like the reasoner did.
I read in Stack Overflow a post saying to use 'Snap SPARQL' plugin in Protege. I tried but simple queries don't work (like the first one above). It's like it's a different language. How does it work?
So, how can I get the benefit of these owl properties with SPARQL? How can I have an OWL-aware SPARQL in Protege? Am I taking this the wrong way? What's the right way?
thanks for your help.
Nicolas

You need to make your inferences a part of your knowledge.
To do so, go to the SWRL Tab and click successively on the button
at the bottom of that Tab, start from the left to the right.

Related

Graphdb seem to be missing rdfs:subClassOf inference

I am using the RDFS-Plus (Optimized) ruleset. According to this: https://docs.cambridgesemantics.com/anzograph/v2.2/userdoc/inferences.htm
If something is of type owl:Class it should be inferred it is an rdfs:subClassof owl:Thing.
If I run the query
PREFIX owl: <http://www.w3.org/2002/07/owl#>
select ?s ?p where {
?s ?p owl:Thing .
}
I only get results for when I have explicitly stated that a class is an rdfs:subClassOf owl:Thing
Can someone explain what I am missing and why this does not seem to work as expected?
I looked at the .pie rule files located in $GDB_HOME/configs/rules and found that the graphdb rulesets do not match the link in the question. The least complex ruleset (according to https://graphdb.ontotext.com/documentation/free/rules-optimisations.html#hints-on-optimizing-graphdb-s-rulesets) that provides the functionality needed is owl2-ql-optimized.

Removing unwanted superclass answers in SPARQL

I have an OWL file that includes a taxonomic hierarchy that I want to write a query where the answer includes each individual and its immediate taxonomic parent. Here's an example (the full query is rather messier).
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix rdf: <http:://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix : <urn:ex:> .
:fido rdf:type :Dog .
:Dog rdfs:subClassOf :Mammal .
:Mammal rdfs:subClassOf :Vertebrate .
:Vertebrate rdfs:subClassOf :Animal .
:fido :hasToy :bone
:kitty rdf:type :Cat .
:Cat rdfs:subClassOf :Mammal .
:kitty :hasToy :catnipMouse .
And this query does what I want.
prefix rdf: <http:://www.w3.org/1999/02/22-rdf-syntax-ns#> .
prefix : <urn:ex:> .
SELECT ?individual ?type
WHERE {
?individual :hasToy :bone .
?individual rdf:type ?type .
}
The problem is that I'd rather use a reasoned-over version of the OWL file, which unsurprisingly includes additional statements:
:fido rdf:type :Mammal .
:fido rdf:type :Vertebrate .
:fido rdf:type :Animal .
:kitty rdf:type :Mammal .
:kitty rdf:type :Vertebrate .
:kitty rdf:type :Animal .
And now the query results in additional answers about Fido being a Mammal, etc. I could just give up on using the reasoned version of the file, or, since the SPARQL queries are called from java, I could do a bunch of additional queries to find the least inclusive type that appears. My question is whether there is a reasonable pure SPARQL solution to only returning the Dog solution.
A generic solution is that you make sure you ask for the direct type only. A class C is the direct type of an instance X if:
X is of type C
there is no C' such that:
X is of type C'
C' is a subclass of C
C' is not equal to C
(that last condition is necessary, by the way, because in RDF/OWL, the subclass-relation is reflexive: every class is a subclass of itself)
In SPARQL, this becomes something like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <urn:ex:> .
SELECT ?individual ?type
WHERE {
?individual :hasToy :bone .
?individual a ?type .
FILTER NOT EXISTS { ?individual a ?other .
?other rdfs:subClassOf ?type .
FILTER(?other != ?type)
}
}
Depending on which API/triplestore/library you use to execute these queries, there may also be other, tool-specific solutions. For example, the Sesame API (disclosure: I am on the Sesame dev team) has the option to disable reasoning for the purpose of a single query:
TupleQuery query = conn.prepareTupleQuery(SPARQL, "SELECT ...");
query.setIncludeInferred(false);
TupleQueryResult result = query.evaluate();
Sesame also offers an optional additional inferencer (called the 'direct type inferencer') which introduces additional 'virtual' properties you can query, such as sesame:directType, sesame:directSubClassOf, etc. Other tools will undoubtedly have similar options.

OWL reasoning: Necessary and sufficient conditions for inferring a property

We are trying to get a reasoner (e.g. HermiT in Protege) to infer that a more specific sub-property can be used instead of the asserted general property.
Classes:
- Patient
- Finding
- Dyspnea
- ObservationStatus
- Inclusion
- Exclusion
Properties:
- has_finding (domain: Patient, range: Finding)
- has_positive_finding (domain: Patient, range: Finding, Inclusion)
- has_negative_finding (domain: Patient, range: Finding, Exclusion)
If we assert the following triples:
:Patient1 a :Patient .
:Patient1 :has_negative_finding :Dyspnea1 .
A reasoner can infer (among other things) that:
:Dyspnea1 a :Finding .
:Dyspnea1 a :Exclusion.
But when we look at it the other way around and assert:
:Patient1 a :Patient .
:Dyspnea1 a :Dyspnea .
:Dyspnea1 a :Exclusion .
:Patient1 :has_finding :Dyspnea1.
We would like the reasoner to infer that:
:Patient1 :has_negative_finding :Dyspnea1 .
We cannot seem to get Protege and HermiT to draw that conclusion and infer the triples.
What are we missing? Are the conditions not necessary and sufficient for it to infer that knowledge?
:Patient1 a :Patient .
:Dyspnea1 a :Dyspnea .
:Dyspnea1 a :Exclusion .
:Patient1 :has_finding :Dyspnea1.
We would like the reasoner to infer that:
:Patient1 :has_negative_finding :Dyspnea1 .
… What are we missing? Are the conditions not necessary and sufficient
for it to infer that knowledge?
There are a few issues here.
First, you haven't said that every has_finding actually corresponds to one of the subproperties. That is, just because something has as finding, you don't know that it also has a negative or a positive finding. The finding could just be a general finding, without being one of the more specific ones.
Second, the more specific type of the object doesn't mean that you have to use the more specific property.
Third, even if you state that the finding is an exclusion, if you don't know that exclusions are disjoint from inclusions, you could still have the finding be both a positive and negative finding.
Now, what it'd be really nice to do would be to state that has_finding is the union of has_negative_finding and has_positive_finding, and then declare inclusion and exclusion disjoint. Then every instance of a finding would have to be one or the other, and you could make your inference.
Since you can't do that, you'll need some sort of alternative. If you're using individuals as per-person diagnoses, then you could say that every finding is either a negative finding or a positive finding with an axiom like
(inverse(hasFinding) some Patient) subClass ((inverse(hasNegativeFinding) some Patient) or (inverse(hasPositiveFinding) some Patient))
along with making hasFinding inverse functional, so that each finding is associated with at most one patient. Then you'd have an ontology like this:
#prefix : <http://stackoverflow.com/a/30903552/1281433/> .
#prefix a: <http://stackoverflow.com/a/30903552/1281433/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
a:Exclusion a owl:Class ;
rdfs:subClassOf a:Finding .
a:hasNegativeFinding a owl:ObjectProperty ;
rdfs:range a:Exclusion ;
rdfs:subPropertyOf a:hasFinding .
_:b0 a owl:Restriction ;
owl:onProperty a:hasPositiveFinding ;
owl:someValuesFrom a:Inclusion .
a: a owl:Ontology .
[ a owl:Restriction ;
rdfs:subClassOf [ a owl:Class ;
owl:unionOf ( _:b1 _:b0 )
] ;
owl:onProperty a:hasFinding ;
owl:someValuesFrom a:Finding
] .
a:Finding a owl:Class ;
owl:disjointUnionOf ( a:Finding a:Inclusion ) .
a:patient1 a owl:Thing , owl:NamedIndividual ;
a:hasFinding a:dyspnea1 .
_:b1 a owl:Restriction ;
owl:onProperty a:hasNegativeFinding ;
owl:someValuesFrom a:Exclusion .
a:hasFinding a owl:ObjectProperty .
a:Inclusion a owl:Class ;
rdfs:subClassOf a:Finding .
a:hasPositiveFinding a owl:ObjectProperty ;
rdfs:range a:Inclusion ;
rdfs:subPropertyOf a:hasFinding .
a:dyspnea1 a owl:NamedIndividual , a:Exclusion .
and you can get results like this:

add RDFS inference rules support in endpoint SPARQL

I have create an endpoint SPAQL on OpenLink Virtuoso.
All work well, but i have to access on the data in a Container, in particular a rdf:Seq.
I have a Seq like this:
<myrdf:has_stoptimes>
<rdf:Seq rdf:about="http://test.com/343">
<rdf:li>
<myrdf:StopTime rdf:about="http://test.com/StopTime/434">
...
</ns0:StopTime>
</rdf:li>
<rdf:li>
<myrdf:StopTime rdf:about="http://test.com/StopTime/435">
...
</ns0:StopTime>
</rdf:li>
</rdf:Seq>
Now i see that to access data in a container i can use rdfs:member or FILTER (strstarts(str(?prop), str(rdf:_)) how is explained here
But for my project i have to adopt the first solution because i'm working with Silk and i will use the code syntax like ?a/myrdf:has_stoptimes/rdfs:member without use of "complex" filter.
I have tried to follow this guide but querying the endpoint nothing work how i hoped.
So my question is: how can i query ?a/myrdf:has_stoptimes/rdfs:member on a Virtuoso endpoint SPARQL?Which inference rule i have to add in endpoint SPARQL?
Thank you in advance
UPDATE
I have created the following inference rules in Virtuoso:
ttlp (' #prefix rdfs: .
#prefix rdf: .
rdfs:Container rdf:type rdfs:Class ; rdfs:subClassOf rdfs:Resource .
rdfs:ContainerMembershipProperty a rdfs:Class ; rdfs:subClassOf rdf:Property .
rdf:Seq rdf:type rdfs:Class ; rdfs:subClassOf rdfs:Container .
rdfs:member rdf:type rdf:Property ; rdfs:domain rdfs:Resource ; rdfs:range rdfs:Resource .
', '', 'http://localhost:8890/schema/test') ;
Nothing work querying the SPARQL endpoint like:
define input:inference "http://localhost:8890/schema/property_rules1"
SELECT *
FROM
WHERE {?sep a rdf:Seq.
?seq rdfs:member ?p}
After i tried adding the follow line to the ttl file: rdf:_1 rdfs:subPropertyOf rdfs:member . In this way it work but obviously the results are only for the first element of the container. So is very unconvenient add a line for all of rdf:_n, and i think this is only a temporary solution, it is not correct.
I have tried to add an RDF dump on SILK 2.6.1, and on the section SPARQL of the data source if i run the query:
SELECT *
FROM
WHERE {?sep a rdf:Seq.
?seq rdfs:member ?p}
I obtain the correct result, without specify any inference rules. So i think that in this functionality of SILK there is something that i’m missing in my endpoint SPARQL or am i saying nonsense things?
You can't use variables in property paths, so you can't actually do
?x ?a/has_stoptimes/rdfs:member ?y
Instead, you have to use another variable or blank node in between:
?x ?a ?z . ?z has_stoptimes/rdfs:member ?y
?x ?a [ has_stoptimes/rdfs:member ?y ] .

Extracting hierarchy for dbpedia entity using SPARQL

I am trying to extract the hierarchy of Wikipedia category or Yago classification for DBpedia resources using the SPARQL endpoint. For instance, I would like to find out all the possible categories and classes in hierarchical form of entity, say, http://dbpedia.org/resource/Nokia, like Thing → Organization → Company → … → Nokia.
A simple SPARQL select can retrieve the information that you're interested in, though it won't be arranged hierarchically. You're interested in getting all the types of a resource, as well as the rdfs:subClassOf relations between them. Here's a very simple query for Nokia that can be run on the DBpedia SPARQL endpoint
SELECT * WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
SPARQL results
If you treat each pair of classes in that result set as a directed edge and perform a topological sort , then you'll see the hierarchy of the classes to which the Nokia resource belongs. In fact, since it is probably convenient to treat this as a graph, you can get it in the form of an RDF graph by using a SPARQL construct query.
CONSTRUCT WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
SPARQL results
The construct query produces this graph (in N3 format):
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix dbpedia-owl: <http://dbpedia.org/ontology/> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix yago: <http://dbpedia.org/class/yago/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix dbpedia: <http://dbpedia.org/resource/> .
dbpedia-owl:Agent rdfs:subClassOf owl:Thing .
dbpedia-owl:Company rdfs:subClassOf dbpedia-owl:Organisation .
dbpedia-owl:Organisation rdfs:subClassOf dbpedia-owl:Agent .
yago:CompaniesBasedInEspoo rdfs:subClassOf yago:Company108058098 .
dbpedia:Nokia rdf:type yago:CompaniesListedOnTheHelsinkiStockExchange ,
owl:Thing ,
yago:CompaniesBasedInEspoo ,
dbpedia-owl:Agent ,
yago:DisplayTechnologyCompanies ,
yago:ElectronicsCompaniesOfFinland ,
dbpedia-owl:Company ,
dbpedia-owl:Organisation ,
yago:Company108058098 ,
yago:CompaniesEstablishedIn1865 .
yago:CompaniesEstablishedIn1865 rdfs:subClassOf yago:Company108058098 .
yago:CompaniesListedOnTheHelsinkiStockExchange rdfs:subClassOf yago:Company108058098 .
yago:DisplayTechnologyCompanies rdfs:subClassOf yago:Company108058098 .
yago:ElectronicsCompaniesOfFinland rdfs:subClassOf yago:Company108058098 .
Remarks
The queries above retrieve the rdf:type hierarchy for Nokia. In the question, you also mention Wikipedia categories. DBpedia resources are associated with the Wikipedia categories to which their corresponding articles belong by the dcterms:subject property. Those Wikipedia categories are then structured hierarchically by skos:broader. These really are not types for the individuals though. For instance, the data contain:
dbpedia:Nokia dcterms:subject category:Finnish_brands
category:Finnish_brands skos:broader category:Brands_by_country
While it probably makes sense to say that Nokia is a Finnish_brand, it makes much less sense to say that Nokia is a Brand_by_country.