Validating that every subject has a type of class - sparql

I have the following Data & Shape Graph.
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes hr: ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:property [
sh:path rdf:type ;
sh:nodeKind sh:IRI ;
sh:hasValue rdfs:Class
] ;
.
Using pySHACL:
import rdflib
from pyshacl import validate
full_graph = open( "/Users/jamesh/jigsaw/shacl_work/data_graph.ttl", "r" ).read()
g = rdflib.Graph().parse( data = full_graph, format = 'turtle' )
report = validate( g, inference='rdfs', abort_on_error = False, meta_shacl = False, debug = False )
print( report[2] )
What I think should happen is the SPARQL based target should select every subject in the Data Graph and then verify that there is a path of rdf:type which has a value of rdfs:Class.
I get the following result:
Validation Report
Conforms: True
The expected validation errors should include only the following subjects:
| <http://learningsparql.com/ns/humanResources#BadOne> |
| <http://learningsparql.com/ns/humanResources#BadTwo> |
| <http://learningsparql.com/ns/humanResources#BadThree> |
| <http://learningsparql.com/ns/humanResources#AnotherName> |
| <http://learningsparql.com/ns/humanResources#name> |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> |
Is this possible with SHACL? If so, what should the shape file be?

What follows results in the expected validation errors, however, there are still several things I do not understand.
The sh:prefixes hr: ; is not needed. It is designed to supply prefixes for the SPARQL target SELECT statement itself and nothing more.
Inference needed to be disabled. It was inserting triples and trying to validate them. In this use case, that is not what is desired. What should be validated is what is in the schema and nothing else.
I was also thinking that it would not be an issue to put everything into a single graph based on what apparently was a misunderstanding of https://github.com/RDFLib/pySHACL/issues/46.
graph_data = """
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""
shape_data = '''
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes hr: ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:property [
sh:path ( rdf:type [ sh:zeroOrMorePath rdf:type ] ) ;
sh:nodeKind sh:IRI ;
sh:hasValue rdfs:Class
] ;
.
'''
data = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )
report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )
An alternative using a SPARQL based constraint would look like:
graph_data = """
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""
shape_data = '''
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:sparql [
a sh:SPARQLConstraint ;
sh:message "Node does not have type rdfs:Class." ;
sh:prefixes hr: ;
sh:select """
SELECT $this
WHERE {
$this rdf:type ?o .
FILTER NOT EXISTS {
?o rdf:type* rdfs:Class
}
FILTER ( strstarts( str( $this ), str( hr: ) ) )
}
""" ;
]
.
'''
data = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )
report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )

Related

SHACL: Require sh:property to be a URI

I was wondering if there's a way to specify that a given sh:property is expected to have a value of a URI without any particular class.
In the example SHACL below, the property meta:value would only allow URIs, although I don't know of a way to represent it in SHACL.
Example SHACL
#prefix sh: <http://www.w3.org/ns/shacl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix meta: <metadata#> .
#prefix meta_sh: <metadata/shacl#> .
meta_sh:Entry
a sh:NodeShape;
sh:targetClass meta:Entry;
sh:property [
sh:path meta:value;
# Value expected to be a URI
sh:minCount 1; # Required; 1 or more
];
sh:property [
sh:path meta:time;
sh:datatype xsd:dateTime;
sh:minCount 1; sh:maxCount 1; # Required; 1
];
.
Example Data
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix meta: <metadata#> .
#prefix data: <data#> .
data:Entry_001
a meta:Entry;
meta:value data:ProductListing_459; # URI
meta:value data:RentalListing_934; # URI
meta:time "2022-06-15T06:20:31Z"^^xsd:dateTime;
.
this should work:
#prefix sh: <http://www.w3.org/ns/shacl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix meta: <metadata#> .
#prefix meta_sh: <metadata/shacl#> .
meta_sh:Entry
a sh:NodeShape;
sh:targetClass meta:Entry;
sh:nodeKind sh:IRI ;
sh:property [
sh:path meta:value;
# Value expected to be a URI
sh:minCount 1; # Required; 1 or more
];
sh:property [
sh:path meta:time;
sh:datatype xsd:dateTime;
sh:minCount 1; sh:maxCount 1; # Required; 1
];
.
Yes, you are right, the sh:nodeKind has to be part of the property shape.
The reason why it still not worked: the way you defined the prefixes. They were no full URIs so the SHACL processor added some parts (maybe the disk location of the file, depends). The result: the meta in the shapes never matched the meta in the data.
This works:
SHACL
#prefix sh: <http://www.w3.org/ns/shacl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix meta: <https://example.org/#> .
#prefix meta_sh: <metadata/shacl#> .
meta_sh:Entry
a sh:NodeShape;
sh:targetClass meta:Entry;
sh:property [
sh:path meta:value;
sh:nodeKind sh:IRI ;
sh:minCount 1; # Required; 1 or more
];
.
Data
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix meta: <https://example.org/#> .
#prefix data: <data#> .
data:Entry_001
a meta:Entry;
meta:value data:RentalListing_934 ; # URI
meta:time "2022-06-15T06:20:31Z"^^xsd:dateTime;
.
data:Entry_002
a meta:Entry;
meta:value "this is not an uri" ; # No Uri
meta:time "2022-06-15T06:20:31Z"^^xsd:dateTime;
.

How to use sh:rule and sh:count correctly

Here is my data.ttl:
#prefix : <http://example.com/ns#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:Bob a :Student ;
:name "Bob" ;
:tookTest :Test1, :Test2, :Test3, :Test4, :Test5 .
:Test1 a :Test ;
:grade "A" .
:Test2 a :Test ;
:grade "A" .
:Test3 a :Test ;
:grade "A" .
:Test4 a :Test ;
:grade "C" .
:Test5 a :Test ;
:grade "D" .
the shacl.ttl is:
#prefix : <http://example.com/ns#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
:Test a rdfs:Class, sh:NodeShape .
:Student a rdfs:Class, sh:NodeShape ;
sh:rule [
a sh:TripleRule ;
sh:subject sh:this ;
sh:predicate :rating ;
sh:object [sh:count [sh:path :Test] ];
] .
and I use shacl inferance engine (https://github.com/TopQuadrant/shacl):
shaclinfer.bat -datafile D:\data.ttl -shapesfile D:\shacl.ttl>D:\report.txt
I get a result report as:
<http://example.com/ns#Bob>
<http://example.com/ns#rating> 0 .
I actually want to implement the following rule:
if Bob get more than 2 "A", then--> :Bob :rating "Outstanding" .
How do I write this sh:condition statement correctly?
In this case sh:count should calculate the number of Test which grade="A",Do I have to use sparql CONSTRUCT statements?
Thanks for any help!
#prefix : <http://example.com/ns#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
:Test a rdfs:Class, sh:NodeShape .
:Student a rdfs:Class, sh:NodeShape ;
sh:rule [
a sh:TripleRule ;
sh:subject sh:this ;
sh:predicate :rating ;
sh:object "Outstanding";
sh:condition :Student ;
sh:condition [
sh:qualifiedValueShape [
sh:path (:tookTest :grade ) ;
sh:hasValue "A" ; ] ;
sh:qualifiedMinCount 2 ; ];
] .
The shacl.ttl looks close to correct, but it get error by engine of https://github.com/TopQuadrant/shacl, the error message as follows:
Exception in thread "main" org.apache.jena.query.QueryParseException: Line 2, column 1: Unresolved prefixed name: :tookTest
Would anyone please test this with any of the other SHACL inference engines?

Inferencing in Protege does not infer types based on `subClassOf`

We are developing an ontology, where we need to infer the type of each node, based on some mappings defined in # Test mapping file section. We tried rdfs:subClassOf, as in the below snippet, but it doesn't infer types. Ideally we want to infer Person 1 and Person 2 to NodeTypePerson, also sydney and canberra to NodeTypeCity.
I tried owl:equivalentClass instead but no luck. rdfs:range and rdfs:domain infer types as expected, but having trouble inferring with rdfs:subClassOf. Any advice is highly appreciated.
UPDATE:
owl:equivalentClass works in Protege. (If infers the type). But can't we use rdfs:subClassOf similarly? Actually I want a way to get this done with RDFS inferencing, where owl:equivalentClass doesn't work obviously. Is there any other RDFS property we can use here?
# baseURI: http://www.Test-app.com/ns
#prefix : <http://www.Test-app.com/ns#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix acl: <http://www.w3.org/ns/auth/acl#> .
#prefix cc: <http://creativecommons.org/ns#> .
#prefix cert: <http://www.w3.org/ns/auth/cert#> .
#prefix csvw: <http://www.w3.org/ns/csvw#> .
#prefix dc: <http://purl.org/dc/terms/> .
#prefix dcam: <http://purl.org/dc/dcam/> .
#prefix dcat: <http://www.w3.org/ns/dcat#> .
#prefix dctype: <http://purl.org/dc/dcmitype/> .
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
#prefix ldp: <http://www.w3.org/ns/ldp#> .
#prefix posixstat: <http://www.w3.org/ns/posix/stat#> .
#prefix schema: <https://schema.org/> .
#prefix shacl: <http://www.w3.org/ns/shacl#> .
#prefix skos: <http://www.w3.org/2004/02/skos/core#> .
#prefix skosxl: <http://www.w3.org/2008/05/skos-xl#> .
#prefix solid: <http://www.w3.org/ns/solid/terms#> .
#prefix swapdoc: <http://www.w3.org/2000/10/swap/pim/doc#> .
#prefix ui: <http://www.w3.org/ns/ui#> .
#prefix vann: <http://purl.org/vocab/vann/> .
#prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
#prefix vs: <http://www.w3.org/2003/06/sw-vocab-status/ns#> .
#prefix ws: <http://www.w3.org/ns/pim/space#> .
#base <http://www.Test-app.com/ns> .
<http://www.Test-app.com/ns>
rdf:type owl:Ontology ;
dc:title "Test Ontology"#en ;
owl:versionIRI <http://www.Test-app.com/ns/0.1> .
# Test mapping file
#prefix test: <http://www.Test-app.com/ns#> .
:Person
a rdfs:Class ;
owl:equivalentClass :NodeTypePerson .
:locatedNear
a rdf:Property ;
rdfs:subClassOf :NodeCustomAttribute ;
rdfs:label "Located Near" .
:Location
a rdfs:Class ;
rdfs:label "Location" ;
rdfs:subClassOf :NodeTypeCity ;
rdfs:comment "This represents a geolocation." .
:City
a rdfs:Class ;
rdfs:label "City" ;
rdfs:comment "This represents a city." ;
rdfs:subClassOf :Location .
# Some graph instances data
:sydney
a :City ;
rdfs:label "Sydney" ;
rdfs:comment "Australia's largest city." .
:canberra
a :City ;
rdfs:label "Canberra" ;
rdfs:comment "Australia's national capital." .
:person1
a :Person ;
rdfs:label "Person 1" ;
rdfs:comment "First vertex." ;
:locatedNear :sydney .
:person2
a :Person ;
rdfs:label "Person 2" ;
rdfs:comment "Second vertex." ;
:locatedNear :canberra .
# Types
:NodeTypePerson
rdf:type rdfs:Class ;
rdfs:label "NodeTypePerson" ;
rdfs:comment "" ;
vs:term_status "stable" ;
rdfs:isDefinedBy <http://www.Test-app.com/ns> .
:NodeTypeCity
rdf:type rdfs:Class ;
rdfs:label "NodeTypeCity" ;
rdfs:comment "" ;
vs:term_status "stable" ;
rdfs:isDefinedBy <http://www.Test-app.com/ns> .```

Get MAX result from COUNT in SPARQL

I have this query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX curricula: <http://www.semanticweb.org/lsarni/ontologies/curricula#>
SELECT ?topic (COUNT(?x) as ?number_of_courses)
WHERE {
?topic curricula:taughtIn ?x .
}
GROUP BY ?topic
ORDER BY desc(?number_of_courses)
That returns:
Now I want to get the ?topic with the max ?number_of_courses, in this case Engineering
I managed to get the max ?number_of_courses with this query (following this answer):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX curricula: <http://www.semanticweb.org/lsarni/ontologies/curricula#>
SELECT (MAX(?number_of_courses) AS ?maxc)
WHERE{
{
SELECT ?topic (COUNT(?x) as ?number_of_courses)
WHERE {
?topic curricula:taughtIn ?x .
}
GROUP BY ?topic
}
}
This returns:
But I can't manage to get the ?topic.
How could I fix this?
Update
I tried what I understood from AKSW comment:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX curricula: <http://www.semanticweb.org/lsarni/ontologies/curricula#>
SELECT ?topic
WHERE {
?topic curricula:taughtIn ?x .
{
SELECT (MAX(?number_of_courses) AS ?max)
WHERE {
{
SELECT ?topic (COUNT(?x) as ?number_of_courses)
WHERE {
?topic curricula:taughtIn ?x .
}
GROUP BY ?topic
}
}
}
}
GROUP BY ?topic
HAVING (COUNT(?x) = ?max)
But it returns nothing.
Example
For this minimal example the result should be Topic_2 since it's taughtIn two courses.
#prefix : <http://www.semanticweb.org/lsarni/ontologies/curricula#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#base <http://www.semanticweb.org/lsarni/ontologies/curricula> .
<http://www.semanticweb.org/lsarni/ontologies/curricula> rdf:type owl:Ontology .
#################################################################
# Object Properties
#################################################################
### http://www.semanticweb.org/lsarni/ontologies/curricula#taughtIn
:taughtIn rdf:type owl:ObjectProperty ;
rdfs:subPropertyOf owl:topObjectProperty .
### http://www.w3.org/2002/07/owl#topObjectProperty
owl:topObjectProperty rdfs:domain :Topic ;
rdfs:range :Course .
#################################################################
# Classes
#################################################################
### http://www.semanticweb.org/lsarni/ontologies/curricula#Course
:Course rdf:type owl:Class ;
owl:disjointWith :Topic .
### http://www.semanticweb.org/lsarni/ontologies/curricula#Topic
:Topic rdf:type owl:Class .
#################################################################
# Individuals
#################################################################
### http://www.semanticweb.org/lsarni/ontologies/curricula#Course_1
:Course_1 rdf:type owl:NamedIndividual ,
:Course .
### http://www.semanticweb.org/lsarni/ontologies/curricula#Course_2
:Course_2 rdf:type owl:NamedIndividual ,
:Course .
### http://www.semanticweb.org/lsarni/ontologies/curricula#Topic_1
:Topic_1 rdf:type owl:NamedIndividual ,
:Topic ;
:taughtIn :Course_1 .
### http://www.semanticweb.org/lsarni/ontologies/curricula#Topic_2
:Topic_2 rdf:type owl:NamedIndividual ,
:Topic ;
:taughtIn :Course_1 ,
:Course_2 .
### Generated by the OWL API (version 4.2.8.20170104-2310) https://github.com/owlcs/owlapi

owl: property domain restriction on property value

I am studying the inference in OWL, currently the restriction in domain definition:
#prefix : <http://www.test.org/2015/4/ontology#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#base <http://www.test.org/2015/4/ontology> .
<http://www.test.org/2015/4/ontology> rdf:type owl:Ontology .
:Class1 rdf:type owl:Class .
:Prop1 rdf:type owl:DatatypeProperty ;
rdfs:domain [ rdf:type owl:Class ;
owl:intersectionOf ( :Class1
[ rdf:type owl:Restriction ;
owl:onProperty :Prop1 ;
owl:hasValue "class1"
]
)
] .
:Ind1 rdf:type owl:NamedIndividual ;
:Prop1 "p" .
I've expected that the reasoner (Pellet) infers
:Ind1 rdf:type :Class1
only if there is
:Ind1 :Prop1 "class1"
but it seems to ignore the restriction in the domain definition.
Is it correct to define restrictions in damain definitions? The reasoner (Pellet) does not forbid me to do that.