I have the following sample data:
#prefix hr: <http://ex.com/> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix sch: <http://schema.org/> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
hr:AAAA a rdfs:Class .
hr:BBBB a rdfs:Class .
hr:ClassA a rdfs:Class ;
rdfs:subClassOf hr:AAAA ;
rdfs:subClassOf hr:BBBB .
hr:ClassB a rdfs:Class ;
rdfs:subClassOf hr:ClassA .
hr:ClassC a rdfs:Class ;
rdfs:subClassOf hr:ClassB .
hr:ClassD a rdfs:Class ;
rdfs:subClassOf hr:AAAA .
hr:ClassE a rdfs:Class ;
rdfs:subClassOf hr:ClassD .
What I would like to get as a series of strings are the full rdfs:subClassOf property paths leading from one node to another. In this case, the set of strings I would like to get back would look like or be similar to:
hr:ClassC -> hr:ClassB -> hr:ClassA -> hr:AAAA
hr:ClassC -> hr:ClassB -> hr:ClassA -> hr:BBBB
hr:ClassB -> hr:ClassA -> hr:AAAA
hr:ClassB -> hr:ClassA -> hr:BBBB
hr:ClassA -> hr:AAAA
hr:ClassA -> hr:BBBB
hr:ClassD -> hr:AAAA
hr:ClassE -> hr:ClassD -> hr:AAAA
Is this possible with SPARQL? If so, what would the query look like?
The following query seems like it is close:
SELECT ( GROUP_CONCAT( ?subclass; SEPARATOR = " -> " ) AS ?hierarchy )
WHERE {
?class rdfs:subClassOf+ ?subclass .
}
GROUP BY ?class
but the results are:
http://ex.com/AAAA
http://ex.com/AAAA -> http://ex.com/BBBB
http://ex.com/ClassA -> http://ex.com/AAAA -> http://ex.com/BBBB
http://ex.com/ClassB -> http://ex.com/ClassA -> http://ex.com/AAAA -> http://ex.com/BBBB
http://ex.com/ClassD -> http://ex.com/AAAA
I cannot explain where:
http://ex.com/AAAA -> http://ex.com/BBBB
is coming from or why hr:ClassE is missing from the results.
This is not possible with SPARQL.
Related
We are developing an ontology, where we need to infer the type of each node, based on some mappings defined in # Test mapping file section. We tried rdfs:subClassOf, as in the below snippet, but it doesn't infer types. Ideally we want to infer Person 1 and Person 2 to NodeTypePerson, also sydney and canberra to NodeTypeCity.
I tried owl:equivalentClass instead but no luck. rdfs:range and rdfs:domain infer types as expected, but having trouble inferring with rdfs:subClassOf. Any advice is highly appreciated.
UPDATE:
owl:equivalentClass works in Protege. (If infers the type). But can't we use rdfs:subClassOf similarly? Actually I want a way to get this done with RDFS inferencing, where owl:equivalentClass doesn't work obviously. Is there any other RDFS property we can use here?
# baseURI: http://www.Test-app.com/ns
#prefix : <http://www.Test-app.com/ns#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix acl: <http://www.w3.org/ns/auth/acl#> .
#prefix cc: <http://creativecommons.org/ns#> .
#prefix cert: <http://www.w3.org/ns/auth/cert#> .
#prefix csvw: <http://www.w3.org/ns/csvw#> .
#prefix dc: <http://purl.org/dc/terms/> .
#prefix dcam: <http://purl.org/dc/dcam/> .
#prefix dcat: <http://www.w3.org/ns/dcat#> .
#prefix dctype: <http://purl.org/dc/dcmitype/> .
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
#prefix ldp: <http://www.w3.org/ns/ldp#> .
#prefix posixstat: <http://www.w3.org/ns/posix/stat#> .
#prefix schema: <https://schema.org/> .
#prefix shacl: <http://www.w3.org/ns/shacl#> .
#prefix skos: <http://www.w3.org/2004/02/skos/core#> .
#prefix skosxl: <http://www.w3.org/2008/05/skos-xl#> .
#prefix solid: <http://www.w3.org/ns/solid/terms#> .
#prefix swapdoc: <http://www.w3.org/2000/10/swap/pim/doc#> .
#prefix ui: <http://www.w3.org/ns/ui#> .
#prefix vann: <http://purl.org/vocab/vann/> .
#prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
#prefix vs: <http://www.w3.org/2003/06/sw-vocab-status/ns#> .
#prefix ws: <http://www.w3.org/ns/pim/space#> .
#base <http://www.Test-app.com/ns> .
<http://www.Test-app.com/ns>
rdf:type owl:Ontology ;
dc:title "Test Ontology"#en ;
owl:versionIRI <http://www.Test-app.com/ns/0.1> .
# Test mapping file
#prefix test: <http://www.Test-app.com/ns#> .
:Person
a rdfs:Class ;
owl:equivalentClass :NodeTypePerson .
:locatedNear
a rdf:Property ;
rdfs:subClassOf :NodeCustomAttribute ;
rdfs:label "Located Near" .
:Location
a rdfs:Class ;
rdfs:label "Location" ;
rdfs:subClassOf :NodeTypeCity ;
rdfs:comment "This represents a geolocation." .
:City
a rdfs:Class ;
rdfs:label "City" ;
rdfs:comment "This represents a city." ;
rdfs:subClassOf :Location .
# Some graph instances data
:sydney
a :City ;
rdfs:label "Sydney" ;
rdfs:comment "Australia's largest city." .
:canberra
a :City ;
rdfs:label "Canberra" ;
rdfs:comment "Australia's national capital." .
:person1
a :Person ;
rdfs:label "Person 1" ;
rdfs:comment "First vertex." ;
:locatedNear :sydney .
:person2
a :Person ;
rdfs:label "Person 2" ;
rdfs:comment "Second vertex." ;
:locatedNear :canberra .
# Types
:NodeTypePerson
rdf:type rdfs:Class ;
rdfs:label "NodeTypePerson" ;
rdfs:comment "" ;
vs:term_status "stable" ;
rdfs:isDefinedBy <http://www.Test-app.com/ns> .
:NodeTypeCity
rdf:type rdfs:Class ;
rdfs:label "NodeTypeCity" ;
rdfs:comment "" ;
vs:term_status "stable" ;
rdfs:isDefinedBy <http://www.Test-app.com/ns> .```
I have the following Data & Shape Graph.
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes hr: ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:property [
sh:path rdf:type ;
sh:nodeKind sh:IRI ;
sh:hasValue rdfs:Class
] ;
.
Using pySHACL:
import rdflib
from pyshacl import validate
full_graph = open( "/Users/jamesh/jigsaw/shacl_work/data_graph.ttl", "r" ).read()
g = rdflib.Graph().parse( data = full_graph, format = 'turtle' )
report = validate( g, inference='rdfs', abort_on_error = False, meta_shacl = False, debug = False )
print( report[2] )
What I think should happen is the SPARQL based target should select every subject in the Data Graph and then verify that there is a path of rdf:type which has a value of rdfs:Class.
I get the following result:
Validation Report
Conforms: True
The expected validation errors should include only the following subjects:
| <http://learningsparql.com/ns/humanResources#BadOne> |
| <http://learningsparql.com/ns/humanResources#BadTwo> |
| <http://learningsparql.com/ns/humanResources#BadThree> |
| <http://learningsparql.com/ns/humanResources#AnotherName> |
| <http://learningsparql.com/ns/humanResources#name> |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> |
Is this possible with SHACL? If so, what should the shape file be?
What follows results in the expected validation errors, however, there are still several things I do not understand.
The sh:prefixes hr: ; is not needed. It is designed to supply prefixes for the SPARQL target SELECT statement itself and nothing more.
Inference needed to be disabled. It was inserting triples and trying to validate them. In this use case, that is not what is desired. What should be validated is what is in the schema and nothing else.
I was also thinking that it would not be an issue to put everything into a single graph based on what apparently was a misunderstanding of https://github.com/RDFLib/pySHACL/issues/46.
graph_data = """
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""
shape_data = '''
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:prefixes hr: ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:property [
sh:path ( rdf:type [ sh:zeroOrMorePath rdf:type ] ) ;
sh:nodeKind sh:IRI ;
sh:hasValue rdfs:Class
] ;
.
'''
data = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )
report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )
An alternative using a SPARQL based constraint would look like:
graph_data = """
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""
shape_data = '''
#prefix hr: <http://learningsparql.com/ns/humanResources#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix schema: <http://schema.org/> .
#prefix sh: <http://www.w3.org/ns/shacl#> .
schema:SchemaShape
a sh:NodeShape ;
sh:target [
a sh:SPARQLTarget ;
sh:select """
SELECT ?this
WHERE {
?this ?p ?o .
}
""" ;
] ;
sh:sparql [
a sh:SPARQLConstraint ;
sh:message "Node does not have type rdfs:Class." ;
sh:prefixes hr: ;
sh:select """
SELECT $this
WHERE {
$this rdf:type ?o .
FILTER NOT EXISTS {
?o rdf:type* rdfs:Class
}
FILTER ( strstarts( str( $this ), str( hr: ) ) )
}
""" ;
]
.
'''
data = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )
report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )
I have the following sample graph of services that includes dependencies information:
#base <https://meta.acme.com/> .
#prefix : <http://schema.meta.acme.com/> .
#prefix dc: <http://purl.org/dc/terms/> .
</service/c84acffd-944a-43c1-8f06-956a4a6033da>
a :Service ;
dc:title "Service 1" ;
:serviceDependency
</service/70987802-9157-4881-ab0c-049b04b7798d>,
</service/2b47109e-26e3-4d06-b245-98730bdb7d43>.
</service/70987802-9157-4881-ab0c-049b04b7798d>
a :Service ;
dc:title "Service 2" ;
:serviceDependency
</service/c84acffd-944a-43c1-8f06-956a4a6033da>,
</service/f45b998c-b496-4318-be40-46c1aafaf6cd> .
</service/2b47109e-26e3-4d06-b245-98730bdb7d43>
a :Service ;
dc:title "Service 3" ;
:serviceDependency </service/> .
</service/f45b998c-b496-4318-be40-46c1aafaf6cd>
a :Service ;
dc:title "Service 4" ;
:serviceDependency </service/> .
I'm writing a SPARQL query to find cycles in the dependency relationships. I have a query that is producing correct results, but it yields pseudo-duplicates.
For example:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix c: <http://schema.meta.acme.com/>
prefix dc: <http://purl.org/dc/terms/>
select ?a ?aname ?b ?bname
where
{
{
?a a c:Service ;
dc:title ?aname ;
c:serviceDependency ?b .
?b dc:title ?bname .
} filter ( EXISTS { ?b c:serviceDependency ?a } )
}
yields the following output:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| a | aname | b | bname |
===========================================================================================================================================================================
| <https://meta.acme.com/service/70987802-9157-4881-ab0c-049b04b7798d> | "Service 2" | <https://meta.acme.com/service/c84acffd-944a-43c1-8f06-956a4a6033da> | "Service 1" |
| <https://meta.acme.com/service/c84acffd-944a-43c1-8f06-956a4a6033da> | "Service 1" | <https://meta.acme.com/service/70987802-9157-4881-ab0c-049b04b7798d> | "Service 2" |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Again, this is as expected - however, the information that I want the query to reflect is the cycle itself and not both sides of the cycle.
My thinking for how to solve this is to calculate an identifier by sorting and concatenating the 2 IDs, then grouping by those IDs, but I wanted to ask whether there's a more natural way to do this in SPARQL?
thanks!
Are these always one-step loops?
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix c: <http://schema.meta.acme.com/>
prefix dc: <http://purl.org/dc/terms/>
select ?a ?aname ?b ?bname
where
{
?a a c:Service ;
dc:title ?aname ;
c:serviceDependency ?b .
?b dc:title ?bname ;
c:serviceDependency ?a
}
I did not get the right sentence to ask the question.
I would like to explain my problem using a sample data.
let's say the triples are as below.
#prefix skos: <http://www.w3.org/2004/02/skos/core#> .
#prefix xs: <http://www.w3.org/2001/XMLSchema#> .
#prefix p0: <http://www.mlacustom.com#> .
#prefix p2: <http://www.mla.com/term/> .
_:bnode7021016689601753065 p0:lastModifiedDateTime "2018-09-14T12:55:38"^^<xsd:dateTime> ;
p0:lastModifiedUser "admin"^^xs:string .
<http://www.mla.com/name/4204078359> p0:hasClassingFacet "http://www.mla.com/facet/RA"^^xs:string ;
p0:type "Normal"^^xs:string ;
p0:classification _:bnode3452184423513029143 ,
_:bnode6827572371999795686 ;
p0:recordType "Name"^^xs:string ;
p0:recordNumber "4204078359"^^xs:string ;
p0:stdDescriptor "classification111111"^^xs:string ;
p0:establishedBy "admin"^^xs:string ;
skos:prefLabel "classification111111"^^xs:string ;
p0:createdBy "admin"^^xs:string ;
a skos:Concept ;
p0:createdDate "2018-09-14T12:55:38"^^<xsd:dateTime> ;
p0:establishedDate "2018-09-14T12:55:38"^^<xsd:dateTime> ;
p0:hasRAsubFacet "http://www.mla.com/subfacet/classing-subject-authors"^^xs:string ;
p0:lastModifiedDate "2018-09-14T12:55:38"^^<xsd:dateTime> ;
p0:lastModifiedDetails _:bnode7021016689601753065 ;
p0:isProblematic "N,N"^^xs:string ;
p0:lastModifiedBy "admin"^^xs:string ;
p0:status "established"^^xs:string .
_:bnode3452184423513029143 p0:literature p2:1513 ;
p0:timePeriod p2:1005 ;
p0:language p2:3199 .
_:bnode6827572371999795686 p0:literature p2:11307 ;
p0:timePeriod p2:1009 ;
p0:language p2:31 .
please have a look at the p0:classification it has two blank nodes and both the blank nodes has triples with p0:literature, p0:timePeriod, p0:language
Now I want to write a SPARQL query where
(
(p0:literature is p2:1513 AND p0:timePeriod is p2:1005) AND
(p0:literature is p2:11307 AND p0:timePeriod is p2:1009)
)
As per the above scenario it should return me the http://www.mla.com/name/4204078359 subject
Classification can have any number of blank nodes.
Here is one of the solutions. It is possible to find many different queries. This one is explicit.
SELECT ?s WHERE {
?s p0:classification ?o1 .
?s p0:classification ?o2 .
?o1 p0:literature p2:1513 ;
p0:timePeriod p2:1005 .
?o2 p0:literature p2:11307 ;
p0:timePeriod p2:1009 .
}
I trying to run a text search query with Jena via command line.
tdbquery --desc textsearch.ttl --query search.rq
The query return empty results with the messages:
17:23:46 WARN TextQueryPF :: Failed to find the text index : tried context and as a text-enabled dataset
17:23:46 WARN TextQueryPF :: No text index - no text search performed
My assembler file is:
#prefix : <http://localhost/jena_example/#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix text: <http://jena.apache.org/text#> .
## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
## Initialize text query
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset rdfs:subClassOf ja:RDFDataset .
# Lucene index
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
# Solr index
text:TextIndexSolr rdfs:subClassOf text:TextIndex .
## ---------------------------------------------------------------
## This URI must be fixed - it's used to assemble the text dataset.
:text_dataset rdf:type text:TextDataset ;
text:dataset <#dataset> ;
text:index <#indexLucene> ;
.
# A TDB datset used for RDF storage
<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB2" ;
tdb:unionDefaultGraph true ; # Optional
.
# Text index description
<#indexLucene> a text:TextIndexLucene ;
text:directory <file:Lucene2> ;
##text:directory "mem" ;
text:entityMap <#entMap> ;
.
# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "text" ;
text:map (
[ text:field "text" ; text:predicate rdfs:label ]
) .
My query is :
PREFIX text: <http://jena.apache.org/text#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s
{ ?s text:query 'London' ;
rdfs:label ?label
}
I would like to know if I miss any configuration or this query can only be done inside fuseki.
First, you can do text search outside Fuseki. The example you took the code from shows how to do this using plain Jena Dataset in Java.
Second, Andy Seaborne on Jena mailing list suggests the following:
SELECT (count(*) AS ?C) { ?x text:query .... }
in order to "touch" the index before running the real queries.