I have a number of triples organized like following.
:A :hasB :B
:B :hasC :C
:C :hasD :D
:D :hasE :E
............
:X :hasY :Y
:Y :hasZ :Z
All the predicates are unique.
I need to write two SPARQL queries.
Query 1 will find all the predicates between :A to :Z through a transitive query (something like this :A :has* :Z). Output 1 should look like following.
Output 1:
--------
hasB
hasC
hasD
....
hasZ
Ouery 2 will find triples between :A to :Z through a transitive query. Output 2 should look like following.
Output 2:
--------
:B :hasC :C
:C :hasD :D
:D :hasE :E
............
:X :hasY :Y
Please let me know how to write these transitive SPARQL queries.
SPARQL has some obvious limitations as it's not a graph query language. Possible solutions below:
If there are no other predicates besides has[A-Z]:
Sample Data
#prefix : <http://ex.org/> .
:A :hasB :B .
:B :hasC :C .
:C :hasD :D .
:D :hasE :E .
Query
prefix : <http://ex.org/>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 .
?v2 (<p>|!<p>)* ?end .
}
Output
---------
| p |
=========
| :hasB |
| :hasC |
| :hasD |
| :hasE |
---------
If there are other predicates besides has[A-Z]:
Sample Data
#prefix : <http://ex.org/> .
:A :hasB :B .
:B :hasC :C .
:C :hasD :D .
:C :notHasD :D .
:D :hasE :E .
Introduce a super property :has:
Additional Data:
#prefix : <http://ex.org/> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:hasB rdfs:subPropertyOf :has .
:hasC rdfs:subPropertyOf :has .
:hasD rdfs:subPropertyOf :has .
:hasE rdfs:subPropertyOf :has .
Query:
prefix : <http://ex.org/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 . ?p rdfs:subPropertyOf :has .
?v2 (<p>|!<p>)* ?end .
}
Output
---------
| p |
=========
| :hasB |
| :hasC |
| :hasD |
| :hasE |
---------
Use REGEX on property URI:
prefix : <http://ex.org/>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 .
FILTER(REGEX(STRAFTER(STR(?p), STR(:)), 'has[A-Z]'))
?v2 (<p>|!<p>)* ?end .
}
Note, all the proposed solutions will not work on all kind of data, especially once you have multiple paths and/or cycles. In that case, you should use a proper graph database.
Related
I need some help starting with SPARQL:
I have some individuals (A, B, C). I want to state, that they are all the same. I’ve tried the following:
PREFIX : <http://test.com#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
CONSTRUCT {
?p owl:sameAs ?p
}
WHERE {
} VALUES (?p) {
(:A)
(:B)
(:C)
}
The result is:
:B owl:sameAs :B .
:A owl:sameAs :A .
:C owl:sameAs :C .
What I would like to have is something like this:
:A owl:sameAs :B
:A owl:sameAs :C
:B owl:sameAs :A
:B owl:sameAs :C
:C owl:sameAs :A
:C owl:sameAs :B
Do you have any hints for me, how to do that?
I am trying to learn about semantic web technology to see if it can replace the current Relational model in an application. I am using protege to design the ontology and Jena for the application which will be hosted on Apache TomEE.
I have created a simplified ontology to explain the query I am having. In people.owl (turtle below) there is only one class People. The object property hasChild links a person individual to his children who are also Person. Now I want to know the order of the children for a person.
For example, individual john has two children, with anne being first and jack being second. So to the hasChild object property of john I have added sequence annotation. The sequence is a number which tells the order in which the child was born for that person. So hasChild anne has sequence 1 and hasChild jack has sequnce 2. Similarly jane's hasChild jack has sequence 1.
The SPARQL I have to write is to get all the children of a given Person's name in the order they were born. So if I query children of "John Doe", I should get
|sequence | childname|
| 1 | Anne Doe |
| 2 | Jack Doe |
Below is the Ontology in turtle format.
#prefix : <http://www.semanticweb.org/abhishek/ontologies/people#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix xml: <http://www.w3.org/XML/1998/namespace> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix people: <http://www.semanticweb.org/abhishek/ontologies/people#> .
#base <http://www.semanticweb.org/abhishek/ontologies/people> .
<http://www.semanticweb.org/abhishek/ontologies/people> rdf:type owl:Ontology .
### http://www.semanticweb.org/abhishek/ontologies/people#sequence
people:sequence rdf:type owl:AnnotationProperty .
### http://www.semanticweb.org/abhishek/ontologies/people#hasChild
people:hasChild rdf:type owl:ObjectProperty .
### http://www.semanticweb.org/abhishek/ontologies/people#hasParent
people:hasParent rdf:type owl:ObjectProperty .
### http://www.semanticweb.org/abhishek/ontologies/people#hasSpouse
people:hasSpouse rdf:type owl:ObjectProperty .
### http://www.semanticweb.org/abhishek/ontologies/people#name
people:name rdf:type owl:DatatypeProperty .
### http://www.semanticweb.org/abhishek/ontologies/people#Person
people:Person rdf:type owl:Class .
### http://www.semanticweb.org/abhishek/ontologies/people#anne
people:anne rdf:type owl:NamedIndividual ,
people:Person ;
people:name "Anne Doe"^^xsd:string .
### http://www.semanticweb.org/abhishek/ontologies/people#jack
people:jack rdf:type owl:NamedIndividual ,
people:Person ;
people:name "Jack Doe"^^xsd:string .
### http://www.semanticweb.org/abhishek/ontologies/people#jane
people:jane rdf:type owl:NamedIndividual ,
people:Person ;
people:hasChild people:jack ;
people:hasSpouse people:john ;
people:name "Jane Doe"^^xsd:string .
[ rdf:type owl:Axiom ;
owl:annotatedSource people:jane ;
owl:annotatedProperty people:hasChild ;
owl:annotatedTarget people:jack ;
people:sequence "1"^^xsd:int
] .
### http://www.semanticweb.org/abhishek/ontologies/people#john
people:john rdf:type owl:NamedIndividual ,
people:Person ;
people:hasChild people:anne ,
people:jack ;
people:hasSpouse people:jane ;
people:name "John Doe"^^xsd:string .
[ rdf:type owl:Axiom ;
owl:annotatedSource people:john ;
owl:annotatedProperty people:hasChild ;
owl:annotatedTarget people:anne ;
people:sequence "1"^^xsd:int
] .
[ rdf:type owl:Axiom ;
owl:annotatedSource people:john ;
owl:annotatedProperty people:hasChild ;
owl:annotatedTarget people:jack ;
people:sequence "2"^^xsd:int
] .
### Generated by the OWL API (version 4.2.6.20160910-2108) https://github.com/owlcs/owlapi
I have the query till now:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ppl: <http://www.semanticweb.org/abhishek/ontologies/people#>
SELECT ?sequence ?childname
WHERE { ?person rdf:type ppl:Person .
?person ppl:name ?name .
?person ppl:hasChild ?child .
#Get the ?sequence for the current child. sequence is annotated to hasChild.
?child ppl:name ?childname .
FILTER regex(?name, "John Doe") }
So I have two questions:
Should I change the ontology design to assign the sequence number to a child?
If the above approach is correct, how do I get the sequence for a child?
I posted the same question on Jena user list and got the recommendation to use RDF Collections. But from google search I found collections don't work with OWL (and hence Protege).
Thanks,
Abhishek
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ppl: <http://www.semanticweb.org/abhishek/ontologies/people#>
SELECT ?sequence ?childname
WHERE { ?person rdf:type ppl:Person .
?person ppl:name ?name .
?person ppl:hasChild ?child .
#Get the ?sequence for the current child. sequence is annotated to hasChild.
?child ppl:name ?childname .
?annotation owl:annotatedSource ?person ;
owl:annotatedTarget ?child ;
ppl:sequence ?sequence .
FILTER regex(?name, "John Doe") }
This is my query
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
{
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
}
union
{
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
}
filter (?x != ?item)
}
group by ?item
in my firstbind clause, I would like to not just bind the variable ?p, but also the variable ?instance. Plus, adding a text like that is why.
so the first bind should result the following results:
?p that is why ?instance
is that possible in SPARQL ?
please don't care about if the data makes sence or not, it is just a query to show you my question
If I understand you correctly, you're just looking for the concat function. As I've mentioned before, you should really browse through the SPARQL 1.1 standard, at least through the table of contents. You don't need to memorize it, but it will give you an idea of what things are possible, and an idea of where to look. Additionally, it's very helpful if you provide sample data that we can work with, because it makes it much clearer to figure out what you're trying to do. The phrasing of your title was not particularly clear, and the question doesn't really provide an example of what you're trying to accomplish. Only because I've seen some of your past questions did I have an idea of what you were aiming for. At any rate, here's some data:
#prefix : <urn:ex:>
:p :hasSimilarity 0.3 .
:A :hasSimilarity 0.6 .
:a :p :b ; #-- is is related to :b
a :A . #-- and is an :A .
:c :p :b . #-- :c is also related to :b
:d a :A . #-- :d is also an :A .
:e :p :b ; #-- :e is related to :b
a :A . #-- and is also an :A .
And here's the query and its results. You just use concat to join the str form of your variables with the appropriate strings and then bind the result to the variable.
prefix : <urn:ex:>
select ?item
(sum(?factor_) as ?factor)
(group_concat(distinct ?reason_; separator=", ") as ?reason)
{
values ?x { :a }
{ ?x ?p ?instance .
?item ?p ?instance .
?p :hasSimilarity ?factor_ .
bind(concat("has common ",str(?p)," value ",str(?instance)) as ?reason_) }
union
{ ?x a ?class.
?item a ?class.
?class :hasSimilarity ?factor_ .
bind(concat("has common class ",str(?class)) as ?reason_)
}
filter (?x != ?item)
}
group by ?item
-----------------------------------------------------------------------------------
| item | factor | reason |
===================================================================================
| :c | 0.3 | "has common urn:ex:p value urn:ex:b" |
| :d | 0.6 | "has common class urn:ex:A" |
| :e | 0.9 | "has common urn:ex:p value urn:ex:b, has common class urn:ex:A" |
-----------------------------------------------------------------------------------
Take this graph:
:thing1 a :Foo ;
:has :A ;
:has :B .
:thing2 a :Foo ;
:has :B ;
:has :A .
:thing3 a :Foo ;
:has :A ;
:has :B ;
:has :C .
I want to select :thing1 and :thing2, but NOT :thing3.
Here is the SPARQL query I wrote that works. Is there a better way to do this?
SELECT ?foo WHERE {
?foo a :Foo ;
:has :A ;
:has :B .
MINUS {
?foo a :Foo ;
:has :A ;
:has :B ;
:has ?anythingElse .
FILTER(?anythingElse != :A && ?anythingElse != :B)
}
}
An alternative to MINUS is FILTER NOT EXISTS:
SELECT ?foo WHERE {
?foo a :Foo ;
:has :A, :B .
FILTER NOT EXISTS {
?foo :has ?other .
FILTER (?other NOT IN (:A, :B))
}
}
which says, loosely, find all ?foo with :A and :B, then check that they have no other :has value.
In terms of execution efficiency, there are optimizations to turn some MINUS patterns into FILTER NOT EXISTS and vice versa and also there is the possibility of shared common sub patterns.
Without an optimizer being that smart, the FILTER NOT EXISTS is likely to be faster because the "?foo a :Foo ; ;has :A, :B ." is not repeated and the FILTER only considers items that already passed the "?foo a :Foo ; ;has :A, :B .".
There is only one way to know which is to try for real on real data when all effects, including caching, come together.
You can do this using using the NOT IN operator instead of a boolean expression, and indeed there is no need to repeat the three triple patterns if you replace the MINUS clause with a FILTER NOT EXISTS clause:
SELECT ?foo WHERE {
?foo a :Foo ;
:has :A, :B .
FILTER NOT EXISTS {
?foo :has ?other .
FILTER (?other NOT IN (:A, :B))
}
}
I doubt there'll be a significant difference in performance, but the query is shorter and easier to read.
How can I find the distance between 2 nodes in a graph using Virtuoso? I've read the Transitivity documentations but they limit you to one predicate e.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s foaf:knows ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Only traverses foaf:knows and not any predicate type. How can I extend this to 'whatever predicate'? I don't need the actual path, just a true/false (ASK query). Changing the foaf:knows to ?p seems like an overkill.
I'm currently performing a set of recursive ASKs to find out if two nodes are connected within a specific distance but that doesn't seem efficient.
You should be able to use ?p instead of foaf:knows in your query to determine if there's a path between the nodes. E.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s ?p ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Here's an approach that works if there's at most one path between the nodes that you're interested in.
If you have data like this (note that there are different properties connecting the resources):
#prefix : <https://stackoverflow.com/q/3914522/1281433/>
:a :p :b .
:b :q :c .
:c :r :d .
Then a query like the following finds the distance between each pair of nodes. The property path (:|!:) consists a property that is either : or something other than : (i.e., anything). Thus (:|!:)* is zero or more occurrences of any property; it's a wildcard path. (The technique used here is described more fully in Is it possible to get the position of an element in an RDF Collection in SPARQL?.)
prefix : <https://stackoverflow.com/q/3914522/1281433/>
select ?begin ?end (count(?mid)-1 as ?distance) where {
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
order by ?begin ?end ?distance
--------------------------
| begin | end | distance |
==========================
| :a | :a | 0 |
| :a | :b | 1 |
| :a | :c | 2 |
| :a | :d | 3 |
| :b | :b | 0 |
| :b | :c | 1 |
| :b | :d | 2 |
| :c | :c | 0 |
| :c | :d | 1 |
| :d | :d | 0 |
--------------------------
To just find out whether there's a path between two nodes that's less than some particular length, you use an ask query instead of a select, fix the values of ?begin and ?end, and restrict the value of count(?mid)-1 rather than binding it to ?distance. E.g., is there a path from :a to :d of length less than three?
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :d) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 3 ) )
Ask => No
On the other hand, there is a path from :a to :c with length less than 5:
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :c) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 5 ) )
Ask => Yes