How can I find the distance between 2 nodes in a graph using Virtuoso? I've read the Transitivity documentations but they limit you to one predicate e.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s foaf:knows ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Only traverses foaf:knows and not any predicate type. How can I extend this to 'whatever predicate'? I don't need the actual path, just a true/false (ASK query). Changing the foaf:knows to ?p seems like an overkill.
I'm currently performing a set of recursive ASKs to find out if two nodes are connected within a specific distance but that doesn't seem efficient.
You should be able to use ?p instead of foaf:knows in your query to determine if there's a path between the nodes. E.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s ?p ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Here's an approach that works if there's at most one path between the nodes that you're interested in.
If you have data like this (note that there are different properties connecting the resources):
#prefix : <https://stackoverflow.com/q/3914522/1281433/>
:a :p :b .
:b :q :c .
:c :r :d .
Then a query like the following finds the distance between each pair of nodes. The property path (:|!:) consists a property that is either : or something other than : (i.e., anything). Thus (:|!:)* is zero or more occurrences of any property; it's a wildcard path. (The technique used here is described more fully in Is it possible to get the position of an element in an RDF Collection in SPARQL?.)
prefix : <https://stackoverflow.com/q/3914522/1281433/>
select ?begin ?end (count(?mid)-1 as ?distance) where {
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
order by ?begin ?end ?distance
--------------------------
| begin | end | distance |
==========================
| :a | :a | 0 |
| :a | :b | 1 |
| :a | :c | 2 |
| :a | :d | 3 |
| :b | :b | 0 |
| :b | :c | 1 |
| :b | :d | 2 |
| :c | :c | 0 |
| :c | :d | 1 |
| :d | :d | 0 |
--------------------------
To just find out whether there's a path between two nodes that's less than some particular length, you use an ask query instead of a select, fix the values of ?begin and ?end, and restrict the value of count(?mid)-1 rather than binding it to ?distance. E.g., is there a path from :a to :d of length less than three?
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :d) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 3 ) )
Ask => No
On the other hand, there is a path from :a to :c with length less than 5:
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :c) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 5 ) )
Ask => Yes
Related
I have a number of triples organized like following.
:A :hasB :B
:B :hasC :C
:C :hasD :D
:D :hasE :E
............
:X :hasY :Y
:Y :hasZ :Z
All the predicates are unique.
I need to write two SPARQL queries.
Query 1 will find all the predicates between :A to :Z through a transitive query (something like this :A :has* :Z). Output 1 should look like following.
Output 1:
--------
hasB
hasC
hasD
....
hasZ
Ouery 2 will find triples between :A to :Z through a transitive query. Output 2 should look like following.
Output 2:
--------
:B :hasC :C
:C :hasD :D
:D :hasE :E
............
:X :hasY :Y
Please let me know how to write these transitive SPARQL queries.
SPARQL has some obvious limitations as it's not a graph query language. Possible solutions below:
If there are no other predicates besides has[A-Z]:
Sample Data
#prefix : <http://ex.org/> .
:A :hasB :B .
:B :hasC :C .
:C :hasD :D .
:D :hasE :E .
Query
prefix : <http://ex.org/>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 .
?v2 (<p>|!<p>)* ?end .
}
Output
---------
| p |
=========
| :hasB |
| :hasC |
| :hasD |
| :hasE |
---------
If there are other predicates besides has[A-Z]:
Sample Data
#prefix : <http://ex.org/> .
:A :hasB :B .
:B :hasC :C .
:C :hasD :D .
:C :notHasD :D .
:D :hasE :E .
Introduce a super property :has:
Additional Data:
#prefix : <http://ex.org/> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:hasB rdfs:subPropertyOf :has .
:hasC rdfs:subPropertyOf :has .
:hasD rdfs:subPropertyOf :has .
:hasE rdfs:subPropertyOf :has .
Query:
prefix : <http://ex.org/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 . ?p rdfs:subPropertyOf :has .
?v2 (<p>|!<p>)* ?end .
}
Output
---------
| p |
=========
| :hasB |
| :hasC |
| :hasD |
| :hasE |
---------
Use REGEX on property URI:
prefix : <http://ex.org/>
select ?p
where {
values (?start ?end) { (:A :E) }
?start (<p>|!<p>)* ?v1 .
?v1 ?p ?v2 .
FILTER(REGEX(STRAFTER(STR(?p), STR(:)), 'has[A-Z]'))
?v2 (<p>|!<p>)* ?end .
}
Note, all the proposed solutions will not work on all kind of data, especially once you have multiple paths and/or cycles. In that case, you should use a proper graph database.
Let's say I make the following insertions into my GraphDB 8.3 triplestore:
PREFIX : <http://example.com/>
insert data { :hello a :word }
and
PREFIX : <http://example.com/>
insert data { graph :farewells { :goodbye a :word }}
now, if I ask
select * where {
graph ?g {
?s ?p ?o .
}
}
I only get
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
| ?g | ?s | ?p | ?o |
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
| <http://example.com/farewells> | <http://example.com/goodbye> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://example.com/word> |
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
I can obviously get both "triples about words" with the following, but then the named-graph membership is not shown
select * { ?s ?p ?o }
How can I write a query that retrieves both triples about words and indicates that { :goodbye a :word } comes from graph :farewells ?
You can do something along these lines:
SELECT *
WHERE {
{ GRAPH ?g { ?s ?p ?o } }
UNION
{ ?s ?p ?o .
FILTER NOT EXISTS { GRAPH ?g { ?s ?p ?o } }
}
}
The first part of the union selects all triples in named graphs. The second part grabs all triples in the default graph, explicitly excluding patterns that occur in a named graph.
In GraphDB, you could use pseudographs for this purpose, i. e. <http://www.ontotext.com/explicit> (it seems you are not using inferencing).
Try this query:
SELECT * FROM NAMED <http://www.ontotext.com/explicit>
{ GRAPH ?g { ?s ?p ?o } }
The result should be:
+------------------------------------+----------+-----------+-------+
| ?g | ?s | ?p | ?o |
+------------------------------------+----------+-----------+-------+
| <http://www.ontotext.com/explicit> | :hello | rdf:type | :word |
| :farewells | :goodbye | rdf:type | :word |
+------------------------------------+----------+-----------+-------+
For comparison, note that
SELECT * FROM NAMED <http://www.openrdf.org/schema/sesame#nil>
{ GRAPH ?g { ?s ?p ?o } }
will return only
+------------------------------------+----------+-----------+-------+
| ?g | ?s | ?p | ?o |
+------------------------------------+----------+-----------+-------+
| <http://www.ontotext.com/explicit> | :hello | rdf:type | :word |
+------------------------------------+----------+-----------+-------+
Short "answer": avoid putting data into the default graph in GraphDB (and other triple stores with a "virtual" default graph which is just a UNION of all named graphs)
Background: GraphDB decided to define the default graph as the union of all named graphs and the default graph. This behaviour is not backed by the SPARQL semantics specification, but is implementation-specific behaviour.
So you really have three options:
Use FILTER NOT EXISTS or MINUS as explained by Jeen Broekstra. This can have a serious negative impact on query performance.
Use the GraphDB pseudographs as exemplified by Stanislav Kralin. This option makes your queries (and your system) dependent on GraphDB -- you can not change the SPARQL engine later, without adapting your queries.
Avoid putting data into the default graph. You might define "your own" default graph, e.g. call it http://default/, and put it in the SPARQL FROM clause.
Other triple stores allow to enable/disable this feature. I couldn't find a switch in the documentation of GraphDB. Otherwise this would be the 4th, and my preferred, option.
This is my query
PREFIX : <http://example.org/rs#>
select ?item (SUM(?similarity) as ?summedSimilarity)
(group_concat(distinct ?becauseOf ; separator = " , ") as ?reason) where
{
values ?x {:instance1}
{
?x ?p ?instance.
?item ?p ?instance.
?p :hasSimilarityValue ?similarity
bind (?p as ?becauseOf)
}
union
{
?x a ?class.
?item a ?class.
?class :hasSimilarityValue ?similarity
bind (?class as ?becauseOf)
}
filter (?x != ?item)
}
group by ?item
in my firstbind clause, I would like to not just bind the variable ?p, but also the variable ?instance. Plus, adding a text like that is why.
so the first bind should result the following results:
?p that is why ?instance
is that possible in SPARQL ?
please don't care about if the data makes sence or not, it is just a query to show you my question
If I understand you correctly, you're just looking for the concat function. As I've mentioned before, you should really browse through the SPARQL 1.1 standard, at least through the table of contents. You don't need to memorize it, but it will give you an idea of what things are possible, and an idea of where to look. Additionally, it's very helpful if you provide sample data that we can work with, because it makes it much clearer to figure out what you're trying to do. The phrasing of your title was not particularly clear, and the question doesn't really provide an example of what you're trying to accomplish. Only because I've seen some of your past questions did I have an idea of what you were aiming for. At any rate, here's some data:
#prefix : <urn:ex:>
:p :hasSimilarity 0.3 .
:A :hasSimilarity 0.6 .
:a :p :b ; #-- is is related to :b
a :A . #-- and is an :A .
:c :p :b . #-- :c is also related to :b
:d a :A . #-- :d is also an :A .
:e :p :b ; #-- :e is related to :b
a :A . #-- and is also an :A .
And here's the query and its results. You just use concat to join the str form of your variables with the appropriate strings and then bind the result to the variable.
prefix : <urn:ex:>
select ?item
(sum(?factor_) as ?factor)
(group_concat(distinct ?reason_; separator=", ") as ?reason)
{
values ?x { :a }
{ ?x ?p ?instance .
?item ?p ?instance .
?p :hasSimilarity ?factor_ .
bind(concat("has common ",str(?p)," value ",str(?instance)) as ?reason_) }
union
{ ?x a ?class.
?item a ?class.
?class :hasSimilarity ?factor_ .
bind(concat("has common class ",str(?class)) as ?reason_)
}
filter (?x != ?item)
}
group by ?item
-----------------------------------------------------------------------------------
| item | factor | reason |
===================================================================================
| :c | 0.3 | "has common urn:ex:p value urn:ex:b" |
| :d | 0.6 | "has common class urn:ex:A" |
| :e | 0.9 | "has common urn:ex:p value urn:ex:b, has common class urn:ex:A" |
-----------------------------------------------------------------------------------
I am using DBpedia for the first time. I would like to download all the people in the dataset of people along with properties for commonName, nationality, birthDate, and knownFor (I will eventually stick it into an excel spread sheet using some sort of scripting language I think).
This was my first attempt at a query to do this job, however it does not work. I tried piecing it together from other bits of code I've seen on the internet. Does anyone know how to fix this? Thanks
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
EDIT: NEW VERSION OF CODE: RETURNS COMMONNAME (WITH NON-ENGLISH DUPLICATES). STILL MISSING OTHER PROPERTIES.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/ontology/>
SELECT DISTINCT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName . FILTER(lang(?commonName) = 'en')
}
LIMIT 30
First, your query has a bunch of syntax issues:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
^ you probably want to use the dbpedia-owl properties which are
# in <http://dbpedia.org/ontology/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
^ space between ? and varname
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
^ This needs to be "?commonName . FILTER(..."
# and the same thing applies to your other
# filters
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
It's easier to build some of these queries incrementally, because then you can find out what properties some of the resources actually have, and then you can extend your query some more. The public endpoint has a number of predefined namespaces, and using those will make it easier for others to read your query. So, you can start by asking for people:
SELECT * WHERE {
?person a dbpedia-owl:Person .
}
LIMIT 10
SPARQL results
Seeing that that's working, you can look at some of the returned instances and see that they have dbpedia-owl:commonName properties, and then extend the query:
SELECT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName .
}
LIMIT 10
SPARQL results
It's easy enough to extend this with the dbpedia-owl:birthDate property. I don't see a nationality predicate on the instances that I've looked at, so I'm not sure what you were basing the nationality query on. While I saw some use of the knownFor property, I didn't see it on many instances, so if you make it a required property, you're going to exclude lots of people. This sort of incremental approach will probably help you out in the long run, though.
Finding Properties
While the browseable ontology provides a nice way to find classes, I'm not sure whether there's such a nice way of finding properties. However, you can do something in a brute force manner. E.g., to find all the properties that have actually been used for Persons, you can run a query like the following. (Note: this query takes a while to execute, so if you use it, you should probably download the results.)
select distinct ?p where {
[] a dbpedia-owl:Person ;
?p [] .
}
SPARQL results
I'll note that dbpedia-owl:nationality does appear in that list.
To get all the properties for everything, you can download the ontology, and run a query like:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
select * where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
}
I ran this locally using Jena's ARQ:
$ arq --query properties.sparql --data dbpedia_3.8.owl
----------------------------------------------------------------------------
| p |
============================================================================
| <http://dbpedia.org/ontology/regionServed> |
| <http://dbpedia.org/ontology/coachedTeam> |
| <http://dbpedia.org/ontology/legalForm> |
| <http://dbpedia.org/ontology/goldenCalfAward> |
| <http://dbpedia.org/ontology/composer> |
| <http://dbpedia.org/ontology/owningOrganisation> |
| <http://dbpedia.org/ontology/branchFrom> |
| <http://dbpedia.org/ontology/iso6393Code> |
...
| <http://dbpedia.org/ontology/classification> |
| <http://dbpedia.org/ontology/bgafdId> |
| <http://dbpedia.org/ontology/currencyCode> |
| <http://dbpedia.org/ontology/onChromosome> |
| <http://dbpedia.org/ontology/course> |
| <http://dbpedia.org/ontology/frequentlyUpdated> |
| <http://dbpedia.org/ontology/distance> |
| <http://dbpedia.org/ontology/volume> |
| <http://dbpedia.org/ontology/description> |
----------------------------------------------------------------------------
That won't provide the rdfs:domain and rdfs:range, but you could also ask for these, or for just those properties with rdfs:range dbpedia-owl:Person (but note that this won't get all the properties that could be used Person, since the range could be more or less specific):
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?p ?range where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
?p rdfs:domain dbpedia-owl:Person ; rdfs:range ?range .
}
$ arq --query properties.sparql --data dbpedia_3.8.owl | head
--------------------------------------------------------------------------------------------------------
| p | range |
========================================================================================================
| dbpedia-owl:restingPlacePosition | <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> |
| dbpedia-owl:opponent | dbpedia-owl:Person |
| dbpedia-owl:employer | dbpedia-owl:Organisation |
| dbpedia-owl:hometown | dbpedia-owl:Settlement |
| dbpedia-owl:militaryBranch | dbpedia-owl:MilitaryUnit |
| dbpedia-owl:school | dbpedia-owl:EducationalInstitution |
| dbpedia-owl:ethnicity | dbpedia-owl:EthnicGroup |
...
| dbpedia-owl:sex | xsd:string |
| dbpedia-owl:hipSize | xsd:double |
| dbpedia-owl:individualisedPnd | xsd:nonNegativeInteger |
| dbpedia-owl:weddingParentsDate | xsd:date |
| dbpedia-owl:birthName | xsd:string |
| dbpedia-owl:networth | xsd:double |
| dbpedia-owl:birthYear | xsd:gYear |
| dbpedia-owl:bustSize | xsd:double |
| dbpedia-owl:description | xsd:string |
--------------------------------------------------------------------------------------------------------
I'm not very good at SPARQL yet, but i do see some syntax issues here. I rewrote the query to look like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person, ?commonName, ?nationality, ?knownFor, ?birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonName .
FILTER (lang(?commonName) = 'en') .
?person prop:nationality ?nationality .
FILTER (lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor .
FILTER (lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
and now its at least running the query without an error. but i'm not seeing results. not sure why
How can I find lists that contain each of a set of items in SPARQL? Let's say I have this data:
<http://foo.org/test> <http://foo.org/name> ( "new" "fangled" "thing" ) .
<http://foo.org/test2> <http://foo.org/name> ( "new" "york" "city" ) .
How can I find the items whose lists contain both "new" and "york"?
The following SPARQL doesn't work, since filter works on each binding of ?t, not the set of all of them.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?s ?p WHERE {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t
FILTER( ?t = "new" && ?t = "york")
}
Finding lists with multiple required values
If you're looking for lists that contain all of a number of values, you'll need to use a more complicated query. This query finds all the ?s values that have a ?list value, and then filters out those where there is not a word that is not in the list. The list of words is specified using a values block.
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s {
?s <http://foo.org/name> ?list .
filter not exists { #-- It's not the case
values ?word { "new" "york" } #-- that any of the words
filter not exists { #-- are not
?list rdf:rest*/rdf:first ?word #-- in the list.
}
}
}
--------------------------
| s |
==========================
| <http://foo.org/test2> |
--------------------------
Find alternative strings in a list
On the other hand, if you're just trying to search for one of a number of options, you're using the right property path, but you've got the wrong filter expression. You want strings equals to "new" or "york". No string is equal to both "new" and "york". You just need to do filter(?t = "new" || ?t = "york") or better yet use in: filter(?t in ("new", "york")). Here's a full example with results:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s ?t {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t .
filter(?t in ("new","york"))
}
-----------------------------------
| s | t |
===================================
| <http://foo.org/test2> | "new" |
| <http://foo.org/test2> | "york" |
| <http://foo.org/test> | "new" |
-----------------------------------