Find lists containing ALL values in a set? - sparql

How can I find lists that contain each of a set of items in SPARQL? Let's say I have this data:
<http://foo.org/test> <http://foo.org/name> ( "new" "fangled" "thing" ) .
<http://foo.org/test2> <http://foo.org/name> ( "new" "york" "city" ) .
How can I find the items whose lists contain both "new" and "york"?
The following SPARQL doesn't work, since filter works on each binding of ?t, not the set of all of them.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?s ?p WHERE {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t
FILTER( ?t = "new" && ?t = "york")
}

Finding lists with multiple required values
If you're looking for lists that contain all of a number of values, you'll need to use a more complicated query. This query finds all the ?s values that have a ?list value, and then filters out those where there is not a word that is not in the list. The list of words is specified using a values block.
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s {
?s <http://foo.org/name> ?list .
filter not exists { #-- It's not the case
values ?word { "new" "york" } #-- that any of the words
filter not exists { #-- are not
?list rdf:rest*/rdf:first ?word #-- in the list.
}
}
}
--------------------------
| s |
==========================
| <http://foo.org/test2> |
--------------------------
Find alternative strings in a list
On the other hand, if you're just trying to search for one of a number of options, you're using the right property path, but you've got the wrong filter expression. You want strings equals to "new" or "york". No string is equal to both "new" and "york". You just need to do filter(?t = "new" || ?t = "york") or better yet use in: filter(?t in ("new", "york")). Here's a full example with results:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s ?t {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t .
filter(?t in ("new","york"))
}
-----------------------------------
| s | t |
===================================
| <http://foo.org/test2> | "new" |
| <http://foo.org/test2> | "york" |
| <http://foo.org/test> | "new" |
-----------------------------------

Related

How to get a path between IRIs or between two nodes of certain rdf:type using a SPARQL query?

Trying to execute a query using rdf4j console against a sparql endpoint to find the path between 2 nodes using property wildcards but no luck. The first query gives an error as
Malformed query: Not a valid (absolute) IRI:
The second query crashes the console. Should I try to use the query using a different way to query the endpoint as this maybe an rdf4j issue or is the query itself wrong?
PREFIX xy: <http://mainuri/>
select
*
where
{
<http://uriOfInstanceOfData> ((<>|!<>)|^(<>|!<>))* ?x .
?x ?p ?o .
?o ((<>|!<>)|^(<>|!<>))* <http://uriOfInstanceOfData>.
}
AND
PREFIX xy: <http://mainuri/>
select
*
where
{
<http://uriOfInstanceOfData> (xy:|!xy:)* ?x .
?x ?p ?o .
?o (xy:|!xy:)* <http://uriOfInstanceOfData>.
}
The first query is syntactically incorrect: <> is not a valid IRI reference. The SPARQL grammar allows the empty string, but the specification also notes that any IRI reference must be a string that (after escape processing results) in a valid RFC3987 IRI. Since an IRI requires, at a mimimum, a scheme identifier, an empty string can by definition not be a valid IRI.
The second query works when I try it on a small test dataset. However it is likely very expensive to process.
EDIT the query I actually tried:
PREFIX xy: <http://mainuri/>
select
*
where
{
rdfs:domain (xy:|!xy:)* ?x .
?x ?p ?o .
?o (xy:|!xy:)* rdf:Property.
}
On a local in-memory database with basic RDFS inferencing enabled, that gives the following result:
Evaluating SPARQL query...
+------------------------+------------------------+------------------------+
| x | p | o |
+------------------------+------------------------+------------------------+
| rdfs:domain | rdf:type | rdf:Property |
| rdfs:domain | rdfs:domain | rdf:Property |
+------------------------+------------------------+------------------------+
2 result(s) (28 ms)

select all triples, indicating named graph if relevant

Let's say I make the following insertions into my GraphDB 8.3 triplestore:
PREFIX : <http://example.com/>
insert data { :hello a :word }
and
PREFIX : <http://example.com/>
insert data { graph :farewells { :goodbye a :word }}
now, if I ask
select * where {
graph ?g {
?s ?p ?o .
}
}
I only get
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
| ?g | ?s | ?p | ?o |
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
| <http://example.com/farewells> | <http://example.com/goodbye> | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://example.com/word> |
+--------------------------------+------------------------------+---------------------------------------------------+---------------------------+
I can obviously get both "triples about words" with the following, but then the named-graph membership is not shown
select * { ?s ?p ?o }
How can I write a query that retrieves both triples about words and indicates that { :goodbye a :word } comes from graph :farewells ?
You can do something along these lines:
SELECT *
WHERE {
{ GRAPH ?g { ?s ?p ?o } }
UNION
{ ?s ?p ?o .
FILTER NOT EXISTS { GRAPH ?g { ?s ?p ?o } }
}
}
The first part of the union selects all triples in named graphs. The second part grabs all triples in the default graph, explicitly excluding patterns that occur in a named graph.
In GraphDB, you could use pseudographs for this purpose, i. e. <http://www.ontotext.com/explicit> (it seems you are not using inferencing).
Try this query:
SELECT * FROM NAMED <http://www.ontotext.com/explicit>
{ GRAPH ?g { ?s ?p ?o } }
The result should be:
+------------------------------------+----------+-----------+-------+
| ?g | ?s | ?p | ?o |
+------------------------------------+----------+-----------+-------+
| <http://www.ontotext.com/explicit> | :hello | rdf:type | :word |
| :farewells | :goodbye | rdf:type | :word |
+------------------------------------+----------+-----------+-------+
For comparison, note that
SELECT * FROM NAMED <http://www.openrdf.org/schema/sesame#nil>
{ GRAPH ?g { ?s ?p ?o } }
will return only
+------------------------------------+----------+-----------+-------+
| ?g | ?s | ?p | ?o |
+------------------------------------+----------+-----------+-------+
| <http://www.ontotext.com/explicit> | :hello | rdf:type | :word |
+------------------------------------+----------+-----------+-------+
Short "answer": avoid putting data into the default graph in GraphDB (and other triple stores with a "virtual" default graph which is just a UNION of all named graphs)
Background: GraphDB decided to define the default graph as the union of all named graphs and the default graph. This behaviour is not backed by the SPARQL semantics specification, but is implementation-specific behaviour.
So you really have three options:
Use FILTER NOT EXISTS or MINUS as explained by Jeen Broekstra. This can have a serious negative impact on query performance.
Use the GraphDB pseudographs as exemplified by Stanislav Kralin. This option makes your queries (and your system) dependent on GraphDB -- you can not change the SPARQL engine later, without adapting your queries.
Avoid putting data into the default graph. You might define "your own" default graph, e.g. call it http://default/, and put it in the SPARQL FROM clause.
Other triple stores allow to enable/disable this feature. I couldn't find a switch in the documentation of GraphDB. Otherwise this would be the 4th, and my preferred, option.

SPARQL query to get all class label with namespace prefix defined

I want to get all class that stored in sesame repository.
This is my query
SELECT ?class ?classLabel
WHERE {
?class rdf:type rdfs:Class.
?class rdfs:label ?classLabel.
}
It's return all URI of class with label. For example,
"http://example.com/A_Class" "A_Class"
"http://example.com/B_Class" "B_Class"
But, if namespace prefix already defined I want to get label with namespace defined. For example if I already defined namespace prefix "ex" for "http://example.com/", the result become
"http://example.com/A_Class" "ex:A_Class"
"http://example.com/B_Class" "ex:B_Class"
You want to add a URI prefix to the front of a label string?
I think you might be confused about what URI prefixes do. They're just shorthand for full URIs, and aren't part of the URI, and they don't have any bearing on strings.
You can do something like what you want with
SELECT ?class (CONCAT("ex:", ?classLabel) AS ?label
WHERE {
?class rdf:type rdfs:Class.
?class rdfs:label ?classLabel.
}
But the prefix won't depend on the prefix of the URI.
You could have a chain of IF()s that tests the starting characters of STR(?class), but it will get ugly quickly:
BIND(IF(STRSTARTS(STR(?class), "http://example.com/"), "ex:", "other:") as ?prefix)
then
SELECT ... (CONCAT(?prefix, ?classLabel) AS ?label
There's almost certainly an easier way to get what you want than doing it in SPARQL though :)
I agree with the other answers that say you're trying to do a strange thing, at least as it's stated in the question. Associating the prefix with the rdfs:label isn't all that useful, since the resulting string may or may not represent the actual URI of the class. Assuming that what you're trying to do is actually figure out a prefixed name for a URI given some set of prefix definitions, you can do this in SPARQL without needing lots of nested IFs; you can do it by using a VALUES and a FILTER.
Given data like this, in which a number of classes are defined (I don't define any labels here, since they're not necessary for this example),
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix ex1: <http://example1.com/> .
#prefix ex2: <http://example2.com/> .
#prefix ex3: <http://example3.com/> .
#prefix ex4: <http://example4.com/> .
ex1:A a rdfs:Class .
ex1:B1 a rdfs:Class .
ex2:A a rdfs:Class .
ex2:B2 a rdfs:Class .
ex3:A a rdfs:Class .
ex3:B3 a rdfs:Class .
ex3:A a rdfs:Class .
ex3:B4 a rdfs:Class .
ex4:A a rdfs:Class .
ex4:B4 a rdfs:Class .
you can write a query like this that selects all classes, checks whether their URI begins with one of a number of defined namespaces, and if so, returns the class and the prefixed form:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?class (group_concat(?prefixedName ; separator = "") as ?prefName) where {
values (?prefix ?ns) {
( "ex1:" <http://example1.com/> )
( "ex2:" <http://example2.com/> )
( "ex3:" <http://example3.com/> )
}
?class a rdfs:Class .
bind( if( strStarts( str(?class), str(?ns) ),
concat( ?prefix, strafter( str(?class), str(?ns) )),
"" )
as ?prefixedName )
}
group by ?class
order by ?class
which produces as results:
$ arq --data data.n3 --query query.sparql
---------------------------------------
| class | prefName |
=======================================
| <http://example1.com/A> | "ex1:A" |
| <http://example1.com/B1> | "ex1:B1" |
| <http://example2.com/A> | "ex2:A" |
| <http://example2.com/B2> | "ex2:B2" |
| <http://example3.com/A> | "ex3:A" |
| <http://example3.com/B3> | "ex3:B3" |
| <http://example3.com/B4> | "ex3:B4" |
| <http://example4.com/A> | "" |
| <http://example4.com/B4> | "" |
---------------------------------------

Use DBPedia to load all People along with some data

I am using DBpedia for the first time. I would like to download all the people in the dataset of people along with properties for commonName, nationality, birthDate, and knownFor (I will eventually stick it into an excel spread sheet using some sort of scripting language I think).
This was my first attempt at a query to do this job, however it does not work. I tried piecing it together from other bits of code I've seen on the internet. Does anyone know how to fix this? Thanks
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
EDIT: NEW VERSION OF CODE: RETURNS COMMONNAME (WITH NON-ENGLISH DUPLICATES). STILL MISSING OTHER PROPERTIES.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/ontology/>
SELECT DISTINCT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName . FILTER(lang(?commonName) = 'en')
}
LIMIT 30
First, your query has a bunch of syntax issues:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
^ you probably want to use the dbpedia-owl properties which are
# in <http://dbpedia.org/ontology/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
^ space between ? and varname
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
^ This needs to be "?commonName . FILTER(..."
# and the same thing applies to your other
# filters
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
It's easier to build some of these queries incrementally, because then you can find out what properties some of the resources actually have, and then you can extend your query some more. The public endpoint has a number of predefined namespaces, and using those will make it easier for others to read your query. So, you can start by asking for people:
SELECT * WHERE {
?person a dbpedia-owl:Person .
}
LIMIT 10
SPARQL results
Seeing that that's working, you can look at some of the returned instances and see that they have dbpedia-owl:commonName properties, and then extend the query:
SELECT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName .
}
LIMIT 10
SPARQL results
It's easy enough to extend this with the dbpedia-owl:birthDate property. I don't see a nationality predicate on the instances that I've looked at, so I'm not sure what you were basing the nationality query on. While I saw some use of the knownFor property, I didn't see it on many instances, so if you make it a required property, you're going to exclude lots of people. This sort of incremental approach will probably help you out in the long run, though.
Finding Properties
While the browseable ontology provides a nice way to find classes, I'm not sure whether there's such a nice way of finding properties. However, you can do something in a brute force manner. E.g., to find all the properties that have actually been used for Persons, you can run a query like the following. (Note: this query takes a while to execute, so if you use it, you should probably download the results.)
select distinct ?p where {
[] a dbpedia-owl:Person ;
?p [] .
}
SPARQL results
I'll note that dbpedia-owl:nationality does appear in that list.
To get all the properties for everything, you can download the ontology, and run a query like:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
select * where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
}
I ran this locally using Jena's ARQ:
$ arq --query properties.sparql --data dbpedia_3.8.owl
----------------------------------------------------------------------------
| p |
============================================================================
| <http://dbpedia.org/ontology/regionServed> |
| <http://dbpedia.org/ontology/coachedTeam> |
| <http://dbpedia.org/ontology/legalForm> |
| <http://dbpedia.org/ontology/goldenCalfAward> |
| <http://dbpedia.org/ontology/composer> |
| <http://dbpedia.org/ontology/owningOrganisation> |
| <http://dbpedia.org/ontology/branchFrom> |
| <http://dbpedia.org/ontology/iso6393Code> |
...
| <http://dbpedia.org/ontology/classification> |
| <http://dbpedia.org/ontology/bgafdId> |
| <http://dbpedia.org/ontology/currencyCode> |
| <http://dbpedia.org/ontology/onChromosome> |
| <http://dbpedia.org/ontology/course> |
| <http://dbpedia.org/ontology/frequentlyUpdated> |
| <http://dbpedia.org/ontology/distance> |
| <http://dbpedia.org/ontology/volume> |
| <http://dbpedia.org/ontology/description> |
----------------------------------------------------------------------------
That won't provide the rdfs:domain and rdfs:range, but you could also ask for these, or for just those properties with rdfs:range dbpedia-owl:Person (but note that this won't get all the properties that could be used Person, since the range could be more or less specific):
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?p ?range where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
?p rdfs:domain dbpedia-owl:Person ; rdfs:range ?range .
}
$ arq --query properties.sparql --data dbpedia_3.8.owl | head
--------------------------------------------------------------------------------------------------------
| p | range |
========================================================================================================
| dbpedia-owl:restingPlacePosition | <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> |
| dbpedia-owl:opponent | dbpedia-owl:Person |
| dbpedia-owl:employer | dbpedia-owl:Organisation |
| dbpedia-owl:hometown | dbpedia-owl:Settlement |
| dbpedia-owl:militaryBranch | dbpedia-owl:MilitaryUnit |
| dbpedia-owl:school | dbpedia-owl:EducationalInstitution |
| dbpedia-owl:ethnicity | dbpedia-owl:EthnicGroup |
...
| dbpedia-owl:sex | xsd:string |
| dbpedia-owl:hipSize | xsd:double |
| dbpedia-owl:individualisedPnd | xsd:nonNegativeInteger |
| dbpedia-owl:weddingParentsDate | xsd:date |
| dbpedia-owl:birthName | xsd:string |
| dbpedia-owl:networth | xsd:double |
| dbpedia-owl:birthYear | xsd:gYear |
| dbpedia-owl:bustSize | xsd:double |
| dbpedia-owl:description | xsd:string |
--------------------------------------------------------------------------------------------------------
I'm not very good at SPARQL yet, but i do see some syntax issues here. I rewrote the query to look like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person, ?commonName, ?nationality, ?knownFor, ?birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonName .
FILTER (lang(?commonName) = 'en') .
?person prop:nationality ?nationality .
FILTER (lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor .
FILTER (lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
and now its at least running the query without an error. but i'm not seeing results. not sure why

OpenLink Virtuoso: Finding if two nodes are connected within a certain distance

How can I find the distance between 2 nodes in a graph using Virtuoso? I've read the Transitivity documentations but they limit you to one predicate e.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s foaf:knows ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Only traverses foaf:knows and not any predicate type. How can I extend this to 'whatever predicate'? I don't need the actual path, just a true/false (ASK query). Changing the foaf:knows to ?p seems like an overkill.
I'm currently performing a set of recursive ASKs to find out if two nodes are connected within a specific distance but that doesn't seem efficient.
You should be able to use ?p instead of foaf:knows in your query to determine if there's a path between the nodes. E.g.:
SELECT ?link ?g ?step ?path
WHERE
{
{
SELECT ?s ?o ?g
WHERE
{
graph ?g {?s ?p ?o }
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_no_cycles, T_shortest_only,
t_step (?s) as ?link, t_step ('path_id') as ?path, t_step ('step_no') as ?step, t_direction 3) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
}
LIMIT 20
Here's an approach that works if there's at most one path between the nodes that you're interested in.
If you have data like this (note that there are different properties connecting the resources):
#prefix : <https://stackoverflow.com/q/3914522/1281433/>
:a :p :b .
:b :q :c .
:c :r :d .
Then a query like the following finds the distance between each pair of nodes. The property path (:|!:) consists a property that is either : or something other than : (i.e., anything). Thus (:|!:)* is zero or more occurrences of any property; it's a wildcard path. (The technique used here is described more fully in Is it possible to get the position of an element in an RDF Collection in SPARQL?.)
prefix : <https://stackoverflow.com/q/3914522/1281433/>
select ?begin ?end (count(?mid)-1 as ?distance) where {
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
order by ?begin ?end ?distance
--------------------------
| begin | end | distance |
==========================
| :a | :a | 0 |
| :a | :b | 1 |
| :a | :c | 2 |
| :a | :d | 3 |
| :b | :b | 0 |
| :b | :c | 1 |
| :b | :d | 2 |
| :c | :c | 0 |
| :c | :d | 1 |
| :d | :d | 0 |
--------------------------
To just find out whether there's a path between two nodes that's less than some particular length, you use an ask query instead of a select, fix the values of ?begin and ?end, and restrict the value of count(?mid)-1 rather than binding it to ?distance. E.g., is there a path from :a to :d of length less than three?
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :d) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 3 ) )
Ask => No
On the other hand, there is a path from :a to :c with length less than 5:
prefix : <https://stackoverflow.com/q/3914522/1281433/>
ask {
values (?begin ?end) { (:a :c) }
?begin (:|!:)* ?mid .
?mid (:|!:)* ?end .
}
group by ?begin ?end
having ( (count(?mid)-1 < 5 ) )
Ask => Yes