Sparql query to read from all named graphs without knowing the names - sparql

I am looking to run a SPARQL query over any dataset. We dont know the names of the named graphs in the datasets.
These are lots of documentation and examples of selection from named graphs when you know the name of the named graph/s. There are examples showing listing named graphs.
We are running the Jena from Java so it would be possible to run 2 queries, the first gets the named graphs and we inject these into the 2nd.
But surely you can write a single query that reads from all named graphs when you dont know their names?
Note: we are looking to stay away from using default graph/s as their behaviour seems implementation dependent.

Example:
{
?s foaf:name ?name ;
vCard:nickname ?nickName .
}
If you want the pattern to match within one graph and wish to try each graph, use the GRAPH ?g form.
GRAPH ?g
{ ?s foaf:name ?name ;
vc:nickname ?nickName .
}
If you want to make a query where the pattern matches across named graphs, -- e.g. foaf:name in one graph and vCard:nickname in another, same subject --
then set union default graph tdb2:unionDefaultGraph true then the default graph as seen by the query is the union (actually, RDF merge - no duplicates) of all the named graphs. Use the pattern as originally given.
Fuseki configuration file extract:
:dataset_tdb2 rdf:type tdb2:DatasetTDB2 ;
tdb2:location "DB2" ;
## Optional - with union default for query and update WHERE matching.
tdb2:unionDefaultGraph true ;
.
In code, not Fuseki, the application can use Dataset.getUnionModel().

Related

Get movie(s) based on book(s) from DBpedia

I am new to SPARQL and trying to fetch a movie adapted from specific book from dbpedia. This is what I have so far:
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT *
WHERE
{
<http://dbpedia.org/page/2001:_A_Space_Odyssey> a ?type.
?type onto:basedOn ?book .
?book a onto:Book
}
I can't get any results. How can I do that?
When using any web resource, and in your case the property :basedOn, you need to make sure that you have declared the right prefix. If you are querying from the DBpedia SPARQL endpoint, then you can directly use dbo:basedOneven without declaring it, as it is among predefined. Alternatively, if you want to use your own, or if you are using another SPARQL client, make sure that whatever short name you choose for this property, you declare the prefix for http://dbpedia.org/ontology/.
Then, first, to get more result you may not restrict the type of the subject of this triple pattern, as there could be movies that actually not type as such. So, a query like this
select distinct *
{
?movie dbo:basedOn ?book .
?book a dbo:Book .
}
will give you lots of good results but not all. For example, the resource from your example will be missing. You can easily check test the available properties between these two resource with a query like this:
select ?p
{
{<http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey> }
UNION
{ <http://dbpedia.org/resource/2001:_A_Space_Odyssey> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)>}
}
You'll get only one result:
http://www.w3.org/2000/01/rdf-schema#seeAlso
(note that the URI is with 'resource', not with 'page')
Then you may search for any path between the two resource, using the method described here, or find a combination of other patterns that would increase the number of results.

Setting to query only Default Graph and exclude Named Graphs

In the GraphDB documentation, I see that "the dataset’s default graph contains the merge of the database’s default graph AND all the database named graphs." This means that "if a statement ex:x ex:y ex:z exists in the database in the graph ex:g" then a query such as SELECT * { ?s ?p ?o } will return the triple ex:x ex:y ex:z
I am wondering if there is a setting which can be triggered either via the web interface or via the RDF4J/OpenRDF API which will disable this behavior in a specified GraphDB repository. That is, for the purposes of my project I would prefer to have triples which are stored in named graphs to only appear in results which specifically query that named graph.
I have not seen anything like this searching through the documentation or on the settings available on the web interface, but maybe somebody here knows something I don't.
EDIT: I am not looking for a SPARQL solution to this problem. I know that I can query just the default graph using SPARQL, but I want to be able to use the query SELECT * { ?s ?p ?o } and only see results which are in the default graph by default.
GraphDB/RDF4J have a different interpretation than Jena how to query the default graph. The only easy way to query only explicit statements in the default graph is to use the special graph sesame:nil. The SPARQL-based solution is to write:
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
SELECT ?s ?p ?o
FROM sesame:nil
WHERE {
?s ?p ?o .
} LIMIT 100
I don't think there is any easy non-SPARQL based solution like changing a configuration option or even use this special graph over the SPARQL Graph Store protocol.

Query a named graph but get label from an ontology

My data is partitioned using named graphs. One graph contains my ontology and human-readable labels for classes. The data in turn is divided up into a number of graphs, ex:data1, ex:data2, ... ex:data5000.
Now I wish to query each of the data graphs independently, while using the labels from the ontologies. First I thought this would be trivial but then I found that scoping in SPARQL can be a bit confusing. As a simple example:
Select all instances with their respective instance label. Also retrieve the optional class associated with each instance and the class label from the ontology.
If the scope of class stretched across both groups I would expect the following query to do the trick:
SELECT ?instance ?instanceLabel ?class ?classLabel
WHERE {
GRAPH ?data {
?instance rdfs:label ?instanceLabel .
OPTIONAL { ?instance a ?class }
}
GRAPH <http://myontology> {
OPTIONAL {
?class rdfs:label ?classLabel
}
}
}
If the class is bound the result will be what I wanted, since the two groups are joined; however, due to scoping ?class can bind to anything in the ontology if its not bound in the first group.
The query I'm writing has about 15 similar optional fields (terribly expensive join...). When I tried re-writing with UNION it quickly became quite unwieldy due to the size of the original query. I've also tried nesting queries but that gives me the same issue with scoping. Is there anyone who has some SPARQL-trick up their sleeve that I've overlooked? Any suggestions on how to approach this would be appreciated.

How to execute SPARQL Query (Call a service) Over extracted subgraph?

I have a RDF graph with several types of relations (relations with the same prefix and with different prefixes also). I need to call a service over the graph but filtering out some relations.
Example:
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
#prefix myPref: <http://www.myPref.com/>.
#prefix otherPref: <http://www.otherPref.com/>.
myPref:1
myPref:label "1" ;
myPref:solid myPref:2 ;
myPref:dotted myPref:4 ;
otherPref:dashed myPref:3 ;
otherPref:dashed2 myPref:3 .
myPref:2
myPref:label "2" ;
myPref:solid myPref:3 .
myPref:3
myPref:label "3" .
myPref:4
myPref:label "4" ;
myPref:dotted myPref:3 .
I would like to run the service call over an extracted sub-graph containing only the solid and dotted relations (In this particular case, running a service calculating the shortest path between 1 to 3, I want to exclude those direct links).
I run the service (Over the entire graph) like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
PREFIX myPref: <http://www.myPref.com/>.
PREFIX otherPref: <http://www.otherPref.com/>.
PREFIX gas: <http://www.bigdata.com/rdf/gas#>
SELECT ?sp ?out {
SERVICE gas:service {
gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" .
gas:program gas:in myPref:1 .
gas:program gas:target myPref:3 .
gas:program gas:out ?out .
gas:program gas:out1 ?sp .
}
}
How can I extract a subgraph containing only the links I want (Dotted and solid) and the run the service call over the extracted sub-graph?
SPARQL doesn't provide any functionality for querying a constructed graph, unfortunately. I've come across places where it would make some queries very easy. Some endpoints do have extensions to support it, though. I think that dotNetRDF might support it. There are probably a few aspects: in many cases, it's not actually necessary; if the endpoint supports updates, you can create a new named graph and construct into it, and then launch a second query against it (which is pretty much what you're asking for, but in two steps); this could be a very expensive operation, so endpoints might disable it anyway, even if it was directly supported.
The first note, though, that it's often times not necessary, appears that it might be the case here.
I need to call a service over the graph but filtering out some relations.
In this case, you can query over the subgraph that you want, I think, by using property paths. You can ask for paths built from just solid and dashed edges like:
?s myPref:solid|myPref:dotted ?t
If you want an arbitrary path of them, you can repeat it:
?s (myPref:solid|myPref:dotted)+ ?t
If you have unique paths between sources and destinations, then you can figure out the lengths of paths using the standard "count the ways of splitting the path" technique:
select (count(?t) as ?length) {
?s (myPref:solid|myPref:dotted)* ?t
?t (myPref:solid|myPref:dotted)* ?u
}
group by ?s ?t

Querying WikiData, difference between p and wdt default prefix

I am new to wikidata and I can't figure out when I should use -->
wdt prefix (http://www.wikidata.org/prop/direct/)
and when I should use -->
p prefix (http://www.wikidata.org/prop/).
in my sparql queries. Can someone explain what each of these mean and what is the difference?
Things in the p: namespace are used to select statements. Things in the wdt: namespace are used to select entites. Entity selection, with wdt:, allows you to simplify or summarize more complex queries involving statement selection.
When you see a p: you are usually going to see a ps: or pq: shortly following. This is because you rarely want a list of statements; you usually want to know something about those statements.
This example is a two-step process showing you all the graffiti in Wikidata:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti p:P31 ?statement . # entities that are statements
?statement ps:P31 wd:Q17514 . # which state something is graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Two different versions of the P31 property are used here, housed in different namespaces. Each version comes with different expectations about how it will connect to other items. Things in the p: namespace connect entities to statements, and things in the ps: namespace connect statements to values. In the example, p:P31 is used to select statements about an entity. The entity will be graffiti, but we do not specify that until the next line, where ps:P31 is used to select the values (subjects) of the statements, specifying that those values should be graffiti.
So, that's kind of complicated! The wdt: namespace is supposed to make this kind of query simper. The example could be rewritten as:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti wdt:P31 wd:Q17514 . # entities that are graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This is now one line shorter because we are no longer looking for statements about graffiti, but for graffiti itself. The dual p: and ps: linkages are summarized with a wdt: version of the same P31 property. However, be aware:
This technique only works for statements that are true or false in nature, like, is a thing graffiti or not. (The "t" in wdt: stands for "truthy").
Information available to wdt: is just missing some facts, sometimes. Often in my experience a p: and ps: query will return a few more results than a wdt: query.
If you go to the Wikidata item page for Barack Obama at https://www.wikidata.org/wiki/Q76 and scroll down, you see the entry for the "spouse" property P26:
Think of the p: prefix as a way to get to the entire white box on the right side of the image.
In order to get to the information inside the white box, you need to dig deeper.
In order to get to the main part of the information ("Michelle Obama"), you combine the p: prefix with the ps: prefix like this:
SELECT ?spouse WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
}
The variable ?s is an abstract statement node (aka the white box).
You can get the same information with only one triple in the body of the query by using wdt::
SELECT ?spouse WHERE {
wd:Q76 wdt:P26 ?spouse .
}
So why would you ever use p:?
You might have noticed that the white box also contains meta information ("start time" and "place of marriage").
In order to get to the meta information, you combine the p: prefix with the pq: prefix.
The following example query returns all the information together with the statement node:
SELECT ?s ?spouse ?time ?place WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
?s pq:P580 ?time .
?s pq:P2842 ?place .
}
They're simply XML namespace prefixes, basically a shortcut for full URIs. So given wdt:Apples, the full URI is http://www.wikidata.org/prop/direct/Apples and given p:fruitType the URI is http://www.wikidata.org/prop/fruitType.
Prefixes/namespaces have no other meaning, they are simply ways to define the name of something with URL format. However conventions, such as defining properties in http://www.wikidata.org/prop/, are useful to separate the meanings of terms, so 'direct' is likely a sub-type of property as well (in this case having to do with wikipedia dumps).
For the specifics, you'd need to hope the authors have exposed some naming convention, or be caught in a loop of "was it p:P51 or p:P15 or maybe wdt:P51?". And may luck be with you because the "semantics" of semantic technology have been lost.