SPARQL query turtle file with blank nodes in the graph - sparql

I'm trying to understand blank nodes in turtle correctly, and I need to know if I have understood them correctly.
Suppose we had a turtle file:
#prefix sn: <....some uri...> .
_:a sn:name "person1";
sn:email "email1#test.com" .
_:b sn:name "person2";
sn:email "email2#test.com" .
_:c sn:name "person3" .
and we have SPARQL query:
PREFIX sn: <...some uri...>
SELECT ?email WHERE {
?x sn:name "person1" .
?x sn:email ?email .
}
This would only return the email of person 1, because ?x binds to the blank node label _:a... So my question is if an undefined variable ?x can still bind to a blank node that has a label, just like it would if it was not blank... So in this example, return will be:
email
--------------------------
<mailto:email1#test.com> |
Would that be correct? Thanks.

You guess correctly: in both cases (and as said I don't really see a difference between your two examples) the answer wil be only the e-mail of person 1.
A blank node is, from SPARQL's perspective, a resource like any other, and it will bind variables to a blank node in exactly the same way. So in the above example, since we know that the two statements use the same blank node as their subject (since they use the same label), they will be seen by SPARQL as being about the same subject.
One caveat: the SPARQL engine/triplestore that you use will most likely not preserve the specific blank node id a: when adding your data to a triplestore it will be replaced by some generated internal id. However, the triplestore will of course make sure that both occurrences of a in your file get replaced by the exact same generated id.

Related

How to retrieve anonymous subclass from OWL in SPARQL

I am unsure of the SPARQL query needed to replicate the results of DL Query "has part some benzamide".
That query should return all entities that have some part that is a benzamide or a subclass of benzamide.
My SPARQL attempt:
PREFIX opioid: <https://mac389.github.io/ontology#>
SELECT ?substance ?substance_label
{ ?substance rdfs:subClassOf* opioid:chemical_entity.
?substance rdfs:subClassOf* / opioid:has_part / owl:someValuesFrom opioid:benzamide.
?substance rdfs:label ?substance_label }
Link to OWL as RDF/XML
In this code the 1st and 3rd lines work as intended, retrieving a list of all chemical entities and their labels. When I add the second line, the query returns no answers (there should be 21 items which is what the DL Query in Protege returns).
How does one query for anonymous subclasses like this?
I have looked at this question but I am looking for a subclass that only fulfills one of the property restrictions. I don't fully understand the answer to this question, but mine seems very related.

Sparql query to read from all named graphs without knowing the names

I am looking to run a SPARQL query over any dataset. We dont know the names of the named graphs in the datasets.
These are lots of documentation and examples of selection from named graphs when you know the name of the named graph/s. There are examples showing listing named graphs.
We are running the Jena from Java so it would be possible to run 2 queries, the first gets the named graphs and we inject these into the 2nd.
But surely you can write a single query that reads from all named graphs when you dont know their names?
Note: we are looking to stay away from using default graph/s as their behaviour seems implementation dependent.
Example:
{
?s foaf:name ?name ;
vCard:nickname ?nickName .
}
If you want the pattern to match within one graph and wish to try each graph, use the GRAPH ?g form.
GRAPH ?g
{ ?s foaf:name ?name ;
vc:nickname ?nickName .
}
If you want to make a query where the pattern matches across named graphs, -- e.g. foaf:name in one graph and vCard:nickname in another, same subject --
then set union default graph tdb2:unionDefaultGraph true then the default graph as seen by the query is the union (actually, RDF merge - no duplicates) of all the named graphs. Use the pattern as originally given.
Fuseki configuration file extract:
:dataset_tdb2 rdf:type tdb2:DatasetTDB2 ;
tdb2:location "DB2" ;
## Optional - with union default for query and update WHERE matching.
tdb2:unionDefaultGraph true ;
.
In code, not Fuseki, the application can use Dataset.getUnionModel().

Adding rdf:type to a blank node in Ontorefine

I'm using Ontorefine within GraphDB to create RDF triples from a csv source. It seems impossible to add a rdf:type when the subject is a blank node.
When you hit the arrow in the bottom right corner of an object that is a blank node, you can type a owl:Restiction (or rdf:type owl:Restriction for that matter), but after applying, it disappears.
Even if you manually add the statement (in the JSON source) that this blank node has rdf:type owl:Restriction as property en object, it still does not create the actual triple. See picture below. The configuration is there, but the example states: empty empty. And indeed, no triple is created.
Is this in someway a feature of Ontotext, or is this a bug? In several occasions this is needed, for instance in creating a restriction in OWL.
Hello this is still present in the tool but it seems to be a purely a bug in the mapping UI . If you produce a mapping without the rdf:type and generate a SPAQRL query from that mapping, you can just add the rdf:type in the construct/insert part of the mapping query and use it to produce RDF or insert it GraphDB.

How to get composer of a symphony in dbpedia?

This is my query
select *
{
?symphonies_by_composer <http://www.w3.org/2004/02/skos/core#broader> <http://dbpedia.org/resource/Category:Symphonies_by_composer> .
?symphony <http://purl.org/dc/terms/subject> ?symphonies_by_composer .
}
I run it over Dbpedia end point http://dbpedia.org/sparql/
it gives me many symphonies. i want to construct my triples, adding my own property, which is mo:composedBy like this:
PREFIX mo: <http:blablabla.com/mo#>
construct
{
?symphony mo:composedBy ?composer .
?symphony a mo:Symphony
}
{
?symphonies_by_composer <http://www.w3.org/2004/02/skos/core#broader> <http://dbpedia.org/resource/Category:Symphonies_by_composer> .
?symphony <http://purl.org/dc/terms/subject> ?symphonies_by_composer .
}
but i don't know how to get the binding for the ?composer variable.
Do you know how ?
(I'm aware that there might be no way to get it, if you think there is no way, kindly just let me know and i will pass, unfortunately, those data)
There seems to be no explicit relation in DBPedia connecting these symphonies to an actual resource that represents the composer.
A possible workaround is to extract the name of the composer from the prefLabel of the category, by snipping off the first bit ("Symphonies by"):
PREFIX mo: <http://example.com/mo#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
CONSTRUCT
{
?symphony mo:composedBy ?composer_name .
?symphony a mo:Symphony
}
WHERE
{
?symphonies_by_composer skos:broader <http://dbpedia.org/resource/Category:Symphonies_by_composer> ;
skos:prefLabel ?label .
?symphony dct:subject ?symphonies_by_composer .
BIND(SUBSTR(STR(?label), (STRLEN("Symphonies by ") + 1)) AS ?composer_name)
}
This will give you back the name of each composer as a literal value.
A second possible step is to try and reconstruct the actual IRI of the resource identifying the composer, from the name. For example, in the case of "Hans Werner Henze", the actual resource identifying the person is http://dbpedia.org/resource/Hans_Werner_Henze, so a simple further string operations or two, replacing spaces and concatenating with the dbpedia base IRI, will resolve this. However, this is brittle, as there is no guarantee that the resource exists, and even if it does, whether it actually identifies the composer (there might be more than one Hans Werner Henze, for instance).
Of course, you can expand this further by doing followup queries to verify that the resource exists and is the correct one, but it will require some additional trial and error. If the goal is simply the name of the composer, the first example query should work fine for most instances.

How to get a concise bounded description of a resource with Sesame?

I've been testing Sesame 2.7.2 and I got a big surprise when faced to the fact that DESCRIBE queries do not include blank nodes closure [EDIT: the right term for this is CBD for concise bounded description]
If I correctly understand, the SPARQL spec is quite loose on that and says that what is returned is actually up to the provider, but I'm still surprised at the choice, since bnodes (in the results of the describe query) cannot be used in subsequent SPARQL queries.
So the question is: how can I get a closed description of a resource <uri1> without doing:
query DESCRIBE <uri1>
iterate over the result to determine which objects are blank nodes
then DESCRIBE ?b WHERE { <uri1> pred_relating_to_bnode_ ?b }
do it recursively and chaining over as long as bnodes are found
If I'm not mistaken, depth-2 bnodes would have to be described with
DESCRIBE ?b2 WHERE {<uri1> <p1&> ?b . ?b <p2> ?b2 }
unless there is a simpler way to do this?
Finally, would it not be better and simpler to let DESCRIBE return a closed description of a resource where you can still obtain the currently returned result with something like the following?
CONSTRUCT {<uri1> ?p ?o} WHERE {<uri1> ?p ?o}
EDIT: here is an example of a closed result I want to get back from Sesame
<urn:sites#1> a my:WebSite .
<urn:sites#1> my:domainName _:autos1 .
<urn:sites#1> my:online "true"^^xsd:boolean .
_:autos1 a rdf:Alt .
_:autos1 rdf:_1 _:autos2
_:autos2 my:url "192.168.2.111:15001"#fr
_:autos2 my:url "192.168.2.111:15002"#en
Currently: DESCRIBE <urn:sites#1> returns me the same result as the query CONSTRUCT WHERE {<urn:sites#1> ?p ?o}, so I get only that
<urn:sites#1> a my:WebSite .
<urn:sites#1> my:domainName _:autos1 .
<urn:sites#1> my:online "true"^^xsd:boolean .
Partial solutions using SPARQL
Based on your comments, this isn't an exact solution yet, but note that you can describe multiple things in a given describe query. For instance, given the data:
#prefix : <http://example.org/> .
:Alice :named "Alice" ;
:likes :Bill, [ :named "Carl" ;
:likes [ :named "Daphne" ]].
:Bill :likes :Elaine ;
:named "Bill" .
you can run the query:
PREFIX : <http://example.org/>
describe :Alice ?object where {
:Alice :likes* ?object .
FILTER( isBlank( ?object ) )
}
and get the results:
#prefix : <http://example.org/> .
:Alice
:likes :Bill ;
:likes [ :likes [ :named "Daphne"
] ;
:named "Carl"
] ;
:named "Alice" .
That's not a complete description of course, because it's only following :likes out from :Alice, not arbitrary predicates. But it does get the blank nodes named "Carl" and "Daphne", which is a start.
The larger issue in Sesame
It looks like you're going to have to do something like what's described above, and possibly with multiple searches, or you're going to have to modify Sesame. The alternative to writing some creative SPARQL is to change the way that Sesame implements describe queries. Some endpoints make this relatively easy, but Sesame doesn't seem to be one of them. There's a mailing list thread from 2011, Custom SPARQL DESCRIBE Implementation, that seems addressed at this same problem.
Roberto GarcĂ­a asks:
I'm trying to customise the behaviour of SPARQL DESCRIBE queries.
I'm willing to get something similar to CBD (i.e. all properties and
values for the described resource plus all properties and values for
the blank nodes connected to it).
I have tried to reproduce a similar behaviour using a CONSTRUCT query
but the performance is not good and the query gets quite complex if I
try to consider long chains of properties pointing to blank nodes
starting from the described resource.
Jeen Broekstra replies:
The implementation of DESCRIBE in Sesame is hardcoded in the query
parser. It can only be changed by adapting the parser itself, and even
then it will be tricky, as the query model has no easy way to express it
either: it needs an extension of the algebra.
> If this is not possible, any advice about how to implement it using CONSTRUCT
queries?
I'm not sure it's technically possible to do this in a single query.
CBDs are recursive in nature, and while SPARQL does have some support
for recursivity (property chains), the problem is that you have to do an
intermediate check in every step of the property chain to see if the
bound value is a blank node or not. This is not something that SPARQL
supports out of the box: property chains are defined to have only length
of the path as the stop condition.
Perhaps something is possible using a convoluted combination of
subqueries, unions and optionals, but I doubt it.
I think the best workaround is instead to use the standard DESCRIBE
format that Sesame supports, and for each blank node value in that
result do a separate consecutive query. In other words: you solve it by
hand.
The only other option is to log a feature request for support of CBDs in
Sesame. I can't give any guarantees about if/when that will be followed
up on though.