SPARQL: Insert query results as RDF LIST - sparql

I want to get from graph A results bound to a specific variable lets say ?s.
Next I want to insert those results as an RDF list in graph B.
This is my SPARQL Update:
prefix foo:<http://foo.com/>
insert
{
graph <http://B.com>
{
?var foo:propA foo:A ;
foo:propB [ foo:propA foo:A ;
foo:propX ( ?s )
] ;
foo:propC ?o .
}
}
where
{
graph <http://A.com>
{
?s ?p ?o.
BIND(URI(CONCAT("http://example.org/", STRAFTER( STR(?o), "http://someuri.org/"))) as ?var)
}
}
My problem is that it inserts this data:
<http://example.org/varX> foo:propA foo:A ;
foo:propB [ foo:propA foo:A ;
foo:propX ( <http://s1.com> )
],
[
foo:propA foo:A ;
foo:propX ( <http://s2.com> )
] ;
foo:propC <http://oX.com> .
Instead I want it to insert this:
<http://example.org/varX> foo:propA foo:A ;
foo:propB [ foo:propA foo:A ;
foo:propX ( <http://s1.com> <http://s2.com> )
] ;
foo:propC <http://oX.com> .
Can I achieve this result, is it possible?
Basically I want to set the object for the foo:propX predicate, an RDF list containing the elements values bound to variable ?s.
Note: the exact same query executes fine in RDF4J, but strangely causes Blazegraph to throw a
MalformedQueryException: Undefined vocabulary: http://www.w3.org/1999/02/22-rdf-syntax-ns#first

I don't think this is possible using SPARQL only. You'll need to use some of the API functionality to create the RDF collection.
One way to do this is to first construct your graphB as a Model object in memory, and then at the end insert that model in one go. Something along these lines (untested, but this should illustrate the general idea - have a look at the RDF4J documentation and javadoc for more details):
ValueFactory vf = conn.getValueFactory();
TupleQuery query = conn.prepareTupleQuery("SELECT ?s ?o ?var WHERE ...");
List<BindingSet> queryResult = QueryResults.asList(query.evaluate());
// map values of var to values of S
Map<Value, List<Value>> varToS = new HashMap<>();
... // do something clever with the query result to fill this HashMap
// start building our result graph
Model graphB = new TreeModel()
ModelBuilder mb = new ModelBuilder(graphB);
mb.setNamespace("foo", "http://example.org/");
mb.namedGraph("foo:graphB");
for(Value var: varToS.keySet()) {
BNode propBValue = vf.createBNode();
BNode propXValue = vf.createBNode();
mb.subject(var)
.add("foo:propA", "foo:A")
.add("foo:propB", propBValue)
.subject(propBValue)
.add("foo:propA", "foo:A")
.add("foo:propX", propXValue);
// add the values of ?s for the given v as a collection
RDFCollections.asRDF(varToSet.get(var), propXValue, graphB);
}
// insert our created graphB model into the database
conn.add(graphB);

Related

Best approach to create URIs with Sparql (like auto increment)

I am currently writing a service which creates new items (data) by user input. To save these items in a RDF Graph Store (currently using Sesame via Sparql 1.1), I need to add a subject URI to the data. My approach is to use a number that is incremented for each new item. So e.g.:
<http://example.org/item/15> dct:title "Example Title" .
<http://example.org/item/16> dct:title "Other Item" .
What's the best approach to get an incremented number for new items (like auto incement in MySQL/MongoDB) via Sparql? Or to issue some data and the endpoint autmatically creates a URI by a template (like done for blank nodes).
But I don't want to use blank nodes as subjects for these items.
Is there a better solution than using an incremented number? My users don't care about the the URI.... and I don't want to handle collisions like created by hashing the data and using the hash as part of the subject.
If you maintain a designated counter during updates, then something along these lines will do it,
fisrt insert a counter into your dataset
insert data {
graph <urn:counters> {<urn:Example> <urn:count> 1 }
}
then a typical update should looks like:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
delete {
#remove the old value of the counter
graph <urn:counters> {<urn:Example> <urn:count> ?old}
}
insert {
#update the new value of the counter
graph <urn:counters> {<urn:Example> <urn:count> ?new}
# add your new data using the constructed IRI
GRAPH <http://example.com> {
?id dct:title "Example Title" ;
a <http://example.org/ontology/Example> .
} .
} where {
# retrieve the counter
graph <urn:counters> {<urn:Example> <urn:count> ?old}
# compute the new value
bind(?old+1 as ?new)
#construct the IRI
bind(IRI(concat("http://example.org/item/", str(?old))) as ?id)
}
Assuming the class of your items is http://example.org/ontology/Example, the query becomes the following. Note: items must be inserted one by one, as only one new URI is computed at each transaction.
PREFIX dct: <http://purl.org/dc/terms/>
INSERT {
GRAPH <http://example.com> {
?id dct:title "Example Title" ;
a <http://example.org/ontology/Example> .
} .
} WHERE {
SELECT ?id WHERE {
{
SELECT (count(*) AS ?c) WHERE {
GRAPH <http://example.com> { ?s a <http://example.org/ontology/Example> }
}
}
BIND(IRI(CONCAT("http://example.org/item/", STR(?c))) AS ?id)
}
}
(Tested with GraphDB 8.4.0 using RDF4J 2.2.2)
You said you are open to other options than an auto-incremented number. One good alternative is to use UUIDs.
If you don't care at all what the URI looks like, you can use the UUID function:
INSERT {
?uri dct:title "Example Title"
}
WHERE {
BIND (UUID() AS ?uri)
}
This will generate URIs like <urn:uuid:b9302fb5-642e-4d3b-af19-29a8f6d894c9>.
If you'd rather have HTTP URIs in your own namespace, you can use strUUID:
INSERT {
?uri dct:title "Example Title"
}
WHERE {
BIND (IRI(CONCAT("http://example.org/item/", strUUID())) AS ?uri)
}
This will generate URIs like http://example.org/item/73cd4307-8a99-4691-a608-b5bda64fb6c1.
UUIDs are pretty good. Collision risk is negligible. The functions are part of the SPARQL standard. The only downside really is that they are long and ugly.

Exact match of variable string in SPARQL Wikidata Query Service

Exact match of variable string in SPARQL Wikidata Query Service at https://query.wikidata.org does not give the the results I expected.
I was expecting I could do:
SELECT * {
hint:Query hint:optimizer "None" .
{ SELECT DISTINCT (xsd:string(?author_name_) AS ?author_name) { wd:Q5565155 skos:altLabel ?author_name_ . } }
?work wdt:P2093 ?author_name .
}
But I get no returned results from the Wikidata Query Service:
However, if I use the "=" comparison, I can match the strings:
SELECT * {
hint:Query hint:optimizer "None" .
{ SELECT DISTINCT (xsd:string(?author_name_) AS ?author_name) { wd:Q5565155 skos:altLabel ?author_name_ . } }
?work wdt:P50 wd:Q5565155 .
?work wdt:P2093 ?author_name__ .
FILTER (?author_name = ?author_name__)
}
With the current data in Wikidata, I get five rows returned in this query.
Another way to get this data is by using a BIND:
SELECT * {
BIND("Knudsen GM" AS ?author_name)
?work wdt:P2093 ?author_name .
}
I suppose there might be something wrong with the casting as this does not return anything:
SELECT * {
BIND(xsd:string("Knudsen GM") AS ?author_name)
?work wdt:P2093 ?author_name .
}
Combinations with xsd:string changed to STR or no conversion at all in the original query do neither yield result rows.

Conditional subquery in SPIN function (SPARQL)

How do I change the query formula based on whether or not a variable is bound?
I am invoking the magic property like this:
WHERE {
VALUES (?subj) {
([my bound positional parameter value goes here...])
}
?subj :myMagicProperty ?result .
}
Inside the magic property, I do a union:
?result a :Rule .
{
?result :someProp ?subj .
}
UNION
{
FILTER NOT EXISTS {
?result :someProp ?anyValue .
}
}
In other words, get me all results where :someProp is this value or :someProp is not defined.
Here is the tricky part. If ?subj is unbound (i.e., I set it as UNDEF in the VALUES block), the above query goes wild and returns everything.
Instead, I want to check if ?subjis unbound. If ?subj is unbound, :myMagicProperty should only return the following results:
FILTER NOT EXISTS {
?result ?someProp ?anyValue .
}
I have experimented with using FILTER and the BOUND function, but I can't figure out how to get the correct behavior. How can I drop one of UNION clauses from my query when ?subj is not bound?
Updates
Revised the first query to add the VALUES block.
Added missing ?result a :Rule . statement.
Corrected ?someProp to :someProp.
First I'd like to confirm what your intent is. I'd like to do that by asking you to respond to the following query that you can run in TopBraid Composer.
SELECT *
WHERE { GRAPH <http://topbraid.org/examples/kennedys> {
VALUES (?property) {(kennedys:firstName) (kennedys:lastName) (UNDEF)}
{
FILTER(BOUND(?property) )
?s ?property ?result .
}
UNION
{
FILTER(!BOUND(?property))
BIND("not sure what you want to do in this case" AS ?result)
}
}
}
The difference in the code above to your code is that I am setting values of your ?someProp in the VALUES statement, whereas you are setting ?subj.
The UNIONed subgraphs are using BOUND and !BOUND as guards.
Before going further with help I'd like to hear from you with a clearer explanation of the query you are wanting to build. Then I can show you the magic property that will be needed.
It's this piece of your initial post I need to understand more:
Here is the tricky part. If ?subj is unbound (i.e., I set it as UNDEF in the VALUES block), the above query goes wild and returns everything.
Instead, I want to check if ?subj is unbound. If ?subj is unbound, myMagicProperty should only return the following results:
FILTER NOT EXISTS {
?result ?someProp ?anyValue .
}*
With ?someProp undefined, as well as ?result and ?anyValue, what were you expecting to come back? Also this subgraph of yours has no assertions that will populate the graph and therefore will return nothing.
Ralph
The trick is, I need to do the UNION using a variable different than the one passed in as an argument. This way, the UNION operation does not cause the unbound parameter to be bound. After the UNION, I can use a FILTER to control the results based on the input parameter.
SELECT ?result
WHERE {
?result a :Rule .
{
SELECT ?rule ?value ?anyValueMatch
WHERE {
{
?rule :someProp ?value .
BIND (false AS ?anyValueMatch) .
}
UNION
{
FILTER NOT EXISTS {
?rule :someProp ?any .
} .
BIND (true AS ?anyValueMatch) .
} .
}
} .
FILTER ((bound(?subj) && (?value = ?subj)) || (?anyValueMatch = true)) .
}
Another way to do this is with COALESCE:
SELECT ?result
WHERE {
?result a :Rule .
OPTIONAL {
?result :someProp ?value .
}
FILTER (COALESCE(?value = ?subj, !bound(?value)))
}
...this avoids the sub-select and simply filters to include only the ?result matches where '?value = ?subj', and if that clause fails the !bound() clause ensures matches that do not have a :someProp property are also included.

SPARQL update with optional parts

Consider the following SPARQL update:
INSERT {
?performance
mo:performer ?performer ; # optional
mo:singer ?singer ; # optional
mo:performance_of [
dc:title ?title ; # mandatory
mo:composed_in [ a mo:Composition ;
mo:composer ?composer # optional
]
]
}
WHERE {}
If I do not provide values (e.g. in Jena's ParameterizedSparqlString.setIri() for ?performer, ?singer, or ?composer, this update won't insert statements with the corresponding objects, which is as intended.
But how can I suppress [] a mo:Composition as well if ?composer is missing. Creating it in a second INSERT whose WHERE filters on ISIRI(?composer) doesn't seem to be an option because that INSERT won't know the blank node that has already been created by the first one.
So how can I support this kind of optional parameters in a single SPARQL update? E.g., is there any means for "storing" the blank node between two INSERTs?
The following seems to work, when the caller sets composition to a blank node if and only if it sets ?composer to an IRI.
if (composer != null) {
parameterizedSparqlString.setIri ("composer" , composer);
parameterizedSparqlString.setParam("composition", NodeFactory.createAnon());
}
INSERT {
?performance
mo:performer ?performer ; # optional
mo:singer ?singer ; # optional
mo:performance_of [
dc:title ?title ; # mandatory
mo:composed_in ?composition ] . # optional
?composition a mo:Composition ;
mo:composer ?composer .
}
WHERE {}
Hats off to #Joshua Taylor for the lead.
I'd still prefer a self-contained version that does not require the additional parameter ?composition (i.e. works without making additional demands on the caller), if that's possible at all.

SPARQL DATASET definition

Can anyone explain to me what this statement "Dataset(QuadPattern,μ,GS,GS)" means?Especially, I am trying to figure out the model of DELETE DATA operation (DELETE DATA QuadData) , but I cant understand what Dataset(QuadPattern,{},GS,GS) means.
You seem to be referring to the SPARQL 1.1 Update spec:
Dataset(QuadPattern,μ,DS,GS) ...[an] auxiliary function constructs an RDF Dataset from a QuadPattern, given a solution mapping and an RDF Dataset.
Put simply it's function which takes a bunch of RDF in graphs which may include variables, e.g.:
GRAPH ?g { ?person a Person ; ex:tel ?tel }
{ ?g ex:source ?source }
and a set of solutions μ:
{ ?g => <http://example.com/graph1> , ?person => <http://example.com/alice> , ?tel => "0898 505050" , ?source => <http://192.com/> }
{ ?g => <http://example.com/graph2> , ?person => <http://example.com/bob> , ?tel => "117 117" , ?source => <http://192.com/> }
and binds those values, resulting in a dataset:
{
<http://example.com/graph1> ex:source <http://192.com/> .
<http://example.com/graph2> ex:source <http://192.com/> .
}
GRAPH <http://example.com/graph1> { <http://example.com/alice> a Person ; ex:tel "0898 505050" }
GRAPH <http://example.com/graph2> { <http://example.com/bob> a Person ; ex:tel "117 117" }