SPARQL generate Values for missing fields - sparql

I´m trying to write a SELECT, that gives me all the Values in a Table. I have optional Values, I want those to be filled with a standard value, If they don´t exist.
This is my code:
SELECT * WHERE {
?a nmo:hasObject nm:coin
OPTIONAL
{ ?a nmo:hasAuthority ?b }
OPTIONAL
{ ?a nmo:hasMaterial ?c }}
What I get id the following:
?a ?b ?c
1 yx
2 ab
3 xz bc
What I want is to fill it up with the String "missing" if there is no value:
?a ?b ?c
1 yx "missing"
2 "missing" ab
3 xz bc
Any ideas on how to structure the SELECT to get this output?

I'd probably use coalesce here:
SELECT
?a
(coalesce (?b, ?missing) as ?bb)
(coalesce (?c, ?missing) as ?cc)
WHERE {
VALUES ?missing { "missing" }
?a nmo:hasObject nm:coin
OPTIONAL
{ ?a nmo:hasAuthority ?b }
OPTIONAL
{ ?a nmo:hasMaterial ?c }
}

SPARQL 1.1 BIND in combination with IF can be used to achieve this:
SELECT * WHERE
{ ?a nmo:hasObject nm:coin
OPTIONAL
{ ?a nmo:hasAuthority ?b_tmp }
OPTIONAL
{ ?a nmo:hasMaterial ?c_tmp }
BIND(if(bound(?b_tmp), ?b_tmp, "missing") AS ?b)
BIND(if(bound(?c_tmp), ?c_tmp, "missing") AS ?c)
}

Related

SPARQL returning empty result when passing MIN(?date1) from subquery into outer query, with BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)

<This question is now resolved, see comment by Valerio Cocchi>
I am trying to pass a variable from a subquery, that takes the minimum date of a set of dates ?date1 belonging to ?p and passes this to the outer query, which then takes another date ?date2 belonging to ?p (there can be at most 1 ?date2 for every ?p) and subtracts ?minDate from ?date2 to get an integer value for the number of years between. I am getting a blank value for this, i.e. ?diffDate returns no value.
I am using Fuseki version 4.3.2. Here is an example of the query:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}
and an example of the kind of result I am getting:
|-?p----|-----------------?minDate-------------|-----------------?date2------------- |?diffDate|
|<123>|20012-11-22T00:00:00"^^xsd:dateTime|2008-08-18T00:00:00"^^xsd:dateTime| |
I would expect that ?diffDate would give me an integer value. Am I missing something fundamental about how subqueries work in SPARQL?
It seems you have encountered quite an obscure part of the SPARQL spec, namely how BIND works.
Normally SPARQL is evaluated without regard for the position of atoms, i.e.
SELECT *
WHERE {
?a :p1 ?b .
?b :p2 ?c .}
is the same query as:
SELECT *
WHERE {
?b :p2 ?c .
?a :p1 ?b .}
However, BIND is position dependent, so e.g.:
SELECT *
WHERE {
?a :p1 ?b .
BIND(:john AS ?a)}
is not a valid query, whereas:
SELECT *
WHERE {
BIND(:john AS ?a)
?a :p1 ?b .
}
is entirely valid. The same applies to variables used inside of the BIND, which must be declared before the BIND appears.
See here for more.
To go back to your problem, your BIND is using the ?minDate variable before it has been bound, which is why it fails to produce a value for ?diffDate.
This query should do the trick:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate) #Put the BIND after all the variables it uses are bound.
}
Alternatively, you could evaluate the difference in the SELECT, like so:
SELECT ?p ?minDate ?date2 (YEAR(?minDate) - YEAR(?date2) AS ?diffDate)
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}

Querying two RDF graphs in SPARQL

I have two RDF knowledge bases
KB1 (path/to/file1.rdf) wkich includes the two following triples
a b c
a f e
and KB2 (path/to/file2.rdf) with the following triple:
c t p
I want to get all paths include ones like ?a ?b ?c & ?c ?t ?p as c is common.
How can I do that in SPARQL?
In case of two KB, it is what we call a "Federated Query". Here is an example :
SELECT * WHERE {
SERVICE URI_for_path/to/file1.rdf {
?a ?b ?c .
OPTIONAL {
SERVICE URI_for_path/to/file2.rdf {
?c ?t ?p . } }
}
}
You will get ?t ?p only if it exists by the way.
Simpliest way is to load both files in a single KB and therefore have a simple query :
SELECT * WHERE {
?a ?b ?c .
?c ?t ?p .
}

SPARQL Subquery Graph Name

I have following SPARQL query that contains a sub-select. The data contains multiple graphs and I want to know what graph the values for ?b and ?m come from:
select ?b, ?m, ?g1
where {
{
select ?o1, ?o2, ?e
where{
graph ?g{
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> ?o1.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> ?o2.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_iteration> '0'^^xsd:decimal.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_exposureday> ?e.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_pid1> ?o1.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_pid2> ?o2.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_acttype1> '5'^^xsd:decimal.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_acttype2> '5'^^xsd:decimal
}
}ORDER BY ASC(?e) LIMIT 1
}
{
graph ?g1 {
?b <http://ndssl.bi.vt.edu/chicago/vocab/getInfectedBy> ?o1.
?m <http://ndssl.bi.vt.edu/chicago/vocab/getInfectedBy>* ?b.
}
}
}
The second graph pattern contains a transitive property path and the query provides following correct result:
b m g1
----------------------------------------------------- ----------------------------------------------------- -------------------------------------------------------
<http://ndssl.bi.vt.edu/chicago/person/pid#446734805> <http://ndssl.bi.vt.edu/chicago/person/pid#446753456> <http://ndssl.bi.vt.edu/chicago/dendrogram/replicate1/>
However, I want to see the intermediate nodes and count the path length from transitive relationship. If I remove graph ?g1 from the query, then it shows the intermediate node information like following:
b m
--------------------------------------------------- ---------------------------------------------------
http://ndssl.bi.vt.edu/chicago/person/pid#446718746 http://ndssl.bi.vt.edu/chicago/person/pid#446718746
http://ndssl.bi.vt.edu/chicago/person/pid#446734805 http://ndssl.bi.vt.edu/chicago/person/pid#446734805
http://ndssl.bi.vt.edu/chicago/person/pid#446734805 http://ndssl.bi.vt.edu/chicago/person/pid#446753456
The Purpose of the query is to figure out graph name for matching ?b and ?m. Hence, I want to use graph ?g1. Is it possible to show intermediate nodes by retaining the graph keyword? I am using Virtuoso.
Since you're not using g the first GRAPH statement is not necessary. Note also that the second GRAPH statement only uses ?o1, so the following query do what you want it to. You may also want to check SPARQL syntax in your select clause.
PREFIX ndssl: <http://ndssl.bi.vt.edu/chicago/vocab/>
SELECT ?b ?m ?g1
WHERE {
{
SELECT ?o1
WHERE {
?s ndssl:dendrogram_infector_pid ?o1 .
?s ndssl:dendrogram_infectee_pid ?o2 .
?s ndssl:dendrogram_iteration '0'^^xsd:decimal .
?s ndssl:dendrogram_exposureday ?e .
?s1 ndssl:contactnetwork_pid1 ?o1 .
?s1 ndssl:contactnetwork_pid2 ?o2 .
?s1 ndssl:contactnetwork_acttype1 '5'^^xsd:decimal .
?s1 ndssl:contactnetwork_acttype2 '5'^^xsd:decimal
} ORDER BY ASC(?e) LIMIT 1
}
GRAPH ?g1 {
?b ndssl:getInfectedBy ?o1 .
?m ndssl:getInfectedBy* ?b .
}
}
In the endpoint provided there isn't a match for ?b or ?m, regardless of whether a GRAPH statement is used or not.

Simplify SPARQL query

I’m trying to make a rather complex call to DBPedia using a SPARQL query. I’d like to get some information about a city (district, federal state/»Bundesland«, postal codes, coordinates and geographically related cities).
Try online!
SELECT * WHERE {
#input
?x rdfs:label "Bentzin"#de.
#district
OPTIONAL {
?x dbpedia-owl:district ?district
# ?x dbpprop:landkreis ?district
{ SELECT * WHERE {
?district rdfs:label ?districtName
FILTER(lang(?districtName) = "de")
?district dbpprop:capital ?districtCapital
{ SELECT * WHERE {
?districtCapital rdfs:label ?districtCapitalName
FILTER(lang(?districtCapitalName) = "de")
}}
}}
}
#federal state
OPTIONAL {
# ?x dbpprop:bundesland ?land
?x dbpedia-owl:federalState ?land
{ SELECT * WHERE {
?land rdfs:label ?landName
FILTER(lang(?landName) = "de")
}}
}
#postal codes
?x dbpedia-owl:postalCode ?zip.
#coordinates
?x geo:lat ?lat.
?x geo:long ?long
#cities in the south
OPTIONAL {
?x dbpprop:south ?south
{SELECT * WHERE {
?south rdfs:label ?southName
FILTER(lang(?southName) = "de")
}}
}
#cities in the north
OPTIONAL {
?x dbpprop:north ?north
{ SELECT * WHERE {
?north rdfs:label ?northName
FILTER(lang(?northName) = "de")
}}
}
#cities in the west
...
}
This works in some cases, however, there are a few major problems.
There are several different properties that may contain the value for the federal state or district. Sometimes it’s dbpprop:landkreis (the german word for district, in other cases it’s dbpedia-owl:district. Is it possible to combine those two in cases where only one of them is set?
Further, I’d like to read out the names of the cities in the north, northwest, …. Sometimes, these cities are referenced in dbpprop:north etc. The basic query for each direction is the same:
OPTIONAL {
?x dbpprop:north ?north
{ SELECT * WHERE {
?north rdfs:label ?northName
FILTER(lang(?northName) = "de")
}}
}
I really don’t want to repeat that eight times for every direction, is there any way to simplify this?
Sometimes, there are multiple other cities referenced (example). In those cases, there are multiple datasets returned. Is there any possibility to get a list of the names of those cities in a single dataset instead?
+---+---+---------------------------------------------------------------+
| x | … | southName |
+---+---+---------------------------------------------------------------+
| … | … | "Darmstadt"#de, "Stuttgart"#de, "Karlsruhe"#de, "Mannheim"#de |
+---+---+---------------------------------------------------------------+
Your feedback and your ideas are greatly appreciated!
Till
There are several different properties that may contain the value for the federal state or district. Sometimes it’s dbpprop:landkreis (the
german word for district, in other cases it’s dbpedia-owl:district. Is
it possible to combine those two in cases where only one of them is
set?
SPARQL property paths are great for this. You can just say
?subject dbprop:landkreis|dbpedia-owl:district ?district
If there are more properties, you'll probably prefer a version with values:
values ?districtProperty { dbprop:landkreis dbpedia-owl:district }
?subject ?districtProperty ?district
Further, I’d like to read out the names of the cities in the north,
northwest, …. Sometimes, these cities are referenced in dbpprop:north
etc. The basic query for each direction is the same:
OPTIONAL {
?x dbpprop:north ?north
{ SELECT * WHERE {
?north rdfs:label ?northName
FILTER(lang(?northName) = "de")
}}
}
Again, it's values to the rescue. Also, don't use lang(…) = … to filter languages, use langMatches:
optional {
values ?directionProp { dbpprop:north
#-- ...
dbpprop:south }
?subject ?directionProp ?direction
optional {
?direction rdfs:label ?directionLabel
filter langMatches(lang(?directionLabel),"de")
}
}
Sometimes, there are multiple other cities referenced (example). In
those cases, there are multiple datasets returned. Is there any
possibility to get a list of the names of those cities in a single
dataset instead?
+---+---+---------------------------------------------------------------+
| x | … | southName |
+---+---+---------------------------------------------------------------+
| … | … | "Darmstadt"#de, "Stuttgart"#de, "Karlsruhe"#de, "Mannheim"#de |
+---+---+---------------------------------------------------------------+
That's what group by and group_concat are for. See Aggregating results from SPARQL query. I don't actually see these results in the query you gave though, so I don't have good data to test a result with.
You also seem to be doing a lot of unnecessary subselects. You can just put additional triples in the graph pattern; you don't need a nested query to get additional information.
With those considerations, your query becomes:
select * where {
?x rdfs:label "Bentzin"#de ;
dbpedia-owl:postalCode ?zip ;
geo:lat ?lat ;
geo:long ?long
#-- district
optional {
?x dbpedia-owl:district|dbpprop:landkreis ?district .
?district rdfs:label ?districtName
filter langMatches(lang(?districtName),"de")
optional {
?district dbpprop:capital ?districtCapital .
?districtCapital rdfs:label ?districtCapitalName
filter langMatches(lang(?districtCapitalName),"de")
}
}
#federal state
optional {
?x dbpprop:bundesland|dbpedia-owl:federalState ?land .
?land rdfs:label ?landName
filter langMatches(lang(?landName),"de")
}
values ?directionProp { dbpprop:south dbpprop:north }
optional {
?x ?directionProp ?directionPlace .
?directionPlace rdfs:label ?directionName
filter langMatches(lang(?directionName),"de")
}
}
SPARQL results
Now, if you're just looking for the names of these things, without the associated URIs, you can actually use property paths to shorten a lot of the results that retrieve labels. E.g.:
select * where {
?x rdfs:label "Bentzin"#de ;
dbpedia-owl:postalCode ?zip ;
geo:lat ?lat ;
geo:long ?long
#-- district
optional {
?x (dbpedia-owl:district|dbpprop:landkreis)/rdfs:label ?districtName
filter langMatches(lang(?districtName),"de")
optional {
?district dbpprop:capital/rdfs:label ?districtCapitalName
filter langMatches(lang(?districtCapitalName),"de")
}
}
#-- federal state
optional {
?x (dbpprop:bundesland|dbpedia-owl:federalState)/rdfs:label ?landName
filter langMatches(lang(?landName),"de")
}
optional {
values ?directionProp { dbpprop:south dbpprop:north }
?x ?directionProp ?directionPlace .
?directionPlace rdfs:label ?directionName
filter langMatches(lang(?directionName),"de")
}
}
SPARQL results

Sparql: Arithmetic operators between variables?

Hi I have a query like this:
SELECT ?a ?b
WHERE
{
?c property:name "myThing"#en
?c property:firstValue ?b
?c property:secondValue ?a
}
How can I divide the first nuber and the second? idealy somthing like this:
SELECT ?a/?b
WHERE
{
?c property:name "myThing"#en
?c property:firstValue ?b
?c property:secondValue ?a
}
Thank you
In SPARQL 1.1 you can do it using Project expressions like so:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT (xsd:float(?a)/xsd:float(?b) AS ?result)
WHERE
{
?c property:name "myThing"#en
?c property:firstValue ?b
?c property:secondValue ?a
}
You may alternately use xsd:double(?var) to cast to a double, xsd:integer(?var) to cast to an integer and xsd:decimal(?var) to cast to a decimal.
Note that SPARQL specifies type promotion rules so for example:
integer / integer = decimal
float / double = double
If you really need the result in a guaranteed datatype you can cast the whole divide expression e.g.
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT (xsd:double(xsd:float(?a)/xsd:float(?b)) AS ?result)
WHERE
{
?c property:name "myThing"#en
?c property:firstValue ?b
?c property:secondValue ?a
}
There are two ways to achieve this:
SELECT ((?a/?b) AS ?result) WHERE {
?c property:name "myThing"#en .
?c property:firstValue ?b .
?c property:secondValue ?a .
}
or
SELECT ?result WHERE {
BIND((?a/?b) AS ?result) .
?c property:name "myThing"#en .
?c property:firstValue ?b .
?c property:secondValue ?a .
}