Sparql - Concatenation fails if any one variable is not bound - sparql

Hi am using AllegroGraph and Sparql query to retrieve the results. This is a sample data that reproduces my issue.
Consider below data where a person has first, middle and last names.
<http://mydomain.com/person1> <http://mydomain.com/firstName> "John"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person1> <http://mydomain.com/middleName> "Paul"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person1> <http://mydomain.com/lastName> "Jai"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>
<http://mydomain.com/person6> <http://mydomain.com/middleName> "Mannan"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person6> <http://mydomain.com/lastName> "Sathish"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person6> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>
Now I need to compute the persons name by combining the all 3 names. The names are optional and a person may not have any of the first, middle and last names.
Query I tried
select ?person ?name ?firstName ?middleName ?lastName where
{
?person rdf:type <http://mydomain.com/person>.
optional {?person <http://mydomain.com/firstName> ?firstName}.
optional {?person <http://mydomain.com/middleName> ?middleName}.
optional {?person <http://mydomain.com/lastName> ?lastName}.
bind (concat(str(?firstName),str(?middleName),str(?lastName)) as ?name).
}
But the result set does not contain the name for person6 (Mannan Sathish) since the first name is not present. Please let me know if I can ignore the firstName if it's not bound.

If a variable is not bound then str(...) will cause an error on evaluation and the whole BIND fails.
COALESCE can be used to give default values to expressions.
bind ( COALESCE(?firstName, "") As ?firstName1)
bind ( COALESCE(?middleName, "") As ?middleName1)
bind ( COALESCE(?lastName, "") As ?lastName1)
bind (concat(str(?firstName1),str(?middleName1),str(?lastName1)) as ?name

Related

Assigning specific values to multiple variable in SPARQL

I am trying to query the WikiData database using their SPARQL endpoint. Here I want to assign specific values to certain variables independent from each other. I know that the VALUES keyword can be used for this, however, is there a way of doing this for multiple variables without having to specify each possible combination? For example, the query looks like this:
SELECT ?relation1 ?item1?
WHERE {wd:Q31 ?relation1 ?item1 .}
I would like to specify the values for both ?relation1 and ?item1 without having to go over each possible combination.
Thank you.
Edited answer:
Yes, you can use multiple VALUES statements.
This can be written like this:
SELECT *
WHERE {
VALUES ?person {wd:Q35332 wd:Q23844} #Brad Pitt and Geroge Clooney
VALUES ?property {wdt:P27 wdt:P19} #Country of nationality and place of birth
?person ?property ?o ;
rdfs:label ?personLabel .
?o rdfs:label ?oLabel .
FILTER(LANG(?personLabel) = 'en' && LANG(?oLabel) = 'en')
}
You will then see these results:
This link will contain more information.

SPARQL - How to Filter if a property exists, otherwise add as is?

I am trying to select people, whose date of birth ( wdt:P569 ) is above 1920, if they have a date of birth and add everybody else who don't have a date of birth.
SELECT ?politician
WHERE {
VALUES ?occupations {wd:Q193391 wd:Q116}
?politician wdt:P106 ?occupations .
OPTIONAL {
?politician wdt:P569 ?date .
FILTER (year(?date) > 1920)
}
MINUS {?politician wdt:P102 ?o }
FILTER NOT EXISTS {?politician wdt:P570|wdt:P509|wdt:P20 ?o }
}
It probably can be solved using -> OPTIONAL(FILTER(BIND(IF(BOUND())))).. But I can't figure it out.
SELECT ?politician
WHERE {
VALUES ?occupations {wd:Q82955 wd:Q212238 wd:Q193391 wd:Q116}
?politician wdt:P106 ?occupations .
OPTIONAL {?politician wdt:P569 ?date .}
FILTER (!bound(?date) || year(?date) > 1920)
MINUS {?politician wdt:P102 ?o }
FILTER NOT EXISTS {?politician wdt:P570|wdt:P509|wdt:P20 ?o }
}
Thank you AKSW!
the OPTIONAL parameter's job is to "initialize" A.K.A. BIND a variable.
The FILTER( !BOUND(?date) || year(?date)>1920 ) is saying - add to results all the matches, where ?date variable was not "initialized" or if it was "initialized" - add all the matches where year(?date) > 1920 filter evaluates to true.
and while I'm on it:
VALUES ?occupations {wd:Q82955 wd:Q212238 wd:Q193391 wd:Q116} is javascript's equivalent of saying var x = (a || b || c || d).
it's more like this
If any of those properties match, then ?occupatoins becomes bound as well.
Grouping your properties into VALUES makes the query more efficient and there's less of a chance it will time out.
The MINUS and FILTER NOT EXISTS for any practical reason can be considered identical. (they are not in edge cases)
Both of them look, if the match at hand (at that point of the execution) contain any of the listed properties "wd:P570, wd:P58, ..". But unlike VALUES the evaluation context is | OR, not of ANY.
I used both MINU and FILTER NOT EXISTS because using just one of them made the query time out.
(it might have become more efficient, because "wdt:102" produces by far the most matches, leaving too big of a chunk for the following operators to sort through)

Matching user inputs with tripplestore data

I am trying to develop a decision support system for soybean disease analysis using protege, fuseki server and php. I want to take input from a user and a return a disease name based on the matching of the literal. The query I am trying is :
$sparql = "PREFIX : <http://localhost/soy_test1#>
select ?Diaporthe_stem_canker where {?Diaporthe_stem_canker :hasTOC ' . TOC . ' . ' . TOC .' :hasTOC 'July'}";
It always returns Diaporthe_stem_canker no matter what the input is. I am trying to find what's the error in the sparql query. Any help would be appreciated.
Thanks.
Edit
The ontology I have is as follows:
A plant soybean is described by some plant descriptors. Plant descriptor has several types with some attribute values.
Example:
:PlantDescriptor :hasType "EnvironmentalDescriptors" .
:PlantDescriptor :hasAttribute :TOC .
:PlantDescriptor :hasTOC "July" .
:TOC rdfs:subClassOf :Attribute .
where PlantDescriptor and Attribute are the concepts of the ontology.
I am trying to take an input from the user for the attribute "TOC", which I want to match whether it's "July" or not; if it matches with July then it will return the disease name Diaporthe-stem-canker.
The string part is for matching the user input text with the value "July". That's why I have used it as a php variable inside SPARQL Query.
The query that I used in my Query engine is like this:
PREFIX : <http://localhost/soy_test1#>
select distinct ?P ?X ?Time
where {?PlantDescriptor :hasType ?P .
?PlantDescriptor :hasAttribute ?X.
?X :hasTOC "July" .
?X :hasTOC ?Time .}
which returns the value "July" under ?Time, the url for Attribute concept and the type of the PlantDescriptor for the corresponding Attribute.
Hope I have made my point more clear.
Thanks in advance.
The example isn't entirely clear because you say you are searching for Diaporthe-stem-canker, but that's not in your sample data. Assuming Diaporthe-stem-canker is a PlantDescriptor, the following query could get you what you need:
SELECT ?disease
WHERE {
?disease :hasTOC "July" .
}
And in your case the "July" string is inserted into the string, so you may have something like the following if you were using PHP (no idea what your syntax is(:
$sparql = "PREFIX : <http://localhost/soy_test1#>
select ?disease where {?disease :hasTOC " . $date . " . }"
...where $date is bound to "July", etc.

Sparql - Order by to return empty values last

I use AllegroGraph and Sparql 1.1.
I need to do ascending sort on a column and make the Sparql query to return empty values at the last.
Sample data:
<http://mydomain.com/person1> <http://mydomain.com/name> "John"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>
<http://mydomain.com/person2> <http://mydomain.com/name> "Abraham"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
<http://mydomain.com/person2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>
<http://mydomain.com/person3> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>
Here I need the Sparql to return Abraham, followed by John and person3 that does not have a name attribute.
Query I use:
select ?name ?person {
?person <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person>.
optional {?person <http://mydomain.com/name> ?name.}
} order by asc(?name )
Current output is person3 (null), followed by Abraham and John.
Please let me know your thoughts.
I don't have AllegroGraph at hand but AFAIK it supports multiple order conditions:
select ?name ?person {
?person <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mydomain.com/person> .
optional {?person <http://mydomain.com/name> ?name . }
} order by (!bound(?name)) asc(str(?name))
First condition sorts based on whether ?name is bound or not and if this condition does not find a difference, the second condition is used. Note the use of str() to convert rdf:XMLLiteral to a datatype for which comparison is supported.
(You may also want to add . at the end of each row in your ntriples data.)

How to know if a string is a proper name of person or a place name using DBpedia?

I am using SPARQL query on DBpedia into a Prolog project and I have a doubt. I would know if a word is, most probably, a NAME OF A PERSON (something like: John, Mario) or a PLACE (like a city: Rome, London, New York).
I have implement the following two queries, the first gives me the number of persons having a specific name, and the second gives me the number of places having a specific name.
1) Query for a PERSON NAME:
select COUNT(?person) where {
?person a dbpedia-owl:Person .
{ ?person foaf:givenName "John"#en }
UNION
{ ?person foaf:surname "John"#en }
}
For the name John, I obtain the following output: callret-0: 7313, so I think that it has found 7313 instances for the proper name John. Is it right?
2) Query for a PLACE NAME:
select COUNT(?place) where {
?place a dbpedia-owl:Place .
{ ?x rdfs:label "John"#en }
}
The problem is that, as you can see in the previous “place” query, I have inserted John as parameter, which is not a place name but a proper name of persons, but I obtain the following strange result: callret-0: 81900104
The problem is that, in this way, if I compare the output of the previous two queries, it seems that John is a place and not a person name! This is not good for my scope; I have tried with other personal names and it always happens that the place query gives me a bigger output than the name query.
Why? What am I missing? Are there some errors in my queries? How can I solve it to have a correct result?
Actually, when I run the query you provided:
select COUNT(?place) where {
?place a dbpedia-owl:Place .
{ ?x rdfs:label "John"#en }
}
I get the result 93027312, not 81900104, but that does not really matter much. The strange results arise because ?x and ?place don't have to be bound to the same thing, so you are getting all the dbpedia-owl:Places and counting them, but the number of result rows is the number of dbpedia-owl:Place multiplied by the number of things with rdfs:label "John#en":
select COUNT(?place) where { ?place a dbpedia-owl:Place }
=> 646023
select COUNT(?x) where { ?x rdfs:label "John"#en }
=> 144
646023 × 144 = 93027312
If you actually ask for dbpedia-owl:Places that have the rdfs:label "John#en", you'll get no results:
select COUNT(?place) as ?numPlaces where {
?place a dbpedia-owl:Place ;
rdfs:label "John"#en .
}
SPARQL results
Also, you might consider using dbpprop:name instead of rdfs:label. Some results seem like they are more useful that way. For instance, let us find places called "Springfield". If we ask for places with that name we get no results:
select * where {
?place a dbpedia-owl:Place ;
rdfs:label "Springfield"#en .
}
SPARQL results
However, if we modify the query and use dbpprop:name, we get 17. Some of these are duplicates though, so you might have to do something else to remove duplicates. The point, though, is that dbpprop:name got some results, and rdfs:label didn't.
select * where {
?place a dbpedia-owl:Place ;
dbpprop:name "Springfield"#en .
}
SPARQL results
You can even use dbpprop:name in working with the names of persons, although it's not as useful, because the dbpprop:name value for most persons is their entire name. To find persons with the given name John using dbpprop:name requires a query like:
select * where {
?place a dbpedia-owl:Person ;
dbpprop:name ?name .
FILTER( STRSTARTS( str( ?name ), "John" ) )
}
(or you could use CONTAINS instead of STRSTARTS), but this becomes much more expensive, because it has to select all persons and their names, and then filter through that set. Being able to select persons based on a specific name (e.g., with foaf:givenName) is much more efficient.