Incomplete sem:sparql results for Property Paths when "optimize=0" - sparql

Consider the following subClassOf relations:
m1
|_ m1_1
|_ m1_1_1
The following query correctly returns m1_1 and m1_1_1:
sem:sparql('
SELECT *
WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf>+ <http://example.org/people#m1> .
}', (),
("optimize=1"))
However, the following query only returns m1_1:
sem:sparql('
SELECT *
WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf>+ <http://example.org/people#m1> .
}', (),
("optimize=0"))
It's also the case with *.
After upgrading to MarkLogic 8.0-5.2, the previous sem:sparql queries return correct results but not when the object is a parameter. E.g. the following query doesn't return anything:
sem:sparql('
SELECT *
WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf>+ ?item .
}',
(map:new ((map:entry ("item", sem:iri("http://example.org/people#m1"))))),
("optimize=0"))

Here are some possible workarounds:
sem:sparql('
SELECT *
WHERE {
BIND (?param AS ?item) .
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf>+ ?item .
}',
(map:new ((map:entry ("param", sem:iri("http://example.org/people#m1"))))),
("optimize=0"))
Or
sem:sparql('
SELECT *
WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#subClassOf>+ ?item .
FILTER (?item = ?param) .
}',
(map:new ((map:entry ("param", sem:iri("http://example.org/people#m1"))))),
("optimize=0"))

Checking with a a fairly complex RDFS ontology using MarkLogic 8.0-5.2, this seems to work OK. Is it possible to upgrade? There was some significant semantics work around 8.0-4.0.

Related

SPARQL returning empty result when passing MIN(?date1) from subquery into outer query, with BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)

<This question is now resolved, see comment by Valerio Cocchi>
I am trying to pass a variable from a subquery, that takes the minimum date of a set of dates ?date1 belonging to ?p and passes this to the outer query, which then takes another date ?date2 belonging to ?p (there can be at most 1 ?date2 for every ?p) and subtracts ?minDate from ?date2 to get an integer value for the number of years between. I am getting a blank value for this, i.e. ?diffDate returns no value.
I am using Fuseki version 4.3.2. Here is an example of the query:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}
and an example of the kind of result I am getting:
|-?p----|-----------------?minDate-------------|-----------------?date2------------- |?diffDate|
|<123>|20012-11-22T00:00:00"^^xsd:dateTime|2008-08-18T00:00:00"^^xsd:dateTime| |
I would expect that ?diffDate would give me an integer value. Am I missing something fundamental about how subqueries work in SPARQL?
It seems you have encountered quite an obscure part of the SPARQL spec, namely how BIND works.
Normally SPARQL is evaluated without regard for the position of atoms, i.e.
SELECT *
WHERE {
?a :p1 ?b .
?b :p2 ?c .}
is the same query as:
SELECT *
WHERE {
?b :p2 ?c .
?a :p1 ?b .}
However, BIND is position dependent, so e.g.:
SELECT *
WHERE {
?a :p1 ?b .
BIND(:john AS ?a)}
is not a valid query, whereas:
SELECT *
WHERE {
BIND(:john AS ?a)
?a :p1 ?b .
}
is entirely valid. The same applies to variables used inside of the BIND, which must be declared before the BIND appears.
See here for more.
To go back to your problem, your BIND is using the ?minDate variable before it has been bound, which is why it fails to produce a value for ?diffDate.
This query should do the trick:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate) #Put the BIND after all the variables it uses are bound.
}
Alternatively, you could evaluate the difference in the SELECT, like so:
SELECT ?p ?minDate ?date2 (YEAR(?minDate) - YEAR(?date2) AS ?diffDate)
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}

SPARQL select from namespace

It's easy to select all statements from fuseki with a query like:
SELECT * { ?s ?o ?z}
but how to obtain all statements from a certain namespace prefix?
This can be done by testing the URI: if "from a certain namespace prefix" you want the subjects in the namespace: for ?s:
PREFIX ns: <....>
SELECT * {
?s ?o ?z
FILTER (isURI(?s) && STRSTARTS(str(?s), str(ns:) ) )
}

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language

Select inside another select

I have the following query where I'm trying to use the id of an element from one graph to retrieve some values in another graph. However this doesn't work:
select * from <mygraph2#> where {
?s ?p ?id in {select ?id from <mygraph1#> where { ?id ?t "MyService" }
}
You don't select from a subselect. You just execute the sub-select, and it provides some variables to the enclosing query. I'm not sure where you found an example like what you showed in the question. It's not what any of the examples in the standard (see Section 12, Subqueries) look like. Your query would be something like this (note that this isn't actually legal, since you can't use FROM in subqueries):
select * from <mygraph2#> where {
?s ?p ?id
{ select ?id from <mygraph1#>
where { ?id ?t "MyService" } }
}
However, the the graphs that you're selecting from are available as named graphs in the dataset, there's no real need for the sub-select here. You can just do
select * where {
graph <mygraph2#> { ?s ?p ?id }
graph <mygraph1#> { ?id ?t "MyService" }
}

Getting Wrong Result in Sparql

I am try to Implement a Sparql Query which will give some result.I am trying to Implement Like this:
My Data points are below from where I getting the data:
Subject:
<http://rhizomik.net/semanticxbrl/0001397832_agph-20110930/Context_9ME_30-Sep-2011/ConvertibleNotesPayableTextBlock/>
Predicate:
<http:// www.w3.org/1999/02/22-rdf-syntax-ns#type>
Object:
<http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock>
My Query is
PREFIX ab: <http:// www.atlanticgreenpower.com/20110930#>
SELECT ?node ?val_type ?value
WHERE {
?node ab:val_type ?val_type .
?node ab:value ?value .
}
I want to get the result of all subject predicate and object.I am new to sparql.please help me out
In SPARQL everything with a ? in front of it is a variable. If you want everything from the subject, predicate and object you can do it like this:
SELECT ?s ?p ?o WHERE { ?s ?p ?o }
But then you list everything. When you want to only list the values of your given subject you will have to do it like this:
SELECT * WHERE { <http://rhizomik.net/semanticxbrl/0001397832_agph-20110930/Context_9ME_30-Sep-2011/ConvertibleNotesPayableTextBlock/> ?p ?o }
And your query could easily be formatted to something like this:
PREFIX ab: <http:// www.atlanticgreenpower.com/20110930#>
SELECT ?s ?valType ?value WHERE {
?s ab:val_type ?valType ;
ab:value ?value .
}
Where ; marks when you want to include something else in your query with your given subject, but . marks the end of your query. If you want more information about how SPARQL works or how you can query check out the Euclid project with webinar recording: Querying linked data, 2013-03-04 or just test it yourself on a SPARQL endpoint like from FactForge.
To reflect my comment below:
SELECT * WHERE { ?s ?p <http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock> }
will select everything with the given object.
Try this:
SELECT * WHERE
{
?s ?p <http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock>
}