SPARQL query to sort by property - sparql

Assume having the following triples in the triplestore, i.e. five resources that have both a "hierarchical structure" as well as a "horizontal order":
<kiwi> rico:isOrWasIncludedIn <fruits> .
<apple> rico:isOrWasIncludedIn <fruits> .
<plum> rico:isOrWasIncludedIn <fruits> .
<orange> rico:isOrWasIncludedIn <fruits> .
<banana> rico:isOrWasIncludedIn <fruits> .
<orange> rico:followsOrFollowed <plum> .
<banana> rico:followsOrFollowed <kiwi> .
<apple> rico:followsOrFollowed <orange> .
<plum> rico:followsOrFollowed <banana> .
How would I query the triplestore with SPARQL to return the resources that are included in <fruits> in correct order like:
<kiwi>
<banana>
<plum>
<orange>
<apple>

To isolate those iris you can write a query like this:
PREFIX rico: <long form goes here>
SELECT DISTINCT ?iri WHERE
{
?iri rico:isOrWasIncludedIn <fruits> .
}
I think I understand your request for order by. You could use a sub-query and bind a sequence of numbers to the entities in order - 1,2,3 etc. Then order by this new variable.

Related

SPARQL how to return date value that has no prefix?

I'm trying to query some data from https://data.niod.nl/PoolParty/wiki/WO2_Thesaurus using SPARQL queries.
One thing I am unable to accomplish is to return the value of startDate (or endDate in some cases). A startDate may look like <startDate xmlns="http://schema.org/" rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1942-05-03</startDate>, see the RDF fragment below:
<rdf:Description rdf:about="https://data.niod.nl/WO2_Thesaurus/events/6556">
<dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2018-07-17T11:56:18Z</dcterms:created>
<dcterms:creator rdf:resource="https://cultureelerfgoed.poolparty.biz/user/jongmal"/>
<dcterms:source xml:lang="nl">Erik Schumacher, 1942. Oorlog op alle fronten. Leven in bezet Nederland (Houten: Spectrum, 2017)</dcterms:source>
<startDate xmlns="http://schema.org/" rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1942-05-03</startDate>
<skos:broader rdf:resource="https://data.niod.nl/WO2_Thesaurus/2030"/>
</rdf:Description>
Compared to the other elements, which have prefixes such as rdf, dcterms, or skos, startDate does not have a prefix.
My current query (without startDate) looks like this:
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
SELECT ?s ?prefLabel ?scopeNote WHERE {
?s skos:inScheme <https://data.niod.nl/WO2_Thesaurus/events/4329> .
?s skos:prefLabel ?prefLabel .
?s skos:scopeNote ?scopeNote .
FILTER (lang(?prefLabel)= "nl") .
}
LIMIT 50
So, my question is, how do I add the value of startDate to the result of my query?

Matching two words together in sparql

How to List the laureate awards (given by their label) for which the description of the contribution (given by nobel:motivation) contains the word "human" together with the word "peace" (i.e., both words must be there).
I have use the bds:search namespace from the the full-text search feature of Blazegraph.
After visiting this link i have composed this query
Free text search in sparql when you have multiword and scaping character
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bds: <http://www.bigdata.com/rdf/search#>
PREFIX nobel: <http://data.nobelprize.org/terms/>
SELECT ?awards ?description
WHERE {
?entity rdfs:label ?awards .
?entity nobel:motivation ?description .
FILTER ( bds:search ( ?description, '"human" AND "peace"' ) )
}
This query is returning me the following error on execution shown in image.
Error Image
How to correct this query and get the desired result?
You may take a look at the specification of this dataset or download an RDF dump of the dataset
Use bds:search to search for "human" category.Then apply filter and contain function to "peace".
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bds: <http://www.bigdata.com/rdf/search#>
PREFIX nobel: <http://data.nobelprize.org/terms/>
PREFIX bif: <http://www.openlinksw.com/schemas/bif#>
SELECT ?awards ?description
WHERE {
?entity rdfs:label ?awards .
?entity nobel:motivation ?description .
?description bds:search "human" .
FILTER (CONTAINS(?description, "peace"))
}

Length of path between nodes, allowing for multiple relation types (SPARQL)

I have to write a SPARQL query that returns the length of the path between two nodes (:persA and :persD) that are connected by these relationships:
#prefix : <http://www.example.org/> .
:persA :knows :knowRelation1 .
:knowRelation1 :hasPerson :persB .
:persB :knows :knowRelation2 .
:knowRelation2 :hasPerson :persC .
:persC :knows :knowRelation3 .
:knowRelation3 :hasPerson :persD .
I tried with this query:
PREFIX : <http://www.example.org/>
SELECT (COUNT(?mid) AS ?length)
WHERE
{
:persA (:knows | :hasPerson)* ?mid .
?mid (:knows | :hasPerson)+ :persD .
}
the result seems to be a infinite loop.
Any advice/examples of how this can be done?
After fixing some syntax errors in your initial post, the provided triples and query work for me in GraphDB Free 8.2 and BlazeGraph 2.1.1. I have since applied these edits to your post itself.
added a trailing / to your definition of the empty prefix
added a trailing . to your prefix definition line (required if you want to start with a #)
fixed the spelling of length (OK, that's just a cosmetic fix)
.
#prefix : <http://www.example.org/> .
:persA :knows :knowRelation1 .
:knowRelation1 :hasPerson :persB .
:persB :knows :knowRelation2 .
:knowRelation2 :hasPerson :persC .
:persC :knows :knowRelation3 .
:knowRelation3 :hasPerson :persD .
.
PREFIX : <http://www.example.org/>
SELECT (COUNT(?mid) AS ?length)
WHERE
{ :persA (:knows|:hasPerson)* ?mid .
?mid (:knows|:hasPerson)+ :persD
}
result:
length
"6"^^xsd:integer

Pairwise comparison with SPARQL

I'd like to compare a collection of objects pairwise for a given similarity metric. The metric will be defined explicitly such that some properties much match exactly and some others can only be so different to each other (i.e., comparing floats: no more than a 50% SMAPE between them).
How would I go about constructing such a query? The output would ideally be an Nx2 table, where each row contains two IRIs for the comparable objects. Duplicates (i.e., 1==2 is a match as well as 2==1) are admissible but if we can avoid them that would be great as well.
I would like to run this on all pairs with a single query. I would probably be able to figure out how to do it for a given object, but when querying across all objects simultaneously this problem becomes much more difficult.
Does anyone have insights into how to perform this?
The idea is this:
PREFIX ex: <http://example.org/ex#>
SELECT DISTINCT ?subject1 ?subject2
WHERE {
?subject1 ex:belongs ex:commonCategory .
?subject2 ex:belongs ex:commonCategory .
?subject1 ex:exactProperty ?e .
?subject2 ex:exactProperty ?e .
?subject1 ex:approxProperty ?a1 .
?subject2 ex:approxProperty ?a2 .
FILTER ( ?subject1 > ?subject2 ) .
FILTER ( (abs(?a1-?a2)/(abs(?a1)+abs(?a2))) < 0.5 )
}
E.g., on DBpedia:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX umbel-rc: <http://umbel.org/umbel/rc/>
SELECT DISTINCT ?subject1 ?subject2
WHERE {
?subject1 rdf:type umbel-rc:Actor .
?subject2 rdf:type umbel-rc:Actor .
?subject1 dbo:spouse ?spouse1 .
?subject2 dbo:spouse ?spouse2 .
?subject1 dbo:wikiPageID ?ID1 .
?subject2 dbo:wikiPageID ?ID2 .
FILTER ( ?subject1 > ?subject2 ) .
FILTER ( ?spouse1 = ?spouse2 ) .
FILTER ( abs(?ID1-?ID2)/xsd:float(?ID1+?ID2) < 0.05 )
}
Thus, probably, Zsa Zsa Gabor and Magda Gabor are the same person.
Both were spouses of George Sanders and their wikiPageID's are not very different from each other.
Some explanations:
The ?subject1 > ?subject2 clause removes "permutation duplicates";
On the usage of xsd:float see this question.

How do i fit that sparql calculation?

this is my actual problem:
?var0 is a group variable and ?var1 is not. But whenever I try to validate the syntax, there comes the following error message:
Non-group key variable in SELECT: ?var1 in expression ( sum(?var0) / ?var1 )
The complete Query:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX cz: <http://www.vs.cs.hs-rm.de/ontostor/SVC#Cluster>
PREFIX n: <http://www.vs.cs.hs-rm.de/ontostor/SVC#Node>
SELECT ( (SUM(?var0) / ?var1) AS ?result)
WHERE{
?chain0 rdf:type rdfs:Property .
?chain0 rdfs:domain <http://www.vs.cs.hs-rm.de/ontostor/SVC#Cluster> .
?chain0 rdfs:range <http://www.vs.cs.hs-rm.de/ontostor/SVC#Node> .
?this ?chain0 ?arg0 .
?arg0 n:node_realtime_cpu ?var0 .
?this cz:node_count ?var1 .
}
My question is how to correct that calculation to fit the SPARQL syntax?
The immediate problem is that ?var1 is not grouped on, so a fix would be to simply append
GROUP BY ?var1
at the end of your query.
However, whether that gives you the calculation you actually want is another matter.
It's not quite clear what you're trying to calculate, but it looks as if you're attempting to determine the average node_realtime_cpu for a cluster. If that is the case, you can probably do your calculation by just using SPARQL's AVG function instead:
SELECT ( AVG(?var0) AS ?result)
WHERE{
?chain0 rdf:type rdfs:Property .
?chain0 rdfs:domain <http://www.vs.cs.hs-rm.de/ontostor/SVC#Cluster> .
?chain0 rdfs:range <http://www.vs.cs.hs-rm.de/ontostor/SVC#Node> .
?this ?chain0 ?arg0 .
?arg0 n:node_realtime_cpu ?var0 .
}
GROUP BY ?this // grouping on the cluster identifier so we get an average _per cluster_
Yet another alternative would be to keep your query as-is, but group on two variables:
GROUP BY ?this ?var1
Which is best depends on what your data looks like and what, exactly, you're trying to calculate.