Property Path equivalent to OPTION(TRANSITIVE) statement in SPARQL - sparql

I'm a naive user trying to replicate a query that results in the following type of string:
derives_from some (epithelial cell and (part_of some (uterine cervix and (part_of some (Homo sapiens and (has disease some adenocarcinoma)))))
I'm at a hackathon, we have no ontology/SPARQL expert, and we're just trying to get these related fields out of this ontology and into SOLR. We're desperate!
The webpage this is from
http://www.ontobee.org/ontology/CLO?iri=http://purl.obolibrary.org/obo/CLO_000368)
helpfully provides all the SPARQL queries that are used on the page. I think this is the relevant query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?ref ?refp ?label ?o FROM <http://purl.obolibrary.org/obo/merged/CLO> WHERE {
?ref ?refp ?o .
FILTER ( ?refp IN ( owl:equivalentClass, rdfs:subClassOf ) ) .
OPTIONAL { ?ref rdfs:label ?label } .
{
{
SELECT ?s ?o FROM <http://purl.obolibrary.org/obo/merged/CLO> WHERE {
?o ?p ?s .
FILTER ( ?p IN ( rdf:first, rdf:rest, owl:intersectionOf, owl:unionOf, owl:someValuesFrom, owl:hasValue, owl:allValuesFrom, owl:complementOf, owl:inverseOf, owl:onClass, owl:onProperty ) )
}
}
OPTION ( TRANSITIVE, t_in( ?s ), t_out( ?o ), t_step( ?s ) as ?link ).
FILTER ( ?s= <http://purl.obolibrary.org/obo/CLO_0003684> )
}
}
ORDER BY ?label
However, I can't even run that to check, because my SPARQL endpoint doesn't support Virtuoso. http://sparql.bioontology.org
So, it spits back an error on OPTION(TRANSITIVE).
Can anyone show me the equivalent standard pathway language? There are varying pathway lengths between the target nodes.

Virtuoso's transitivity operator is more powerful than what standard SPARQL provides, so in the general case it will not always be possible to express the same thing in a single standard SPARQL query. However, I believe that in this case it is possible.
The following property path would be the equivalent (disclaimer, I haven't tested this query, but it should give you the general idea):
?o (rdf:first|rdf:rest|owl:intersectionOf|owl:unionOf|owl:someValuesFrom|owl:hasValue|owl:allValuesFrom|owl:complementOf|owl:inverseOf|owl:onClass|owl:onProperty)+ ?s .
The full query would become something like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?ref ?refp ?label ?o FROM <http://purl.obolibrary.org/obo/merged/CLO> WHERE {
?ref ?refp ?o .
FILTER ( ?refp IN ( owl:equivalentClass, rdfs:subClassOf ) ) .
OPTIONAL { ?ref rdfs:label ?label } .
{
{
SELECT ?s ?o FROM <http://purl.obolibrary.org/obo/merged/CLO> WHERE {
?o (rdf:first|rdf:rest|owl:intersectionOf|owl:unionOf|owl:someValuesFrom|owl:hasValue|owl:allValuesFrom|owl:complementOf|owl:inverseOf|owl:onClass|owl:onProperty)+ ?s .
}
}
FILTER ( ?s= <http://purl.obolibrary.org/obo/CLO_0003684> )
}
}
ORDER BY ?label
Note, by the way, that the FILTER condition on your variable ?s is outside the subselect, which might make this query a bit of a performance hog. Since you are not using ?s anywhere else in the query, you can simplify this part of the query further, eliminating the FILTER, like so:
{
{
SELECT ?o FROM <http://purl.obolibrary.org/obo/merged/CLO> WHERE {
?o (rdf:first|rdf:rest|owl:intersectionOf|owl:unionOf|owl:someValuesFrom|owl:hasValue|owl:allValuesFrom|owl:complementOf|owl:inverseOf|owl:onClass|owl:onProperty)+ <http://purl.obolibrary.org/obo/CLO_0003684> .
}
}
}

Related

SPARQL CONSTRUCT subClassOf

i want to create a construct query with triples of subjects, that has a certain subclass.
That works fine:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX AS: <http://www.w3.org/ns/activitystreams#>
CONSTRUCT {?s ?p ?o}
WHERE {
?s rdf:type/rdfs:subClassOf* AS:Create ;
?p ?o .
}
But now i want to ask for more than one type!
Something like
WHERE {
?s rdf:type/rdfs:subClassOf* AS:Create|AS:Announce ;
?p ?o .
}
any idea ?
You can use a VALUES clause for this:
VALUES ?cls {AS:Create AS:Announce} ?s rdf:type/rdfs:subClassOf* ?cls ;

Get structure of RDF graph by SPARQL query

How can I get only triples which represent graph structure - class and properties hierarchy (i.e. without individuals, property values)?
It seems that I need rdf:type, owl:class and etc. triplets. So that is my variant:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
select ?s ?p ?o
where
{
{
graph <http://graph.org/gr>
{
?s rdf:type ?o.
?s ?p ?o.
}
FILTER
(?o IN (owl:Class, owl:DatatypeProperty, owl:AnnotationProperty, owl:ObjectProperty, owl:DataRange, owl:Ontology,
owl:DataRange,owl:DeprecatedClass,owl:DeprecatedProperty,owl:OntologyProperty,rdfs:Class,owl:Restriction,owl:InverseFunctionalProperty,
owl:FunctionalProperty,owl:AllDisjointClasses,rdf:Property, rdfs:Datatype )
)
}
UNION
{
graph <http://graph.org/gr>
{
?s ?p ?o.
}
FILTER
(?p IN (rdfs:subClassOf,rdfs:subPropertyOf,rdfs:domain,rdfs:range,rdfs:label,rdfs:comment,rdfs:member,
rdf:first,rdf:rest,owl:allValuesFrom,owl:someValuesFrom,owl:AnnotationProperty,owl:equivalentClass,
owl:equivalentProperty,owl:hasValue,owl:OntologyProperty,owl:SymmetricProperty,owl:TransitiveProperty,
owl:versionInfo,owl:priorVersion,owl:oneOf,owl:maxCardinality,owl:minCardinality,owl:inverseOf,
owl:incompatibleWith,owl:intersectionOf,owl:imports,owl:backwardCompatibleWith,owl:AllDifferent,
owl:differentFrom,owl:disjointWith,owl:distinctMembers,owl:complementOf,owl:cardinality,owl:unionOf,owl:onProperty))
}
}

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language

Sesame repository not being updated using INSERT despite no error

I am trying to update a Sesame repository with data from dbpedia. I have the following query:
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
INSERT{?s ?p ?o}
WHERE {
SERVICE <http://dbpedia.org/sparql>{
{:Rotavirus_vaccine ?p ?o.
}
UNION
{
?s ?p :Rotavirus_vaccine.
}
}
}
This query doesn't show any error doesn't update the repository. On the other hand, splitting the UNION into two separate queries and then updating the repository one by one works. Why do the queries work in isolation but not in union? The code of an individual query is:
INSERT{:Rotavirus_vaccine ?p ?o}
WHERE {
SERVICE <http://dbpedia.org/sparql>{
{:Rotavirus_vaccine ?p ?o.
}
}
}
?s ?p ?o must all be defined in a row for the template be meaningful. Any time a variable is not bound, no update is done.
In one branch of the UNION, ?s ?p are defined and in the other ?p ?o. So all 3 are not defined in the same row.
Either add a BIND or a FILTER e.g for the first part:
{:Rotavirus_vaccine ?p ?o. BIND(:Rotavirus_vaccine AS ?s) }
{?s ?p ?o. FILTER(?s = :Rotavirus_vaccine }
or use this
INSERT{:Rotavirus_vaccine ?p ?o.
?s ?p :Rotavirus_vaccine.
}
because exactly one of those is defined for each case.
I've been able to execute a functional query by using BIND for both clauses of the UNION. The code is:
INSERT{?s ?p ?o}
WHERE
{
SERVICE <dbpedia.org/sparql>
{
{:Rotavirus_vaccine ?p ?o. BIND(:Rotavirus_vaccine AS ?s)}
UNION {?s ?p :Rotavirus_vaccine. BIND(:Rotavirus_vaccine AS ?o) }
}
}

How the pass the output of one sparql query as a input to another sparql query

I am trying get the dbpedia movie link using the movie name in the first query and pass that link in the second query to get the movies similar to this movie.For e.g Lagaan.Now instead of passing the link manually in the second query is there a way to combine the two queries and pass the output of first query as an input to the second query.i.e:the link of the movie lagaan.Also,if the first query gives multiple links eg:if i am searching for Harry potter it will return multiple harry potter series links so,it should handle that case as well.
Query1
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
select distinct ?film where {
?film a dbpedia-owl:Film .
?film rdfs:label ?label .
filter regex( str(?label), "Lagaan", "i")
}
limit 10
Query 2
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
select ?similar (count(?p) as ?similarity) where {
values ?movie { <http://dbpedia.org/resource/Lagaan> }
?similar ?p ?o ; a dbpedia-owl:Film .
?movie ?p ?o .
}
group by ?similar ?movie
having count(?p) > 35
order by desc(?similarity)
Edited query:
select ?film ?similar (count(?p) as ?similarity) where {
{
select distinct ?film where {
?film a dbpedia-owl:Film .
?film rdfs:label ?label .
filter regex( str(?label), "Lagaan", "i")
}
}
?similar ?p ?o ; a dbpedia-owl:Film .
?film ?p ?o .
}
group by ?similar ?film
having count(?p) > 35
order by desc(?similarity)
corrected query as told by Joshua Taylor
select ?film ?other (count(*) as ?similarity) {
{
select ?film where {
?film a dbpedia-owl:Film ; rdfs:label ?label .
filter contains(lcase(?label),"lagaan")
}
limit 1
}
?film ?p ?o .
?other a dbpedia-owl:Film ; ?p ?o .
}
group by ?film ?other
having count(?p) > 25
order by desc(?similarity)
is there a way to combine the two queries and pass the output of first
query as an input to the second query.
SPARQL 1.1 defines subqueries. The results of inner queries are available to outer queries, so they are "passed" to them. In your case, you would have something along the lines of:
select ?similarMovie (... as ?similarity) where {
{ #-- QUERY 1, find one or more films
select distinct ?film where {
#-- ...
}
}
#-- QUERY 2, find films similar to ?film
#-- ...
}