Sparql MULTIPLE not exists condition - sparql

I'm executing a Sparql query that returns all the URIs where the keyword apple does not belong to a specific subclass Species
select distinct ?s
where
{
?s a owl:Thing . ?s rdfs:label ?label .
filter(langmatches(lang(?label), 'en')) ?label bif:contains '"apple"' .
filter not exists {?s rdf:type/rdfs:subClassOf* dbo:Species }
}
I want to include more subclasses. I want to include MANY subclasses, so I want filter out like so:
filter not exists {?s rdf:type/rdfs:subClassOf* dbo:Species AND filter not exists {?s rdf:type/rdfs:subClassOf* dbo:Organisation AND filter not exists {?s rdf:type/rdfs:subClassOf* dbo:SomeOtherSubclass
How do I chain MULTIPLE ANDs together?

You can do this:
FILTER NOT EXISTS {
VALUES ?clazz { dbo:Species dbo:Organisation dbo:SomeOtherSubclass }
?s rdf:type/rdfs:subClassOf* ?clazz.
}
No guarantees on how well this performs though.

Related

SPARQL query - multiple OPTIONALS

I want to retrieve the entities of the classes along with some information - if exists (e.g. label, comment, image)
I run the following query:
PREFIX schema: <http://schema.org/>
SELECT DISTINCT ?entity ?label ?comment ?image
WHERE {
?entity a ?class .
OPTIONAL {?entity rdfs:label ?label } .
OPTIONAL {?entity schema:image ?image} .
OPTIONAL {?entity rdfs:comment ?comment} .
OPTIONAL {?entity rdfs:label ?label} .
FILTER (langMatches(lang(?comment), "EN") && langMatches(lang(?label), "EN")) .
}
Although, some triples contain this extra information I do not receive them
(even if I re-write and include all the optional statements in one statement)
How could I compose such a query?

Sparql find entities with dbo type and sort them by count

I have to do a sparql query to the dbpedia endpoint which needs to:
Find all the entities containing "vienna" in the label and "city" in the abstract
Filter them keeping only the ones that have at least one dbo rdf:type
Sort the results by count of dbo types (e.g. if an entity has 5 dbo rdf:type it has to be shown before entities with 4 dbo rdf:type)
I did several attempts, the closest to the result is:
select distinct (str(?s) as ?s) count(?t) as ?total where {{ ?s rdfs:label "vienna"#en. ?s rdf:type ?t.}
UNION { ?s rdfs:label ?l. ?s rdf:type ?t . ?l <bif:contains> '("vienna")'
. FILTER EXISTS { ?s dbo:abstract ?cc. ?cc <bif:contains> '("city")'. FILTER(lang(?cc) = "en").}}
FILTER (!strstarts(str(?s), str("http://dbpedia.org/resource/Category:")))
. FILTER (!strstarts(str(?s), str("http://dbpedia.org/property/")))
. FILTER (!strstarts(str(?s), str("http://dbpedia.org/ontology/")))
. FILTER (strstarts(str(?t), str("http://dbpedia.org/ontology/"))).}
LIMIT 50
Which will (wrongly) count the rdf:type before actually filtering it. I don't want to count rdf:type that are not dbo (ontology).
The idea is to use a subquery in which you search for the entities and to do the counting in the outer query:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s (count(*) AS ?cnt)
WHERE
{ { SELECT DISTINCT ?s
WHERE
{ ?s rdfs:label ?l .
?l <bif:contains> '"vienna"'
FILTER langMatches(lang(?l), "en")
FILTER EXISTS { ?s dbo:abstract ?cc .
?cc <bif:contains> '"city"'
FILTER langMatches(lang(?cc), "en")
}
?s rdf:type ?t
FILTER ( ! strstarts(str(?s), str("http://dbpedia.org/resource/Category:")) )
FILTER ( ! strstarts(str(?s), str("http://dbpedia.org/property/")) )
FILTER ( ! strstarts(str(?s), str(dbo:)) )
FILTER strstarts(str(?t), str(dbo:))
}
}
?s ?p ?o
FILTER strstarts(str(?p), str(dbo:))
}
GROUP BY ?s
ORDER BY DESC(?cnt)

How to return all S->P->O triples from a starting resource to a specified path depth?

My goal is to graphically represent the S->P->O relations within a depth two edges from the specified resource, p:Person_1. I want all relations within that path length to be returned from my query as ?s, ?p, ?o for further processing in my graphical application.
I tried the first query below which gives me my first set of ?s ?p ?o with repeats, then ?p2, ?o2, ?p3, ?o3 as additional columns in the result. I want to bind ?p2 and ?p3 to ?p, ?o2 and ?o3 to ?o.
SELECT *
WHERE {
p:Person_1 ?p ?o .
BIND("p:Person_1" as ?s)
OPTIONAL{
?o ?p2 ?o2 .
}
OPTIONAL{
?o2 ?p3 ?o3 .
}
}
Then, based on How do I construct get the whole sub graph from a given resource in RDF Graph?, I tried using CONSTRUCT to return the graph.
PREFIX p: <http://www.example.org/person/>
PREFIX x: <example.org/foo/>
construct { ?s ?p ?o }
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:|!x:)* ?s .
?s ?p ?o .
}
I am using Virtuoso and I get the error:
Virtuoso 37000 Error SP031: SPARQL compiler: Variable ?_::trans_subj_9_3 in T_IN list is not a value from some triple
I could post-process the result from my first query but I want to learn how to do this correctly with SPARQL, preferably on Virtuoso.
Update after testing the advice from #AKSW :
Both CONSTRUCT and SELECT statements work with the pattern suggested.
CONSTRUCT { ?s ?p ?o }
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar)* ?s .
?s ?p ?o .
} LIMIT 100
and:
SELECT s ?p ?o
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar)* ?s .
?s ?p ?o .
} LIMIT 100
The SELECT results in several duplicates that cannot be removed using DISTINCT, which results in an error that I assume is due to the 'datatype' of some of the returned values.
Virtuoso 22023 Error SR066: Unsupported case in CONVERT (DATETIME -> IRI_ID)
It appears some post-SPARQL processing is in order.
This gets me most of the way there. Still hoping I can find a solution for SPARQL that is like Cypher's "number of hops away" :
OPTIONAL MATCH path=s-[*1..3]-(o)
Here is a SPARQL query that works in Virtuoso. Note the SPARQL W3C standard does not support this syntax and it will fail in other triplestores.
PREFIX p: <http://www.example.org/person/>
PREFIX x: <example.org/foo/>
# CONSTRUCT {?s ?p ?o} # If you wish to return the graph
SELECT ?s ?p ?o # To return the triples
FROM <http://localhost:8890/MYGRAPH>
where { p:Person_1 (x:foo|!x:bar){1,3} ?s .
?s ?p ?o .
}LIMIT 100
See also K. Idehen's wiki entry here: http://linkedwiki.com/exampleView.php?ex_id=141
And thanks to #Joshua Taylor for advice in the same area.
Working Drafts of SPARQL 1.1 Property Paths included the {n,m} operator for handling this issue, which was implemented (and will remain supported) in Virtuoso. Here's a tweak to #tim's response.
Live SPARQL Query Results Page using the DBpedia endpoint (which is a Virtuoso instance).
Live SPARQL Query Definition Page that opens up query source code in the default DBpedia query editor.
Actual Query Example:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s AS ?Entity
?o AS ?Category
WHERE {
?s rdf:type <http://dbpedia.org/ontology/AcademicJournal> ;
rdf:type{1,3} ?o
}
LIMIT 100
Should you be looking for LinkedIn-like presentation of Contact Networks and Degrees of Separation between individuals, here is an example using Virtuoso-specific SPARQL Extensions that solve this particular issue:
SELECT ?o AS ?WebID
((SELECT COUNT (*) WHERE {?o foaf:knows ?xx})) AS ?contact_network_size
?dist AS ?DegreeOfSeparation
<http://www.w3.org/People/Berners-Lee/card#i> AS ?knowee
WHERE
{
{
SELECT ?s ?o
WHERE
{
?s foaf:knows ?o
}
} OPTION (TRANSITIVE, t_distinct, t_in(?s), t_out(?o), t_min (1), t_max (4), t_step ('step_no') AS ?dist) .
FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>)
FILTER (isIRI(?s) and isIRI(?o))
}
ORDER BY ?dist DESC (?contact_network_size)
LIMIT 500
Note: this approach is the only way (at the current time) to expose actual relational hops between entities in an Entity Relationship Graph that includes Transitive relations.
Live Link to Query Results
Live Link to Query Source Code
Bearing in mind that the r{n,m} operator was deprecated in the final SPARQL 1.1 (but will remain supported in Virtuoso), you can use r/r?/r? instead of r{1,3}, if you want to work strictly off the current spec:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s AS ?Entity
?o AS ?Category
WHERE {
?s rdf:type <http://dbpedia.org/ontology/AcademicJournal> ;
rdf:type / rdf:type? / rdf:type? ?o
}
LIMIT 100
Here's a live example, against the DBpedia instance hosted in Virtuoso.

sparql how to count variable pairs

I have the following query that gets instances of a class and their label/names. I want to count how many total results there are. However, I do not know how to formulate the count statement.
select ?s ?l {
?s a <http://dbpedia.org/ontology/Ship> .
{?s <http://www.w3.org/2000/01/rdf-schema#label> ?l}
union
{?s <http://xmlns.com/foaf/0.1/name> ?l}
}
I have tried
select ?s ?l (count (?s) as ?count) {
?s a <http://dbpedia.org/ontology/Ship> .
{?s <http://www.w3.org/2000/01/rdf-schema#label> ?l}
union
{?s <http://xmlns.com/foaf/0.1/name> ?l}
}
But that gives the counting for each ?s ?l pair, instead I need to know how many of the ?s ?l pairs there are. Or maybe I should not use count at all? As mentioned all I need to know is how many results in total a query returns (regardless of the hard limit that is put by the server, e.g., DBPedia returns a maximum of 50000 results for each query).
Any suggestions please?
Many thanks!
To count the number of matches, use
SELECT (COUNT(*) AS ?count)
WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#label> | <http://xmlns.com/foaf/0.1/name> ?l .
}
Note I'm using the property path "or" (|) to get the union of the properties.

Behaviour of variables projected out of the sub-select

Given a list of entities (with Persons among them) and their properties how should the following query behave:
select *
where
{
?s ?p ?o.
{
SELECT ?ps WHERE
{
?ps a <http://www.example.org/schema/Person> .
}
limit 1
}
#?ps ?p ?o.
filter (?s =?ps)
}
I tested this in 3 triple stores. Two of them filter on ps with the above query so the result is triples for one person(+ps column).
The 3'rd one returns all database triples because "The variable "ps" that is projected out of the sub-select does not join with anything in the top-level query."
Still since it's projected out and I use it in a FILTER I would expect to apply the filter.
Uncommenting line " #?ps ?p ?o. " will indeed display triples for one person.
The filter will be applied.
The FILTER applies to the whole block. There is a join of results of "?s ?p ?o" with results ?ps (so it's a join that is a cross product at this point - no common variable - but that's OK). That results in solutions with 4 bindings ?s ?p ?o ?ps The filter then applies.
You could write:
WHERE {
?s ?p ?o.
{
SELECT ?s
WHERE { ?s a <http://www.example.org/schema/Person> . }
limit 1
}
}