SPARQL delete if exists - sparql

Let's assume the following Graph as an turtle file:
pf:A a CLASSA
rdfs:label "a"
pf:B a CLASSB
rdfs:label "b"
pf:has A
My Goal is an query to delete A, and if B exists also B.
The current Query expects A and B to exist, if A exists without B nothing is deleted as the query is not satisfied.
DELETE {
?A ?p ?o .
?B ?p ?o .
}
WHERE {
?s ?p ?o
{
SELECT ?b WHERE
{
?b a CLASSB .
?b pf:has A;
}
}
}
Now my question: Is there an way SPARQL offers me to ask for the existence of B and if it exists to also delete it?
If I wrap it in OPTIONAL {} it deletes all instances of CLASSB from the turtle file.

Related

SPARQL: DELETE if subject contains 1 triple and has predicate

I was wondering if anyone knew of a way to DELETE a triple based on 2 conditions:
Subject has a triple count of 1.
The only triple on the subject matches a predicate.
The obstacle I'm coming across with these 2 conditions is that in order to COUNT the number of statements on a subject, you must do GROUP BY ?s. Then you do not have the ability to filter any ?p value.
The most likely solution would be a subquery, but I am not sure how this would be structured.
I would write it like this:
DELETE { ?s ?p ?o }
WHERE {
{
SELECT ?s (COUNT(*) AS ?c) {
?s ?p ?o
}
GROUP BY ?s
}
FILTER (?c = 1)
?s ?p ?o
}
Does something like this work?
I assume that by "Subject has a triple count of 1" you mean that there is only one triple with this subject. For the case of 1, this is a bit easier since we can just check that "there does not exist another triple for the same subject."
DELETE { ?s ?p ?o }
WHERE {
?s ?p ?o
#-- the value of ?p is the predicate of interest.
#-- Alternatively, this could be a FILTER involving
#-- the variable ?p .
values ?p { <the-predicate> }
#-- There is no other triple with ?s as the subject
#-- that has a different subject or object. (I.e.,
#-- the only triple with ?s as a subject is with ?p
#-- and ?o .
filter not exists {
?s ?pp ?oo
filter (?pp != ?p || ?oo != ?o)
}
}

Find orphan nodes with SPARQL

I am trying to find orphan nodes (nodes which do not have any incoming relations) with SPARQL in a Fuseki database.
I tried several queries which all do not return correct results.
I tried the following:
Query 1 (got this from linkedIn)
select ?o ?isOrphan where { GRAPH <http://localhost:8080/catalog/-1305288727> {
?s ?p ?o .
FILTER(!isLiteral(?o))
bind(!(EXISTS {?o ?p1 ?o2}) as ?isOrphan)}}
Query 2
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?p ?s } .
}
}
Query 3 - unbound variable pp in FILTER
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?pp ?s } .
}
}
Any help is highly appreciated.
This query finds each entity that is the subject of any triple, and then checks that this entity is not the object of any triple.
SELECT ?orphan
FROM <http://localhost:8080/catalog/-1305288727>
WHERE {
?orphan ?p1 [] .
FILTER NOT EXISTS { ?linkingNode ?p2 ?orphan . }
}

SPARQL returning empty result when passing MIN(?date1) from subquery into outer query, with BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)

<This question is now resolved, see comment by Valerio Cocchi>
I am trying to pass a variable from a subquery, that takes the minimum date of a set of dates ?date1 belonging to ?p and passes this to the outer query, which then takes another date ?date2 belonging to ?p (there can be at most 1 ?date2 for every ?p) and subtracts ?minDate from ?date2 to get an integer value for the number of years between. I am getting a blank value for this, i.e. ?diffDate returns no value.
I am using Fuseki version 4.3.2. Here is an example of the query:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate)
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}
and an example of the kind of result I am getting:
|-?p----|-----------------?minDate-------------|-----------------?date2------------- |?diffDate|
|<123>|20012-11-22T00:00:00"^^xsd:dateTime|2008-08-18T00:00:00"^^xsd:dateTime| |
I would expect that ?diffDate would give me an integer value. Am I missing something fundamental about how subqueries work in SPARQL?
It seems you have encountered quite an obscure part of the SPARQL spec, namely how BIND works.
Normally SPARQL is evaluated without regard for the position of atoms, i.e.
SELECT *
WHERE {
?a :p1 ?b .
?b :p2 ?c .}
is the same query as:
SELECT *
WHERE {
?b :p2 ?c .
?a :p1 ?b .}
However, BIND is position dependent, so e.g.:
SELECT *
WHERE {
?a :p1 ?b .
BIND(:john AS ?a)}
is not a valid query, whereas:
SELECT *
WHERE {
BIND(:john AS ?a)
?a :p1 ?b .
}
is entirely valid. The same applies to variables used inside of the BIND, which must be declared before the BIND appears.
See here for more.
To go back to your problem, your BIND is using the ?minDate variable before it has been bound, which is why it fails to produce a value for ?diffDate.
This query should do the trick:
SELECT ?p ?minDate ?date2 ?diffDate
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
BIND((YEAR(?minDate) - YEAR(?date2)) AS ?diffDate) #Put the BIND after all the variables it uses are bound.
}
Alternatively, you could evaluate the difference in the SELECT, like so:
SELECT ?p ?minDate ?date2 (YEAR(?minDate) - YEAR(?date2) AS ?diffDate)
{
?p a abc:P;
abc:hasAnotherDate ?date2.
{
SELECT ?p (MIN(?date1) as ?minDate)
WHERE
{
?p a abc:P;
abc:hasDate ?date1.
} group by ?p
}
}

Querying two RDF graphs in SPARQL

I have two RDF knowledge bases
KB1 (path/to/file1.rdf) wkich includes the two following triples
a b c
a f e
and KB2 (path/to/file2.rdf) with the following triple:
c t p
I want to get all paths include ones like ?a ?b ?c & ?c ?t ?p as c is common.
How can I do that in SPARQL?
In case of two KB, it is what we call a "Federated Query". Here is an example :
SELECT * WHERE {
SERVICE URI_for_path/to/file1.rdf {
?a ?b ?c .
OPTIONAL {
SERVICE URI_for_path/to/file2.rdf {
?c ?t ?p . } }
}
}
You will get ?t ?p only if it exists by the way.
Simpliest way is to load both files in a single KB and therefore have a simple query :
SELECT * WHERE {
?a ?b ?c .
?c ?t ?p .
}

SPARQL Subquery Graph Name

I have following SPARQL query that contains a sub-select. The data contains multiple graphs and I want to know what graph the values for ?b and ?m come from:
select ?b, ?m, ?g1
where {
{
select ?o1, ?o2, ?e
where{
graph ?g{
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> ?o1.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> ?o2.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_iteration> '0'^^xsd:decimal.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_exposureday> ?e.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_pid1> ?o1.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_pid2> ?o2.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_acttype1> '5'^^xsd:decimal.
?s1 <http://ndssl.bi.vt.edu/chicago/vocab/contactnetwork_acttype2> '5'^^xsd:decimal
}
}ORDER BY ASC(?e) LIMIT 1
}
{
graph ?g1 {
?b <http://ndssl.bi.vt.edu/chicago/vocab/getInfectedBy> ?o1.
?m <http://ndssl.bi.vt.edu/chicago/vocab/getInfectedBy>* ?b.
}
}
}
The second graph pattern contains a transitive property path and the query provides following correct result:
b m g1
----------------------------------------------------- ----------------------------------------------------- -------------------------------------------------------
<http://ndssl.bi.vt.edu/chicago/person/pid#446734805> <http://ndssl.bi.vt.edu/chicago/person/pid#446753456> <http://ndssl.bi.vt.edu/chicago/dendrogram/replicate1/>
However, I want to see the intermediate nodes and count the path length from transitive relationship. If I remove graph ?g1 from the query, then it shows the intermediate node information like following:
b m
--------------------------------------------------- ---------------------------------------------------
http://ndssl.bi.vt.edu/chicago/person/pid#446718746 http://ndssl.bi.vt.edu/chicago/person/pid#446718746
http://ndssl.bi.vt.edu/chicago/person/pid#446734805 http://ndssl.bi.vt.edu/chicago/person/pid#446734805
http://ndssl.bi.vt.edu/chicago/person/pid#446734805 http://ndssl.bi.vt.edu/chicago/person/pid#446753456
The Purpose of the query is to figure out graph name for matching ?b and ?m. Hence, I want to use graph ?g1. Is it possible to show intermediate nodes by retaining the graph keyword? I am using Virtuoso.
Since you're not using g the first GRAPH statement is not necessary. Note also that the second GRAPH statement only uses ?o1, so the following query do what you want it to. You may also want to check SPARQL syntax in your select clause.
PREFIX ndssl: <http://ndssl.bi.vt.edu/chicago/vocab/>
SELECT ?b ?m ?g1
WHERE {
{
SELECT ?o1
WHERE {
?s ndssl:dendrogram_infector_pid ?o1 .
?s ndssl:dendrogram_infectee_pid ?o2 .
?s ndssl:dendrogram_iteration '0'^^xsd:decimal .
?s ndssl:dendrogram_exposureday ?e .
?s1 ndssl:contactnetwork_pid1 ?o1 .
?s1 ndssl:contactnetwork_pid2 ?o2 .
?s1 ndssl:contactnetwork_acttype1 '5'^^xsd:decimal .
?s1 ndssl:contactnetwork_acttype2 '5'^^xsd:decimal
} ORDER BY ASC(?e) LIMIT 1
}
GRAPH ?g1 {
?b ndssl:getInfectedBy ?o1 .
?m ndssl:getInfectedBy* ?b .
}
}
In the endpoint provided there isn't a match for ?b or ?m, regardless of whether a GRAPH statement is used or not.