Find orphan nodes with SPARQL - sparql

I am trying to find orphan nodes (nodes which do not have any incoming relations) with SPARQL in a Fuseki database.
I tried several queries which all do not return correct results.
I tried the following:
Query 1 (got this from linkedIn)
select ?o ?isOrphan where { GRAPH <http://localhost:8080/catalog/-1305288727> {
?s ?p ?o .
FILTER(!isLiteral(?o))
bind(!(EXISTS {?o ?p1 ?o2}) as ?isOrphan)}}
Query 2
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?p ?s } .
}
}
Query 3 - unbound variable pp in FILTER
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?pp ?s } .
}
}
Any help is highly appreciated.

This query finds each entity that is the subject of any triple, and then checks that this entity is not the object of any triple.
SELECT ?orphan
FROM <http://localhost:8080/catalog/-1305288727>
WHERE {
?orphan ?p1 [] .
FILTER NOT EXISTS { ?linkingNode ?p2 ?orphan . }
}

Related

How to query the path between two IRI within the same sparql endpoint

How to query the path between two IRI within the same sparql endpoint,Can I control the length and direction of the path.For example, I have some triples,input IRI_A and IRI_B,Can all the paths between IRI_A and IRI_B be detected? now i know that sparql can use this statement,However, the length and direction of the path cannot be controlled
SELECT ?s ?p ?o
WHERE {
GRAPH ?g {
BIND ( <http://localhost/XXX/IRI_A> AS ?start ) .
BIND ( <http://localhost/XXX/IRI_B> AS ?end ) .
?start (<>|!<>)* ?s .
?s ?p ?o .
?o (<>|!<>)* ?end .
}
}
I want to find all the paths between two IRIs

SPARQL: DELETE if subject contains 1 triple and has predicate

I was wondering if anyone knew of a way to DELETE a triple based on 2 conditions:
Subject has a triple count of 1.
The only triple on the subject matches a predicate.
The obstacle I'm coming across with these 2 conditions is that in order to COUNT the number of statements on a subject, you must do GROUP BY ?s. Then you do not have the ability to filter any ?p value.
The most likely solution would be a subquery, but I am not sure how this would be structured.
I would write it like this:
DELETE { ?s ?p ?o }
WHERE {
{
SELECT ?s (COUNT(*) AS ?c) {
?s ?p ?o
}
GROUP BY ?s
}
FILTER (?c = 1)
?s ?p ?o
}
Does something like this work?
I assume that by "Subject has a triple count of 1" you mean that there is only one triple with this subject. For the case of 1, this is a bit easier since we can just check that "there does not exist another triple for the same subject."
DELETE { ?s ?p ?o }
WHERE {
?s ?p ?o
#-- the value of ?p is the predicate of interest.
#-- Alternatively, this could be a FILTER involving
#-- the variable ?p .
values ?p { <the-predicate> }
#-- There is no other triple with ?s as the subject
#-- that has a different subject or object. (I.e.,
#-- the only triple with ?s as a subject is with ?p
#-- and ?o .
filter not exists {
?s ?pp ?oo
filter (?pp != ?p || ?oo != ?o)
}
}

SPARQL query to remove "#en"

I have a large skos taxonomy that has some incorrect notation properties. Most of the properties are xsd:string but some appear with a "#en" language string. I want to modify the triples so as to remove the language string from these triples and convert them to xsd:string.
I tried the query below. It doesn't report any errors and commits successfully.
DELETE { ?s ?p ?o }
INSERT { ?s ?p ?o2 }
WHERE
{
?s skos:notation ?o .
BIND(STRDT(STR(?o), xsd:string) AS ?o2)
}
However, the query does not result in any change to the triples data. Can anyone suggest where I might be going wrong?
Variable ?p in your query appears to be unbound. Try:
DELETE { ?s skos:notation ?o }
INSERT { ?s skos:notation ?o2 }
WHERE
{
?s skos:notation ?o .
BIND(STRDT(STR(?o), xsd:string) AS ?o2)
}

ُEXISTS and join with distinct values in two SPARQL queries return different results while they should do the same thing?

When running the following two queries on DBpedia the result is different.
First query gives 68 while the second gives 42. The only difference is the line
filter(exists {[] <http://dbpedia.org/ontology/nationality> ?o.})
replaced by join to ensure that the object of dbpo:country is in dbpo:nationality
{select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o.}}
First Query:
select count(*){
{select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
filter(exists {[] <http://dbpedia.org/ontology/nationality> ?o.})
filter(exists {?s <http://dbpedia.org/ontology/nationality> []})
}}.
}
Second Query:
select count(*){
{select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
{select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o.}}
filter(exists {?s <http://dbpedia.org/ontology/nationality> []})
}}.
}
The result of the first query seems to be the correct one.
You've got a DISTINCT in the subquery within the second full query, which is causing some results not to be carried through to the final result set.
Note the result of this query, which drops that keyword from the subquery, matches your first, i.e., 68 --
select count(*)
{ { select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
{ select ?o { [] <http://dbpedia.org/ontology/nationality> ?o. } }
filter ( exists { ?s <http://dbpedia.org/ontology/nationality> [] } )
} } }
I can't spare the time to investigate which result rows from the first and third queries are not found in the second, but I imagine that if you dig further into the descriptions of all these ?s and ?o, you will be able to find the answer.
A key hint — SPARQL queries are evaluated from inside-out (also described as from bottom-up, but this is confusing because it's not the literal bottom, but the lowest sub-query). That means that select ?o { [] <http://dbpedia.org/ontology/nationality> ?o. } (or select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o. }) is evaluated before the rest of the query -- while the filter clauses are evaluated after the main select.

Number of triples of specific group instances?

I have found another problem in SPARQLing dbpedia. I am trying to get number of triples for specific group of class instances.
Number of triples of class Politician:
SELECT * WHERE {?s ?p ?o FILTER (?s = dbo:Politician OR ?o = dbo:Politician)}
But what about summary number of all triples for a specific group of politicians? For example number of triples of german politician. How is possible to get?
Thank you for your help!
revised answer
This will get the count of entities who are described as being Politicians from Germany —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
}
— and this will get the count of all records where those entities who are described as being Politicians from Germany appear as Subject —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
It is possible that you're looking for a bit more info, to include all records where the entities who are described as being Politicians from Germany appears as either Subject or Object (not just as Subject) —
SELECT COUNT(*)
{ { ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
UNION
{ ?o a dbo:Politician .
?o dbo:nationality dbr:Germany .
?s ?p ?o .
}
}
original answer
I think you are currently aiming for this, which counts all triples with dbo:Politician as either Subject or Object (which is currently 41105, without timeout), but note that this query doesn't count "entities which are politicians" which is (I think) what you're really after!
SELECT ( COUNT ( * ) AS ?NumberOfTriples )
WHERE
{ { dbo:Politician ?p ?o }
UNION
{ ?s ?p dbo:Politician }
}
If you want to count the number of "entities which are politicians" (i.e., rdf:type dbo:Politician) (currently 41078), you need a different query, like this --
SELECT ( COUNT ( DISTINCT ?s ) AS ?NumberOfPoliticians )
WHERE
{ ?s rdf:type dbo:Politician }
This should be clarified by a look at the { dbo:Politician ?p ?o } triples --
SELECT *
WHERE
{ dbo:Politician ?p ?o }