Number of triples of specific group instances? - sparql

I have found another problem in SPARQLing dbpedia. I am trying to get number of triples for specific group of class instances.
Number of triples of class Politician:
SELECT * WHERE {?s ?p ?o FILTER (?s = dbo:Politician OR ?o = dbo:Politician)}
But what about summary number of all triples for a specific group of politicians? For example number of triples of german politician. How is possible to get?
Thank you for your help!

revised answer
This will get the count of entities who are described as being Politicians from Germany —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
}
— and this will get the count of all records where those entities who are described as being Politicians from Germany appear as Subject —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
It is possible that you're looking for a bit more info, to include all records where the entities who are described as being Politicians from Germany appears as either Subject or Object (not just as Subject) —
SELECT COUNT(*)
{ { ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
UNION
{ ?o a dbo:Politician .
?o dbo:nationality dbr:Germany .
?s ?p ?o .
}
}
original answer
I think you are currently aiming for this, which counts all triples with dbo:Politician as either Subject or Object (which is currently 41105, without timeout), but note that this query doesn't count "entities which are politicians" which is (I think) what you're really after!
SELECT ( COUNT ( * ) AS ?NumberOfTriples )
WHERE
{ { dbo:Politician ?p ?o }
UNION
{ ?s ?p dbo:Politician }
}
If you want to count the number of "entities which are politicians" (i.e., rdf:type dbo:Politician) (currently 41078), you need a different query, like this --
SELECT ( COUNT ( DISTINCT ?s ) AS ?NumberOfPoliticians )
WHERE
{ ?s rdf:type dbo:Politician }
This should be clarified by a look at the { dbo:Politician ?p ?o } triples --
SELECT *
WHERE
{ dbo:Politician ?p ?o }

Related

SPARQL: DELETE if subject contains 1 triple and has predicate

I was wondering if anyone knew of a way to DELETE a triple based on 2 conditions:
Subject has a triple count of 1.
The only triple on the subject matches a predicate.
The obstacle I'm coming across with these 2 conditions is that in order to COUNT the number of statements on a subject, you must do GROUP BY ?s. Then you do not have the ability to filter any ?p value.
The most likely solution would be a subquery, but I am not sure how this would be structured.
I would write it like this:
DELETE { ?s ?p ?o }
WHERE {
{
SELECT ?s (COUNT(*) AS ?c) {
?s ?p ?o
}
GROUP BY ?s
}
FILTER (?c = 1)
?s ?p ?o
}
Does something like this work?
I assume that by "Subject has a triple count of 1" you mean that there is only one triple with this subject. For the case of 1, this is a bit easier since we can just check that "there does not exist another triple for the same subject."
DELETE { ?s ?p ?o }
WHERE {
?s ?p ?o
#-- the value of ?p is the predicate of interest.
#-- Alternatively, this could be a FILTER involving
#-- the variable ?p .
values ?p { <the-predicate> }
#-- There is no other triple with ?s as the subject
#-- that has a different subject or object. (I.e.,
#-- the only triple with ?s as a subject is with ?p
#-- and ?o .
filter not exists {
?s ?pp ?oo
filter (?pp != ?p || ?oo != ?o)
}
}

Find orphan nodes with SPARQL

I am trying to find orphan nodes (nodes which do not have any incoming relations) with SPARQL in a Fuseki database.
I tried several queries which all do not return correct results.
I tried the following:
Query 1 (got this from linkedIn)
select ?o ?isOrphan where { GRAPH <http://localhost:8080/catalog/-1305288727> {
?s ?p ?o .
FILTER(!isLiteral(?o))
bind(!(EXISTS {?o ?p1 ?o2}) as ?isOrphan)}}
Query 2
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?p ?s } .
}
}
Query 3 - unbound variable pp in FILTER
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?pp ?s } .
}
}
Any help is highly appreciated.
This query finds each entity that is the subject of any triple, and then checks that this entity is not the object of any triple.
SELECT ?orphan
FROM <http://localhost:8080/catalog/-1305288727>
WHERE {
?orphan ?p1 [] .
FILTER NOT EXISTS { ?linkingNode ?p2 ?orphan . }
}

ُEXISTS and join with distinct values in two SPARQL queries return different results while they should do the same thing?

When running the following two queries on DBpedia the result is different.
First query gives 68 while the second gives 42. The only difference is the line
filter(exists {[] <http://dbpedia.org/ontology/nationality> ?o.})
replaced by join to ensure that the object of dbpo:country is in dbpo:nationality
{select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o.}}
First Query:
select count(*){
{select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
filter(exists {[] <http://dbpedia.org/ontology/nationality> ?o.})
filter(exists {?s <http://dbpedia.org/ontology/nationality> []})
}}.
}
Second Query:
select count(*){
{select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
{select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o.}}
filter(exists {?s <http://dbpedia.org/ontology/nationality> []})
}}.
}
The result of the first query seems to be the correct one.
You've got a DISTINCT in the subquery within the second full query, which is causing some results not to be carried through to the final result set.
Note the result of this query, which drops that keyword from the subquery, matches your first, i.e., 68 --
select count(*)
{ { select distinct ?s ?o
{ ?o1 <http://dbpedia.org/ontology/successor> ?s .
?o1 <http://dbpedia.org/ontology/governor> ?o2 .
?o2 <http://dbpedia.org/ontology/country> ?o
{ select ?o { [] <http://dbpedia.org/ontology/nationality> ?o. } }
filter ( exists { ?s <http://dbpedia.org/ontology/nationality> [] } )
} } }
I can't spare the time to investigate which result rows from the first and third queries are not found in the second, but I imagine that if you dig further into the descriptions of all these ?s and ?o, you will be able to find the answer.
A key hint — SPARQL queries are evaluated from inside-out (also described as from bottom-up, but this is confusing because it's not the literal bottom, but the lowest sub-query). That means that select ?o { [] <http://dbpedia.org/ontology/nationality> ?o. } (or select distinct ?o { [] <http://dbpedia.org/ontology/nationality> ?o. }) is evaluated before the rest of the query -- while the filter clauses are evaluated after the main select.

Sparql CONSTRUCT with DISTINCT

PREFIX content: <http://example.com/content#>
construct { ?s content:field ?o}
WHERE { ?s content:field ?o }
90% of all the ?o I get here are the same URI <http://example.com/name>.
I'm trying to find a way to filter out all quads that have the same value for ?o, so in the end I get a list of quads which are unique by its ?o
I tried DISTINCT ?o CONSTRUCT{...} but from what I saw you cant use DISTINCT on a CONSTRUCT.
How would you filter the returned list of quads
I'm trying to find a way to filter out all quads that have the same
value for ?o, so in the end I get a list of quads which are unique by
its ?o
if it does not matter which exact value is bound to ?s, then a sub-select with a group by ?o is the way to go. Use (SAMPLE(?s) as ?subj) e.g. something like:
`
PREFIX content: <http://example.com/content#>
construct { ?s content:field ?o}
WHERE {
{ select ?o (SAMPLE(?subj) as ?s)
{ ?subj content:field ?o }
group by ?o
}
}
`

Is it possible to Filter Graphs in a way that they at most contain requested Data?

Let me start with an example query to explain my problem:
SELECT ?g ?s ?p ?o WHERE
{
{GRAPH ?g
{ ?s ?p ?o.
OPTIONAL{ ?s
ab:temperature ?temperature.}
FILTER (?temperature = 20)
FILTER NOT EXISTS {?s ab:person ?person}
}
}
}
This query gives me all graphs (in this case representing context data) that have a temperature of 20 but don't have a person associated. My problem is I want to query the graphs for certain optional properties but they shouldn't have any other properties. At the time of the query I only know the OPTIONAL part but I don't know which additional property might be there. Is there an easy way to do this with SPARQL or is that something that would be easier to check after I received the graph and converted it to an object which I can handle with my programm?
If i understand your question correctly, you are searching for graphs that only have that subjects with some properties but not others. In that case i'd run something like this:
SELECT ?g ?s ?p ?o WHERE {
GRAPH ?g {
?s ?p ?o.
FILTER NOT EXISTS {
?s ?bad [] .
FILTER (?bad NOT IN ( ab:temperature, ... ) )
}
}
}