Sparql- how to get number of triples? - sparql

I'm doing a small exercise on sparql. Using Dbpedia Endpoint, I need to count number of triples.
This is my query
// Get the number of triples //
SELECT (COUNT(*) as ?Triples) WHERE { ?s ?p ?o}
-------------------------------------------------------
OUTPUT:
( ?Triples = 1625382483 )
Just wondering are the query and the result right?
Is this how you get the number of triples?

You can sanity check many things by executing queries directly on a SPARQL endpoint, rather than going through Jena or other intermediary clients. For instance, your query on the DBpedia form, and its results, which shows all triples in that triplestore (currently 1,625,382,483).
If you want the count of triples only within the DBpedia named graph (currently 438,336,517), you'll need to specify that, either in the SPARQL form Default Data Set Name (Graph IRI), or directly in the query, as in --
SELECT (COUNT(*) as ?Triples)
WHERE
{ GRAPH <http://dbpedia.org>
{ ?s ?p ?o }
}
-- or --
SELECT (COUNT(*) as ?Triples)
FROM <http://dbpedia.org>
WHERE { ?s ?p ?o }

Related

Multiple MINUS clause in a sparql query

I am trying to run the following SPARQL query on the public endpoint of DBPedia:
select distinct ?o ?olabel where {
{ select distinct ?s where {
?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>/<http://www.w3.org/2000/01/rdf-schema#subClassOf>{,6} <http://dbpedia.org/ontology/FictionalCharacter>
. ?s <http://dbpedia.org/property/creator> ?oo } limit 100000 }
. ?s <http://dbpedia.org/property/creator> ?o
MINUS { ?o <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>/<http://www.w3.org/2000/01/rdf-schema#subClassOf>{,6} <http://dbpedia.org/class/yago/Communicator109610660>}
# MINUS { ?o <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>/<http://www.w3.org/2000/01/rdf-schema#subClassOf>{,6} <http://dbpedia.org/class/yago/Creator109614315>}
. ?o <http://www.w3.org/2000/01/rdf-schema#label> ?olabel . FILTER (lang(?olabel)=''||lang(?olabel)='en')
} LIMIT 100
It works ok.
But then I uncomment the commented line.
And then the query fails.
From what I understand, multiple MINUS clauses are authorized in SPARQL.
Can anyone give me pieces of advice to rewrite my query so the multiple MINUS clauses are taken into account by DBPedia?
One of the nice things about standards is that we can look things up in the specifications, so after looking at productions 66, 56, and 54 at https://www.w3.org/TR/sparql11-query/#grammar I can confirm that multiple MINUS clauses are allowed by the standard.

How to filter the simple Subject in a SPARQL Query

I guess I am stuck at the basics with SPARQL. Can someone help ?
I simply wnat to filter all subjects containing "Mountain" of an RDS database.
Prefix lgdr:<http://linkedgeodata.org/triplify/> Prefix lgdo:<http://linkedgeodata.org/ontology/>
Select * where {
?s ?p ?o .
filter (contains(?s, "Mountain"))
} Limit 1000
The query leads to an error:
Virtuoso 22023 Error SL001: The SPARQL 1.1 function CONTAINS() needs a string value as first argument
You can get it to "work" using:
Prefix lgdr:<http://linkedgeodata.org/triplify/> Prefix lgdo:<http://linkedgeodata.org/ontology/>
Select * where {
?s ?p ?o .
filter (contains(str(?s), "Mountain"))
} Limit 1000
Note the additional str in the query.
However, that results in
Virtuoso S1T00 Error SR171: Transaction timed out
and I am not sure how to deal with that.
But in principle in works: When you use
Limit 1
you get
s p o
http://linkedgeodata.org/ontology/MountainRescue http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Class

Excluding triples based on substring

I have a graph created by sponging web pages. I'm trying to write a query to exclude all triples where the subject is a web page. I can get these triples simply by
FILTER (STRENDS((STR(?i)), "html"))
but what's the best way to exclude them. Basically I need all {?s ?p ?o} minus the subset {?i ?p ?o}.
This
SELECT *
{ ?s ?p ?o
FILTER ( ! STRENDS( STR(?s), "html") )
}
filters the outcome the pattern.

Meaning of the very simple SPARQL query on LinkedGeoData endpoint?

I am a newbie to SPARQL. Though I have read some materials about RDF and SPARQL, I still cannot understand the meaning of the mysterious SPARQL query on the LinkedGeoData SPARQL endpoint
Prefix lgdr:<http://linkedgeodata.org/triplify/>
Prefix lgdo:<http://linkedgeodata.org/ontology/>
Select * { ?s ?p ?o . }
Limit 1000
What does this oversimplified where condition ?s ?p ?o mean?
The query you ask about will return 1,000 triples from the endpoint with no filter or condition applied i.e. {?s ?p ?o. } will match any triple.
It's similar to SELECT * FROM aView in SQL if aView was a union of all, or most, of the tables in a SQL database.

Behaviour of variables projected out of the sub-select

Given a list of entities (with Persons among them) and their properties how should the following query behave:
select *
where
{
?s ?p ?o.
{
SELECT ?ps WHERE
{
?ps a <http://www.example.org/schema/Person> .
}
limit 1
}
#?ps ?p ?o.
filter (?s =?ps)
}
I tested this in 3 triple stores. Two of them filter on ps with the above query so the result is triples for one person(+ps column).
The 3'rd one returns all database triples because "The variable "ps" that is projected out of the sub-select does not join with anything in the top-level query."
Still since it's projected out and I use it in a FILTER I would expect to apply the filter.
Uncommenting line " #?ps ?p ?o. " will indeed display triples for one person.
The filter will be applied.
The FILTER applies to the whole block. There is a join of results of "?s ?p ?o" with results ?ps (so it's a join that is a cross product at this point - no common variable - but that's OK). That results in solutions with 4 bindings ?s ?p ?o ?ps The filter then applies.
You could write:
WHERE {
?s ?p ?o.
{
SELECT ?s
WHERE { ?s a <http://www.example.org/schema/Person> . }
limit 1
}
}