SPARQL: extract distinct values from dbpedia - sparql

I want to extract graphs for 5 individuals who are Film(or movies) from DBPedia.
My query is:
ParameterizedSparqlString qs = new ParameterizedSparqlString( "" +
"construct{?s ?p ?o}"+
"where{?s a http://dbpedia.org/ontology/Film ."+
"?s ?p ?o"} OFFSET 0 LIMIT 5" );
I get the following result:
1- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Film .
2- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing .
3- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.wikidata.org/entity/Q386724 .
4- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Wikidata:Q11424 .
5- http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Work .
Problem:
The same film is returned 5 times as all the class: Film, Thing, Q386724,WIKIdata:Q11424, and Work are equivalent class (or Subclass relation exist).
My question:
I want to return once the triple
<http://dbpedia.org/resource/1001_Inventions_and_the_World_of_Ibn_Al-Haytham>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Film> .
and filter out the other 4 triples.
How do it please?
Thank you in advance

I think the following should work for you:
CONSTRUCT {?s ?p ?o}
WHERE {
{ SELECT DISTINCT ?s
WHERE {
?s a <http://dbpedia.org/ontology/Film> .
} LIMIT 5
}
?s ?p ?o .
}

I think you want this query
construct {?s a <http://dbpedia.org/ontology/Film> .}
where { ?s a <http://dbpedia.org/ontology/Film>. }
limit 5

Related

Find orphan nodes with SPARQL

I am trying to find orphan nodes (nodes which do not have any incoming relations) with SPARQL in a Fuseki database.
I tried several queries which all do not return correct results.
I tried the following:
Query 1 (got this from linkedIn)
select ?o ?isOrphan where { GRAPH <http://localhost:8080/catalog/-1305288727> {
?s ?p ?o .
FILTER(!isLiteral(?o))
bind(!(EXISTS {?o ?p1 ?o2}) as ?isOrphan)}}
Query 2
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?p ?s } .
}
}
Query 3 - unbound variable pp in FILTER
SELECT ?source ?s ?p ?o
WHERE { GRAPH <http://localhost:8080/catalog/-1305288727>{
?s ?p ?o .
FILTER EXISTS {?source ?pp ?s } .
}
}
Any help is highly appreciated.
This query finds each entity that is the subject of any triple, and then checks that this entity is not the object of any triple.
SELECT ?orphan
FROM <http://localhost:8080/catalog/-1305288727>
WHERE {
?orphan ?p1 [] .
FILTER NOT EXISTS { ?linkingNode ?p2 ?orphan . }
}

Number of triples of specific group instances?

I have found another problem in SPARQLing dbpedia. I am trying to get number of triples for specific group of class instances.
Number of triples of class Politician:
SELECT * WHERE {?s ?p ?o FILTER (?s = dbo:Politician OR ?o = dbo:Politician)}
But what about summary number of all triples for a specific group of politicians? For example number of triples of german politician. How is possible to get?
Thank you for your help!
revised answer
This will get the count of entities who are described as being Politicians from Germany —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
}
— and this will get the count of all records where those entities who are described as being Politicians from Germany appear as Subject —
SELECT COUNT(*)
{ ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
It is possible that you're looking for a bit more info, to include all records where the entities who are described as being Politicians from Germany appears as either Subject or Object (not just as Subject) —
SELECT COUNT(*)
{ { ?s a dbo:Politician .
?s dbo:nationality dbr:Germany .
?s ?p ?o .
}
UNION
{ ?o a dbo:Politician .
?o dbo:nationality dbr:Germany .
?s ?p ?o .
}
}
original answer
I think you are currently aiming for this, which counts all triples with dbo:Politician as either Subject or Object (which is currently 41105, without timeout), but note that this query doesn't count "entities which are politicians" which is (I think) what you're really after!
SELECT ( COUNT ( * ) AS ?NumberOfTriples )
WHERE
{ { dbo:Politician ?p ?o }
UNION
{ ?s ?p dbo:Politician }
}
If you want to count the number of "entities which are politicians" (i.e., rdf:type dbo:Politician) (currently 41078), you need a different query, like this --
SELECT ( COUNT ( DISTINCT ?s ) AS ?NumberOfPoliticians )
WHERE
{ ?s rdf:type dbo:Politician }
This should be clarified by a look at the { dbo:Politician ?p ?o } triples --
SELECT *
WHERE
{ dbo:Politician ?p ?o }

How the pass the output of one sparql query as a input to another sparql query

I am trying get the dbpedia movie link using the movie name in the first query and pass that link in the second query to get the movies similar to this movie.For e.g Lagaan.Now instead of passing the link manually in the second query is there a way to combine the two queries and pass the output of first query as an input to the second query.i.e:the link of the movie lagaan.Also,if the first query gives multiple links eg:if i am searching for Harry potter it will return multiple harry potter series links so,it should handle that case as well.
Query1
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
select distinct ?film where {
?film a dbpedia-owl:Film .
?film rdfs:label ?label .
filter regex( str(?label), "Lagaan", "i")
}
limit 10
Query 2
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
select ?similar (count(?p) as ?similarity) where {
values ?movie { <http://dbpedia.org/resource/Lagaan> }
?similar ?p ?o ; a dbpedia-owl:Film .
?movie ?p ?o .
}
group by ?similar ?movie
having count(?p) > 35
order by desc(?similarity)
Edited query:
select ?film ?similar (count(?p) as ?similarity) where {
{
select distinct ?film where {
?film a dbpedia-owl:Film .
?film rdfs:label ?label .
filter regex( str(?label), "Lagaan", "i")
}
}
?similar ?p ?o ; a dbpedia-owl:Film .
?film ?p ?o .
}
group by ?similar ?film
having count(?p) > 35
order by desc(?similarity)
corrected query as told by Joshua Taylor
select ?film ?other (count(*) as ?similarity) {
{
select ?film where {
?film a dbpedia-owl:Film ; rdfs:label ?label .
filter contains(lcase(?label),"lagaan")
}
limit 1
}
?film ?p ?o .
?other a dbpedia-owl:Film ; ?p ?o .
}
group by ?film ?other
having count(?p) > 25
order by desc(?similarity)
is there a way to combine the two queries and pass the output of first
query as an input to the second query.
SPARQL 1.1 defines subqueries. The results of inner queries are available to outer queries, so they are "passed" to them. In your case, you would have something along the lines of:
select ?similarMovie (... as ?similarity) where {
{ #-- QUERY 1, find one or more films
select distinct ?film where {
#-- ...
}
}
#-- QUERY 2, find films similar to ?film
#-- ...
}

Getting Wrong Result in Sparql

I am try to Implement a Sparql Query which will give some result.I am trying to Implement Like this:
My Data points are below from where I getting the data:
Subject:
<http://rhizomik.net/semanticxbrl/0001397832_agph-20110930/Context_9ME_30-Sep-2011/ConvertibleNotesPayableTextBlock/>
Predicate:
<http:// www.w3.org/1999/02/22-rdf-syntax-ns#type>
Object:
<http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock>
My Query is
PREFIX ab: <http:// www.atlanticgreenpower.com/20110930#>
SELECT ?node ?val_type ?value
WHERE {
?node ab:val_type ?val_type .
?node ab:value ?value .
}
I want to get the result of all subject predicate and object.I am new to sparql.please help me out
In SPARQL everything with a ? in front of it is a variable. If you want everything from the subject, predicate and object you can do it like this:
SELECT ?s ?p ?o WHERE { ?s ?p ?o }
But then you list everything. When you want to only list the values of your given subject you will have to do it like this:
SELECT * WHERE { <http://rhizomik.net/semanticxbrl/0001397832_agph-20110930/Context_9ME_30-Sep-2011/ConvertibleNotesPayableTextBlock/> ?p ?o }
And your query could easily be formatted to something like this:
PREFIX ab: <http:// www.atlanticgreenpower.com/20110930#>
SELECT ?s ?valType ?value WHERE {
?s ab:val_type ?valType ;
ab:value ?value .
}
Where ; marks when you want to include something else in your query with your given subject, but . marks the end of your query. If you want more information about how SPARQL works or how you can query check out the Euclid project with webinar recording: Querying linked data, 2013-03-04 or just test it yourself on a SPARQL endpoint like from FactForge.
To reflect my comment below:
SELECT * WHERE { ?s ?p <http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock> }
will select everything with the given object.
Try this:
SELECT * WHERE
{
?s ?p <http://www.atlanticgreenpower.com/20110930#ConvertibleNotesPayableTextBlock>
}

DBPedia Person dataset

I need to extract Person's names with variation from DBPEdia.
My SPARQL request:
select distinct ?o where {
{ ?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>;
<http://xmlns.com/foaf/0.1/name> ?o }
union
{
?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>;
rdfs:label
}
FILTER (langMatches(lang(?o),"en"))}
DBPedia returns only 50 000 names via SRARQL request.
May be a dataset exists for Persons with all names variation?
Existing persons_en.nt dataset contains only foaf:name but I need other names' variations. Sometimes they listed in rdfs:label(e.g. for Maria Sharapova).
Found the answer in another post. Run the following SPARQL:
SELECT ?o WHERE {
{
select distinct ?o
where {
?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>;
rdfs:label ?o
FILTER (langMatches(lang(?o),"en"))
}
ORDER BY ASC(?o)
}
} OFFSET 800000 LIMIT 50000
in the Java programm:
Query query = QueryFactory.create(queryString);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);