select all people in dbpedia associated with particular concept/occupation/profession - sparql

I am trying to select all people in wikipedia with a particular profession, occupation or are known for a particular field (i.e. contains the string of the occupation or profession).
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
}
LIMIT 100
Produces sort of what I want but I also want to include their Profession/Occupation (both of which are ?Profession or ?occupation) but the following doesn't return anything:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/profession> ?profession.
}
LIMIT 100
Or including ?Occupation doesn't work either as some don't have a profession entry but have an occupation. I just want all people (and their influenced list) for a particular profession/occupation i.e. "psychology" or "architect". I want basically a table:
Person | Profession/Occupation | Known For | Influenced Person | Profession/Occupation | Known For
The below is obviously wrong but something like this....
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
?p <http://dbpedia.org/ontology/influenced> ?influenced.
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?s <http://dbpedia.org/ontology/profession> ?profession. (or ?fields/?occupation)
}
LIMIT 100

I don't know the exact reason*, but it seems that people in DBpedia who have dbo:influenced or dbo:knownFor don't have dbo:profession or dbo:occupation.
So, your query won't work, and you will have to find some other way to find the information you need, possibly by looking at dbp:fields or rdf:type.
* My guess is that dbo:influenced and dbo:knownFor is used for scientists and scientists don't have occupation or profession in DBpedia.

DBpedia is extraction of wikipedia infoboxes. If you look at the result of the first query and take one example out of it, check in wikipedia, you will see that there is no profession or occupation is used in the infoboxes. For example, let us look at http://dbpedia.org/page/Jerry_Coyne, wikipedia page is https://en.wikipedia.org/wiki/Jerry_Coyne. In wikipedia page, we can not find any information about profession or occupation in infobox. This is because https://en.wikipedia.org/wiki/Template:Infobox_scientist is used as template and there is no occupation/profession in that template. On the other hand there is https://en.wikipedia.org/wiki/Template:Infobox_person template, we can find occupation and known for. Therefore, below query will work:
SELECT * WHERE {
?p a <http://dbpedia.org/ontology/Person> .
?p <http://dbpedia.org/ontology/knownFor> ?knownFor.
?p <http://dbpedia.org/ontology/occupation> ?occupation.
}
LIMIT 100
But this one will query only extraction of page which used https://en.wikipedia.org/wiki/Template:Infobox_person and will not return any page which used https://en.wikipedia.org/wiki/Template:Infobox_scientist.

Related

Retrieve the subjects that have at most two properties

I wanted to retrive subjects which have less than three properties. I tried with following query.
SELECT ?s
WHERE {
?s ?p ?o .
}
group by ?s
having (count(?p) < 3)
limit 10
But it's taking quite long time. Do we have any other optimized way to do same thing.

Obtain the average number of distinct values for each of the properties

I am trying with SPARQL querying to do this:
For each of the properties, obtain the average number of distinct values that they take for the instances (e.g., what is the average number of occupations for a Formula 1 Driver, what is the average number of teams that they have participated in, etc.)
You would need a query like:
SELECT ?p (AVG(?ct) AS ?avg)
WHERE {
SELECT ?s ?p (COUNT(DISTINCT ?o) AS ?ct)
WHERE {
?s ?p ?o .
#more restrictions...
}
GROUP BY ?s ?p }
GROUP BY ?p
Notice that this approach ignores instances where the count is 0, i.e. a F1 driver with no profession.
Also, the query is likely to time out unless you add some more restrictions to reduce the size of the matched data.

Finding specific person in dbpedia using sparql query

So I'm learning SPARQL and trying to get a specific person out of dbpedia using a SPARQL query. I'm using their endpoint at https://dbpedia.org/sparql to do this query. The person I'm trying to find is Albert Einstein . I found on his page that he has a property of dbp:name. So I wrote a query to find Albert Einstein by his name:
select ?person where {?person <http://dbpedia.org/property/name> "Albert Einstein"} LIMIT 100
Unfortunately, this gives me just an empty result set (I think) as i'm justing seeing an empty table with the header of 'Person'
What's going wrong?
I'm not sure why your query doesn't work, but this one does:
select ?person where {?person foaf:name "Albert Einstein"#en} LIMIT 100
Though this returns all entities named "Albert Einstein", which includes the scientist, but also an album of the same name. If you want just people, you can use:
select ?person
where {
?person foaf:name "Albert Einstein"#en.
?person a foaf:Person.
}
LIMIT 100

Sparql - Traverse to top (broadest)

I have an internal triples dataset.
I am trying to implement a typeahead feature for an application using the dataset.
I am trying to figure out how I can conditional traverse to the Top Concept
In this picture, if I searched for dreamworks, It would be 3 layers down (Business Organizations -> Private Company -> Dreamworks).
If I do something else, obviously it will be at different layers of the graph.
I am trying to get the value "Organization" or "Person" as the top level.
This very straightforward query works.
SELECT DISTINCT ?subjectPrefLabel ?a ?b ?o
WHERE
{
?subject skosxl:prefLabel/skosxl:literalForm ?subjectPrefLabel .
?subject skos:broader/skos:broader/skos:broader/skosxl:prefLabel/skosxl:literalForm ?a
FILTER regex(?subjectPrefLabel, "Dreamworks", 'i')
}
ORDER BY ?subjectPrefLabel
However, obviously this is unsustainable, since the developer would need to know how many levels down the hierarchy in order to issue the correct number of skos:broader.
Is there anyway I can conditionally discover the highest level of the hierarchy?
As this is an internal Ontology, I cannot share any data, so I know it will be hard to work with, any advice is appreciated.
Using the kleene star operator * incombination with a FILTER that there is no more broad concept should return the top concepts only:
SELECT DISTINCT ?subjectPrefLabel ?a ?b ?o WHERE {
?subject skosxl:prefLabel/skosxl:literalForm ?subjectPrefLabel .
?subject skos:broader* ?concept.
?concept skosxl:prefLabel/skosxl:literalForm ?a
FILTER NOT EXISTS { ?concept skos:broader ?supConcept }
FILTER regex(?subjectPrefLabel, "Dreamworks", 'i')
}
ORDER BY ?subjectPrefLabel

How to get all semantic data that is common between three classes using SPARQL?

I have a scenario where the user can select multiple classes and i need to build dynamic SPARQL query to fetch all the data that satisfies all the relationships thats exist (if any )between all selected classes. How do i achieve this?
Suppose the user selects three classes- Population, Drugname,SideEffects. Now he tries to find all the common data that exists between these three classes. How do i build a SPARQL query to achieve this? I need a sample SPARQL query that joins these three classes.
If you want to retrieve properties for which some resources have the same value (and so you probably want to retrieve that value, too), you can use a query like this:
select ?p ?o where {
dbpedia:Bob_Dylan ?p ?o .
dbpedia:Tom_Waits ?p ?o .
dbpedia:The_Byrds ?p ?o .
}
SPARQL results from DBpedia
p o
--------------------------------------------------------------------------------------------------------------------------
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Agent
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://schema.org/MusicGroup
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/YagoLegalActor
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/class/yago/YagoLegalActorGeo
http://purl.org/dc/terms/subject http://dbpedia.org/resource/Category:Rock_and_Roll_Hall_of_Fame_inductees
http://dbpedia.org/ontology/recordLabel http://dbpedia.org/resource/Asylum_Records
http://dbpedia.org/ontology/genre http://dbpedia.org/resource/Rock_music
http://dbpedia.org/property/genre http://dbpedia.org/resource/Rock_music
http://dbpedia.org/property/label http://dbpedia.org/resource/Asylum_Records
http://dbpedia.org/property/wordnet_type http://www.w3.org/2006/03/wn/wn20/instances/synset-musician-noun-1
You could also use a query like this to find the subjects that are related to all three of these individuals by some particular property (but on DBpedia it doesn't have any answers):
select ?s ?p where {
?s ?p dbpedia:Bob_Dylan, dbpedia:Tom_Waits, dbpedia:The_Byrds .
}