Use DBPedia to load all People along with some data - sparql

I am using DBpedia for the first time. I would like to download all the people in the dataset of people along with properties for commonName, nationality, birthDate, and knownFor (I will eventually stick it into an excel spread sheet using some sort of scripting language I think).
This was my first attempt at a query to do this job, however it does not work. I tried piecing it together from other bits of code I've seen on the internet. Does anyone know how to fix this? Thanks
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
EDIT: NEW VERSION OF CODE: RETURNS COMMONNAME (WITH NON-ENGLISH DUPLICATES). STILL MISSING OTHER PROPERTIES.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/ontology/>
SELECT DISTINCT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName . FILTER(lang(?commonName) = 'en')
}
LIMIT 30

First, your query has a bunch of syntax issues:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
^ you probably want to use the dbpedia-owl properties which are
# in <http://dbpedia.org/ontology/>
SELECT ?person ?commonName ?nationality ?knownFor ? birthDate
^ space between ? and varname
WHERE {
?person a type:Person .
?person prop:commonName ?commonNameFilter(lang(?commonName) = 'en') .
^ This needs to be "?commonName . FILTER(..."
# and the same thing applies to your other
# filters
?person prop:nationality ?nationality(lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor(lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
It's easier to build some of these queries incrementally, because then you can find out what properties some of the resources actually have, and then you can extend your query some more. The public endpoint has a number of predefined namespaces, and using those will make it easier for others to read your query. So, you can start by asking for people:
SELECT * WHERE {
?person a dbpedia-owl:Person .
}
LIMIT 10
SPARQL results
Seeing that that's working, you can look at some of the returned instances and see that they have dbpedia-owl:commonName properties, and then extend the query:
SELECT * WHERE {
?person a dbpedia-owl:Person ;
dbpedia-owl:commonName ?commonName .
}
LIMIT 10
SPARQL results
It's easy enough to extend this with the dbpedia-owl:birthDate property. I don't see a nationality predicate on the instances that I've looked at, so I'm not sure what you were basing the nationality query on. While I saw some use of the knownFor property, I didn't see it on many instances, so if you make it a required property, you're going to exclude lots of people. This sort of incremental approach will probably help you out in the long run, though.
Finding Properties
While the browseable ontology provides a nice way to find classes, I'm not sure whether there's such a nice way of finding properties. However, you can do something in a brute force manner. E.g., to find all the properties that have actually been used for Persons, you can run a query like the following. (Note: this query takes a while to execute, so if you use it, you should probably download the results.)
select distinct ?p where {
[] a dbpedia-owl:Person ;
?p [] .
}
SPARQL results
I'll note that dbpedia-owl:nationality does appear in that list.
To get all the properties for everything, you can download the ontology, and run a query like:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
select * where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
}
I ran this locally using Jena's ARQ:
$ arq --query properties.sparql --data dbpedia_3.8.owl
----------------------------------------------------------------------------
| p |
============================================================================
| <http://dbpedia.org/ontology/regionServed> |
| <http://dbpedia.org/ontology/coachedTeam> |
| <http://dbpedia.org/ontology/legalForm> |
| <http://dbpedia.org/ontology/goldenCalfAward> |
| <http://dbpedia.org/ontology/composer> |
| <http://dbpedia.org/ontology/owningOrganisation> |
| <http://dbpedia.org/ontology/branchFrom> |
| <http://dbpedia.org/ontology/iso6393Code> |
...
| <http://dbpedia.org/ontology/classification> |
| <http://dbpedia.org/ontology/bgafdId> |
| <http://dbpedia.org/ontology/currencyCode> |
| <http://dbpedia.org/ontology/onChromosome> |
| <http://dbpedia.org/ontology/course> |
| <http://dbpedia.org/ontology/frequentlyUpdated> |
| <http://dbpedia.org/ontology/distance> |
| <http://dbpedia.org/ontology/volume> |
| <http://dbpedia.org/ontology/description> |
----------------------------------------------------------------------------
That won't provide the rdfs:domain and rdfs:range, but you could also ask for these, or for just those properties with rdfs:range dbpedia-owl:Person (but note that this won't get all the properties that could be used Person, since the range could be more or less specific):
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix dbpedia-owl: <http://dbpedia.org/ontology/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?p ?range where {
{ ?p a owl:ObjectProperty }
UNION
{ ?p a owl:DatatypeProperty }
?p rdfs:domain dbpedia-owl:Person ; rdfs:range ?range .
}
$ arq --query properties.sparql --data dbpedia_3.8.owl | head
--------------------------------------------------------------------------------------------------------
| p | range |
========================================================================================================
| dbpedia-owl:restingPlacePosition | <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> |
| dbpedia-owl:opponent | dbpedia-owl:Person |
| dbpedia-owl:employer | dbpedia-owl:Organisation |
| dbpedia-owl:hometown | dbpedia-owl:Settlement |
| dbpedia-owl:militaryBranch | dbpedia-owl:MilitaryUnit |
| dbpedia-owl:school | dbpedia-owl:EducationalInstitution |
| dbpedia-owl:ethnicity | dbpedia-owl:EthnicGroup |
...
| dbpedia-owl:sex | xsd:string |
| dbpedia-owl:hipSize | xsd:double |
| dbpedia-owl:individualisedPnd | xsd:nonNegativeInteger |
| dbpedia-owl:weddingParentsDate | xsd:date |
| dbpedia-owl:birthName | xsd:string |
| dbpedia-owl:networth | xsd:double |
| dbpedia-owl:birthYear | xsd:gYear |
| dbpedia-owl:bustSize | xsd:double |
| dbpedia-owl:description | xsd:string |
--------------------------------------------------------------------------------------------------------

I'm not very good at SPARQL yet, but i do see some syntax issues here. I rewrote the query to look like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX type: <http://dbpedia.org/class/yago/>
PREFIX prop: <http://dbpedia.org/property/>
SELECT ?person, ?commonName, ?nationality, ?knownFor, ?birthDate
WHERE {
?person a type:Person .
?person prop:commonName ?commonName .
FILTER (lang(?commonName) = 'en') .
?person prop:nationality ?nationality .
FILTER (lang(?nationality) = 'en') .
?person prop:knownFor ?knownFor .
FILTER (lang(?knownFor) = 'en') .
?person prop:birthDate ?birthDate .
}
and now its at least running the query without an error. but i'm not seeing results. not sure why

Related

How can I match string labels from an external SPARQL endpoint with string labels in my local ontology?

PREFIX ex: <http://www.semanticweb.org/caeleanb/ontologies/twittermap#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?name (STR(?name) AS ?strip_name) ?name2
WHERE {
?p ex:displayName ?name2 .
SERVICE <http://dbpedia.org/sparql> {
{ ?s rdfs:label ?name .
?s rdf:type foaf:Person . }
UNION
{ ?x rdfs:label ?name .
?x dbo:wikiPageRedirects ?s . }
}
FILTER(STR(?name) = ?name2) .
}
This is the SPARQL query that I am sending to my Stardog SPARQL endpoint (hosted locally). I know a bunch of the prefixes are missing but I promise it's not a prefix problem, Stardog holds the rest of the prefixes.
The query is supposed to find all foaf:Persons which share a label string with a display name resource in my own ontology. The reason for the UNION is to try to catch all pages which redirect to the Person as well (but let me know if that part looks screwed up).
Basically the problem is that the above query gives me output like:
name strip_name name2
John McCain (en) | John McCain | John McCain
John McCain (de) | John McCain | John McCain
John McCain (es) | John McCain | John McCain
...
But John McCain is not a placeholder, he is the only one who shows up, just in a bunch of different languages. However, I expect many more to show up, since I can run queries like:
PREFIX ex: <http://www.semanticweb.org/caeleanb/ontologies/twittermap#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?name (STR(?name) AS ?strip_name) ?name2
WHERE {
?p ex:displayName ?name2 .
SERVICE <http://dbpedia.org/sparql> {
{ ?s rdfs:label ?name .
?s rdf:type foaf:Person .
FILTER(?name="Bill Nye"#en)}
UNION
{ ?x rdfs:label ?name .
?x dbo:wikiPageRedirects ?s .
FILTER(?name="Bill Nye"#en)}
}
FILTER(STR(?name) = ?name2) .
}
and get output like:
name strip_name name2
Bill Nye (en) | Bill Nye | Bill Nye
I can repeat this trick with other names that I know should match in my ontology like "Jamie Oliver" and "Donald Trump" but for some reason the only name that shows up in the general version of the query is John McCain. Can anyone explain this to me? I'm sure I don't understand some part of the SERVICE block but I've tried a lot of different configurations and cannot get this query to work properly. Thanks for any help.

SPARQL select query with condition?

I'm trying to form SPARQL query which will give Domain Names and Method Name against the given Java Class from below RDF. For e.g
Select DomainNames, MethodName where JavaClass = 'MyJavaClass'.
This is just a pseudo query. I need help in forming similar query in SPARQL. thanks.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:DOL="http://www.MyOnt.com/something/v1#"
xmlns:DC="http://purl.org/dc/dcmitype/"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<rdf:Description rdf:about="http://www.MyOnt.com/something/data/MyJavaClass">
<DOL:belongsTo>
<rdf:Description rdf:about="http://www.MyOnt.com/something/data/MyDomain">
<DOL:domainName>MyDomainValue2</DOL:domainName>
<DOL:domainName>MyDomainValue</DOL:domainName>
</rdf:Description>
</DOL:belongsTo>
<DOL:hasMethod>
<rdf:Description rdf:about="http://www.MyOnt.com/something/data/MyMethod">
<DOL:returnType>MethodReturnType</DOL:returnType>
</rdf:Description>
</DOL:hasMethod>
<foaf:name>MyJavaClass</foaf:name>
</rdf:Description>
</rdf:RDF>
It's generally easier to understand what the SPARQL query should look like if you first put the data into Turtle, which has a syntax very similar to SPARQL. Here's what your data is in Turtle:
#prefix DOL: <http://www.MyOnt.com/something/v1#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
#prefix DC: <http://purl.org/dc/dcmitype/> .
<http://www.MyOnt.com/something/data/MyDomain>
DOL:domainName "MyDomainValue2" , "MyDomainValue" .
<http://www.MyOnt.com/something/data/MyJavaClass>
DOL:belongsTo <http://www.MyOnt.com/something/data/MyDomain> ;
DOL:hasMethod <http://www.MyOnt.com/something/data/MyMethod> ;
foaf:name "MyJavaClass" .
<http://www.MyOnt.com/something/data/MyMethod>
DOL:returnType "MethodReturnType" .
Once you've done that, the query looks almost exactly like the data, except with variables in it. The only catch here is that since you're looking for domains and methods, you need to use a union (assuming that you want to bind domains and methods as different variables).
prefix DOL: <http://www.MyOnt.com/something/v1#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
select ?domain ?method {
?class foaf:name "MyJavaClass" .
{ ?class DOL:belongsTo ?domain }
union
{ ?class DOL:hasMethod ?method }
}
---------------------------------------------------------------------------------------------------
| domain | method |
===================================================================================================
| <http://www.MyOnt.com/something/data/MyDomain> | |
| | <http://www.MyOnt.com/something/data/MyMethod> |
---------------------------------------------------------------------------------------------------
If you're willing to have the domain and method bound to the same variable, you can use an alternation property path to select either a domain or a method:
prefix DOL: <http://www.MyOnt.com/something/v1#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
select ?domainOrMethod {
?class foaf:name "MyJavaClass" ;
DOL:belongsTo|DOL:hasMethod ?domainOrMethod
}
--------------------------------------------------
| domainOrMethod |
==================================================
| <http://www.MyOnt.com/something/data/MyDomain> |
| <http://www.MyOnt.com/something/data/MyMethod> |
--------------------------------------------------
As another alternative, you could use a values block to specify the properties that you want to follow (hasMethod or belongsTo), in which case you can select that as well in order to know which type of value you have:
prefix DOL: <http://www.MyOnt.com/something/v1#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
select ?property ?value {
values ?property { DOL:belongsTo DOL:hasMethod }
?class foaf:name "MyJavaClass" ;
?property ?value
}
------------------------------------------------------------------
| property | value |
==================================================================
| DOL:belongsTo | <http://www.MyOnt.com/something/data/MyDomain> |
| DOL:hasMethod | <http://www.MyOnt.com/something/data/MyMethod> |
------------------------------------------------------------------

SPARQL query to get all class label with namespace prefix defined

I want to get all class that stored in sesame repository.
This is my query
SELECT ?class ?classLabel
WHERE {
?class rdf:type rdfs:Class.
?class rdfs:label ?classLabel.
}
It's return all URI of class with label. For example,
"http://example.com/A_Class" "A_Class"
"http://example.com/B_Class" "B_Class"
But, if namespace prefix already defined I want to get label with namespace defined. For example if I already defined namespace prefix "ex" for "http://example.com/", the result become
"http://example.com/A_Class" "ex:A_Class"
"http://example.com/B_Class" "ex:B_Class"
You want to add a URI prefix to the front of a label string?
I think you might be confused about what URI prefixes do. They're just shorthand for full URIs, and aren't part of the URI, and they don't have any bearing on strings.
You can do something like what you want with
SELECT ?class (CONCAT("ex:", ?classLabel) AS ?label
WHERE {
?class rdf:type rdfs:Class.
?class rdfs:label ?classLabel.
}
But the prefix won't depend on the prefix of the URI.
You could have a chain of IF()s that tests the starting characters of STR(?class), but it will get ugly quickly:
BIND(IF(STRSTARTS(STR(?class), "http://example.com/"), "ex:", "other:") as ?prefix)
then
SELECT ... (CONCAT(?prefix, ?classLabel) AS ?label
There's almost certainly an easier way to get what you want than doing it in SPARQL though :)
I agree with the other answers that say you're trying to do a strange thing, at least as it's stated in the question. Associating the prefix with the rdfs:label isn't all that useful, since the resulting string may or may not represent the actual URI of the class. Assuming that what you're trying to do is actually figure out a prefixed name for a URI given some set of prefix definitions, you can do this in SPARQL without needing lots of nested IFs; you can do it by using a VALUES and a FILTER.
Given data like this, in which a number of classes are defined (I don't define any labels here, since they're not necessary for this example),
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix ex1: <http://example1.com/> .
#prefix ex2: <http://example2.com/> .
#prefix ex3: <http://example3.com/> .
#prefix ex4: <http://example4.com/> .
ex1:A a rdfs:Class .
ex1:B1 a rdfs:Class .
ex2:A a rdfs:Class .
ex2:B2 a rdfs:Class .
ex3:A a rdfs:Class .
ex3:B3 a rdfs:Class .
ex3:A a rdfs:Class .
ex3:B4 a rdfs:Class .
ex4:A a rdfs:Class .
ex4:B4 a rdfs:Class .
you can write a query like this that selects all classes, checks whether their URI begins with one of a number of defined namespaces, and if so, returns the class and the prefixed form:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?class (group_concat(?prefixedName ; separator = "") as ?prefName) where {
values (?prefix ?ns) {
( "ex1:" <http://example1.com/> )
( "ex2:" <http://example2.com/> )
( "ex3:" <http://example3.com/> )
}
?class a rdfs:Class .
bind( if( strStarts( str(?class), str(?ns) ),
concat( ?prefix, strafter( str(?class), str(?ns) )),
"" )
as ?prefixedName )
}
group by ?class
order by ?class
which produces as results:
$ arq --data data.n3 --query query.sparql
---------------------------------------
| class | prefName |
=======================================
| <http://example1.com/A> | "ex1:A" |
| <http://example1.com/B1> | "ex1:B1" |
| <http://example2.com/A> | "ex2:A" |
| <http://example2.com/B2> | "ex2:B2" |
| <http://example3.com/A> | "ex3:A" |
| <http://example3.com/B3> | "ex3:B3" |
| <http://example3.com/B4> | "ex3:B4" |
| <http://example4.com/A> | "" |
| <http://example4.com/B4> | "" |
---------------------------------------

How do build a subject search in SPARQL

I am trying to get all the books about the author William Shakespeare in the gutemberg project (http://www4.wiwiss.fu-berlin.de/gutendata/).
I can use this query in order to return all the items in which Shakespeare is the author :
PREFIX dc:<http://purl.org/dc/elements/1.1/>
PREFIX foaf:<http://xmlns.com/foaf/0.1/>
SELECT ?bookTitle
WHERE {
?author foaf:name "Shakespeare, William, 1564-1616".
?book dc:creator ?author;
dc:title ?bookTitle
}
But I'd rather obtain the list of the criticism books about Shakespeare, using the dcterms:subjects
http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/subject/Shakespeare_William_1564-1616.
I would really appreciate your help !
I agree with RobV's comment that it's surprising that after writing the first query there would be much difficulty in writing the second. At any rate, the change is relatively trivial. I checked using both dc:subject and dcterms:subject, and it appears that the Gutenberg data is stored with dc:subject, so I've used that in this query, not dcterms:subject. This is the query; you just need to add dc:subject <.../Shakespeare...> to the pattern, and remove the author constraint. A few prefixes make the query easier to write, and the results easier to read, too.
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX gp: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/people/>
PREFIX gs: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/subject/>
PREFIX gr: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/>
SELECT ?book ?author ?bookTitle WHERE {
?book dc:creator ?author ;
dc:title ?bookTitle ;
dc:subject gs:Shakespeare_William_1564-1616 .
}
With a query like this, you should end up with results like:
--------------------------------------------------------------------------------------------
| book | author | bookTitle |
============================================================================================
| gr:etext14827 | gp:Guizot_François_Pierre_Guillaume_1787-1874 | "Étude sur Shakspeare" |
| gr:etext5429 | gp:Johnson_Samuel_1709-1784 | "Preface to Shakespeare" |
--------------------------------------------------------------------------------------------

Find lists containing ALL values in a set?

How can I find lists that contain each of a set of items in SPARQL? Let's say I have this data:
<http://foo.org/test> <http://foo.org/name> ( "new" "fangled" "thing" ) .
<http://foo.org/test2> <http://foo.org/name> ( "new" "york" "city" ) .
How can I find the items whose lists contain both "new" and "york"?
The following SPARQL doesn't work, since filter works on each binding of ?t, not the set of all of them.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?s ?p WHERE {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t
FILTER( ?t = "new" && ?t = "york")
}
Finding lists with multiple required values
If you're looking for lists that contain all of a number of values, you'll need to use a more complicated query. This query finds all the ?s values that have a ?list value, and then filters out those where there is not a word that is not in the list. The list of words is specified using a values block.
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s {
?s <http://foo.org/name> ?list .
filter not exists { #-- It's not the case
values ?word { "new" "york" } #-- that any of the words
filter not exists { #-- are not
?list rdf:rest*/rdf:first ?word #-- in the list.
}
}
}
--------------------------
| s |
==========================
| <http://foo.org/test2> |
--------------------------
Find alternative strings in a list
On the other hand, if you're just trying to search for one of a number of options, you're using the right property path, but you've got the wrong filter expression. You want strings equals to "new" or "york". No string is equal to both "new" and "york". You just need to do filter(?t = "new" || ?t = "york") or better yet use in: filter(?t in ("new", "york")). Here's a full example with results:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?s ?t {
?s <http://foo.org/name>/rdf:rest*/rdf:first ?t .
filter(?t in ("new","york"))
}
-----------------------------------
| s | t |
===================================
| <http://foo.org/test2> | "new" |
| <http://foo.org/test2> | "york" |
| <http://foo.org/test> | "new" |
-----------------------------------