Events in dbpedia.org without start and end Date - sparql

I'm trying to fetch every events in dbpedia.org along with their respective startDate and endDate ( startDate & endDate are attributes defined in http://schema.org/Event )
Here is the sparql request:
PREFIX db-owl: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
?r a db-owl:Event .
OPTIONAL{ ?r db-owl:startDate ?c }
} LIMIT 2000
( I'm using Virtuoso Sparql Query editor to test it on the http://dbpedia.org/sparql endpoint )
Test by yourself and you see that only a veeeerryyy small subset of those events have a specified startDate.
For exemple, if i specialize my query a bit more, like this:
PREFIX db-owl: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
?r a db-owl:MilitaryConflict .
OPTIONAL{ ?r db-owl:startDate ?c }
} LIMIT 2000
You can see that startDate is specified in NO militaryConflict .
How can this be possible ?
I'm i missing something or is it just that the english version of dbpedia is missing some crucial informations like event's startDate ?
Note: If i run the same queries on the french dbpedia endpoint (http://fr.dbpedia.org/sparql) , i get totally different results, i.e: a bunch of MilitaryConflict are filled with their corresponding startDate

Sure, lots (none) of these have dbo:startDate properties, but lots of them have dbo:date properties, and even more have dbp:date properties. (The ontology values tend to be much cleaner than the raw values, but both can be useful.) E.g., have a look at the results of:
SELECT * {
?r a dbo:MilitaryConflict .
OPTIONAL{ ?r dbo:startDate ?startDate }
OPTIONAL{ ?r dbo:date ?oDate }
OPTIONAL{ ?r dbp:date ?pDate }
} LIMIT 2000
SPARQL results
Figuring out what data is and isn't available in DBpedia is something of a trial and error process. I found the dbo:date properties by browsing some of those instances, e.g., 2015 Douma market massacre has a value for that property.
Note: If i run the same queries on the french dbpedia endpoint (http://fr.dbpedia.org/sparql) , i get totally different results, i.e: a bunch of MilitaryConflict are filled with their corresponding startDate
This does make the question a bit more interesting, but I'd guess that it comes down to different infoboxes being used on the French DBpedia that helps to get more that information into the system.

Related

Sparql identifying Powerstations more precisely

I would like to get all powerstations and also the type of powerstation (Nuclear, Water, Coal etc...) from DBpedia.
I recognized that there are no specific types of powerstations, so I am query all powerstations and try to figure out the type of powerstation from the name. ( I will not catch all of them, but a lot enough).
My query so far :
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
#PREFIX georss: <http://www.georss.org/georss/>
select distinct *
{
?name rdf:type dbp-ont:PowerStation .
?name geo:lat ?lat .
?name geo:long ?long
OPTIONAL { ?name dbo:installedCapacity ?installedCapacity }
OPTIONAL { ?name dbo:openingDate ?openingDate }
OPTIONAL { ?name dbo:closingDate ?closingDate }
} limit 100
Is there a way to have a new field named 'nuclear' that has a value of '1' if its name contains 'nuclear' ?
First thing is to remember that DBpedia data is a moving target, just like the Wikipedia data from which it is derived. Updates to Wikipedia will eventually (typically, in 3-9 months) be part of DBpedia. More quickly (typically in a few hours, if not minutes or seconds; sometimes days, for various reasons), they'll be part of DBpedia-Live.
The long-term solution for giving every powerstation a type as you wish, is to edit Wikipedia.
For your specific immediate workaround, note that a large number of OPTIONAL clauses can make your query take much longer, and so may eventually mean that DBpedia will not return the data you want. You may need to spin up your own mirror (such as DBpedia or DBpedia-Live in the Amazon EC2 Cloud).
Finally, as #AKSW noted in comments, the line below should deliver your ?nuclear variable with a 1 if that string appeared in the ?name. Just put it after your OPTIONAL lines.
BIND ( IF ( CONTAINS ( LCASE ( STR ( ?name ) ), "nuclear" ), 1, 0 ) AS ?nuclear )

Get movie(s) based on book(s) from DBpedia

I am new to SPARQL and trying to fetch a movie adapted from specific book from dbpedia. This is what I have so far:
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT *
WHERE
{
<http://dbpedia.org/page/2001:_A_Space_Odyssey> a ?type.
?type onto:basedOn ?book .
?book a onto:Book
}
I can't get any results. How can I do that?
When using any web resource, and in your case the property :basedOn, you need to make sure that you have declared the right prefix. If you are querying from the DBpedia SPARQL endpoint, then you can directly use dbo:basedOneven without declaring it, as it is among predefined. Alternatively, if you want to use your own, or if you are using another SPARQL client, make sure that whatever short name you choose for this property, you declare the prefix for http://dbpedia.org/ontology/.
Then, first, to get more result you may not restrict the type of the subject of this triple pattern, as there could be movies that actually not type as such. So, a query like this
select distinct *
{
?movie dbo:basedOn ?book .
?book a dbo:Book .
}
will give you lots of good results but not all. For example, the resource from your example will be missing. You can easily check test the available properties between these two resource with a query like this:
select ?p
{
{<http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey> }
UNION
{ <http://dbpedia.org/resource/2001:_A_Space_Odyssey> ?p <http://dbpedia.org/resource/2001:_A_Space_Odyssey_(film)>}
}
You'll get only one result:
http://www.w3.org/2000/01/rdf-schema#seeAlso
(note that the URI is with 'resource', not with 'page')
Then you may search for any path between the two resource, using the method described here, or find a combination of other patterns that would increase the number of results.

How to query for publication date of books using SPARQL in DBPEDIA

I am trying to retrieve the publication date and the no.of pages for books in DBpedia. I tried the following query and it gives empty results. I see that these are properties under book(http://mappings.dbpedia.org/server/ontology/classes/Book) but could not retrieve it.
I would like to know if there is an error in the code or if dbpedia does not store these dates related to books.
SELECT ?book ?genre ?date ?numberOfPages
WHERE {
?book rdf:type dbpedia-owl:Book .
?book dbp:genre ?genre .
?book dbp:firstPublicationDate ?date .
OPTIONAL {?book dbp:numberOfPages ?numberOfPages .}
}
The dbp:firstPublicationDate does not work for two reasons:
First, as pointed in the first answer, you used the wrong prefix.
But even if you correct it, you'll see that you would still have no results. Then the best thing to do is to test with the minimum number of patters, in you case you should as for books with first publication date, two triple pattern only. If you still don't get results, you should test how <http://dbpedia.org/ontology/firstPublicationDate> is actually used with a query like this:
SELECT ?class (COUNT (DISTINCT ?s) AS ?instances)
WHERE {
?s <http://dbpedia.org/ontology/firstPublicationDate> ?date ;
a ?class
}
GROUP BY ?class
ORDER BY DESC(?instances)
LIMIT 1000
Mapping based properties are using the namespace http://dbpedia.org/ontology/, thus, the prefix must be dbo instead of dbp, which stands for http://dbpedia.org/property/.
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?book ?genre ?date ?numberOfPages
WHERE {
?book a dbo:Book ;
dbp:genre ?genre ;
dbo:firstPublicationDate ?date .
OPTIONAL {?book dbp:numberOfPages ?numberOfPages .}
}
Some additional comments:
put the prefixes to the SPARQL query such that others here can run it without any exceptions (also in the future) - the current SPARQL query uses dbpedia-owl but this one is not pre-defined on the official DBpedia anymore - it's called dbo instead
which brings me to the second point -> if you're using a public SPARQL endpoint, show its URL
you can start debugging your own SPARQL query by simply starting with only parts of it and adding more triple patterns then, e.g. in your case you could check if there is any triple with the property with
PREFIX dbp: <http://dbpedia.org/property/>
SELECT * WHERE {?book dbp:firstPublicationDate ?date } LIMIT 10
Update
As Ivo Velitchkov noticed in his answer below, the property dbo:firstPublicationDate is only used for mangas, etc., i.e. written work that was published periodically. Thus, the result will be empty.

Querying WikiData, difference between p and wdt default prefix

I am new to wikidata and I can't figure out when I should use -->
wdt prefix (http://www.wikidata.org/prop/direct/)
and when I should use -->
p prefix (http://www.wikidata.org/prop/).
in my sparql queries. Can someone explain what each of these mean and what is the difference?
Things in the p: namespace are used to select statements. Things in the wdt: namespace are used to select entites. Entity selection, with wdt:, allows you to simplify or summarize more complex queries involving statement selection.
When you see a p: you are usually going to see a ps: or pq: shortly following. This is because you rarely want a list of statements; you usually want to know something about those statements.
This example is a two-step process showing you all the graffiti in Wikidata:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti p:P31 ?statement . # entities that are statements
?statement ps:P31 wd:Q17514 . # which state something is graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Two different versions of the P31 property are used here, housed in different namespaces. Each version comes with different expectations about how it will connect to other items. Things in the p: namespace connect entities to statements, and things in the ps: namespace connect statements to values. In the example, p:P31 is used to select statements about an entity. The entity will be graffiti, but we do not specify that until the next line, where ps:P31 is used to select the values (subjects) of the statements, specifying that those values should be graffiti.
So, that's kind of complicated! The wdt: namespace is supposed to make this kind of query simper. The example could be rewritten as:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti wdt:P31 wd:Q17514 . # entities that are graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This is now one line shorter because we are no longer looking for statements about graffiti, but for graffiti itself. The dual p: and ps: linkages are summarized with a wdt: version of the same P31 property. However, be aware:
This technique only works for statements that are true or false in nature, like, is a thing graffiti or not. (The "t" in wdt: stands for "truthy").
Information available to wdt: is just missing some facts, sometimes. Often in my experience a p: and ps: query will return a few more results than a wdt: query.
If you go to the Wikidata item page for Barack Obama at https://www.wikidata.org/wiki/Q76 and scroll down, you see the entry for the "spouse" property P26:
Think of the p: prefix as a way to get to the entire white box on the right side of the image.
In order to get to the information inside the white box, you need to dig deeper.
In order to get to the main part of the information ("Michelle Obama"), you combine the p: prefix with the ps: prefix like this:
SELECT ?spouse WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
}
The variable ?s is an abstract statement node (aka the white box).
You can get the same information with only one triple in the body of the query by using wdt::
SELECT ?spouse WHERE {
wd:Q76 wdt:P26 ?spouse .
}
So why would you ever use p:?
You might have noticed that the white box also contains meta information ("start time" and "place of marriage").
In order to get to the meta information, you combine the p: prefix with the pq: prefix.
The following example query returns all the information together with the statement node:
SELECT ?s ?spouse ?time ?place WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
?s pq:P580 ?time .
?s pq:P2842 ?place .
}
They're simply XML namespace prefixes, basically a shortcut for full URIs. So given wdt:Apples, the full URI is http://www.wikidata.org/prop/direct/Apples and given p:fruitType the URI is http://www.wikidata.org/prop/fruitType.
Prefixes/namespaces have no other meaning, they are simply ways to define the name of something with URL format. However conventions, such as defining properties in http://www.wikidata.org/prop/, are useful to separate the meanings of terms, so 'direct' is likely a sub-type of property as well (in this case having to do with wikipedia dumps).
For the specifics, you'd need to hope the authors have exposed some naming convention, or be caught in a loop of "was it p:P51 or p:P15 or maybe wdt:P51?". And may luck be with you because the "semantics" of semantic technology have been lost.

Filtering results based on specific properties with specific values (cause timeout connection to DBpedia)

I'm trying to make a SPARQL query using Prolog and DBpedia. My objective is to tag in text all Persons, so for retrieving famous people I made this query that remove all results like Music groups(Band) and Organization, since I want to tag only real people and not abstract
select ?person where{
{
?person a dbpedia-owl:Person; rdfs:label "Name Surname" #it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name"#it; foaf:surname "Surname"#it.
}
UNION
{
?person a dbpedia-owl:Person; foaf:name "Name Surname"#it.
}
FILTER NOT EXISTS {
{ ?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Band .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
UNION
{
?subject <http://airpedia.org/ontology/type_with_conf#10> dbpedia-owl:Organisation .
?subject rdfs:label ?artistName .
FILTER ( str(?artistName) = "Name Surname" )
}
}
}
I use It. version of Dbpedia if you run this query use this version although the results will not be good for me.
So for example if I search "Metallica" as a person i don't want to get results cause is it a Band or(for me, but in this case is Metallica are an Organisation too) an Organisation
and it works good this are the results Metallica Query Results and those are for "Michael Jackson" Michael Jackson Query results
My problem is when i put someone that is not a Singer or a Music band for example if i try something like "Jim Carrey" i get " error transction timed out Jim Carrey.
I think I got this problem because those properties are Undefined for Jim Carrey, but i tried an to put an OPTIONAL marker in each subquery in the first filter, but i get too the same error
I put the code in a pastebin file so you can find all three query
I know that i should not use Static String in a query or there are a lot of better mode but i need that since i compose the query with prolog and than send to sparql online so i must do in this way.
TO #Joshua I tried to remove the FILTER(String) in the NOT EXIST (Filter) But I will not work anymore thanks however for helping me
Excuse me for too much editing but i resolved some part of the starting problem but didn't find a solution
First problem :Filtering results based on specific properties with specific values. (Works)
Second : The first works only for Things with that specific property (as show above) like(Metallica,Michael Jackson, The Beatles, ...) but not for thos without the properties in the filter.
(i can't use more than two link because I'm a newbe so i will put a link in the comments with a pastebin links with the 3 Query and the results of they)