Sparql identifying Powerstations more precisely - sparql

I would like to get all powerstations and also the type of powerstation (Nuclear, Water, Coal etc...) from DBpedia.
I recognized that there are no specific types of powerstations, so I am query all powerstations and try to figure out the type of powerstation from the name. ( I will not catch all of them, but a lot enough).
My query so far :
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
#PREFIX georss: <http://www.georss.org/georss/>
select distinct *
{
?name rdf:type dbp-ont:PowerStation .
?name geo:lat ?lat .
?name geo:long ?long
OPTIONAL { ?name dbo:installedCapacity ?installedCapacity }
OPTIONAL { ?name dbo:openingDate ?openingDate }
OPTIONAL { ?name dbo:closingDate ?closingDate }
} limit 100
Is there a way to have a new field named 'nuclear' that has a value of '1' if its name contains 'nuclear' ?

First thing is to remember that DBpedia data is a moving target, just like the Wikipedia data from which it is derived. Updates to Wikipedia will eventually (typically, in 3-9 months) be part of DBpedia. More quickly (typically in a few hours, if not minutes or seconds; sometimes days, for various reasons), they'll be part of DBpedia-Live.
The long-term solution for giving every powerstation a type as you wish, is to edit Wikipedia.
For your specific immediate workaround, note that a large number of OPTIONAL clauses can make your query take much longer, and so may eventually mean that DBpedia will not return the data you want. You may need to spin up your own mirror (such as DBpedia or DBpedia-Live in the Amazon EC2 Cloud).
Finally, as #AKSW noted in comments, the line below should deliver your ?nuclear variable with a 1 if that string appeared in the ?name. Just put it after your OPTIONAL lines.
BIND ( IF ( CONTAINS ( LCASE ( STR ( ?name ) ), "nuclear" ), 1, 0 ) AS ?nuclear )

Related

Sparql Query Geometry doing strange things

I'm doing a Sparql Query right now, to find Places around a specific radius of a given Point from dbpedia Endpoint (Snorql).
My first solution (already doing at some other Endpoints) was this:
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT *
WHERE {
?resource rdfs:label ?label .
?resource geo:lat ?lat .
?resource geo:long ?long .
?resource geo:geometry ?coordinates .
FILTER(bif:st_within(?coordinates, bif:st_geomFromText("POINT(10.2788 47.4093)"), 1)) .
FILTER (lang(?label)= "de") .
}
I noticed that it doesn't give me any results. Then I tried the same thing with the given rounded values in geo:lat and geo:long:
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT *
WHERE {
?resource rdfs:label ?label .
?resource geo:lat ?lat .
?resource geo:long ?long .
?resource geo:geometry ?coordinates .
FILTER(bif:st_within(bif:st_point(?long, ?lat), bif:st_geomFromText("POINT(10.2788 47.4093)"), 1)) .
FILTER (lang(?label)= "de") .
}
Now I'm getting 2 results. When I'm increasing the radius of the first solution to 21, there are plenty results, but decreasing it to 20, there are no results. Is there a mistake I made in the first Query?
Thank you very much,
SaW
As I answered on Confluence...
Interesting!
Your original query (with addition of FROM <http://dbpedia.org> clause) does return the expected results when run against the LOD Cloud Cache, which is now on a somewhat older Virtuoso engine. This looks like a regression in newer versions.
To check things on DBpedia, I started with your query, and added a couple of BINDs using the st_distance() function to the WHERE clause --
BIND
( bif:st_distance
( ?coordinates,
bif:st_geomFromText("POINT(10.2788 47.4093)")
) AS ?coord_distance
) .
BIND
( bif:st_distance
( bif:st_point(?long, ?lat),
bif:st_geomFromText("POINT(10.2788 47.4093)")
) AS ?latlong_distance
)
}
I also added a final ORDER BY ?coord_distance to the query.
My results on DBpedia.org/sparql clearly show two entities within your desired radius of 1, and the calculated distances are the same whether based on ?coordinates or st_point(?long, ?lat) but they are not delivered unless bif:st_within specifies a radius of 21 or greater -- and those results include a number of other entities that are within the larger radius.
I've raised this to Virtuoso Development, and it's being tracked internally as bug#18399.
... and as followed up there ...
st_within() uses st_distance(), so given that the srid is 4326 (as is typical for DBpedia geodata), "the haversine function is used to compute a great circle distance in kilometers on Earth." You can divide your distance in meters by 1000 (or multiply it by 0.001) to get the distance in kilometers for use in the st_within() call.
The computing time is dependent on the instance host, other load on the instance, etc. The public DBpedia instance's response time may well be longer than you can tolerate. You can set up your own mirror in a local server or in the cloud (AMIs based on DBpedia 2016-10 Snapshot [current DBpedia.org/sparql], or DBpedia-Live [current live.DBpedia.org/sparql]), which you can put on any AWS instance type -- so you can give it as much processor and/or RAM as you like.
Note that the LOD Cloud Cache instance may be upgraded to a newer Virtuoso engine at any time, so you should not rely on this delivering the desired results via st_within(). A slightly adjusted DBpedia query will deliver what I think you want using only the st_distance() function (here calculated from ?coordinates, but you could also use the more complex construction based on ?long and ?lat), and not the malfunctioning st_within().

How to query for publication date of books using SPARQL in DBPEDIA

I am trying to retrieve the publication date and the no.of pages for books in DBpedia. I tried the following query and it gives empty results. I see that these are properties under book(http://mappings.dbpedia.org/server/ontology/classes/Book) but could not retrieve it.
I would like to know if there is an error in the code or if dbpedia does not store these dates related to books.
SELECT ?book ?genre ?date ?numberOfPages
WHERE {
?book rdf:type dbpedia-owl:Book .
?book dbp:genre ?genre .
?book dbp:firstPublicationDate ?date .
OPTIONAL {?book dbp:numberOfPages ?numberOfPages .}
}
The dbp:firstPublicationDate does not work for two reasons:
First, as pointed in the first answer, you used the wrong prefix.
But even if you correct it, you'll see that you would still have no results. Then the best thing to do is to test with the minimum number of patters, in you case you should as for books with first publication date, two triple pattern only. If you still don't get results, you should test how <http://dbpedia.org/ontology/firstPublicationDate> is actually used with a query like this:
SELECT ?class (COUNT (DISTINCT ?s) AS ?instances)
WHERE {
?s <http://dbpedia.org/ontology/firstPublicationDate> ?date ;
a ?class
}
GROUP BY ?class
ORDER BY DESC(?instances)
LIMIT 1000
Mapping based properties are using the namespace http://dbpedia.org/ontology/, thus, the prefix must be dbo instead of dbp, which stands for http://dbpedia.org/property/.
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?book ?genre ?date ?numberOfPages
WHERE {
?book a dbo:Book ;
dbp:genre ?genre ;
dbo:firstPublicationDate ?date .
OPTIONAL {?book dbp:numberOfPages ?numberOfPages .}
}
Some additional comments:
put the prefixes to the SPARQL query such that others here can run it without any exceptions (also in the future) - the current SPARQL query uses dbpedia-owl but this one is not pre-defined on the official DBpedia anymore - it's called dbo instead
which brings me to the second point -> if you're using a public SPARQL endpoint, show its URL
you can start debugging your own SPARQL query by simply starting with only parts of it and adding more triple patterns then, e.g. in your case you could check if there is any triple with the property with
PREFIX dbp: <http://dbpedia.org/property/>
SELECT * WHERE {?book dbp:firstPublicationDate ?date } LIMIT 10
Update
As Ivo Velitchkov noticed in his answer below, the property dbo:firstPublicationDate is only used for mangas, etc., i.e. written work that was published periodically. Thus, the result will be empty.

Events in dbpedia.org without start and end Date

I'm trying to fetch every events in dbpedia.org along with their respective startDate and endDate ( startDate & endDate are attributes defined in http://schema.org/Event )
Here is the sparql request:
PREFIX db-owl: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
?r a db-owl:Event .
OPTIONAL{ ?r db-owl:startDate ?c }
} LIMIT 2000
( I'm using Virtuoso Sparql Query editor to test it on the http://dbpedia.org/sparql endpoint )
Test by yourself and you see that only a veeeerryyy small subset of those events have a specified startDate.
For exemple, if i specialize my query a bit more, like this:
PREFIX db-owl: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
?r a db-owl:MilitaryConflict .
OPTIONAL{ ?r db-owl:startDate ?c }
} LIMIT 2000
You can see that startDate is specified in NO militaryConflict .
How can this be possible ?
I'm i missing something or is it just that the english version of dbpedia is missing some crucial informations like event's startDate ?
Note: If i run the same queries on the french dbpedia endpoint (http://fr.dbpedia.org/sparql) , i get totally different results, i.e: a bunch of MilitaryConflict are filled with their corresponding startDate
Sure, lots (none) of these have dbo:startDate properties, but lots of them have dbo:date properties, and even more have dbp:date properties. (The ontology values tend to be much cleaner than the raw values, but both can be useful.) E.g., have a look at the results of:
SELECT * {
?r a dbo:MilitaryConflict .
OPTIONAL{ ?r dbo:startDate ?startDate }
OPTIONAL{ ?r dbo:date ?oDate }
OPTIONAL{ ?r dbp:date ?pDate }
} LIMIT 2000
SPARQL results
Figuring out what data is and isn't available in DBpedia is something of a trial and error process. I found the dbo:date properties by browsing some of those instances, e.g., 2015 Douma market massacre has a value for that property.
Note: If i run the same queries on the french dbpedia endpoint (http://fr.dbpedia.org/sparql) , i get totally different results, i.e: a bunch of MilitaryConflict are filled with their corresponding startDate
This does make the question a bit more interesting, but I'd guess that it comes down to different infoboxes being used on the French DBpedia that helps to get more that information into the system.

Sparql: getting all politicians who ruled a city

I'm new to sparql and I'm trying to understand how to get the resources I need for building a query. I started trying to get all the politicians that ruled a city or a country, and at the moment I could do just the following:
I started by following the links in snorql (in the prefixes) and looking for an entity by adding "politician" at the end. I found one :
PREFIX : <http://dbpedia.org/resource/>
So I wrote http://dbpedia.org/resource/Politician and the resource does exist. I tryed to use it in this way:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
SELECT ?thing WHERE {
?thing a :Politician .
?thing dbo:birthPlace dbpedia:Italy.
}
LIMIT 50
Run in virtuoso.
Even if I remove the second line of the SELECT, I have no results. But if I change the first line with: ?thing a dbo:Person. or even if I remove it, I get the people born in Italy. But not just the politicians. A second problem is I don't need the politicians that were born but ruled that place. How or where can I find that kind of "relations/descriptors"? Now I am just googling and copy-pasting some existing examples, but I would like to understand how to look for more specific things.
Thanks in advance
Your first query isn't working because Politician is not part of the default (:) namespace, but instead it is present in DBpedia Ontology namespace (dbo).
So, your query should be:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
SELECT ?thing WHERE {
?thing a dbo:Politician .
?thing dbo:birthPlace dbpedia:Italy.
}
LIMIT 50
To list all politician who ruled Italy you would need to know which is the predicate for "ruled". Once you have it you can construct a query.
To list all predicates present in the database you can write something like this
SELECT DISTINCT(?b) WHERE {
?a ?b ?c.
}
And it will list all predicates.
I would recommend you to browse through one or two politician and see the predicates they have to check if one works for you.

Querying WikiData, difference between p and wdt default prefix

I am new to wikidata and I can't figure out when I should use -->
wdt prefix (http://www.wikidata.org/prop/direct/)
and when I should use -->
p prefix (http://www.wikidata.org/prop/).
in my sparql queries. Can someone explain what each of these mean and what is the difference?
Things in the p: namespace are used to select statements. Things in the wdt: namespace are used to select entites. Entity selection, with wdt:, allows you to simplify or summarize more complex queries involving statement selection.
When you see a p: you are usually going to see a ps: or pq: shortly following. This is because you rarely want a list of statements; you usually want to know something about those statements.
This example is a two-step process showing you all the graffiti in Wikidata:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti p:P31 ?statement . # entities that are statements
?statement ps:P31 wd:Q17514 . # which state something is graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Two different versions of the P31 property are used here, housed in different namespaces. Each version comes with different expectations about how it will connect to other items. Things in the p: namespace connect entities to statements, and things in the ps: namespace connect statements to values. In the example, p:P31 is used to select statements about an entity. The entity will be graffiti, but we do not specify that until the next line, where ps:P31 is used to select the values (subjects) of the statements, specifying that those values should be graffiti.
So, that's kind of complicated! The wdt: namespace is supposed to make this kind of query simper. The example could be rewritten as:
SELECT ?graffiti ?graffitiLabel
WHERE
{
?graffiti wdt:P31 wd:Q17514 . # entities that are graffiti
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This is now one line shorter because we are no longer looking for statements about graffiti, but for graffiti itself. The dual p: and ps: linkages are summarized with a wdt: version of the same P31 property. However, be aware:
This technique only works for statements that are true or false in nature, like, is a thing graffiti or not. (The "t" in wdt: stands for "truthy").
Information available to wdt: is just missing some facts, sometimes. Often in my experience a p: and ps: query will return a few more results than a wdt: query.
If you go to the Wikidata item page for Barack Obama at https://www.wikidata.org/wiki/Q76 and scroll down, you see the entry for the "spouse" property P26:
Think of the p: prefix as a way to get to the entire white box on the right side of the image.
In order to get to the information inside the white box, you need to dig deeper.
In order to get to the main part of the information ("Michelle Obama"), you combine the p: prefix with the ps: prefix like this:
SELECT ?spouse WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
}
The variable ?s is an abstract statement node (aka the white box).
You can get the same information with only one triple in the body of the query by using wdt::
SELECT ?spouse WHERE {
wd:Q76 wdt:P26 ?spouse .
}
So why would you ever use p:?
You might have noticed that the white box also contains meta information ("start time" and "place of marriage").
In order to get to the meta information, you combine the p: prefix with the pq: prefix.
The following example query returns all the information together with the statement node:
SELECT ?s ?spouse ?time ?place WHERE {
wd:Q76 p:P26 ?s .
?s ps:P26 ?spouse .
?s pq:P580 ?time .
?s pq:P2842 ?place .
}
They're simply XML namespace prefixes, basically a shortcut for full URIs. So given wdt:Apples, the full URI is http://www.wikidata.org/prop/direct/Apples and given p:fruitType the URI is http://www.wikidata.org/prop/fruitType.
Prefixes/namespaces have no other meaning, they are simply ways to define the name of something with URL format. However conventions, such as defining properties in http://www.wikidata.org/prop/, are useful to separate the meanings of terms, so 'direct' is likely a sub-type of property as well (in this case having to do with wikipedia dumps).
For the specifics, you'd need to hope the authors have exposed some naming convention, or be caught in a loop of "was it p:P51 or p:P15 or maybe wdt:P51?". And may luck be with you because the "semantics" of semantic technology have been lost.