Easiest way to query if a name exists in Wikidata? - sparql

I'm trying to create the simplest possible query to check if a name exists in Wikidata. For example, I just to want to see if the common name "Jack Smith" is in Wikidata.
Following the example query in this StackOverflow answer, I created the following SPARQL query (run it):
SELECT distinct ?item ?itemLabel ?itemDescription WHERE{
?item ?label "Jack Smith" .
?article schema:about ?item .
?article schema:inLanguage "en" .
?article schema:isPartOf <https://en.wikipedia.org/>.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} LIMIT 10
However, it returns zero results.
On the other hand, if I search for "Jack Smith" through the Wikidata search webpage, I get back 1200+ results with a lot of people named "Jack Smith".
The behavior with the SPARQL interface is inconsistent. When I run a SPARQL search for "Abraham Lincoln" (run it), I correctly get back an entry for the American president by that name.
My specific questions are:
Why is my SPARQL query returning zero results for "Jack Smith"?
What is the simplest API call to check if a name (e.g. Jack Smith) exists in Wikidata? It could be SPARQL or some other REST API.
Thank you.

The query returns zero results because there is no Jack Smith about whom there is an article in English and is part of Wikipedia.org.
Every line in the body of a SPARQL query is a way to restrict the search as opposed to just a way to add new variables.
Look at this query:
SELECT ?person ?dateOfDeath
WHERE {
?person a :Person .
?person :date_of_death ?dateOfDeath .
}
This would only return people who have a date of death, i.e. who are dead, as well as the date.
If you wanted to return people, and optionally their date of death, i.e. if they are still alive, you don't want a date, but you do want the person, then use this:
SELECT ?person ?dateOfDeath
WHERE {
?person a :Person .
OPTIONAL {?person :date_of_death ?dateOfDeath }
}
In terms of your second question, I'd try something like this:
SELECT ?boolean
WHERE{
BIND(EXISTS{?item ?label "Jack Smith"} AS ?boolean)
}
This can of course also be issued as, say, a cURL request:
curl https://query.wikidata.org/bigdata/namespace/wdq/sparql -X POST --data 'query=SELECT%20%3Fboolean%0AWHERE%7B%0ABIND%28EXISTS%7B%3Fitem%20%3Flabel%20%22Jack%20Smith%22%7D%20AS%20%3Fboolean%29%0A%20%20%7D'

Related

Wikidata Query: Find American authors of children’s fiction

I want to find all children's fiction writers using Wikidata SPARQL query. But I couldn't figure out how? Can someone help, please? The following is my approach but I don't think it is the correct way.
SELECT ?item ?itemLabel {
?item wdt:P31 wd:Q5. #find humans
?item wdt:P106 wd: #humans whose occupation is a novelist
[another condition needed] #children's fiction.
SERVICE wikibase:label {bd:serviceParam wikibase:language 'en'.}
} LIMIT 10
There is not one correct way, especially not in Wikidata where not all items of the same kind necessarily have the same properties.
One way would be to find the authors of works that are intended for (P2360) children:
# it’s a literary work (incl. any sublasses)
?book wdt:P31/wdt:P279* wd:Q7725634 .
# the literary work is intended for children
?book wdt:P2360 wd:Q7569 .
# the literary work has an author
?book wdt:P50 ?author .
# the author is a US citizen
?author wdt:P27 wd:Q30 .
Instead of getting all works that belong to the class "literary work" or any of its subclasses, you could decide to use only the class "fiction literature" (Q38072107) instead; with the risk that not all relevant works use this class.
Another way would be to find all authors that have "children’s writer" (Q4853732), or any of its subclasses, as occupation:
?author wdt:P106/wdt:P279* wd:Q4853732 .
?author wdt:P27 wd:Q30 .
As the different ways might find different results, you could could use them in the same query, using UNION:
SELECT DISTINCT ?author ?authorLabel
WHERE {
{
# way 1
}
UNION
{
# way 2
}
UNION
{
# way 3
}
SERVICE wikibase:label {bd:serviceParam wikibase:language 'en'.}
}

Get all properties, sub-properties and label IDs from Wikidata item

I am trying to write a query that returns all possible information from a wikidata page, as https://www.wikidata.org/wiki/Q1299.
Ideally I would like to retrieve all info that are present in that page in english language.
So I am trying this query:
SELECT ?wdLabel ?ps_Label ?wdpqLabel ?pq_Label WHERE {
VALUES ?artist {
wd:Q1299
}
?artist ?p ?statement.
?statement ?ps ?ps_.
?wd wikibase:claim ?p;
wikibase:statementProperty ?ps.
OPTIONAL {
?statement ?pq ?pq_.
?wdpq wikibase:qualifier ?pq.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
which works quite fine but I would like to retrieve all wikidata ids on the ps_Label column.
For example here I have Paul McCartney as string and I would like to also have the wikidata ID associated to that item, which is Q2599
has part Paul McCartney start time 1960-01-01T00:00:00Z
has part Paul McCartney end time 1970-01-01T00:00:00Z
has part Paul McCartney object has role singer
has part Paul McCartney object has role instrumentalist
Something similar to this other below but I can't merge the two together as I am missing some hint on sub-properties here.
SELECT ?propUrl ?propLabel ?valUrl ?valLabel WHERE {
wd:Q1299 ?propUrl ?valUrl.
?property ?ref ?propUrl;
rdf:type wikibase:Property;
rdfs:label ?propLabel.
?valUrl rdfs:label ?valLabel.
FILTER((LANG(?valLabel)) = "en")
FILTER((LANG(?propLabel)) = "en")
}
ORDER BY (?propUrl) (?valUrl)
Thanks a lot for your help!
This isn’t exactly what you’re asking for, but if you really just want all data on a single object, it is far easier and more economical to use the data access provided by https://www.wikidata.org/wiki/Special:EntityData/Q2599.json

Wikidata sparql query returns 0 result

I’m new to query languages and linked data so thanks a lot for the help. I also have a similar question about sparql on dbpedia
dbpedia sparql query returns 0 result
I would like to look up all the art movements in wikidata with the associated artists (founder/inventor/creator, known for), date start, date end, country. Here is my query:
PREFIX wdno: <http://www.wikidata.org/prop/novalue/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?art ?artLabel ?start ?end ?countryLabel ?influencebyLabel WHERE {
?art wdt:P31 wd:Q968159 ;
wdt:P571 ?start ;
wdt:P576 ?end;
wdt:P17 ?country ;
wdt:P737 ?influenceby .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
When I run the query only for the artLabel, it shows 300s result but as only a few has start date, when I include the start date, my dataset shrank, and when I include the rest of the search terms, there’s few record that has all information. My question is how can I generate result where the empty cells also get recorded instead of discarded?
Also, what’s the difference between this wikidata result and the dbpedia result?
Thanks
Answered in the comments...
Ettore Rizza wrote:
You need to add optional clauses for the properties that are not always filled.

How to check for a sub-property at all levels expanded from a SPARQL * wildcard?

In Wikidata, I want to find an item's country. Either directly if the item has a country directly, or by climbing up the P131s (located in the administrative territorial entity) until I find a country. Here is the query:
?item wdt:P131*/wdt:P17 ?country.
The query above works fine... except when a sub-division used to belong to another country, like for Q25270 (Prishtina). In such case, the result can be anachronistic. That's what I want to fix.
Great news: in such cases we should only consider the unique P131 (located in the administrative territorial entity) that has no P582 (end time) sub-property attached to it, and the problem is solved!
My question: how to alter my query above to achieve that?
Example: Let's say MyItem is in MyStreet is in MyTown is in MyRegion is in MyCountry, I must make sure that MyStreet, MyTown, and MyRegion do not have a P582 (end time).
(If "sub-property" is not the correct term, please let me know the right term and I will fix the question, thanks!)
An attempt
The query below works in most cases, but unfortunately it has a bug: It finds the wrong country in cases where the current country was also the country in the past (for instance Alsace belonged to France until 1871 then to Germany and currently to France again).
SELECT DISTINCT ?country WHERE {
wd:Q6556803 wdt:P131* ?area .
?area wdt:P17 ?country .
OPTIONAL {
wd:Q6556803 wdt:P131*/p:P131 [
pq:P582 ?endTime; ps:P131/wdt:P131* ?area
] .
} .
FILTER( !BOUND( ?endTime ) ) .
}
Wikidata uses different properties for direct links and links with extra information. So, for the statement "Prishtina is located in the administrative territorial entity Socialist Autonomous Province of Kosovo", there's the simple triple:
wd:Q25270 wdt:P131 wd:Q646035
And the long form with additional information (the end time):
wd:Q25270 p:P131 wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b .
wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b ps:P131 wd:Q646035 ;
pq:P582 "1990-01-01T00:00:00Z"
So, we need to filter out all paths with an end time (pq:582):
SELECT DISTINCT ?s ?sLabel ?country ?countryLabel {
VALUES ?s {
wd:Q25270
}
?s wdt:P131* ?area .
?area wdt:P17 ?country .
FILTER NOT EXISTS {
?s p:P131/(ps:P131/p:P131)* ?statement .
?statement ps:P131 ?area .
?s p:P131/(ps:P131/p:P131)* ?intermediateStatement .
?intermediateStatement (ps:P131/p:P131)* ?statement .
?intermediateStatement pq:P582 ?endTime .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
limit 50
Here, ?intermediateStatement is a statement with an end time on the path from ?s to a country.
This query does seem to time out if there is more than one value set for ?s. Also, the query does not take into account that there might exist multiple links from an item to an area where one has a timestamp and the other doesn't (both paths will be filtered out).

Get nationality of person using DBPedia and SPARQL

I have the following SPARQL query:
SELECT ?nationalityLabel WHERE {
dbpedia:Henrik_Ibsen dbpedia-owl:nationality ?nationality .
?nationality rdfs:label ?nationalityLabel .
}
I have checked that Henrik Ibsen exists and that he has the nationality ontology/property on him:
http://dbpedia.org/page/Henrik_Ibsen
And this is an ontology:
http://dbpedia.org/ontology/nationality
A very similar query to this listed here works:
https://stackoverflow.com/a/10248653/1680130
The problem I have is that the query doesn't return any result.
If I could get help solving this it would be great.
Summarized solution:
Both answers were great so upvote to both but landed on Joshua's in the end because informing about dbpedia-owl being cleaner. Optimal solution in my opinion:
First check with dbpedia-owl for birth-place:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
If found then get the demonym:
select ?label {
dbpedia:Norway dbpedia-owl:demonym ?label
filter langMatches(lang(?label),"en")
}
If above fails then do the "dirty" query:
SELECT
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
filter langMatches(lang(?nationality),"en")
}
Of course is "dirty" means data being correct but not so often present the order might be better other way around because people can be born in a country but from a different.
Kristian's answer is right that the property is dbpprop:nationality that Henrik Ibsen has. You're right that there is a dbpedia-owl:nationality property, too, but Henrik Ibsen doesn't have a value for it, unfortunately. The value of dbpprop:nationality that Henrik Ibsen has, though, is a string, which is a literal, and literals cannot be the subjects of triples in RDF, so ?nationality rdfs:label ?nationalityLabel in your query will never match.
The DBpedia ontology data (dbpedia-owl) tends to be cleaner than the dbpprop data, so you might prefer a solution using dbpedia-owl properties that Henrik Ibsen does have. In this case, you might look to the dbpedia-owl:birthPlace. Then you could get the name the country of the birth places:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
}
SPARQL results
You might want to narrow the permissible languages:
select ?label {
dbpedia:Henrik_Ibsen
dbpedia-owl:birthPlace
[ a dbpedia-owl:Country ;
rdfs:label ?label ]
filter langMatches(lang(?label),"en")
}
SPARQL results
Those queries will produce the name of the country, but it wanted the corresponding demonym, you can get the dbpedia-owl:demonym value of the country, if it's available. It's probably best to make the demonym optional, since a cursory investigation suggests that lots of countries in DBpedia don't have a value for it, so the name of the country may be the only option. E.g.,
select ?name ?demonym {
dbpedia:Henrik_Ibsen dbpedia-owl:birthPlace ?country .
?country a dbpedia-owl:Country ; rdfs:label ?name .
optional { ?country dbpedia-owl:demonym ?demonym }
filter langMatches(lang(?name),"en")
filter langMatches(lang(?demonym),"en")
}
SPARQL results
Two things are wrong with the query:
It's dbpprop:nationality
The label doesn't appear to exist, and unless you make that variable optional, it will eliminate the row altogether. EDIT: *Joshua Taylor's answer reminded me that the label doesn't exist because the dbprop:nationality value is a literal, which cannot be used as a subject resource, therefore, there will never be a label for dbpprop:nationality. Instead, where the data exists, you would use dbpedia-owl:nationality, which you did originally. it just so happens that Henrik_Ibsen has no dbpedia-owl:nationality value associated with him*
Updated query (updated).
SELECT
#### ?label #### See Edit
?nationality
WHERE {
dbpedia:Henrik_Ibsen dbpprop:nationality ?nationality .
#### OPTIONAL { ?nationality rdfs:label ?label . } #### See Edit.
}