Wikidata Query Service - Incomplete Result: Wikipedia URL missing - sparql

I am using the Wikidata Query Service (https://query.wikidata.org) to obtain the (former) spouses of Daniel Craig with this query:
PREFIX schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?s ?sLabel ?s_wpurl1 ?s_wpurl2 WHERE {
?s wdt:P26 wd:Q4547 .
OPTIONAL {
?s_wpurl1 schema:about ?s .
?s_wpurl1 schema:inLanguage "en" .
FILTER (SUBSTR(str(?s_wpurl1), 1, 25) = "https://en.wikipedia.org/")
} .
OPTIONAL {
?s_wpurl2 schema:about ?s .
?s_wpurl2 schema:inLanguage "de" .
FILTER (SUBSTR(str(?s_wpurl2), 1, 25) = "https://de.wikipedia.org/")
} .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
}
}
As expected, the result set consists of two results: Q134077 (Rachel Weisz) and Q62510 (Heike Makatsch).
But the requested en. and de.wikipedia article urls are only listed for Rachel Weisz, although respective articles exist for Heike Makatsch and are linked to the item representing Makatsch (see https://www.wikidata.org/wiki/Q62510).
The urls for that item are also not listed in other queries (where other items' urls are listed), so the problem is the item and not the query.
Why are wikipedia article urls missing in a SPARQL result set for an item where this information exists?
Update:
Now (a day later) the service is returning the urls.
Does anybody know when the service is expected to leave beta status and to become fully reliable?

The query service had the problem of missing sitelinks due to faulty data loading. It was fixed, so now the links should be fine.

Related

Wikidata COUNT(*) query times out

I have a straightforward query that counts how many humans have an English Wikipedia page.
prefix schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?item ?article
WHERE
{
?item wdt:P31 wd:Q5 . # Must be of a human
?article schema:about ?item ; # Must have a Wikipedia article
schema:inLanguage "en" ; # Article must be in English
schema:isPartOf <https://en.wikipedia.org/> . # Wikipedia article must be regular article
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } # Helps get the label in your language, if not, then en language
}
I get expected output as follows:
wd:Q11124 <https://en.wikipedia.org/wiki/Stephen_Breyer>
wd:Q10727 <https://en.wikipedia.org/wiki/Steve_Leo_Beleck>
wd:Q10065 <https://en.wikipedia.org/wiki/Taichang_Emperor>
wd:Q9605 <https://en.wikipedia.org/wiki/Sarah_Allen_(software_developer)>
However, if I change the SELECT statement from
SELECT ?item ?article
to
SELECT (count(?item) as ?count)
I get timeout error. Please note that the count statement works if I only specify "human" condition and exclude English Wiki article condition. So, clearly, some kind of background join is causing the query to timeout.
However, this is a fairly trivial join, so the query timeout is surprising.
Please let me know what may I be missing here.
Thanks!

getting labels from Wikidata in graphDB

I have a list of artstyles in graphDB, i am trying to use the SERVICE function to get their labels from Wikidata with this query:
PREFIX gp: <http://www.semanticweb.org/kandd/group76/final_project#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?movement ?label
WHERE{
?artist gp:hasArtStyle ?movement.
SERVICE <https://query.wikidata.org/sparql>{
?movement rdfs:label ?label .
FILTER (langMatches( lang(?label), "EN" ) )
}
}
note that gp is a namespace that only exists in my graph, not anywhere on the internet and also note that ?movement contains a list of valid Wikidata URIs such as http://www.wikidata.org/entity/Q186030
yet still the response I get is:
Error 500: error
Query evaluation error: org.eclipse.rdf4j.query.QueryEvaluationException: org.eclipse.rdf4j.query.QueryEvaluationException: java.io.IOException: Unkown record type: 83 (HTTP status 500)
What am I doing wrong?
Remember that you query is handled from the inside to the outside, meaning that the service part is handled first, and then the part where you use your own specific property.
Currently, your query on WikiData is very general. You ask for everything that has a rdfs:label, and then filter on all the English labels it returns.
Given this, my guess is that you query simply times out. Instead, I would try something like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT *
WHERE{
SERVICE <https://query.wikidata.org/sparql>{
?artist wdt:P101 wd:Q186030 ; #Field of Work is contemporary art
wdt:P31 wd:Q5 ; #instance of Human
rdfs:label ?name . #get the label
FILTER (langmatches(lang(?name), "en"))
}
}
If I try this in GraphDB, it returns 156 results.

SPARQL wikidata query: Obtain number of languages that associated wikipedia article is available in

Is it possible to use SPARQL on wikidata to extract the number of languages of a wikipedia article associated with a wikidata item?
I am new to SPARQL and wikidata. I have tried to find examples online, but no luck so far. I am able to extract the url for the wikipedia article. But I wonder if it is possible to count the number of languages for which the url exists. Here is my code so far:
prefix schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?museum ?museumLabel ?article WHERE {
?museum wdt:P17 wd:Q39; # countries
wdt:P31 ?type.
?type (wdt:P279*) wd:Q207694.
# If available, get the "en" entry, use native language as fallback:
SERVICE wikibase:label { bd:serviceParam wikibase:language "en,de". }
# get wikipedia article in english
OPTIONAL {
?article schema:about ?museum .
?article schema:inLanguage "en" .
FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/")
}
}
order by ?museum
I would like to add another column to my output that gives me the number of languages a wikipedia article exists in.
PREFIX schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?museum ?museumLabel (COUNT(DISTINCT ?article) as ?wiki_article_count) WHERE {
?museum wdt:P17 wd:Q39; # countries
wdt:P31 ?type.
?type (wdt:P279*) wd:Q207694.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en,de". }
OPTIONAL {
?article schema:about ?museum .
}
}
GROUP BY ?museum ?museumLabel
ORDER BY DESC( ?wiki_article_count )
Further developing the query given in the question, one can obtain the number of linked wiki-pages (I am absolutely not sure about the nomenclature).
Please try the Wikidata Query Service here: https://w.wiki/334g
For example, one can see for the wikidata/wiki entry: https://www.wikidata.org/wiki/Q194626 (Kunstmuseum Basel) (nomenclature remains unclear to me for now. Is it a wiki-entity?) 22 linked wiki-pages. That is 21 wikipedia-languages and one multilingual-site. See screenshot here:
I would be thankful for any comments on this answer and could improve this answer accordingly.

SPARQL Federated Query Not Returning All Solutions

This is an evolution of this question.
Basically I am having trouble getting all the solutions to a SPARQL query from a remote endpoint. I have read through section 2.4 here because it seems to describe a situation almost identical to mine.
The idea is that I want to filter my results from DBPedia based on information in my local RDF graph. The query is here:
PREFIX ns1:
<http://www.semanticweb.org/caeleanb/ontologies/twittermap#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT *
WHERE {
?p ns1:displayName ?name .
SERVICE <http://dbpedia.org/sparql> {
?s rdfs:label ?name .
?s rdf:type foaf:Person .
}
}
And the only result I get is dbpedia:John_McCain (for ?s). I think this is because John McCain is the only match in the first 'x' results, but I can't figure out how to get the query to return all matches. For example, if I add a filter like:
SERVICE <http://dbpedia.org/sparql> {
?s rdfs:label ?name .
?s rdf:type foaf:Person .
FILTER(?name = "John McCain"#en || ?name = "Jamie Oliver"#en)
}
Then it correctly returns BOTH dbpedia:Jamie_Oliver and dbpedia:John_McCain. There are dozens of other matches like Jamie Oliver that do not come through unless I specifically add it to a Filter like this.
Can someone explain a way to extract the rest of the matches? Thanks.
It looks like the cause of this issue is that the SERVICE block is attempting to pull all foaf:Persons from DBPedia, and then filter them based on my local Stardog db. Since there is a 10,000 result limit when querying DBPedia, only matches which occur in that set of 10,000 arbitrary Persons will be found. To fix this, I wrote a script to put together a FILTER block containing every string name in my Stardog db and attached it to the SERVICE block to filter remotely and thereby avoid hitting the 10,000 result limit. It looks something like this:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX ns1: <http://www.semanticweb.org/caeleanb/ontologies/twittermap#>
CONSTRUCT{
?s rdf:type ns1:Person ;
ns1:Politician .
}
WHERE {
?s rdfs:label ?name .
?s rdf:type dbo:Politician .
FILTER(?name IN ("John McCain"#en, ...)
}

Returning properties from dbpedia Virtuoso

When I look at the HTML page: http://dbpedia.org/page/Bill_Nye I can see a lot of properties that are not returned in the following simple query from the Virtuoso page (http://pt.dbpedia.org/sparql):
prefix foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?s ?p WHERE {
?e foaf:name "Bill Nye"#en .
?e ?s ?p.
}
No results return when I try to access one of the properties I can see on this page- say foaf:depiction:
prefix foaf: <http://xmlns.com/foaf/0.1/>
SELECT $depiction WHERE {
?s foaf:name "Bill Nye"#en.
?e foaf:depiction ?depiction
}
When I run them via the sparql endpoint at http://dbpedia.org/sparql, after encoding
SELECT ?s ?p WHERE { ?e foaf:name "Bill Nye"#en.?e ?s ?p. }
I get
http://dbpedia.org/sparql?query=SELECT%20%3Fs%20%3Fp%20WHERE%20%7B%20%3Fe%20foaf%3Aname%20%22Bill%20Nye%22#en.%3Fe%20%3Fs%20%3Fp.%20%7D&format=json
And a result of what looks like all the properties shown at http://dbpedia.org/page/Bill_Nye. I would love an explaniation of the difference, is it simply the Virtuoso interface or something more? I'm pretty fresh at this semantic web, so please be gentle.
Please note that you are sending the queries to two different Virtuoso installations:
pt.dbpedia.org/sparql : is the international chapter for Portuguese language (PT stands for DBpedia Portuguese)
dbpedia.org/sparql : is the main SPARQL endpoint for DBpedia, containing data from multiple languages, but in an English-centric way.
You will also have different experiences with es.dbpedia.org, it.dbpedia.org, el.dbpedia.org, etc.
I18n chapters do not load exactly the same data sets as the main DBpedia. Please see:
http://mappings.dbpedia.org/index.php/DBpedia_datasets
I believe the following query (ordered) will enable you to more easily identify the missing results --
prefix foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?s ?p ?o WHERE
{
?s foaf:name "Bill Nye"#en .
?s ?p ?o.
}
order by 2
here you can easily identify that all the missing results have predicates of the form -
is ********** of
Basically, these are just eye candy in the human viewable Html view - and represent additional relationships wherever the object of some other subject is http://dbpedia.org/resource/Bill_Nye
That is -
<something else> ?p <http://dbpedia.org/resource/Bill_Nye>