Wikidata limit query to specific property constraints - sparql

I'm trying to list video games along with the data size (P3575). I want to limit the results to only show those in megabytes (Q79735) as a property constraint of data size.
Here is what I have now, which lists everything (mainly Megabytes, Gigabytes, block):
SELECT ?item ?itemLabel ?Size ?dataSizeLabel WHERE {
?item wdt:P31 wd:Q7889.
?item p:P3575 ?nodedataSize.
?nodedataSize psv:P3575 ?valuenodedataSize.
?valuenodedataSize wikibase:quantityAmount ?Size;
wikibase:quantityUnit ?dataSize.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
If I add:
?item wdt:P2305 wd:Q79735
then I get no results.

Related

SPARQL and counting items

I'm using SPARQL and Wikidata query service to try and determine: The actors that got the Oscar award sorted by the total number of (any) awards they received in decreasing order with the list of all their awards.
So far this is what I have that works but doesn't quite do what the question is asking.
SELECT ?item ?itemLabel
WHERE {
?item wdt:P31 wd:Q5 .
?item wdt:P166 wd:Q103916 .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 1000
And this I what I have that I am trying to get to work but is not yet working. Im new to this and any help is appreciated. Updated below query because I've gotten a bit closer
SELECT ?item ?itemLabel (COUNT (DISTINCT ?year) AS ?count)
WHERE {
?item wdt:P31 wd:Q5 .
?item wdt:P166 wd:Q103916 .
?item p:P585 ?year .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?item ?itemLabel
ORDER BY ?count
LIMIT 100

How to get only the first value from an optional property?

Like in the SQL aggregate MAX, MIN or FIRST, it gets only one value, not duplicating lines.
Real Wikidata case
Where the OPTIONAL clause expands from 253 to 257 lines:
# Countries and its codes
SELECT ?code ?item ?itemLabel ?osmId
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code
try here
I need only one (any) osmId. How to do something like FIRST{OPTIONAL{?item wdt:P402 ?osmId .}} ?
NOTES:
it is not a duplicate of How to get only the most recent value from a Wikidata property?
it is not a duplicate of Why does this Wikidata SPARQL query only work for the first element in a list?
... no exactly need for simple "any first".
Here a WIKI answer (please you can edit to enhance this answer!)
# Countries and its codes
SELECT ?code ?item ?itemLabel
(MAX(?osmId) as ?osmId_max) (COUNT(?code) as ?osmId_n)
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code ?item ?itemLabel
ORDER BY ?code
try
The COUNT(?code) is only to check the lines where osmId was not an Unique-ID.
Other simple solution to filter only the first option?
Using SAMPLE
As the #ValerioCocchi suggestion, we can use SAMPLE instead MAX:
SELECT ?code ?item ?itemLabel (SAMPLE(?osmId) as ?osmId_sample)
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{?item wdt:P402 ?osmId .}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code ?item ?itemLabel
ORDER BY ?code
try
SAMPLE use a little bit less CPU-time, but the main motivation to use is when you don’t care which value is returned. In the case of Wikidata, when the property-value is to be unique but there are some (minimal) errors, and you can ignore them.
NOTE about the osmId: the advantage of MAX in this particular query, using an numeric ID related to a temporal sequence, is that it can be a "fresher" ID... But in OpenStreetMap (OSM) the strategy can be the inverse: most old is the most stable ID. So, SAMPLE make sense also in a context of ignorance about better strategy.
Using FILTER
The #StanislavKralin suggestion:
SELECT ?code ?item ?itemLabel ?osmId
WHERE
{
?item wdt:P297 ?code.
OPTIONAL{
?item wdt:P402 ?osmId
FILTER NOT EXISTS {
?item wdt:P402 ?osmId, ?osmId_ .
FILTER (?osmId_ > ?osmId)
}
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code
try
Seems more verbose.

Timeout when retrieving results from Wikidata

This query for retrieving movie posters from Wikidata results in "Query timeout limit reached" errors on query.wikidata.org:
#defaultView:ImageGrid
SELECT ?item ?itemLabel ?pic ?fileTitle ?width ?height
WHERE
{
?item wdt:P31/wdt:P279* wd:Q11424 .
?item wdt:P3383 ?pic .
BIND(STRAFTER(wikibase:decodeUri(STR(?pic)), "http://commons.wikimedia.org/wiki/Special:FilePath/") AS ?fileTitle)
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
}
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "commons.wikimedia.org";
wikibase:api "Generator";
wikibase:limit "once";
mwapi:generator "allpages";
mwapi:gapfrom ?fileTitle;
mwapi:gapnamespace 6; # NS_FILE
mwapi:gaplimit 1;
mwapi:prop "imageinfo";
mwapi:iiprop "dimensions".
?size wikibase:apiOutput "imageinfo/ii/#size".
?width wikibase:apiOutput "imageinfo/ii/#width".
?height wikibase:apiOutput "imageinfo/ii/#height".
}
}
ORDER BY ?item
LIMIT 100
OFFSET 0
Setting a low limit works, with fewer than 100 results, and removing ORDER BY. However, ORDER BY is necessary to ensure that subsequent queries retrieve all results, and adding ORDER BY ?item throws a timeout error again. Ideally, I would like to filter by posters with a minimum width, but adding FILTER(?width>1500) times out no matter how low the limit.
Any suggestions for optimising the query to work consistently and reliably? Even if it only returns a single value, the query can be repeated until all values have been retrieved.

How to find items with no value for a property

With the Wikidata Query Service (which I am new to), I am trying to find items that have no value for a property. In this case, I am looking for instances of (P31) humans (Q5) with no sex or gender (P21). My code is really basic:
SELECT ?item ?itemLabel WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
?item wdt:P21 wd:Q6581072.
?item wdt:P31 wd:Q5.
}
LIMIT 100
Line 3 restricts it to finding things with female as the sex or gender. What could I replace it with that would make it only find things with no value for P21? The guides that I've found and a bit of googling don't seem to have stuff about looking for things without a value for a given property.
As discussed in the comments...
SELECT ?item ?itemLabel
WHERE
{
SERVICE wikibase:label
{ bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
FILTER NOT EXISTS { ?item wdt:P21 ?val }
?item wdt:P31 wd:Q5
}
LIMIT 100

Filter by type in Wikidata

This SPARQL request looks for all cities called "Berlin" in Wikidata:
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?type (a | wdt:P279) wd:Q515. # Sub-type of city
?item wdt:P31 ?type.
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
PROBLEM: It returns zero result.
Meanwhile, the request below correctly finds Q64 (capital and city-state of Germany), but it also returns a lot of other things called Berlin, so I want to filter on cities (then in a future phase I will order these cities by population, but that is outside the scope of this question):
SELECT DISTINCT ?item ?itemLabel ?itemDescription WHERE {
?item rdfs:label "Berlin"#en.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note: My code for getting instances of subclasses of city (Berlin is a big city which is subclass of city) seems to work correctly, as illustrated by the results of this query.
It was a Wikidata bug.
According to Wikidata's Jura1, it was a bug in Wikidata caused by someone's experiments with "preferred rank".
Discussion at https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2016/09#P31_inconsistency
The bug has been fixed just now.
You can only query for data that is contained in the dataset.
If you try an alternative of your query
SELECT DISTINCT ?item ?itemLabel ?itemDescription ?type1 ?type2 WHERE {
?item rdfs:label "Berlin"#en.
optional{?item rdf:type ?type1 }
optional{?item wdt:P279 ?type2 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
it returns no types, neither connected by rdf:type nor wdt:P279.
If you have a look at the entity of the capital and city state Berlin, you can see that there is information about "instance of", but this property is supposed to be https://www.wikidata.org/wiki/Property:P31. And none of them links to wd:Q515, I'm wondering from where you got this idea.
But to be honest, I don't know that much about Wikidata and to me, it's not clear why no rdf:type is used, but a common pattern for RDF datasets is to use
?s rdf:type/rdfs:subClassOf* SUPER_CLASS .
if we assume that there is rdf:type information available.
If you check the types wd:Q64 is an instance of
SELECT DISTINCT ?type ?typeLabel WHERE {
wd:Q64 (a | wdt:P31) ?type.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?item
None of them are City (wd:Q515) or a sub-class of it.
Looks like a data issue. Perhaps you should contact Wikidata.