Get all Wikidata items that have more than 10 languages? - sparql

I'm trying to get the most famous movies in the world from Wikidata with SPARQL.
I have the following query:
SELECT ?item WHERE {
?item wdt:P31 wd:Q11424.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
which returns ALL movies (about 214143).
I basically only need movies that have, let's say, more than 10 language entries on wikipedia, as I'm guessing these will be the most famous ones.
Is there a way to do this inside the query itself, without checking all entries ?

A naive answer to your question is:
SELECT ?movie (count(?wikipage) AS ?count) WHERE {
hint:Query hint:optimizer "None" .
?movie wdt:P31 wd:Q11424 .
?wikipage schema:about ?movie .
?wikipage schema:isPartOf/wikibase:wikiGroup "wikipedia"
} GROUP BY ?movie HAVING (?count > 10) ORDER BY DESC(?count)
Try it!
Alternatively, you could consider total number of sitelinks. Sitelinks include links to Wikipedia and also links to Wikiquote, Wikivoyage etc. The advantage is that total number of sitelinks is precomputed.
SELECT ?movie ?sitelinks WHERE {
?movie wdt:P31 wd:Q11424 .
?movie wikibase:sitelinks ?sitelinks .
FILTER (?sitelinks > 10)
} ORDER BY DESC(?sitelinks)
Try it!
See also these questions:
Get Wikipedia URLs (sitelinks) in Wikidata SPARQL query
Wikidata results sorted by something similar to a PageRank
As #TallTed and #AKSW have pointed out, the number of labels in different languages may be differ from the number of Wikipedia articles in different languages. Here below a comparison.
Top 5 movies by Wikipedia articles
| title | articles | sitelinks | labels |
|---------------------|----------|-----------|--------|
| Avatar | 92 | 103 | 99 |
| Titanic | 86 | 100 | 101 |
| The Godfather | 79 | 103 | 82 |
| Slumdog Millionaire | 72 | 75 | 80 |
| Forrest Gump | 71 | 101 | 84 |
Top 5 movies by sitelinks
| title | articles | sitelinks | labels |
|---------------|----------|-----------|--------|
| Avatar | 92 | 103 | 99 |
| The Godfather | 79 | 103 | 82 |
| Forrest Gump | 71 | 101 | 84 |
| Titanic | 86 | 100 | 101 |
| The Matrix | 67 | 94 | 77 |
Top 5 movies by labels
| title | articles | sitelinks | labels |
|------------------------------|----------|-----------|--------|
| The 25th Reich | 2 | 2 | 227 |
| Time Is But Brief | 0 | 0 | 224 |
| Michael Moore in TrumpLand | 6 | 6 | 222 |
| Magnus - The Mozart of Chess | 1 | 1 | 221 |
| Lee Chong Wei | 1 | 1 | 196 |

Related

compare two columns in PostgreSQL show only highest value

This is my table
I'm trying to find in which urban area having high girls to boys ratio.
Thank you for helping me in advance.
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| Ran | 100 | 120 |
| Ran | 110 | 105 |
| dhanr | 80 | 73 |
| dhanr | 140 | 80 |
| mohan | 180 | 73 |
| mohan | 25 | 26 |
This is the query I used, but I did not get the expected results
SELECT urban, Max(allboys) as high_girls,Max(allgirls) as high_boys
from table_urban group by urban
Expected results
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| dhar | 220 | 153 |
First of all your example expected result doesn't seems correct because the girls to boys ratio is highest in "mohan" and not in "dhanr" - If what you are really looking for is the highest ratio and not the highest number of girls.
You need to first group and find the sum and then find the ratio (divide one with other) and get the first one.
select foo.urban as urban, foo.girls/foo.boys as ratio from (
SELECT urban, SUM(allboys) as boys, SUM(allgirls) as girls
FROM table_urban
GROUP BY urban) as foo order by ratio desc limit 1
SELECT urban, SUM(allboys) boys, SUM(allgirls) girls
FROM table_urban
GROUP BY urban
ORDER BY boys / girls -- or backward, "girls / boys"
LIMIT 1

How to exclude values from output in SPARQL query for Wikidata?

I'm trying to write a SPARQL query using Wikidata Query Service to retrieve all prime ministers of the Netherlands from 1970. However, I want to filter the output by checking if a prime minister worked for a university. If a minister worked for a university, it should not be in the output.
I think I must use the FILTER NOT EXISTS expression, but do not know how to properly write this line. Can someone please help me out?
See below for my query and output:
SELECT ?pmLabel ?start ?companyLabel
WHERE
{
?pm wdt:P39 wd:Q3058109.
?pm p:P39 ?posHeld.
?pm wdt:P108 ?company.
?posHeld ps:P39 wd:Q3058109.
?posHeld pq:P580 ?start.
FILTER(year(?start) > 1970)
# FILTER NOT EXISTS(?company (something) "Universit")
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } # labels
}
ORDER BY DESC(?start)
+----------------------+------------------+------------------------------+
| primeMinisterLabel | start | companyLabel |
+----------------------+------------------+------------------------------+
| Mark Rutte | 14 October 2010 | Unilever |
| Mark Rutte | 14 October 2010 | Calvé |
| Jan Peter Balkenende | 22 July 2002 | Erasmus University Rotterdam |
| Jan Peter Balkenende | 22 July 2002 | Vrije Universiteit Amsterdam |
| Ruud Lubbers | 4 November 1982 | United Nations |
| Ruud Lubbers | 4 November 1982 | Harvard University |
| Ruud Lubbers | 4 November 1982 | Tilburg University |
| Ruud Lubbers | 4 November 1982 | Hollandia |
| Dries van Agt | 19 December 1977 | Kyoto University |
| Dries van Agt | 19 December 1977 | Radboud University Nijmegen |
| Dries van Agt | 19 December 1977 | Kwansei Gakuin University |
| Dries van Agt | 19 December 1977 | Ritsumeikan University |
+----------------------+------------------+------------------------------+

Get all Countries Name from DBpedia using SPARQL

I want list of countries name from DBpedia.
I am using http://dbpedia.org/snorql/ to execute my query, but till now I have not found all countries name which are available in DBpedia.
For Example : dbr:United_Kingdom, dbr:India, dbr:United_States, etc.
Is the problem
You don't know how to write SPARQL in general? (That's OK, it's hard to get started.)
You don' know about the classes and predicates in DBpedia? (That's OK too. I have to check each time.)
Something else?
This gets all UN member nations. Getting "all countries" is probably just a matter of finding the right class in their ontology.
select distinct ?s
where { ?s a <http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations> }
I happen to know that New York City is the largest city in the United States, and that DBpedia has a largestCity predicate and a New_York_City instance. So I wrote a query that should only get the United States as the subject, and then asked for all connected predicates and objects. You should look in that for an object that meets your exceptions for defining "all countries." If you don't find one, you may have to union a few other triple patterns into one query.
I have also filtered out objects that contain either of two terms that be relevant for you: "country" or "nation"
select distinct *
where { ?s a <http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations> ;
dbo:largestCity dbr:New_York_City ;
?p ?o
filter(isURI(?o))
filter((regex(lcase(str(?o)), "country")) || (regex(lcase(str(?o)), "nation")))
}
Gives the foloowing, which should help you write a followup question that isn't specific to the United States.
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+
| s | p | o |
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=US |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://nationalatlas.gov/ |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/class/yago/Country108544813 |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G7_nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://schema.org/Country |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2000/01/rdf-schema#seeAlso | http://dbpedia.org/resource/Anti-miscegenation_laws |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:Member_states_of_the_United_Nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2002/07/owl#sameAs | http://transparency.270a.info/classification/country/US |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/ontology/Country |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://dbpedia.org/class/yago/WikicatMemberStatesOfTheUnitedNations |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G8_nations |
| http://dbpedia.org/resource/United_States | http://www.w3.org/2002/07/owl#sameAs | http://linked-web-apis.fit.cvut.cz/resource/united_states_of_america_country |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://www.nationalcenter.org/HistoricalDocuments.html |
| http://dbpedia.org/resource/United_States | http://dbpedia.org/ontology/wikiPageExternalLink | http://news.bbc.co.uk/2/hi/americas/country_profiles/1217752.stm |
| http://dbpedia.org/resource/United_States | http://www.w3.org/1999/02/22-rdf-syntax-ns#type | http://umbel.org/umbel/rc/Country |
| http://dbpedia.org/resource/United_States | http://purl.org/dc/terms/subject | http://dbpedia.org/resource/Category:G20_nations |
+--------------------------------------------+---------------------------------------------------+------------------------------------------------------------------------------+

Selecting Food Calories from SPARQL

I've started looking into SPARQL and I admit I find it very obscure and difficult.
I need to output the calories for all the foods.
I don't really understand the difference between the wd: wdt: p: and ps:
Similarly I am unclear on when to use P000 or Q000 ?
The best I got so far is:
SELECT ?food ?calories ?foodLabel ?caloriesLabel WHERE {
?food wdt:P31 wd:Q2095.
?food wdt:P31 ?calories.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Which gives : Results
Any suggestion would be welcome
If you are willing to download a 1 GB RDF file and publish it at your own triplestore, this looks promising:
https://datahub.io/dataset/open-food-facts
Specifically, http://en.openfoodfacts.org/data/en.openfoodfacts.org.products.rdf (If English is your preferred languague.)
I say download and publish as Googling suggests these data aren't already exposed at any public endpoint.
You will likely have to use different classes and predicates, compared to WikiData.
Hey, this is pretty neat!
SELECT *
WHERE
{ GRAPH <http://openfoodfacts.org>
{ ?s a <http://data.lirmm.fr/ontologies/food#FoodProduct> ;
<http://data.lirmm.fr/ontologies/food#name> ?fn ;
<http://data.lirmm.fr/ontologies/food#energyPer100g> ?energy
}
}
LIMIT 9
gives
+------------------------------------------------------------------------------------------------------+---------------------------------------------+--------+
| s | fn | energy |
+------------------------------------------------------------------------------------------------------+---------------------------------------------+--------+
| http://world-en.openfoodfacts.org/product/9400547030446/heinz-big-n-chunky-chicken-corn | "Heinz Big'n Chunky Chicken & Corn" | "245" |
| http://world-en.openfoodfacts.org/product/9400550004847/sour-patch-kids-pascall | "Sour Patch Kids" | "1440" |
| http://world-en.openfoodfacts.org/product/9400550602487/brunch-mixed-berry-bar-cadbury | "Brunch Mixed Berry Bar" | "1810" |
| http://world-en.openfoodfacts.org/product/9400550646276/pascall-family-pack-sweets-fruit-bursts | "Pascall Family Pack Sweets Fruit Bursts" | "1427" |
| http://world-en.openfoodfacts.org/product/9400553011477/choc-thins-griffin-s | "Choc Thins" | "2010" |
| http://world-en.openfoodfacts.org/product/9400553438786/gingernuts-griffin-s | "Gingernuts" | "1810" |
| http://world-en.openfoodfacts.org/product/9400563448614/nice-natural-rosted-nut-bar-chocolate | "Nice & Natural Rosted Nut Bar - Chocolate" | "2060" |
| http://world-en.openfoodfacts.org/product/9400563740589/nut-bar-caramel-cashew-flavour-nice-natural | "Nut Bar - Caramel Cashew Flavour" | "1960" |
| http://world-en.openfoodfacts.org/product/9400563741784/superfruits-cranberry-blueberry-nice-natural | "Superfruits - Cranberry & Blueberry" | "1520" |
+------------------------------------------------------------------------------------------------------+---------------------------------------------+--------+

sparql use subquery as result/variable for outer query

I want to use result of nested query as a variable/sub graph for outer query.
My use-case is I have One-to-Many relation between product and offers, and I want to get all/selected products as well as offers count, minPrice and condition from offer record.
Here is my query
SELECT ?ProductID ?priceMIN ?total ?condition
WHERE {
{
SELECT ?ProductID (COUNT(?cond) AS ?total) (MIN(?pr) AS ?priceMIN ) ?condition WHERE{
?ProductID ^mod:isOfferOf ?oid.
?oid dprop:priceMIN ?pr.
?oid dprop:total ?cond.
BIND (if( ?cond = 1, "New", "Used") AS ?condition).
}
}
VALUES ?ProductID { prod:Rl5RVl5R prod:Rl5RVl5Q prod:Rl5RVl5W prod:Rl5RVl5Y prod:Rl5RVl5U }
}
and i am getting this data.
ProductID |priceMIN |condition |total
product:Rl5RVl5R | 3267 | Used | 1
product:Rl5RVl5R | 3216 | New | 4
product:Rl5RVl5Y | 327 | New | 1
product:Rl5RVl5Q | 323 | New | 1
product:Rl5RVl5Q | 3268 | Used | 1
product:Rl5RVl5W | 326 | New | 1
product:Rl5RVl5W | 3271 | Used | 4
product:Rl5RVl5U | 325 | New | 2
product:Rl5RVl5U | 3270 | Used | 1
Now I want to assign this value like
product:Rl5RVl5U dprop:newTotal ?total
product:Rl5RVl5U dprop:newMin ?priceMIN
product:Rl5RVl5U dprop:usedTotal ?total
product:Rl5RVl5U dprop:usedMin ?priceMIN