How to get the Wikidata ID (Q…) using just a label? - sparql

Is there any way to get the Wikidata Q identifier of an entity using just the label as a string?
For example, "Java":
Q251
I want to search only the English labels.

PREFIX wd: <http://www.wikidata.org/entity/>
SELECT DISTINCT ?qid
WHERE {
# make input string into a language-tagged string
BIND( STRLANG("Java", "en") AS ?label ) .
# search all items that have this languaged-tagged string as label
?item rdfs:label ?label .
# extract the last path segment of the URI
BIND(STRAFTER(STR(?item), STR(wd:)) AS ?qid) .
}
If you can append #en to the input string ("Java"#en), you can use:
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT DISTINCT ?qid
WHERE {
?item rdfs:label "Java"#en .
BIND(STRAFTER(STR(?item), STR(wd:)) AS ?qid) .
}
If you are fine with the URI (e.g., http://www.wikidata.org/entity/Q251) instead of the QID (e.g., Q251) as result, you can use:
SELECT DISTINCT ?item
WHERE {
?item rdfs:label "Java"#en .
}
If you want to ignore case, you can use UCASE or LCASE in a FILTER.

Related

SPARQL query for specific information

I am struggling a lot to create some SPARQL queries. I need 3 specific things, and this is what i have so far:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
select distinct ?title ?author ?country ?genre ?language
where {
?s rdf:type dbo:Book;
dbp:title ?title;
dbp:author ?author;
dbp:country ?country;
dbp:genre ?genre;
dbp:language ?language.
}
This query will bring me a list of all books. What i really need is the ability to add some filters to this code. There are 3 things i want to filter by:
specific title name (e.g., search for title with "harry potter")
specific author name (e.g., search for author with "J. K. Rowling")
specific genre (e.g., search for genre with "adventure")
I've been struggling with this for too long and i simply cannot define these 3 queries. I am trying to implement a function that will execute a SPARQL statement using parameters passed by an user form. I found a few examples here and in the web but i just cannot build these 3 specific queries.
As noted, not every book has every property, and some of your properties may not exist at all. For instance, I changed dbp:genre to dbo:literaryGenre, based on the description of Harry Potter and the Goblet of Fire. See query form, and results.
SELECT *
WHERE
{ ?s rdf:type dbo:Book .
?s rdfs:label ?bookLabel .
FILTER(LANGMATCHES(LANG(?bookLabel), 'en'))
?s dbo:author ?author .
?author rdfs:label ?authorLabel .
FILTER(LANGMATCHES(LANG(?authorLabel), 'en'))
?authorLabel bif:contains "Rowling"
OPTIONAL { ?s dbp:country ?country .
?country rdfs:label ?countryLabel .
FILTER(LANGMATCHES(LANG(?countryLabel), 'en')) }
OPTIONAL { ?s dbo:literaryGenre ?genre .
?genre rdfs:label ?genreLabel .
FILTER(LANGMATCHES(LANG(?genreLabel), 'en')) }
OPTIONAL { ?s dbp:language ?language .
?language rdfs:label ?languageLabel .
FILTER(LANGMATCHES(LANG(?languageLabel), 'en')) }
}

how to find classes from dbpedia?

I need a sparql query that given a free text (user input),
it finds me from dbpedia all the classes related to it.
How do it?
Also asked here. Accepted answer said --
When you say classes, are you mean about types? If yes, try something like
SELECT ?uri ?label ?type
WHERE {
?uri rdfs:label ?label .
?uri <http://dbpedia.org/ontology/type> ?type .
FILTER regex(str(?label), "Leipzig") .
}
limit 10
I couldn't let this go...
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX virtdrf: <http://www.openlinksw.com/schemas/virtrdf#>
SELECT ?s1c AS ?c1
COUNT (*) AS ?c2
?c3
WHERE
{
QUAD MAP virtrdf:DefaultQuadMap
{
GRAPH ?g
{
?s1 ?s1textp ?o1 .
?o1 bif:contains '"dbpedia"' .
}
}
?s1 a ?s1c .
OPTIONAL { ?s1c rdfs:label ?c3
FILTER(langMatches(LANG(?c3),"EN"))}
}
GROUP BY ?s1c ?c3
ORDER BY DESC (2) ASC (3)
The earlier answer gets you partial results.

How to get film cost currency in Wikidata SPARQL?

I know how to get the cost of specified films:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item (GROUP_CONCAT( ?_cost; SEPARATOR = "~~~") AS ?budget)
WHERE {
VALUES ?selectedMovies { wd:Q24515019 wd:Q20762698 }
?item wdt:P31/wdt:P279* wd:Q11424 filter (?item = ?selectedMovies).
OPTIONAL {
?item wdt:P2130 ?_cost.
}
}
GROUP BY ?item
But when I try to get the currency for cost, I get nothing:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item (GROUP_CONCAT( ?_cost; SEPARATOR = "~~~") AS ?budget)
(GROUP_CONCAT( ?_currency; SEPARATOR = "~~~") AS ?_currency)
WHERE {
VALUES ?selectedMovies { wd:Q24515019 wd:Q20762698 }
?item wdt:P31/wdt:P279* wd:Q11424 filter (?item = ?selectedMovies).
OPTIONAL {
?item wdt:P2130 ?_cost.
?_cost wdt:P2237 ?_currency.
}
}
GROUP BY ?item
I checked, on the page of these movies, the cost currency is present.
So, how can I get the currency?
Well, it's a bit more complicated as you're asking for properties of units:
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?item (GROUP_CONCAT( DISTINCT ?cost_with_unit; SEPARATOR = "~~~") AS ?budget)
WHERE {
VALUES ?selectedMovies { wd:Q24515019 wd:Q20762698 }
?item wdt:P31/wdt:P279* wd:Q11424 filter (?item = ?selectedMovies).
OPTIONAL {
?item wdt:P2130 ?_cost.
# get the node to the cost statement
?item p:P2130 ?stmnode.
# then its value node
?stmnode psv:P2130 ?valuenode.
# then its unit, i.e. currency as entity
?valuenode wikibase:quantityUnit ?unit.
# then finally, its label
?unit rdfs:label ?unitLabel.
FILTER(LANGMATCHES(LANG(?unitLabel), 'en'))
# put everything together
BIND(CONCAT(str(?_cost), " ", str(?unitLabel)) as ?cost_with_unit)
}
}
GROUP BY ?item
Update
To get the ISO 4217 code of the unit, replace
# then finally, its label
?unit rdfs:label ?unitLabel.
FILTER(LANGMATCHES(LANG(?unitLabel), 'en'))
with
# then finally, the ISO 4217 code of the unit, e.g. USD
?unit wdt:P498 ?unitLabel .

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language

querying dbpedia sparql for more results

I want to obtain some data from dbpedia.
I have entities urls and want to get some information about localization.
Now i call query like this:
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT * WHERE
{
<{0}> rdfs:label ?label .
OPTIONAL {
<{0}> geo:lat ?lat ;
geo:long ?long .
} .
OPTIONAL {
<{0}> dbo:Country ?dboCountry
} .
OPTIONAL {
<{0}> dbpedia-owl:country ?dbpediaContry .
?dbpediaContry dbpprop:cctld ?ccTLD
}.
OPTIONAL {
<{0}> dbpprop:country ?dbpropContry
}
FILTER ( lang(?label) = "en" )
}
for each url (replace {0} with url).
But I would like to optimize it and get result for more entities in one query.
Also is it possible to not set url in each line?
Regards
Piotr
Hmm, it looks I already have found an answer for both questions.
Do you know this(http://en.wikipedia.org/wiki/Rubber_duck_debugging)
Solution is:
SELECT DISTINCT *
WHERE {
?uri rdfs:label ?label .
OPTIONAL { ?uri geo:lat ?lat .
?uri geo:long ?long} .
FILTER (?uri IN ({0}, {1}, ...) )
}
Maybe it will be helpful for somebody else?
Or maybe someone knows better solution?