Translate nouns using SPARQL with Babelnet - sparql

The following query translates a word to a certain language:
SELECT DISTINCT ?translation WHERE {
?entries a lemon:LexicalEntry .
?entries rdfs:label "apple"#en .
?entries lemon:sense ?sense .
?sense lexinfo:translation ?translation .
filter contains(str(?translation),"HI")
}
But how can I retrieve the label for the translation, which is a LexicalSense as far I can tell

The way up and the way down is one and the same (as Heraclitus had said):
SELECT DISTINCT ?label WHERE {
?original_entry rdfs:label "apple"#en .
?original_entry lemon:sense ?original_sense .
?original_sense
lexinfo:translation
?translated_sense .
?translated_entry lemon:sense ?translated_sense .
?translated_entry rdfs:label ?label .
FILTER (lang(?label) = "hi")
}
Try it!
This page describes data model and provides some example queries.

Related

Retrieve properties and their descriptions from instances of the same class

I want to retrieve all distinct object properties of instances with the same type (class), starting with two initial seeds (wd:Q963 and wd:Q42320). First, I ask for the type (and maybe subtype) of such seeds. Second, all instances of the same class of the seeds are retrieved. Third, properties of the instances are retrieved. Finally, I want to retrieve descriptions of such properties and if possible alternative labels. My query is as follows:
select distinct ?property ?description ?label where{
{
wd:Q963 wdt:P31 ?typesSubject .
?instancesS (wdt:P31|wdt:P279) ?typesSubject .
?instancesS ?property ?unknown .
}
UNION
{
wd:Q42320 wdt:P31 ?typesObject .
?instancesO (wdt:P31|wdt:P279) ?typesObject .
?unknown ?property ?instancesO .
}
?claimPredicate wikibase:directClaim ?property .
?claimPredicate schema:description ?description .
?claimPredicate rdfs:label ?label .
FILTER(strstarts(str(?property),str(wdt:)))
FILTER(strstarts(str(?unknown),str(wd:)))
FILTER(LANG(?description) = "en").
FILTER(LANG(?label) = "en").
}
The problem is that my actual query takes a lot of time and it fails in the public Wikidata endpoint. Does anyone can provide me some hints to optimize such a query?
To be honest, I can't understand the aim of your query. I suppose you are interested in semantic similarity or something like.
Basically, you could reduce the number of joins, retrieving only unique wdt-predicates with nested SELECT DITINCT.
SELECT ?property ?claimPredicateLabel ?claimPredicateDescription
WHERE {
hint:Query hint:optimizer "None" .
{
SELECT DISTINCT ?property {
VALUES (?s) {(wd:Q963) (wd:Q42320)}
?s wdt:P31/^(wdt:P31|wdt:P279) ?instances .
?instances ?property ?unknown .
}
}
?claimPredicate wikibase:directClaim ?property .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Try it!
This is fast enough (~ 3s) even with SERVICE wikibase:label.
Also, you don't need FILTER(strstarts(str(?property),str(wdt:))) after ?claimPredicate wikibase:directClaim ?property.
As for hint:Query hint:optimizer "None", this hint forces Blazegraph to follow standard bottom-up evaluation order. In this particular query, hint:Query hint:optimizer "Runtime" or hint:SubQuery hint:runOnce true should also work.

how to find classes from dbpedia?

I need a sparql query that given a free text (user input),
it finds me from dbpedia all the classes related to it.
How do it?
Also asked here. Accepted answer said --
When you say classes, are you mean about types? If yes, try something like
SELECT ?uri ?label ?type
WHERE {
?uri rdfs:label ?label .
?uri <http://dbpedia.org/ontology/type> ?type .
FILTER regex(str(?label), "Leipzig") .
}
limit 10
I couldn't let this go...
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX virtdrf: <http://www.openlinksw.com/schemas/virtrdf#>
SELECT ?s1c AS ?c1
COUNT (*) AS ?c2
?c3
WHERE
{
QUAD MAP virtrdf:DefaultQuadMap
{
GRAPH ?g
{
?s1 ?s1textp ?o1 .
?o1 bif:contains '"dbpedia"' .
}
}
?s1 a ?s1c .
OPTIONAL { ?s1c rdfs:label ?c3
FILTER(langMatches(LANG(?c3),"EN"))}
}
GROUP BY ?s1c ?c3
ORDER BY DESC (2) ASC (3)
The earlier answer gets you partial results.

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language

DBpedia SPARQL filter does not apply to all results

A FILTER NOT EXISTS allows some results through when combined with OPTIONAL triples.
My query:
SELECT DISTINCT * WHERE
{
{
?en rdfs:label "N'Djamena"#en .
BIND("N'Djamena" AS ?name) .
}
UNION {
?en rdfs:label "Port Vila"#en .
BIND("Port Vila" AS ?name) .
}
UNION {
?en rdfs:label "Atafu"#en .
BIND("Atafu" AS ?name) .
}
FILTER NOT EXISTS { ?en rdf:type skos:Concept } .
OPTIONAL { ?en owl:sameAs ?es . FILTER regex(?es, "es.dbpedia") . }
OPTIONAL { ?en owl:sameAs ?pt . FILTER regex(?pt, "pt.dbpedia") . }
}
LIMIT 100
This query gets the three places as expected, but it also pulls back "Category:Atafu", which should be filtered out by virtue of having "rdf:type skos:Concept".
When used without the OPTIONAL lines, I get the three places expected. When used with those clauses non-optionally, I get only two of the countries, because Atafu doesn't have a page in Portuguese.
I can also move the FILTER NOT EXISTS statement into each of the UNION'd country blocks, but that seems to hurt the server's response time.
Why does the FILTER NOT EXISTS clause filter out "Category:N'Djamena" and Category:Port_Vila but not "Category:Atafu" when followed by OPTIONAL?
I really have no idea why your query doesn't work. I'd have to chalk it up to some weird Virtuoso thing. There's definitely something strange going on. For instance, if you remove the bind for the last name, you'll get the resources you're expecting:
SELECT DISTINCT * WHERE
{
{
?en rdfs:label "N'Djamena"#en .
BIND("N'Djamena" AS ?name) .
}
UNION {
?en rdfs:label "Port Vila"#en .
BIND("Port Vila" AS ?name) .
}
UNION {
?en rdfs:label "Atafu"#en .
}
FILTER NOT EXISTS { ?en rdf:type skos:Concept }
OPTIONAL { ?en owl:sameAs ?es . FILTER regex(?es, "es.dbpedia") }
OPTIONAL { ?en owl:sameAs ?pt . FILTER regex(?pt, "pt.dbpedia") . }
}
LIMIT 100
SPARQL results
It's really pretty weird. Here's a modified version of your query that gets the results you're looking for. It uses values instead of union, which makes the query simpler. It should be logically equivalent, though, so I'm not sure why it makes a difference.
select distinct * where {
values ?label { "N'Djamena"#en "Port Vila"#en "Atafu"#en }
?en rdfs:label ?label .
optional { ?en owl:sameAs ?pt . filter regex(?pt, "pt.dbpedia") }
optional { ?en owl:sameAs ?es . filter regex(?es, "es.dbpedia") }
filter not exists { ?en a skos:Concept }
bind(str(?label) as ?name)
}
SPARQL results
I'd actually clean up the string matching though, since regular expressions are probably more power than you need here. You just want to check whether the value starts with a given substring:
select ?en ?label (str(?label) as ?name) ?es ?pt where {
values ?label { "N'Djamena"#en "Port Vila"#en "Atafu"#en }
?en rdfs:label ?label .
optional { ?en owl:sameAs ?pt . filter strstarts(str(?pt), "http://pt.dbpedia") }
optional { ?en owl:sameAs ?es . filter strstarts(str(?es), "http://es.dbpedia") }
filter not exists { ?en a skos:Concept }
}
SPARQL results

querying dbpedia sparql for more results

I want to obtain some data from dbpedia.
I have entities urls and want to get some information about localization.
Now i call query like this:
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT * WHERE
{
<{0}> rdfs:label ?label .
OPTIONAL {
<{0}> geo:lat ?lat ;
geo:long ?long .
} .
OPTIONAL {
<{0}> dbo:Country ?dboCountry
} .
OPTIONAL {
<{0}> dbpedia-owl:country ?dbpediaContry .
?dbpediaContry dbpprop:cctld ?ccTLD
}.
OPTIONAL {
<{0}> dbpprop:country ?dbpropContry
}
FILTER ( lang(?label) = "en" )
}
for each url (replace {0} with url).
But I would like to optimize it and get result for more entities in one query.
Also is it possible to not set url in each line?
Regards
Piotr
Hmm, it looks I already have found an answer for both questions.
Do you know this(http://en.wikipedia.org/wiki/Rubber_duck_debugging)
Solution is:
SELECT DISTINCT *
WHERE {
?uri rdfs:label ?label .
OPTIONAL { ?uri geo:lat ?lat .
?uri geo:long ?long} .
FILTER (?uri IN ({0}, {1}, ...) )
}
Maybe it will be helpful for somebody else?
Or maybe someone knows better solution?