Validating rdf:Seq with SPARQL - sparql

rdf:Seq is a sequence of RDF properties rdf:_1, rdf:_2, rdf:_3...
Can it be validated using a SPARQL 1.1 query? For example, if the sequence is valid, an ASK query should return true, otherwise false.
The constraints that I can think of right now:
must start with rdf:_1
no repeated properties (rdf:_2, rdf:_2)
no "holes" in the sequence (rdf:_2, rdf:_4)
I suspect a completely generic solution can be difficult due to the absence of recursion in SPARQL. But a pragmatic solution that validates a finite number of members (say 30 or 50) would be acceptable as well.

One could just check, if there is some item with another item of the same position or if there is an item without direct predecessor. Unfortunately, this requires some ugly and expensive string operations, so this will probably take longer on large datasets. This query ignores that the duplicate might actually be the same (owl:sameAs).
Sample Data:
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
#prefix ex: <http://example.org/>
ex:a a rdf:Seq;
rdf:_1 ex:a1 ;
rdf:_2 ex:a2 .
ex:b a rdf:Bag;
rdf:_1 ex:b1 ;
rdf:_1 ex:b2 .
ex:c a rdf:Alt;
rdf:_2 ex:c1 .
ex:d a rdf:Seq;
rdf:_0 ex:d1 ;
rdf:_1 ex:d2 .
Query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?container ?containerMembershipProperty ?item ?error WHERE {
{
?container a rdf:Seq .
} UNION {
?container a rdf:Bag .
} UNION {
?container a rdf:Alt .
}
{
BIND("Duplicated Index" AS ?error)
?container ?containerMembershipProperty ?item .
FILTER strstarts(str(?containerMembershipProperty),"http://www.w3.org/1999/02/22-rdf-syntax-ns#_")
FILTER EXISTS {
?container ?containerMembershipProperty ?item2 .
FILTER (?item!=?item2)
}
} UNION {
BIND("Missing Predecessor" AS ?error)
?container ?containerMembershipProperty ?item .
BIND(xsd:integer(STRAFTER(STR(?containerMembershipProperty),"http://www.w3.org/1999/02/22-rdf-syntax-ns#_")) - 1 AS ?index)
BIND(IRI(CONCAT("http://www.w3.org/1999/02/22-rdf-syntax-ns#_", STR(?index))) AS ?previousContainerMembershipProperty)
FILTER (?index >= 1)
FILTER NOT EXISTS {
?container ?previousContainerMembershipProperty ?item3
}
} UNION {
BIND("Illegal Index" AS ?error)
?container ?containerMembershipProperty ?item .
BIND(xsd:integer(STRAFTER(STR(?containerMembershipProperty),"http://www.w3.org/1999/02/22-rdf-syntax-ns#_")) AS ?index)
FILTER (?index < 1)
}
}
ORDER BY ?container ?containerMembershipProperty ?item
Result:
container
containerMembershipProperty
item
error
http://example.org/b
rdf:_1
http://example.org/b1
"Duplicated Index"
http://example.org/b
rdf:_1
http://example.org/b2
"Duplicated Index"
http://example.org/c
rdf:_2
http://example.org/c1
"Missing Predecessor"
http://example.org/d
rdf:_0
http://example.org/d1
"Illegal Index"

Related

Join has cross-product in SPARQL query

I am trying to come up with a trivial minimal functional example of a federated query not using any web interface or something, but a local query engine (in my case AG free tier).
Now this at least doesn't throw any errors and returns stuff, but I'm getting a warning that the query contains a cross-product. Now I guess the problem is that there is nothing that actually relates the results from the two endpoints.
join has cross-product. Unique LHS variables: ?actor_1_1, ?birthDate, ?spouseURI_1_2, ?spouseName; Unique RHS variables ?actor_2_1, ?gender
But how do I relate them then? In a generic manner, since I guess this information needs to be passed to the query-resolver, and eventually I'll query other databases.
PREFIX dbpo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?birthDate ?spouseName ?gender {
{ SERVICE <https://dbpedia.org/sparql>
{ SELECT ?birthDate ?spouseName WHERE {
?actor rdfs:label "Arnold Schwarzenegger"#en ;
dbpo:birthDate ?birthDate ;
dbpo:spouse ?spouseURI .
?spouseURI rdfs:label ?spouseName .
FILTER ( lang(?spouseName) = "en" )
}
}
}
{ SERVICE <https://query.wikidata.org/sparql>
{ SELECT ?gender WHERE {
?actor wdt:P1559 "Arnold Alois Schwarzenegger"#de .
?actor wdt:P21 ?gender .
}
}
}
}
I'd appreciate any help here, I haven't had as hard a time getting into a subject since forever.

Ask for the presence of all specified values?

I have exactly these triples in GraphDB:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
<http://example.com/greeting>
a <http://example.com/word> ;
rdfs:label "hello" .
I want to know if there is a thing in my triplestore with the label "hello" and another with the label "goodbye"
PREFIX : <http://example.com/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
ASK
WHERE
{ VALUES ?l { "goodbye" "hello" }
?s a :thing ;
rdfs:label ?l
}
I am being told that yes, this is true, as if it is saying at least one of those are true. But I want to know if all of those patterns are true.
Can I do that in a SPARQL ASK?
I also tried the following but got the same (unwanted) result:
PREFIX : <http://example.com/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
ASK
WHERE
{ VALUES (?l) { ("goodbye") ("hello") }
?s a :thing ;
rdfs:label ?l
}
Sanity check: the answer to this ASK is false
PREFIX : <http://example.com/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
ASK
WHERE
{ VALUES (?l) { ("goodbye") }
?s a :thing ;
rdfs:label ?l
}
Rather than asking "are all of these present", ask "is any of these not present" (and then negate the result either I've your application or with another "filter not exists":
ask {
values ?label { "hello" "goodbye" }
filter not exists {
?s a :thing ; rdfs:label ?word
}
}
My group is developing tools to convert tabular data about hospital records into RDF triples and then perform various cleanups and aggregations.
I have started this "answer" by writing out an illustration of this workflow with sample data.
Working implementations of the suggestions from #Joshua Taylor and #AKSW's are at the bottom.
The tabular data is first converted into "shortcut triples", which instantiate a minimal number of classes and link all literal values to those classes, even if the literal values are really "more about" something else.
So tabular data like this:
+-------+------------+----------+----------+
| EncID | EncDate | DiagCode | CodeType |
+-------+------------+----------+----------+
| 102 | 12/05/2015 | J44.9 | ICD-10 |
| 103 | 11/25/2015 | 602.9 | ICD-9 |
| 102 | 12/05/2015 | I50.9 | ICD-10 |
+-------+------------+----------+----------+
First becomes triples like this (ignoring the EncDates and CodeTypes.)
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX turbo: <http://example.org/ontologies/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
INSERT DATA {
GRAPH turbo:encounters_from_karma {
turbo:5f62d61cee174283a4f875ccb8bb91a1 rdf:type obo:OGMS_0000097 .
turbo:5f62d61cee174283a4f875ccb8bb91a1 turbo:ScEnc2DiagCode "J44.9" .
turbo:5f62d61cee174283a4f875ccb8bb91a1 turbo:ScEnc2DiagCodeRegText "ICD-10" .
turbo:5f62d61cee174283a4f875ccb8bb91a1 turbo:ScEnc2EncID "102" .
turbo:81fcbb5c5bd141c9bde7f23321648ff7 rdf:type obo:OGMS_0000097 .
turbo:81fcbb5c5bd141c9bde7f23321648ff7 turbo:ScEnc2DiagCode "I50.9" .
turbo:81fcbb5c5bd141c9bde7f23321648ff7 turbo:ScEnc2DiagCodeRegText "ICD-10" .
turbo:81fcbb5c5bd141c9bde7f23321648ff7 turbo:ScEnc2EncID "102" .
turbo:820dd597229244ab853ed845dd740f1f rdf:type obo:OGMS_0000097 .
turbo:820dd597229244ab853ed845dd740f1f turbo:ScEnc2DiagCode "602.9" .
turbo:820dd597229244ab853ed845dd740f1f turbo:ScEnc2DiagCodeRegText "ICD-9" .
turbo:820dd597229244ab853ed845dd740f1f turbo:ScEnc2EncID "103" . }
}
And is then expanded like this
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX turbo: <http://example.org/ontologies/>
INSERT {
GRAPH turbo:expanded_encounters {
?NewEnc rdf:type obo:OGMS_0000097 .
?NewEnc turbo:previousUriText ?previousUriText .
?NewEnc obo:OBI_0000299 ?DiagCrid .
?DiagCrid rdf:type turbo:DiagCrid .
?DiagCrid obo:BFO_0000051 ?DiagSymb .
?DiagSymb rdf:type turbo:EncounterDiagCodeSymbol .
?DiagSymb turbo:thingLiteralValue ?DiagSymbVal .
}
}
WHERE
{ GRAPH turbo:encounters_from_karma
{ ?EncFromKarma
rdf:type obo:OGMS_0000097 ;
turbo:ScEnc2DiagCode ?DiagSymbVal
BIND(str(?EncFromKarma) AS ?previousUriText)
BIND(uri(concat("http://transformunify.org/ontologies/", struuid())) AS ?NewEnc)
BIND(uri(concat("http://transformunify.org/ontologies/", struuid())) AS ?DiagCrid)
BIND(uri(concat("http://transformunify.org/ontologies/", struuid())) AS ?DiagSymb)
}
}
And therefore looks like this:
#prefix turbo: <http://purl.obolibrary.org/obo/> .
#prefix obo: <http://example.org/ontologies/> .
<http://example.org/ontologies/b9dc5b08-cf1b-465e-8773-4b19bfbcf803>
a <http://purl.obolibrary.org/obo/OGMS_0000097> ;
turbo:OBI_0000299 <http://example.org/ontologies/8a04f52f-22d2-4aab-bacf-d96e1c7fe900> ;
obo:previousUriText "http://example.org/ontologies/5f62d61cee174283a4f875ccb8bb91a1" .
obo:8a04f52f-22d2-4aab-bacf-d96e1c7fe900
a obo:DiagCrid ;
turbo:BFO_0000051 obo:6738d8c0-8bb8-4078-8430-5e9294e5af15 .
obo:6738d8c0-8bb8-4078-8430-5e9294e5af15
a obo:EncounterDiagCodeSymbol ;
obo:thingLiteralValue "J44.9" .
obo:d3a8a700-2eb9-420d-a863-d47462fa393c
a turbo:OGMS_0000097 ;
turbo:OBI_0000299 obo:c12acc26-6dbe-486d-9ae9-9f34c9561aea ;
obo:previousUriText "http://example.org/ontologies/81fcbb5c5bd141c9bde7f23321648ff7" .
obo:c12acc26-6dbe-486d-9ae9-9f34c9561aea
a obo:DiagCrid ;
turbo:BFO_0000051 obo:3b784151-b369-4594-9ce3-285f5fe60850 .
obo:3b784151-b369-4594-9ce3-285f5fe60850
a obo:EncounterDiagCodeSymbol ;
obo:thingLiteralValue "I50.9" .
obo:af0e949a-99e4-48cd-885b-7cb1aa3dd265
a turbo:OGMS_0000097 ;
turbo:OBI_0000299 obo:c2da52ec-7331-4011-b8f8-6fbf8b419708 ;
obo:previousUriText "http://example.org/ontologies/820dd597229244ab853ed845dd740f1f" .
obo:c2da52ec-7331-4011-b8f8-6fbf8b419708
a obo:DiagCrid ;
turbo:BFO_0000051 obo:7f8399ef-5fcd-447c-80a4-18dfb160e99c .
obo:7f8399ef-5fcd-447c-80a4-18dfb160e99c
a obo:EncounterDiagCodeSymbol ;
obo:thingLiteralValue "602.9" .
Finally, I can check for the correct transformation with the suggestions from #Joshua Taylor or #AKSW:
#Joshua Taylor
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX turbo: <http://example.org/ontologies/>
ASK
WHERE
{ { GRAPH turbo:expanded_encounters
{ VALUES ( ?previousUriTextVal ?DiagSymbVal ) {
( "http://example.org/ontologies/5f62d61cee174283a4f875ccb8bb91a1" "J44.9" )
( "http://example.org/ontologies/820dd597229244ab853ed845dd740f1f" "602.9" )
( "http://example.org/ontologies/81fcbb5c5bd141c9bde7f23321648ff7" "I50.9" )
}
FILTER NOT EXISTS { ?NewEnc rdf:type obo:OGMS_0000097 ;
turbo:previousUriText ?previousUriTextVal ;
obo:OBI_0000299 ?DiagCrid .
?DiagCrid rdf:type turbo:DiagCrid ;
obo:BFO_0000051 ?DiagSymb .
?DiagSymb rdf:type turbo:EncounterDiagCodeSymbol ;
turbo:thingLiteralValue ?DiagSymbVal
}
}
}
}
#AKSW
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX turbo: <http://example.org/ontologies/>
ASK
WHERE
{ GRAPH turbo:expanded_encounters
{ FILTER ( ?count = 3 )
{ SELECT (COUNT(DISTINCT ?NewEnc) AS ?count)
WHERE
{ VALUES ( ?previousUriTextVal ?DiagSymbVal ) {
( "http://example.org/ontologies/5f62d61cee174283a4f875ccb8bb91a1" "J44.9" )
( "http://example.org/ontologies/820dd597229244ab853ed845dd740f1f" "602.9" )
( "http://example.org/ontologies/81fcbb5c5bd141c9bde7f23321648ff7" "I50.9" )
}
?NewEnc rdf:type obo:OGMS_0000097 ;
turbo:previousUriText ?previousUriTextVal ;
obo:OBI_0000299 ?DiagCrid .
?DiagCrid rdf:type turbo:DiagCrid ;
obo:BFO_0000051 ?DiagSymb .
?DiagSymb rdf:type turbo:EncounterDiagCodeSymbol ;
turbo:thingLiteralValue ?DiagSymbVal
}
}
}
}

sparql exclude multiple type hierarchy

In dbpedia I select some pages with label starting 'A'. Here I'm using additional filter by subject to narrow the set. In original version there are another conditions (result set is much bigger)
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX purl: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://dbpedia.org/page/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT DISTINCT
?pageType
WHERE
{
{
?page rdfs:label ?label .
?page a ?pageType .
?page <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Banking> .
}
FILTER ( strstarts(str(?pageType), 'http://dbpedia.org/ontology') )
}
LIMIT 1000
sparql results
Here I select only page types to be clear with rest of the question.
This is the whole set. Now I want to exclude some pages. Exclude all agents (persons, organization etc):
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX purl: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://dbpedia.org/page/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT DISTINCT
?pageType
WHERE
{
{
?page rdfs:label ?label .
?page a ?pageType .
?page <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Banking> .
MINUS { ?page a dbo:Agent }
}
FILTER ( strstarts(str(?pageType), 'http://dbpedia.org/ontology') )
}
LIMIT 1000
The result.
Ok. Then I want to exclude more types, for example Written_Work. I tried different approaches, but unabled to find the correct one.
This returns nothing:
WHERE
{
{
?page rdfs:label ?label .
?page a ?pageType .
?page <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Banking> .
MINUS { ?page a dbo:Agent }
MINUS { ?page a dbo:WrittenWork }
}
This is like no filter is set:
WHERE
{
{
?page rdfs:label ?label .
?page a ?pageType .
?page <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Banking> .
MINUS { ?page a dbo:Agent, dbo:WrittenWork }
}
The question is:
what way should I go to exclude pages of certain types (direct and superclass)?
It look's like this is working answer (how to exclude multiple of types)
{
?page purl:subject ?id .
?page a ?pageType .
FILTER NOT EXISTS {
?page a/rdfs:subClassOf* ?skipClasses .
FILTER(?skipClasses in (dbo:Agent, dbo:Place, dbo:Work))
}
}
In this example all dbo:Agents, db:Places, dbo:Works will be filtered out.

How to get `filter not exists` query working with ARC2?

I have an ARC2 based RDF store setup and filled with some data. The following query returns the expected results (all URIs of things of type vcard:Individual or foaf:Person) when I run it without the FILTER NOT EXISTS part:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX app: <http://example.com#>
SELECT ?person
WHERE {
?person rdf:type ?type .
FILTER (?type = vcard:Individual || ?type = foaf:Person)
FILTER NOT EXISTS {
?person app:id ?id
}
}
When I try to exclude those things having the app:id property by running the full query, it fails with the following message:
Incomplete FILTER in ARC2_SPARQLPlusParser
Incomplete or invalid Group Graph pattern. Could not handle " FILTER NOT EXISTS { " in ARC2_SPARQLPlusParser
When testing the query in the query validator on sparql.org, there are no issues. Is there something wrong with my query that I am missing or is this a shortcoming of ARC2?
Can you think of an alternative query that I can try with ARC2 to get the desired result?
ARC2 does not support SPARQL 1.1, thus, you have to use the common OPTIONAL-FILTER(!BOUND()) pattern:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX app: <http://example.com#>
SELECT ?person
WHERE {
?person rdf:type ?type .
FILTER (?type = vcard:Individual || ?type = foaf:Person)
OPTIONAL {
?person app:id ?id
}
FILTER ( !BOUND(?id) )
}

SPARQL Matching Literals with **ANY** Language Tags without run into timeout

I need to select the entity that have a "taxon rank (P105)" of "species (Q7432)" which have a label that match a literal string such as "Topinambur".
I'm testing the queries on https://query.wikidata.org;
this query goes fine and return the entity to me with satisfying response time:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?entity rdfs:label "Topinambur"#de .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
The problem here is that my requisite is not to specify the language but the lexical forms of the labels in the underlying dataset ( wikidata) has language tags so i need a way to get Literal Equality for any language.
I tried some possible solution but I didn't find any query that didn't result in the following:
TIMEOUT message com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired
Here the list of what I tried (..and I always get TIMEOUT) :
1) based on this answer I tried:
SELECT * WHERE {
?entity rdfs:label ?label FILTER ( str( ?label ) = "Topinambur") .
?entity wdt:P105 wd:Q7432.
}
LIMIT 100
2) based on some other documentation I tried:
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label FILTER regex(?label, "^Topinambur") .
}
LIMIT 100
3) and
SELECT * WHERE {
?entity wdt:P105 wd:Q7432.
?entity rdfs:label ?label .
FILTER langMatches( lang(?label), "*" )
FILTER (?label = "Topinambur")
}
LIMIT 100
What I'm looking for is a performant solution or some SPARQL syntax the doesn't end up to a TIMEOUT message.
PS: with reference to http://www.rfc-editor.org/rfc/bcp/bcp47.txt I don't understand if language ranges or ```wildcards`` could help in some way.
EDIT
I successfully tested (without falling timeout) a similar query in DbPedia by using virtuoso query editor at:
https://dbpedia.org/sparql
Default Data Set Name (Graph IRI):http://dbpedia.org
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource
WHERE {
?resource rdfs:label ?label . FILTER ( str( ?label ) = "Topinambur").
?resource rdf:type dbo:Species
}
LIMIT 100
I am still very interested in understanding the performance problem that I experience on Wikidata and what is the best syntax to use.
I solved similar problem - want to find entity with label string in any language. I recommend do not use FILTER, because it is too slow. Rather use UNION like this:
SELECT ?entity WHERE {
?entity wdt:P105 wd:Q7432.
{ ?entity rdfs:label "Topinambur"#de . }
UNION { ?entity rdfs:label "Topinambur"#en . }
UNION { ?entity rdfs:label "Topinambur"#fr . }
}
GROUP BY ?entity
LIMIT 100
Try it!
This solution is not perfect, because you have to enumater all languages, but is fast and reliable. List of all available wikidata language are here.
This answer proposes three options:
Be more specific.
The ?entity wdt:P171+ wd:Q25314 pattern seems to be sufficiently selective in your case.
Wait until they implement full-text search.
Use Quarry (example query).
Another option is to use Virtuoso full-text search capabilities on wikidata.dbpedia.org:
SELECT ?s WHERE {
?resource rdfs:label ?label .
?label bif:contains "'topinambur'" .
BIND ( IRI ( REPLACE ( STR(?resource),
"http://wikidata.dbpedia.org/resource",
"http://www.wikidata.org/entity"
)
) AS ?s
)
}
Try it!
It seems that even the query below sometime works on wikidata.dbpedia.org without falling into timeout:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?resource WHERE {
?resource rdfs:label ?label .
FILTER ( STR(?label) = "Topinambur" ) .
}
Try it!
Two hours ago I've removed this statement on Wikidata:
wd:Q161378 rdfs:label "topinambur"#ru .
I'm not a botanist, but 'topinambur' is definitely not a word in Russian.
Working further on from #quick's answer, and showing it for lexemes rather than labels. First identifying relevant language codes:
SELECT (GROUP_CONCAT(?mword; separator=" ") AS ?mwords) {
BIND(1 AS ?dummy)
VALUES ?word { "topinambur" }
{
SELECT (COUNT(?lexeme) AS ?count) ?language_code {
?lexeme dct:language / wdt:P424 ?language_code .
}
GROUP BY ?language_code
HAVING (?count > 100)
ORDER BY DESC(?count)
}
BIND(CONCAT('"', ?word, '"#', ?language_code) AS ?mword)
}
GROUP BY ?dummy
Try it!
Followed by the verbose query
SELECT (COUNT(?lexeme) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?lexeme dct:language ?language ;
ontolex:lexicalForm / ontolex:representation ?word .
}
GROUP BY ?language
Try it!
For querying on labels, do something similar to:
SELECT (COUNT(?item) AS ?count) ?language (GROUP_CONCAT(?word; separator=" ") AS ?words) {
VALUES ?word { "topinambur"#eo "topinambur"#ko "topinambur"#bfi "topinambur"#nl "topinambur"#uk "topinambur"#cy "topinambur"#pt "topinambur"#zh "topinambur"#br "topinambur"#bg "topinambur"#ms "topinambur"#tg "topinambur"#se "topinambur"#ta "topinambur"#non "topinambur"#it "topinambur"#zh-min-nan "topinambur"#nan "topinambur"#fi "topinambur"#jbo "topinambur"#ml "topinambur"#ja "topinambur"#ku "topinambur"#bn "topinambur"#ar "topinambur"#nb "topinambur"#es "topinambur"#pl "topinambur"#nn "topinambur"#sk "topinambur"#da "topinambur"#de "topinambur"#cs "topinambur"#fr "topinambur"#sv "topinambur"#eu "topinambur"#he "topinambur"#la "topinambur"#en "topinambur"#ru }
?item rdfs:label ?word ;
}
GROUP BY ?language