SPARQL group concat union - sparql

I'm trying to GROUP_CONCAT a UNION of two sets of triples.
Is this not allowed?
PREFIX bo: <https://webfiles.uci.edu/jenniyk2/businessontology#>
SELECT (GROUP_CONCAT(DISTINCT ?m2;SEPARATOR = ", ") AS ?comp)
WHERE
{
{{SELECT ?m2 ?c ?p
WHERE { ?c rdfs:label ?m. ?c2 rdfs:label ?m2. ?so bo:owner ?p.
?so bo:sharesIn ?c. ?so2 bo:owner ?p. ?so2 bo:sharesIn ?c2. }
}
UNION
{SELECT ?m2 ?c ?p
WHERE { ?c rdfs:label ?m. ?c2 rdfs:label ?m2. ?dir bo:isPartOf ?c.
?dir bo:isDirectedBy ?p. ?dir2 bo:isPartOf ?c2. ?dir2 bo:isDirectedBy ?p.}
}}
GROUP BY ?c
HAVING (COUNT(?m2) >1)}
It says there's an error in the very last line.
Found group. Was expecting one of : BIND, BLANK_NODE_LABEL, DECIMAL,
DOUBLE, FALSE, FILTER, GEO, GRAPH, INTEGER, MINUS, NIL-SYMBOL, OPTIONAL,
Q_IRI_REF, QNAME, QNAME_NS, SERVICE, STRING_LITERAL1, STRING_LITERAL2,
STRING_LITERAL_LONG1, STRING_LITERAL_LONG2, TEXTINDEX, TRUE, UNION, VALUES,
VARNAME or punctuation '(', '+', '-', '.', '[', '[]', '{', '}'.

The GROUP BY and HAVING clauses should be outside the WHERE clause, not inside it as they currently are. To make your query syntactically correct, remove the closing } from the last line and add it on the line above your GROUP BY:
PREFIX bo: <https://webfiles.uci.edu/jenniyk2/businessontology#>
SELECT (GROUP_CONCAT(DISTINCT ?m2;SEPARATOR = ", ") AS ?comp)
WHERE
{
{{SELECT ?m2 ?c ?p
WHERE { ?c rdfs:label ?m. ?c2 rdfs:label ?m2. ?so bo:owner ?p.
?so bo:sharesIn ?c. ?so2 bo:owner ?p. ?so2 bo:sharesIn ?c2. }
}
UNION
{SELECT ?m2 ?c ?p
WHERE { ?c rdfs:label ?m. ?c2 rdfs:label ?m2. ?dir bo:isPartOf ?c.
?dir bo:isDirectedBy ?p. ?dir2 bo:isPartOf ?c2. ?dir2 bo:isDirectedBy ?p.}
}}}
GROUP BY ?c
HAVING (COUNT(?m2) >1)
As Joshua also indicated, the query can be written a lot simpler, but this should solve your immediate problem.

Jeen Broekstra's answer identifies the typographical problem that causes your syntax error, but I think it's worth mentioning that this query can be simplified a bit, because simpler code from the beginning makes it easier to spot typos and syntax errors when they arise. If I read it correctly, it looks like the body of the query simplifies to this:
?c rdfs:label ?m .
?c2 rdfs:label ?m2 .
{ ?so bo:owner ?p ;
bo:sharesIn ?c .
?so2 bo:owner ?p ;
bo:sharesIn ?c2 . }
union
{ ?dir bo:isPartOf ?c ;
bo:isDirectedBy ?p .
?dir2 bo:isPartOf ?c2 ;
bo:isDirectedBy ?p }
You're looking for a ?c that's connected by one of two certain types of path to a ?c2, and you want to group by ?c and concatenate the distinct labels of ?c2. I see two ways that this could be simplified further. The first is with the use of property paths, since you don't actually use the values of many of these variables. With property paths, the body could become:
?c rdfs:label ?m .
?c2 rdfs:label ?m2 .
{ ?c ^bo:sharesIn/bo:owner/^bo:owner/bo:sharesIn ?c2 }
union
{ ?c ^bo:isPartOf/bo:isDirectedBy/^bo:isDirectedBy/bo:isPartOf ?c2 }
As another alternative, since the structure of the pattern in each union alternative is similar, you could abstract a bit with values:
?c rdfs:label ?m .
?c2 rdfs:label ?m2 .
values (?sharesIn_isPartOf ?owner_isDirectedBy) {
(bo:sharesIn bo:owner)
(bo:isPartOf bo:isDirectedBy)
}
?x ?owner_isDirectedBy ?y ;
?sharesIn_isPartOf ?c .
?x2 ?owner_isDirectedBy ?y ;
?sharesIn_isPartOf ?c2 .

Related

Sparql query issue with DBpedia retrieving

I'm trying to run this Sparql statement to read data from DBpedia; However, it only return the column names and no row data.
If anybody can let me know what's the issue. I'll be more appreciated
Sparql query below:
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?actor ?movie ?director ?movie_date
WHERE {
?m dc:subject <http://dbpedia.org/resource/Category:American_films> .
?m rdfs:label ?movie .
FILTER(LANG(?movie) = "en")
?m dbp:released ?movie_date .
FILTER(DATATYPE(?movie_date) = xsd:date)
?m dbp:starring ?a .
?a rdfs:label ?actor .
FILTER(LANG(?actor) = "en")
?m dbp:director ?d .
?d rdfs:label ?director .
FILTER(LANG(?director) = "en")
}
The problem is not all the queried triples exist. If at least one of the relations doesnt exist, no result will be given.
So with the OPTIONAL keyword, answers will be given only if they exist and wont block your query if they don't. It simply leaves the missing value empty.
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?actor ?movie ?director ?movie_date
WHERE {
?m dc:subject <http://dbpedia.org/resource/Category:American_films> .
OPTIONAL{
?m rdfs:label ?movie .
FILTER(LANG(?movie) = "en") }
OPTIONAL {
?m dbp:released ?movie_date .
FILTER(DATATYPE(?movie_date) = xsd:date) }
OPTIONAL{
?m dbp:starring ?a .
?a rdfs:label ?actor .
FILTER(LANG(?actor) = "en") }
OPTIONAL{
?m dbp:director ?d .
?d rdfs:label ?director .
FILTER(LANG(?director) = "en") }
}

Aggregate properties

I'm developing my own Fuseki endpoint from some DBpedia data.
I'm in doubt on how to aggregate properties related to a single resource.
SELECT ?name ?website ?abstract ?genre ?image
WHERE{
VALUES ?s {<http://dbpedia.org/resource/Attack_Attack!>}
?s foaf:name ?name ;
dbo:abstract ?abstract .
OPTIONAL { ?s dbo:genre ?genre } .
OPTIONAL { ?s dbp:website ?website } .
OPTIONAL { ?s dbo:image ?image } .
FILTER LANGMATCHES(LANG(?abstract ), "en")
}
SPARQL endpoint: http://dbpedia.org/sparql/
This query returns 2 matching results. They are different just for the dbo:genre value. There is a way I can query the knowledge base and retrieving a single result with a list of genres?
#chrisis's query works well on the DBpedia SPARQL Endpoint, which is based on Virtuoso.
However, if you are using Jena Fuseki, you should use more conformant syntax:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT
?name
(SAMPLE(?website) AS ?sample_website)
(SAMPLE(?abstract) AS ?sample_abstract)
(SAMPLE(?image) AS ?sample_image)
(GROUP_CONCAT(?genre; separator=', ') AS ?genres)
WHERE {
VALUES (?s) {(<http://dbpedia.org/resource/Attack_Attack!>)}
?s foaf:name ?name ;
dbo:abstract ?abstract .
OPTIONAL { ?s dbo:genre ?genre } .
OPTIONAL { ?s dbp:website ?website } .
OPTIONAL { ?s dbo:image ?image} .
FILTER LANGMATCHES(LANG(?abstract ), "en")
} GROUP BY ?name
The differences from the #chrisis's query are:
Since GROUP_CONCAT is an aggregation function, it might be used with GROUP BY only;
Since GROUP BY is used, all non-grouping variables should be aggregated (e.g. via SAMPLE);
GROUP_CONCAT syntax is slightly different.
In Fuseki, these AS in the projection are in fact superfluous: see this question and comments.
Yes, the GROUP_CONCAT() function is what you want.
SELECT ?name ?website ?abstract (GROUP_CONCAT(?genre,',') AS ?genres) ?image
WHERE{
<http://dbpedia.org/resource/Attack_Attack!> a dbo:Band ;
foaf:name ?name;
dbo:abstract ?abstract .
OPTIONAL{ <http://dbpedia.org/resource/Attack_Attack!> dbo:genre ?genre } .
OPTIONAL{ <http://dbpedia.org/resource/Attack_Attack!> dbp:website ?website} .
OPTIONAL{ <http://dbpedia.org/resource/Attack_Attack!> dbo:image ?image} .
FILTER LANGMATCHES(LANG(?abstract ), "en")
}

Sparql of DbPedia based upon name not subject

I am trying to query dbpedia to get some people data and I don't have subjects just names of the people I want to query and their birth/death dates.
I am trying to do a query along these lines. I want the name, birth date, death date and thumbnail of everyone with the surname Presley. What I then intend to do is loop through the results returned and find the best match for Elvis Presley 1935-1977 which is the data I have.
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?Name ?thumbnail ?birthDate ?deathDate WHERE {
{
dbo:name ?Name ;
dbo:birthDate ?birthDate ;
dbo:birthDate ?deathDate ;
dbo:thumbnail ?thumbnail ;
FILTER contains(?Name#en, "Presley")
}
What is the best way to construct my sparql query?
UPDATE:
I have put together this query which seems to work to some extent but I don't entirely understand it, and I can't figure out the contains, but it does at least run and return results.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?subject ?thumbnail ?birthdate ?deathdate WHERE {
{
?subject rdfs:label "Elvis Presley"#en ;
dbo:thumbnail ?thumbnail ;
dbo:birthDate ?birthdate ;
dbo:deathDate ?deathdate ;
a owl:Thing .
}
UNION
{
?altName rdfs:label "Elvis Presley"#en ;
dbo:thumbnail ?thumbnail ;
dbo:birthDate ?birthdate ;
dbo:deathDate ?deathdate ;
dbo:wikiPageRedirects ?s .
}
}
Some entities might not have all of that information, so it's better to use optional. You can use foaf:surname to check for surname directly.
select * where {
?s foaf:surname "Presley"#en
optional { ?s dbo:name ?name }
optional { ?s dbo:birthDate ?birth }
optional { ?s dbo:deathDate ?death }
optional { ?s dbo:thumbnail ?thumb }
}

How to use Sparql Contains to match similar String?

I'm trying to grab some definition in dbpedia inside my thesaurus.
Although can find country that have a label that match my country, i don't get all of them. So i try to match similar label with contains but it does not work.
Any idea why.
SELECT distinct ?idbcountry ?label ?labelDb ?def
WHERE {
?idbcountry a skos:Concept .
?idbcountry rdfs:label ?label .
?idbcountry skos:inScheme iadb:IdBCountries .
FILTER(lang(?label) = "en")
Service <http://dbpedia.org/sparql> {
?s a <http://dbpedia.org/ontology/Country> .
?s rdfs:label ?labelDb .
FILTER(CONTAINS (?labelDb, ?label)).
?s rdfs:comment ?def .
FILTER(lang(?def) = "en") .
FILTER(lang(?labelDb) = "en") .
}}
The exact matching query that works is as follows:
SELECT distinct ?idbcountry ?label ?def
WHERE {
?idbcountry a skos:Concept .
?idbcountry rdfs:label ?label .
?idbcountry skos:inScheme iadb:IdBCountries .
FILTER(lang(?label) = "en")
Service <http://dbpedia.org/sparql> {
?s a <http://dbpedia.org/ontology/Country> .
?s rdfs:label ?label .
?s rdfs:comment ?def
FILTER(lang(?def) = "en")
}
}
EDIT1
Data Samples:
<http://thesaurus.iadb.org/publicthesauri/10157002136735779158437>
rdf:type skos:Concept ;
dct:created "2015-03-27T16:43:48.052-04:00"^^xsd:dateTime ;
rdfs:label "BO"#en ;
rdfs:label "Bolivia"#en ;
rdfs:label "Bolivia"#es ;
rdfs:label "Bolivie"#fr ;
rdfs:label "Bolívia"#pt ;
skos:altLabel "BO"#en ;
skos:definition "Bolivia (/bəˈlɪviə/, Spanish: [boˈliβja], Quechua: Buliwya, Aymara: Wuliwya), officially known as the Plurinational State of Bolivia (Spanish: Estado Plurinacional de Bolivia locally: [esˈtaðo pluɾinasjoˈnal de βoˈliβja]), is a landlocked country located in western-central South America."#en ;
skos:inScheme :IdBCountries ;
skos:prefLabel "Bolivia"#en ;
skos:prefLabel "Bolivia"#es ;
skos:prefLabel "Bolivie"#fr ;
skos:prefLabel "Bolívia"#pt ;
skos:topConceptOf :IdBCountries ;
<http://xmlns.com/foaf/0.1/focus> <http://dbpedia.org/resource/Bolivia> ;
Without seeing your data, we can't know why your query isn't working. However, using contains is pretty straightforward. It's just a matter of contains(string,substring). As Jeen said, we can't reproduce your problem without knowing what your data looks like, but here's an example of contains in action:
select distinct ?country ?label {
?country a dbpedia-owl:Country ; #-- select countries
rdfs:label ?label . #-- and get labels
filter langMatches(lang(?label),"en") #-- but only English labels
filter contains(?label,"land") #-- containing "land"
}
SPARQL results

How to SPARQL optional

I have to use this Spaql query to retrive information about a person, my problem is to break the optional construct up into multiple optional constructs.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbpprop: <http://dbpedia.org/property/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?label ?abstract ?placeOfBirth
?birthPlace ?birthDate ?page ?thumbnail
WHERE {
<http://dbpedia.org/resource/Ismail_Kadare> rdfs:label ?label ;
dbo:abstract ?abstract ;
foaf:page ?page .
OPTIONAL {
<http://dbpedia.org/resource/Ismail_Kadare> dbpprop:placeOfBirth ?placeOfBirth ;
dbpprop:birthPlace ?birthPlace ;
dbo:birthDate ?birthDate ;
dbo:thumbnail ?thumbnail .
}
FILTER (LANG(?label) = 'en')
FILTER (LANG(?abstract) = 'en')
}
LIMIT 1
Splitting the OPTIONAL pattern into parts
The pattern
<http://dbpedia.org/resource/Ismail_Kadare> dbpprop:placeOfBirth ?placeOfBirth ;
dbpprop:birthPlace ?birthPlace ;
dbo:birthDate ?birthDate ;
dbo:thumbnail ?thumbnail .
is shorthand for four triple patterns:
<http://dbpedia.org/resource/Ismail_Kadare> dbpprop:placeOfBirth ?placeOfBirth .
<http://dbpedia.org/resource/Ismail_Kadare> dbpprop:birthPlace ?birthPlace .
<http://dbpedia.org/resource/Ismail_Kadare> dbo:birthDate ?birthDate .
<http://dbpedia.org/resource/Ismail_Kadare> dbo:thumbnail ?thumbnail .
Instead of OPTIONAL { …first pattern… }, you just need to use four optional blocks, one for each of the four triple patterns:
optional { <http://dbpedia.org/resource/Ismail_Kadare> dbpprop:placeOfBirth ?placeOfBirth }
optional { <http://dbpedia.org/resource/Ismail_Kadare> dbpprop:birthPlace ?birthPlace }
optional { <http://dbpedia.org/resource/Ismail_Kadare> dbo:birthDate ?birthDate }
optional { <http://dbpedia.org/resource/Ismail_Kadare> dbo:thumbnail ?thumbnail }
Other issues
It's worth nothing that language matching is a bit more complicated than string matching, so rather than
FILTER (LANG(?label) = 'en')
FILTER (LANG(?abstract) = 'en')
you should really be using
filter(langMatches(lang(?label),'en'))
filter(langMatches(lang(?abstract),'en'))
which allows you to retrieve results that use different language tags that are all English.
select distinct and limit 1 aren't both necessary
Notice that select distinct ensures that you don't have any duplicate rows in your results. However, limit 1 means that you'll only have one result at most anyhow, so there won't be any duplicates to remove.
Standard Namespaces
It looks like you're querying against DBpedia, so it might be worthwhile to use the same namespace prefixes that the public endpoint defines, so that you can copy and paste queries and experiment more easily. Doing that (and using a values ?x { dbpedia:Ismail_Kadare } to avoid some typing, we end up with this query:
select ?label ?abstract ?placeOfBirth ?birthPlace ?birthDate ?page ?thumbnail
where {
values ?x { dbpedia:Ismail_Kadare }
?x rdfs:label ?label ;
dbpedia-owl:abstract ?abstract ;
foaf:page ?page .
optional { ?x dbpprop:placeOfBirth ?placeOfBirth }
optional { ?x dbpprop:birthPlace ?birthPlace }
optional { ?x dbpedia-owl:birthDate ?birthDate }
optional { ?x dbpedia-owl:thumbnail ?thumbnail }
filter langMatches(lang(?label),'en')
filter langMatches(lang(?abstract),'en')
}
limit 1
The DBpedia endpoint won't return anything for that query, but that's because http://dbpedia.org/resource/Ismail_Kadare doesn't have a foaf:page property, not because the query is malformed. I don't know whether you're actually running this against DBpedia or not, so that may or not matter.