Grouping SPARQL results for the same property? - sparql

I have the following SPARQL query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?subsidiary ?employees
WHERE {
<http://dbpedia.org/resource/Google> dbo:subsidiary ?subsidiary;
dbo:numberOfEmployees ?employees .
}
And I receive the following response:
{
"head": {
"vars": [ "subsidiary" , "employees" ]
} ,
"results": {
"bindings": [
{
"subsidiary": { "type": "uri" , "value": "http://dbpedia.org/resource/YouTube" } ,
"employees": { "datatype": "http://www.w3.org/2001/XMLSchema#integer" , "type": "typed-literal" , "value": "53861" }
} ,
{
"subsidiary": { "type": "uri" , "value": "http://dbpedia.org/resource/Motorola_Mobility" } ,
"employees": { "datatype": "http://www.w3.org/2001/XMLSchema#integer" , "type": "typed-literal" , "value": "53861" }
} ,
{
"subsidiary": { "type": "uri" , "value": "http://dbpedia.org/resource/Picnik" } ,
"employees": { "datatype": "http://www.w3.org/2001/XMLSchema#integer" , "type": "typed-literal" , "value": "53861" }
}
]
}
}
The number of employees is returned multiple times because their are multiple subsidiaries associated with the queried resource. Is it possible to collapse all these subsidiaries into a list and get something more like:
{
"head": {
"vars": [ "subsidiary" , "employees" ]
} ,
"results": {
"bindings": [
{
"subsidiary": { "type": "uri" , "value": [
"http://dbpedia.org/resource/YouTube",
"http://dbpedia.org/resource/Motorola_Mobility",
"http://dbpedia.org/resource/Picnik"
]
} ,
"employees": { "datatype": "http://www.w3.org/2001/XMLSchema#integer" , "type": "typed-literal" , "value": "53861" }
}
]
}
}

Not really. There isn't a an array datatype in the results format. You can create this association by processing the results, ORDER BY ?subsidiary ?employees will put the rows adjacent that you want to collapse.
There is GROUP_CONCAT but it produces a string which is the "," separated concat of string in a GROUP. It would been parsing. An engine may have extension aggregate functions.

Related

GraphDB Lucene connector: indexing rdfs:label values of a single language

I'm going to pose a question about indexes in GraphDB Lucene connector.
In the context of a multilingual rdf resource, how is it possible to index the rdfs:label values of a single language (for example english) ?
I tried with this:
PREFIX inst: <http://www.ontotext.com/connectors/lucene/instance#>
PREFIX : <http://www.ontotext.com/connectors/lucene#>
INSERT DATA {
inst:lexicalEntryIndex :createConnector '''
{
"types": [
"http://www.w3.org/ns/lemon/ontolex#LexicalEntry"
],
"fields": [
{
"fieldName": "type",
"propertyChain": [
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
"http://www.w3.org/2000/01/rdf-schema#label"
],
"languages": [
"en"
]
}
]
}
''' .
}
but all the languages are indexed.
Thanks in advance,
Andrea
The GraphDB Lucene Connector documentation clearly demonstrates how to index a single language.
Here is a sample snippet how to do it:
PREFIX luc: <http://www.ontotext.com/connectors/lucene#>
PREFIX luc-index: <http://www.ontotext.com/connectors/lucene/instance#>
INSERT DATA {
luc-index:my_index luc:createConnector '''
{
"types": ["http://www.ontotext.com/example#gadget"],
"fields": [
{
"fieldName": "name",
"propertyChain": [
"http://www.ontotext.com/example#name"
]
},
{
"fieldName": "nameLanguage",
"propertyChain": [
"http://www.ontotext.com/example#name",
"lang()"
]
}
], "entityFilter":"?nameLanguage in (\\"en\\")"
}
''' .
}

Ask about raw queries es on graphdb

I'm query by SPARQL like this
PREFIX inst: http://www.ontotext.com/connectors/elasticsearch/instance#
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?entity {
?search a inst:test ;
:query '''{
{
"bool" : {
"should" : [ {
"query_string" : {
"query" : "Vie"
}
}]
}
}
}
''' ;
:entities ?entity .
}
and the connector and data
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {
inst:test :createConnector '''
{
"elasticsearchNode": "localhost:9200",
"types": [
"http://test.com#Person"
],
"fields": [
{
"fieldName": "age",
"propertyChain": [
"http://test.com#age"
],
"analyzed": false
},
{
"fieldName": "learn",
"propertyChain": [
"http://test.com#learn"
],
"fielddata": true
}
]
}
''' .
}
Data
PREFIX test: <http://test.com#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
insert data {
test:5 rdf:type test:Person;
test:age 11112;
test:learn "Vie" .
}
But i always get Error 500
Query evaluation error: Invalid json for raw query given.
I have tried many times but cant query by raw query ES.
How can i raw query es on graphdbb. Thanks a lot.
There is a bug in documentation, that's why this error is thrown. I'll raise issue in GraphDB documentation. The provided example isn't valid JSON. Change query to:
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select ?entity {
?search a inst:test ;
:query '''{
"query" : {
"bool" : {
"should" : [ {
"query_string" : {
"query" : "Vie"
}
}]
}
}
}
''' ;
:entities ?entity .
}

SPARQL CONSTRUCT: restrict scope of BIND variable?

I am using Jena ARQ with this query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sc: <http://iiif.io/api/presentation/2#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX as: <http://www.w3.org/ns/activitystreams#>
CONSTRUCT {?collectionIRI rdf:type sc:Collection .
?collectionIRI as:items ?manifest .
?manifest rdf:type sc:Manifest}
WHERE {GRAPH ?g {?manifest rdf:type sc:Manifest .
BIND (URI(STR(CONCAT("https://api.repository.org/orgs/example.domain.org/collections/" , STRUUID()))) AS ?collectionIRI) .
FILTER(regex(str(?manifest),"example.domain.org"))
}}
The intention is to create a single ?collectionIRI by BINDing STRUUID() to a domain name (e.g. "example.domain.org" - a query parameter).
With this query every ?manifest in the result set is bound to a different ?collectionIRI and the result looks like this:
{
"#graph": [
{
"#id": "https://api.repository.org/orgs/example.domain.org/collections/0342aaf2-b772-48ea-be90-298a555ef5ab",
"#type": "sc:Collection",
"items": "http://example.domain.org/unescoCzechReformation/AIPDIG-NKCR__X_C_23______3CKR674-cs/"
},
{
"#id": "https://api.repository.org/orgs/example.domain.org/collections/04782c88-81cf-4d7b-8ac6-f15d83f094a5",
"#type": "sc:Collection",
"items": "http://example.domain.org/unescoCzechReformation/AIPDIG-NKCR__VIII_G_26___38E5KUD-cs/"
}
]
}
instead of the desired result:
{
"#graph": [
{
"#id": "https://api.repository.org/orgs/example.domain.org/collections/0342aaf2-b772-48ea-be90-298a555ef5ab",
"#type": "sc:Collection",
"items": [
"http://example.domain.org/unescoCzechReformation/AIPDIG-NKCR__VIII_B_6____30XYOX6-cs/",
"http://example.domain.org/unescoCzechReformation/AIPDIG-NKCR__I_E_45______2KDU8A5-cs/"
]
}
]
}
Is it possible to restrict BIND scope with CONSTRUCT?

SPARQL query to access inside a JSON-LD array

Here is my JSON-LD :
"locationCreated": {
"#type": "Place",
"#id": "http://test.fr/city/pacific-grove",
"geo": {
"#type": "GeoCoordinates",
"#id": "http://test.fr/city/pacific-grove/geo",
"latitude": "36.6177374",
"longitude": "-121.9166215"
},
"name": "Pacific Grove",
"alternateName": "Pacific Grove, CA, USA"
}
I want to get the latitude & longitude value with SPARQL. Actually, when i make my query, I just have access to the URI "http://test.fr/city/pacific-grove" of locationCreated.
SPARQL Query:
prefix schema: <http://schema.org/>
select ?locationCreated
where {
?x schema:locationCreated ?locationCreated
}
LIMIT 100
Thank you for your answer.
This should work given your sample data:
prefix schema: <http://schema.org/>
select ?locationCreated ?lat ?long
where {
?x schema:locationCreated ?locationCreated .
?locationCreated schema:geo ?geoData .
?geoData schema:latitude ?lat .
?geoData schema:longitude ?long .
}
LIMIT 100

Getting JSON/Dictionary of all properties in DBPedia for a page/resource from Wikipedia Infobox

I'm trying to get a representation of the infobox of articles on Wikipedia in a Python project. I had tried using the Wikipedia API, but the data it outputs is dirty, so I'm trying to move to DBpedia. I need to be able to query by page name, and receive a dictionary of the property names and their values for that page.
For example, for the query for London, the returned dictionary would contain:
{dbpedia-owl:PopulatedPlace/areaMetro : 8382.0,
dbpedia-owl:PopulatedPlace/areaTotal : 1572.0
.....
dbpedia-owl:populationDensity : 5285.0
.....
}
etc., and from this I would be able to read all the keys that were in the Infobox. I did try using the SPARQL query of
describe <http://dbpedia.org/resource/London>
but that returned tonnes of unnecessary data &emdash; the full set of triplets associated with London &emdash; which is many orders of magnitude more than I need.
How can I write a query to just get the infobox properties, as above?
You might be able to get what you want by selecting properties and objects where the property IRI begins with something you're interested in (e.g., http://dbpedia.org/ontology/). You could use a query like the following. (It takes advantage of the fact that a prefix by itself, e.g., dbpedia-owl:, is still a legal IRI, and you can use str on it. You could also just use the string http://dbpedia.org/ontology/
select ?p ?o where {
dbpedia:London ?p ?o
filter strstarts(str(?p),str(dbpedia-owl:))
}
SPARQL results (HTML Table)
SPARQL results (JSON)
The JSON results aren't quite in the format you're looking for, but are like this:
{ "head": { "link": [], "vars": ["p", "o"] },
"results": { "distinct": false, "ordered": true, "bindings": [
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://mapoflondon.uvic.ca/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.british-history.ac.uk/place.aspx?region=1" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.london.gov.uk/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.museumoflondon.org.uk/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.tfl.gov.uk/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.visitlondon.com/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "https://london.gov.uk/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/wikiPageExternalLink" } , "o": { "type": "uri", "value": "http://www.britishpathe.com/workspace.php?id=2449&delete_record=75105/" }},
{ "p": { "type": "uri", "value": "http://dbpedia.org/ontology/thumbnail" } , "o": { "type": "uri", "value": "http://commons.wikimedia.org/wiki/Special:FilePath/Greater_London_collage_2013.png?width=300" }},
...
That sort of makes sense though, because there's not necessarily a unique value for each property, so a Python dict as in the question probably isn't the best result format (but it'd be easy to create one where multiple values are put into a list).
Also note that the properties that begin with dbpedia-owl: are actually the DBpedia Ontology properties, which have much cleaner data than the raw infobox values, for which properties beginning with dbpprop: are used. You can read more about the different datasets at 4.3. Infobox Data. A query for the raw properties would be pretty much the same though:
select ?p ?o where {
dbpedia:London ?p ?o
filter strstarts(str(?p),str(dbpprop:))
}
SPARQL Results (HTML Table)
To Get Entire data of Page in JSON Format you can also use below method:
Suppose you want JSON data of Taj_Mahal and you have link :
http://dbpedia.org/resource/Taj_Mahal
Now you have to change this URL by replacing /resource/ with /data/ and add .json extension in the end of URL. As given Below:
http://dbpedia.org/data/Taj_Mahal.json
You will get all DBpedia page Matched data with 'Taj_Mahal' in JSON.
Now you have to Expand this 'http://dbpedia.org/resource/Taj_Mahal' in JSON to get only data related to that page.