SPARQL - Insert data from remote endpoint

SPARQL - Insert data from remote endpoint - sparql

How can i query a remote endpoint (like endpoints of DBPedia or Wikidata) and insert resulting triples in a local graph? So far, i know that there are commands like INSERT, ADD, COPY etc. which can be used for such tasks. What i don't understand is how to address a remote endpoint while updating my local graph. Could someone provide a minimum example or the main steps?
I'm using Apache Jena Fuseki v2 on Windows and this is my query so far:
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wd: <http://www.wikidata.org/entity/>
INSERT
{ GRAPH <???> { ?s ?p ?o } } #don't know what to insert here for "GRAPH"
WHERE
{ GRAPH <???> #don't know what to insert here for "GRAPH" either
{ #a working example query for wikidata:
?s wdt:P31 wd:Q5. #humans
?s wdt:P54 wd:Q43310. #germans
?s wdt:P1344 wd:Q79859. #part of world cup 2014
?s ?p ?o.
}
}
My local endpoint i'm querying is http://localhost:3030/mylocaldb/update. I've read that /update is necessary to edit the database (i'm not sure if i understood that correctly though).
Is my approach correct so far? Or is more stuff like additional scripting outside SPARQL needed?

Taken from the SPARQL 1.1 Update W3C specs:
The syntax is
( WITH IRIref )?
INSERT QuadPattern
( USING ( NAMED )? IRIref )*
WHERE GroupGraphPattern
If the INSERT template specifies GRAPH blocks then these will be the graphs affected. Otherwise, the operation will be applied to the default graph, or, respectively, to the graph specified in the WITH clause, if one was specified. If no USING (NAMED) clause is present, then the pattern in the WHERE clause will be matched against the Graph Store, otherwise against the dataset specified by the USING (NAMED) clauses. The matches against the WHERE clause create bindings to be applied to the template for determining triples to be inserted (following the same rules as for DELETE/INSERT).
So this basically means, you can omit the GRAPH definition from the INSERT part if you want to store it in the default graph, otherwise it will be the graph in which you want to store the data.
Regarding the WHERE clause, usually you would have to use the SERVICE keyword here to apply federated querying on the Wikidata endpoint (https://query.wikidata.org/bigdata/namespace/wdq/sparql):
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wd: <http://www.wikidata.org/entity/>
INSERT
{ ?s ?p ?o }
WHERE
{ SERVICE <https://query.wikidata.org/bigdata/namespace/wdq/sparql>
{ #a working example query for wikidata:
?s wdt:P31 wd:Q5. #humans
?s wdt:P54 wd:Q43310. #germans
?s wdt:P1344 wd:Q79859. #part of world cup 2014
?s ?p ?o.
}
}
I tested it with Apache Jena and it inserts 4462 triples into my local dataset.

Related

Converting a bnode to a string in graphdb

I have an RDF file where the resources are identified with nodeID's instead of URIs. I have imported them into Ontotext graphdb, and would like to generate URIs based on the nodeID (which I preserved during import). For example, I am trying to map this triple
_:C00456 rdf:type skos:Concept
to this:
<https://example.com/data/C00456> rdf:type skos:Concept
Unfortunately, if ?s is a BNODE, STR(?s) is an empty string in graphdb. xsd:string(?s), ditto. IRI(?s), you guessed it. Is there any function that will expose the form of the bnode as a string, so that I can build a URI from it? I went through the list of functions in the sparql 1.1 specification and could not see any.
PS It would have been nice if graphdb would just convert the nodeIDs to URIs during import (I specified a prefix for relative names), but it went by the book and turned them into bnodes. If I overlooked something, I'll be glad to be set right.

You can use the spif:buildString function to convert a BNode to String and then to IRI.
Here is a sample query:
PREFIX spif: <http://spinrdf.org/spif#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
CONSTRUCT {?sIRI ?p ?o} WHERE {
?s ?p ?o .
FILTER (isBlank(?s))
BIND (IRI(spif:buildString("http://my/namespace/{?1}", ?s)) as ?sIRI)
} LIMIT 10
The function is documented here:
https://graphdb.ontotext.com/documentation/10.0/sparql-functions-reference.html?highlight=buildstring#sparql-spin-functions-and-magic-predicates

A problem with querying a graph with SPARQL on Bioportal

I am querying an ontology on the Bioportal endpoint. The ontology (NIF) is stored as a graph, so I put it in the FROM clause as the endpoint instructed.
SELECT DISTINCT ?p
FROM <http://bioportal.bioontology.org/ontologies/NIF>
WHERE{
?p a rdf:Property
}
limit 100
However, as can be seen below, the results came back showing few properties related to NIF and others to a different ontology called SKOS (Simple Knowledge Organization System).
In the Bioportal documentation it is said it maps some properties to SKOS properties, so I thought maybe the results are fine.
However, I had to test if I am querying the correct graph. So I used the below code to count the number of nodes since I know the NIF has around 3.6 million triples!
SELECT (count (*) as ?nodes)
FROM <http://bioportal.bioontology.org/ontologies/NIF>
WHERE{
?s ?p ?o
}
This resulted in 7984 nodes with and without the FROM clause! So I guessed I should be using the "count" incorrectly!
So I wonder how I should make sure that I am just querying the NIF ontology. Also, how to count its nodes?
Thanks :)

Try using SERVICE keyword.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT (count (*) as ?nodes)
WHERE
{
SERVICE <http://bioportal.bioontology.org/ontologies/NIF>
{
?s ?p ?o
}
}
If this fails, possibly the service you are connecting is not correct or up.
Try below example which connects to DBpedia:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT (count (*) as ?nodes)
WHERE
{
SERVICE <http://DBpedia.org/sparql>
{
?s ?p ?o
}
}
By the way I can;t access URL http://bioportal.bioontology.org/ontologies/NIF. Seems to be unavailable or down.

How to reference different repositories in a SPARQL query (federated)?

I would like to know what are the triples that are in a repository but that are not included in other repository.
But for doing this, I would need to reference the two repositories in a federated query.
I'm using Allegrograph.

Example on FactForge (i. e., using SPARQL 1.1 federated queries, not Allegrograph's federated stores):
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
CONSTRUCT { ?s ?p ?o }
{
VALUES (?s) {(dbr:Yekaterinburg)}
?s ?p ?o.
MINUS
{
SERVICE <http://dbpedia.org/sparql>
{
VALUES (?s) {(dbr:Yekaterinburg)}
?s ?p ?o
}
}
}
Obviously, one can't subtract DBpedia from DBpedia Live in this way, at least one resultset should be relatively small. It seems that Allegrograph's federated stores couldn't help in such substraction task. Probably you should divide your triples into named graphs, not into separate stores.

With AllegroGraph there are 2 ways this can be achieved:
(1) AllegroGraph allows you to federate multiple local and/or remote triple stores into a single virtual store which can then be queried as a single triple store.
(2) You can use SPARQL 1.1 federated queries which use the SERVICE keyword. Here is an example from the Learning SPARQL book:
PREFIX gp: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/people/>
SELECT ?dbpProperty ?dbpValue ?gutenProperty ?gutenValue
WHERE
{
SERVICE <http://DBpedia.org/sparql>
{
<http://dbpedia.org/resource/Joseph_Hocking> ?dbpProperty ?dbpValue .
}
SERVICE <http://wifo5-04.informatik.uni-mannheim.de/gutendata/sparql>
{
gp:Hocking_Joseph ?gutenProperty ?gutenValue .
}
}

Default RDFS inference in Virtuoso 7.x

This is a question about simple RDFS inference in Virtuoso 7.1 and DBpedia. I have a Virtuoso instance which was installed using this link as a reference. Now if I query the endpoint with the following query :
Select ?s
where { ?s a <http://dbpedia.org/ontology/Cricketer> . }
I get a list of Cricketers that are present in DBpedia. Suppose I want all the athletes (all sports and cricketers included, where Athlete is rdfs:superClassOf Cricketer), I just try the query
Select ?s
where { ?s a <http://dbpedia.org/ontology/Athlete> . }
For this I get all the correct answers. However I have an issue with rdfs:subPropertyOf. For example the property <http://dbpedia.org/ontology/capital> is the sub-property of <http://dbpedia.org/ontology/administrativeHeadCity>. So suppose I want all the capitals and the administrative head cities and I issue the query
Select ?s ?o
where { ?s <http://dbpedia.org/ontology/administrativeHeadCity> ?o . }
I get zero results. Why is it that subproperty inference isn't working in DBpedia? Is there something else that I have missed?

You've missed a couple of things.
First, Virtuoso is at 7.2.4 as of April 2016, and this version is strongly recommended over the old version from 2014, for many reasons.
#AKSW's advice about Property Paths will work much better with this later version, too.
Then, you can use inference on the DBpedia endpoint (including your local mirror), through the input:inference pragma, as shown on the live results of the query shown below --
DEFINE input:inference "http://dbpedia.org/resource/inference/rules/dbpedia#"
SELECT ?place ?HeadCity
WHERE
{
?place <http://dbpedia.org/ontology/administrativeHeadCity> ?HeadCity
}
ORDER BY ?place ?HeadCity
You can also see a list of predefined inference rule sets.
And... more of the relevant documentation.
(ObDisclaimer: I work for OpenLink Software, producer of Virtuoso.)

There is no automatic inference enabled in DBpedia. DBpedia itself is a dataset loaded into Virtuoso.
The reason that you get all instances with a superclass like dbo:Athlete is that subclass-inheritance is fully materialized in the current DBpedia dataset:
(s rdf:type c1), (c1 rdfs:subClassOf c2) -> (s rdf:type c2)
That means that for each individual x, the DBpedia dataset contains all the classes C it belongs to - in fact also the superclasses.
That procedure was not done for subproperty-inheritance, i.e.,
(s p1 o), (p1 rdfs:subPropertyOf p2) -> (s p2 o)
You can solve that problem with SPARQL 1.1 property paths:
SELECT ?s ?o WHERE {
?p rdfs:subPropertyOf* <http://dbpedia.org/ontology/administrativeHeadCity> .
?s ?p ?o .
}

Obtain linked resources with SPARQL query on DBpedia

Hello everybody i'm using a SPARQL query to retrieve properties and values of a specified resource. For example if i ask for Barry White, i obtain: birth place, associatedBand, recordLabels and so on.
Instead for any instance such as "Hammerfall", i obtain only this results:
Query results
But i want properties and values as shown in this page: Correct results.
My query is:
PREFIX db: <http://dbpedia.org/resource/>
PREFIX prop: <http://dbpedia.org/property/>
PREFIX onto: <http://dbpedia.org/ontology/>
SELECT ?property ?value
WHERE { db:Hammerfall?property ?value }
Anyone can tell me how to access to the correct resource and obtain corrects properties and values in every case?

select ?p ?o { dbpedia:HammerFall ?p ?o }
SPARQL results
The particular prefix doesn't matter; I just used dbpedia: because it's predefined on the endpoint as http://dbpedia.org/resource/, just like your db:. The issue is that HammerFall has a majuscule F in the middle, but your query uses a miniscule f.
As an alternative, since the results for Hammerfall (with a miniscule f) do include
http://dbpedia.org/ontology/wikiPageRedirects http://dbpedia.org/resource/HammerFall
you could use a property path to follow any wikiPageRedirects paths:
select ?p ?v {
dbpedia:Hammerfall dbpedia-owl:wikiPageRedirects* ?hammerfall .
?hammerfall ?p ?v
}
SPARQL results
See Retrieving dbpedia-owl:type value of resource with dbpedia-owl:wikiPageRedirect value? for more about that approach.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SPARQL - Insert data from remote endpoint - sparql

Related

Converting a bnode to a string in graphdb

A problem with querying a graph with SPARQL on Bioportal

How to reference different repositories in a SPARQL query (federated)?

Default RDFS inference in Virtuoso 7.x

Obtain linked resources with SPARQL query on DBpedia

Categories

Resources