How to completely delete an RDF node/instance from a graph database? - sparql

Good day, I am using Graphdb to store some triples as seen in the image below. This particular RDF node uses a regular URI http://example/regular/uri. What I wish to do is to not only completely delete all properties attached to this node, but also delete the node itself. (with the result that http://example/regular/uri does not appear in the graph database any longer)
So far I am only able to delete all properties, but I am not able to delete the actual RDF node itself. It seemed rather simple, but the more I research online, the more this seems impossible unless clearing the complete graph.
I have tried simple "delete where" queries as shown in example 11 of SPARQL documentation. And i have also tried using simple "delete where"-queries using the wildcard operator as shown in the query below:
Is there a way to delete such RDF nodes?
Thanks in advance!

A node exists in a graph as long as there is one or more triples with that node in subject or object position. So the easiest way would be to issue two delete statements, one deleting all statements with the node in subject position and one deleting all statements with the node in object position. But if you need/want to do it with a single operation you can do that as well with filters.

Here is a sample that delete uri://node/to/delete from uri://my/graph :
DELETE { GRAPH <uri://my/graph> {
?s ?p ?o .
}}
USING <uri://my/graph>
WHERE {
{
?s ?p ?o . VALUES ?s { <uri://node/to/delete>}
} UNION {
?s ?p ?o . VALUES ?o { <uri://node/to/delete>}
}
}

Related

SparQL: Replace subject URIs

We have graphs containing oa:Annotations with certain properties. Since I am working on a local copy of the server, it would be useful to me to change these URIs to point to localhost. According to the book I read, I thought this should work, but it does not. It does not seem to change anything. Still, the server returns a 204, I am using the priting port and url (/update). So I definitely should be able to change things. There is no error message.
PREFIX oa: <http://www.w3.org/ns/oa#>
DELETE
{ GRAPH ?g {?oldIRI ?p ?o} }
INSERT
{ GRAPH ?g {?newIRI ?p ?o} }
WHERE
{
GRAPH ?g {
?oldIRI a oa:Annotation .
?oldIRI ?p ?o .
}
BIND(
CONCAT("http://localhost:80",
SUBSTR( STR(?oldIRI),
34,
STRLEN(STR(?oldIRI)) )
) AS ?newIRI
)
FILTER(CONTAINS(?oldIRI, "part_of_old_url"))
}
Any idea why this does not have the effect I hoped for? The book I use as reference does have "recipies" to change properties and it's values, but there is no example changing the subjects, so I assume there is a more general problem?
Update: using STR()
As suggested in the comments, I used CONTAINS(STR(?oldIRI), "part_of_old_url") to convert oldIRI to a string. I am not fully aware of all changes, but this is what I can say: (I have backups, no worry :D)
PREFIX oa: <http://www.w3.org/ns/oa#>
SELECT *
WHERE {
GRAPH ?g {
?iri a oa:Annotation .
}
} LIMIT 100
This query has zero results. It's a default query I often used to get some annotation uris for looking into things.

How to retrieve and delete a JSON-LD tree from a triple store using SPARQL?

I need to save and delete JSON-LD trees in a triple store. These trees often have subtrees that are represented as blank nodes. It is easy to transform them into RDF and use SPARQL to save them. But, when I need to retrieve/delete them, it is impossible to get all blank nodes. I am using the following query to get all blank nodes (but only up to 3 levels deep):
CONSTRUCT {
<instanceURI> ?p ?o.
?o ?p1 ?o1.
?o1 ?p2 ?o2.
}
WHERE {
<instanceURI> ?p ?o.
OPTIONAL {
FILTER (isBlank(?o))
?o ?p1 ?o1.
OPTIONAL {
FILTER (isBlank(?o1))
?o1 ?p2 ?o2.
}
}
}
In another question (How to recursively expand blank nodes in SPARQL construct query?) they suggest the use of DESCRIBE. But DESCRIBE returns more things than I want. Can I make a subquery to the graph DESCRIBE returns? If so, how?
This use case (retrieve/delete JSON-LD instances with arbitrary blank nodes) seems to be very common for people working with JSON-LD. A solution for it shouldn't be hard to implement. Am I doing something wrong? Is there a standard way of doing this?

Setting to query only Default Graph and exclude Named Graphs

In the GraphDB documentation, I see that "the dataset’s default graph contains the merge of the database’s default graph AND all the database named graphs." This means that "if a statement ex:x ex:y ex:z exists in the database in the graph ex:g" then a query such as SELECT * { ?s ?p ?o } will return the triple ex:x ex:y ex:z
I am wondering if there is a setting which can be triggered either via the web interface or via the RDF4J/OpenRDF API which will disable this behavior in a specified GraphDB repository. That is, for the purposes of my project I would prefer to have triples which are stored in named graphs to only appear in results which specifically query that named graph.
I have not seen anything like this searching through the documentation or on the settings available on the web interface, but maybe somebody here knows something I don't.
EDIT: I am not looking for a SPARQL solution to this problem. I know that I can query just the default graph using SPARQL, but I want to be able to use the query SELECT * { ?s ?p ?o } and only see results which are in the default graph by default.
GraphDB/RDF4J have a different interpretation than Jena how to query the default graph. The only easy way to query only explicit statements in the default graph is to use the special graph sesame:nil. The SPARQL-based solution is to write:
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
SELECT ?s ?p ?o
FROM sesame:nil
WHERE {
?s ?p ?o .
} LIMIT 100
I don't think there is any easy non-SPARQL based solution like changing a configuration option or even use this special graph over the SPARQL Graph Store protocol.

Pure-SPARQL migration of data from one endpoint to another?

It looks like this question has been raised before, but subsequently deleted?!
For data in one SQL table, I can easily replicate the structure and then migrate the data to another table (or database?).
CREATE TABLE new_table
AS (SELECT * FROM old_table);
SELECT *
INTO new_table [IN externaldb]
FROM old_table
WHERE condition;
Is there something analogous for RDF/SPARQL? Something that combines a select and an insert into one SPARQL statement?
Specifically, I use Karma, which publishes data to an embedded OpenRDF/Sesame endpoint. There's a text box on the GUI for the endpoint, so I can change it to a free-standing RDF4J, since RDF4J is a fork of Sesame.
Unfortunately, I get an error like invalid SPARQL endpoint from Karma when I put the address for a Virtuoso, Stardog or Blazegraph endpoint in the endpoint text box. I suspect it might be possible to modify and recompile Karma, or (more realistically), I could write a small tool with the Jena or RDF4J libraries to select into RAM or scratch disk space and then insert into the other endpoint.
But if there's a pure-SPARQL solution, I'd sure like to hear it.
In SPARQL, you can only specify the source endpoint. Therefore, a partial pure-SPARQL solution would be to run the following update on your target triplestore:
INSERT { ?s ?p ?o }
WHERE { SERVICE <http://source/sparql>
{
?s ?p ?o
}
}
This will copy over all triples from the (remote) source's default graph to your target store, but it doesn't copy over any named graphs. To copy over any named graphs as well, you can execute this in addition:
INSERT { GRAPH ?g { ?s ?p ?o } }
WHERE { SERVICE <http://source/sparql>
{
GRAPH ?g {
?s ?p ?o
}
}
}
If you're not hung up on pure SPARQL though, different toolkits and frameworks offer you all sorts of options. For example, using RDF4J's Repository API you could just wrap both source and target in a SPARQLRepository proxy (or just use a HTTPRepository if either one is an actual RDF4J store), and then just run copy API operations. There's many different ways to do that, one possible approach (disclaimer: I didn't test this code fragment) is this:
SPARQLRepository source = new SPARQLRepository("http://source/sparql");
source.initialize();
SPARQLRepository target = new SPARQLRepository("http://target/sparql");
target.initialize();
try (RepositoryConnection sourceConn = source.getConnection();
RepositoryConnection targetConn = target.getConnection()) {
sourceConn.export(new RDFInserter(targetConn));
}

Apache Marmotta Delete All Graphs using SPARQL Update

I am trying to clear all loaded graphs into my Apache Marmotta instance. I have tried several SPARQL queries to do so, but I am not able to remove the RDF/XML graph that I imported. What is the appropriate syntax to do so?
Try this query:
DELETE WHERE { ?x ?y ?z }
Be careful, as it deletes every triple in the database, including the built-in ones of Marmotta.
A couple of things I did for experimenting:
I downloaded the source code of Marmotta and used the Silver Searcher tool for searching for DELETE queries with the following command:
ag '"DELETE '
This did not help much.
I navigated to the Marmotta installation directory and watched the debug log:
tail -f marmotta-home/log/marmotta-main.log
This showed that the parser is not able to process the query DELETE DATA { ?s ?p ?o }. The exception behind the "error while executing update" was:
org.openrdf.sail.SailException: org.openrdf.rio.RDFParseException: Expected an RDF value here, found '?' [line 8]
[followed by a long stacktrace]
This shows that the parser does not allow variables in the query after DELETE DATA.
Based on a related StackOverflow answer, I tried CLEAR / CLEAR GRAPH / DROP / DROP GRAPH, but they did not work.
I tried many combinations of DELETE, *, ?s ?p ?p and accidentally managed to get it working with the DELETE WHERE construct. According to the W3C documentation:
The DELETE WHERE operation is a shortcut form for the DELETE/INSERT operation where bindings matched by the WHERE clause are used to define the triples in a graph that will be deleted.