Apache Marmotta Delete All Graphs using SPARQL Update - sparql

I am trying to clear all loaded graphs into my Apache Marmotta instance. I have tried several SPARQL queries to do so, but I am not able to remove the RDF/XML graph that I imported. What is the appropriate syntax to do so?

Try this query:
DELETE WHERE { ?x ?y ?z }
Be careful, as it deletes every triple in the database, including the built-in ones of Marmotta.
A couple of things I did for experimenting:
I downloaded the source code of Marmotta and used the Silver Searcher tool for searching for DELETE queries with the following command:
ag '"DELETE '
This did not help much.
I navigated to the Marmotta installation directory and watched the debug log:
tail -f marmotta-home/log/marmotta-main.log
This showed that the parser is not able to process the query DELETE DATA { ?s ?p ?o }. The exception behind the "error while executing update" was:
org.openrdf.sail.SailException: org.openrdf.rio.RDFParseException: Expected an RDF value here, found '?' [line 8]
[followed by a long stacktrace]
This shows that the parser does not allow variables in the query after DELETE DATA.
Based on a related StackOverflow answer, I tried CLEAR / CLEAR GRAPH / DROP / DROP GRAPH, but they did not work.
I tried many combinations of DELETE, *, ?s ?p ?p and accidentally managed to get it working with the DELETE WHERE construct. According to the W3C documentation:
The DELETE WHERE operation is a shortcut form for the DELETE/INSERT operation where bindings matched by the WHERE clause are used to define the triples in a graph that will be deleted.

Related

How to completely delete an RDF node/instance from a graph database?

Good day, I am using Graphdb to store some triples as seen in the image below. This particular RDF node uses a regular URI http://example/regular/uri. What I wish to do is to not only completely delete all properties attached to this node, but also delete the node itself. (with the result that http://example/regular/uri does not appear in the graph database any longer)
So far I am only able to delete all properties, but I am not able to delete the actual RDF node itself. It seemed rather simple, but the more I research online, the more this seems impossible unless clearing the complete graph.
I have tried simple "delete where" queries as shown in example 11 of SPARQL documentation. And i have also tried using simple "delete where"-queries using the wildcard operator as shown in the query below:
Is there a way to delete such RDF nodes?
Thanks in advance!
A node exists in a graph as long as there is one or more triples with that node in subject or object position. So the easiest way would be to issue two delete statements, one deleting all statements with the node in subject position and one deleting all statements with the node in object position. But if you need/want to do it with a single operation you can do that as well with filters.
Here is a sample that delete uri://node/to/delete from uri://my/graph :
DELETE { GRAPH <uri://my/graph> {
?s ?p ?o .
}}
USING <uri://my/graph>
WHERE {
{
?s ?p ?o . VALUES ?s { <uri://node/to/delete>}
} UNION {
?s ?p ?o . VALUES ?o { <uri://node/to/delete>}
}
}

Setting to query only Default Graph and exclude Named Graphs

In the GraphDB documentation, I see that "the dataset’s default graph contains the merge of the database’s default graph AND all the database named graphs." This means that "if a statement ex:x ex:y ex:z exists in the database in the graph ex:g" then a query such as SELECT * { ?s ?p ?o } will return the triple ex:x ex:y ex:z
I am wondering if there is a setting which can be triggered either via the web interface or via the RDF4J/OpenRDF API which will disable this behavior in a specified GraphDB repository. That is, for the purposes of my project I would prefer to have triples which are stored in named graphs to only appear in results which specifically query that named graph.
I have not seen anything like this searching through the documentation or on the settings available on the web interface, but maybe somebody here knows something I don't.
EDIT: I am not looking for a SPARQL solution to this problem. I know that I can query just the default graph using SPARQL, but I want to be able to use the query SELECT * { ?s ?p ?o } and only see results which are in the default graph by default.
GraphDB/RDF4J have a different interpretation than Jena how to query the default graph. The only easy way to query only explicit statements in the default graph is to use the special graph sesame:nil. The SPARQL-based solution is to write:
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
SELECT ?s ?p ?o
FROM sesame:nil
WHERE {
?s ?p ?o .
} LIMIT 100
I don't think there is any easy non-SPARQL based solution like changing a configuration option or even use this special graph over the SPARQL Graph Store protocol.

Pure-SPARQL migration of data from one endpoint to another?

It looks like this question has been raised before, but subsequently deleted?!
For data in one SQL table, I can easily replicate the structure and then migrate the data to another table (or database?).
CREATE TABLE new_table
AS (SELECT * FROM old_table);
SELECT *
INTO new_table [IN externaldb]
FROM old_table
WHERE condition;
Is there something analogous for RDF/SPARQL? Something that combines a select and an insert into one SPARQL statement?
Specifically, I use Karma, which publishes data to an embedded OpenRDF/Sesame endpoint. There's a text box on the GUI for the endpoint, so I can change it to a free-standing RDF4J, since RDF4J is a fork of Sesame.
Unfortunately, I get an error like invalid SPARQL endpoint from Karma when I put the address for a Virtuoso, Stardog or Blazegraph endpoint in the endpoint text box. I suspect it might be possible to modify and recompile Karma, or (more realistically), I could write a small tool with the Jena or RDF4J libraries to select into RAM or scratch disk space and then insert into the other endpoint.
But if there's a pure-SPARQL solution, I'd sure like to hear it.
In SPARQL, you can only specify the source endpoint. Therefore, a partial pure-SPARQL solution would be to run the following update on your target triplestore:
INSERT { ?s ?p ?o }
WHERE { SERVICE <http://source/sparql>
{
?s ?p ?o
}
}
This will copy over all triples from the (remote) source's default graph to your target store, but it doesn't copy over any named graphs. To copy over any named graphs as well, you can execute this in addition:
INSERT { GRAPH ?g { ?s ?p ?o } }
WHERE { SERVICE <http://source/sparql>
{
GRAPH ?g {
?s ?p ?o
}
}
}
If you're not hung up on pure SPARQL though, different toolkits and frameworks offer you all sorts of options. For example, using RDF4J's Repository API you could just wrap both source and target in a SPARQLRepository proxy (or just use a HTTPRepository if either one is an actual RDF4J store), and then just run copy API operations. There's many different ways to do that, one possible approach (disclaimer: I didn't test this code fragment) is this:
SPARQLRepository source = new SPARQLRepository("http://source/sparql");
source.initialize();
SPARQLRepository target = new SPARQLRepository("http://target/sparql");
target.initialize();
try (RepositoryConnection sourceConn = source.getConnection();
RepositoryConnection targetConn = target.getConnection()) {
sourceConn.export(new RDFInserter(targetConn));
}

How to suppress objectProperty in Fuseki?

I've problem to suppress one objectProperty in graph on my fuseki server.
I tried to use many way to delete my objectProperty without results.
I tried to use s-delete to suppress:
./s-delete http://localhost:3030/ds 'DELETE {?s <http://www.semanticweb.org/ds/dependsOfExchange> ?o}'
or
./s-delete http://localhost:3030/ds 'DELETE {GRAPH ?g {?s <http://www.semanticweb.org/ds/dependsOfExchange> ?o} WHERE{?s <http://www.semanticweb.org/ds/dependsOfExchange> ?o}}'
I tried to find some information about how to use correctly s-delete to suppress objectproperty or dataproperty in data stored in my fuseki server, but I haven't found anything useful. And there is no update or suppress tools accessible by the browser.
Thanks in advance.
s-delete executes part of the SPARQL Graph Store Protocol which operaes on whole graphs only. It is an HTTP delete on a named graph.
To modify part of a graph, you need SPARQL Update, and so use s-update.
The delete command, using the DELETE WHERE short form might be:
DELETE WHERE {?s <http://www.semanticweb.org/ds/dependsOfExchange> ?o}
or the GRAPH version.

Bizarre results with bif:contains - corrupted full text index?

I recently built a Virtuoso database (version 07.10.3207) using dbpedia data. I'm trying to build some queries for it, and encountering very strange results. For one example:
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?s, ?p, ?o where {
?s ?p ?o .
?s rdfs:label "Almond"#en .
?o bif:contains "mythical"
}
This yields a hit. One might expect it to mean that the comment field (the field that matches "mythical") for Almond contains the word "mythical". However, it does not. It is, in fact:
"The almond (/??m?nd/) (Prunus dulcis, syn. Prunus amygdalus, Amygdalus communis, Amygdalus dulcis) (or badam in Indian English, from Persian: ??????) is a species of tree native to the Middle East and South Asia. "Almond" is also the name of the edible and widely cultivated seed of this tree."#en
Many other queries yield similarly strange results.
Trying the same queries on the public dbpedia endpoint does not yield these bizarre results, so I know it's somehow a problem with my database. I guessed that it might have to do with some corruption of the full text indices.
I tried the following, without a super-clear understanding of what exactly they might do, based on other notes I was able to find:
DB.DBA.RDF_OBJ_FT_RULE_ADD(null, null, 'All');
DB.DBA.VT_INC_INDEX_DB_DBA_RDF_OBJ();
DB.DBA.RDF_OBJ_FT_RECOVER();
DB.DBA.VT_INDEX_DB_DBA_RDF_OBJ();
Thus far, no dice. I'm sort of wondering if it might have to do with the mangled characters in the comment field - the online dbpedia endpoint renders them properly, while my Virtuoso installation just gives question marks, as seen above. No idea even how to begin approaching this though.
I did include SQL_UTF8_EXECS = 1 in virtuoso.ini (and subsequently restart the server), which still left me with question marks in the results.
Actually, it does not appear to have anything to do with those question marks; I ran the following query:
select ?s, ?p, ?o where {
?s ?p ?o .
?o bif:contains "mythical" .
FILTER (!regex(?o, "mythical", "i"))
}
A pseudorandom selection of hits, none of which contain "mythical" or "?":
"Asgrrr"
"403 BC"
"Potential infinity"
"Beauty and the Beast (talk show)"
"Alberta highway highway 22"
The same query, run at http://dbpedia.org/sparql, returns nothing (as it should).
Any ideas?
Rebuilding the database did not correct the problem. I was, however, able to get a working version by taking the following steps. Some of them may have been unnecessary, but given how long it takes, I haven't done controlled experiments to narrow things down to the minimum.
First, delete the database and associated files, to start with a blank slate.
Edit virtuoso.ini to uncomment/include:
SQL_UTF8_EXECS = 1
Start up Virtuoso, and then from isql issue the following commands:
DB.DBA.RDF_OBJ_FT_RULE_ADD (null, null, 'All');
DB.DBA.VT_BATCH_UPDATE ('DB.DBA.RDF_OBJ', 'OFF', null);
DB.DBA.VT_INC_INDEX_DB_DBA_RDF_OBJ ();
DB.DBA.RDF_OBJ_FT_RECOVER ();
COMMIT WORK;
CHECKPOINT;
CHECKPOINT_INTERVAL(60000);
Then, load the data.
Then, call:
COMMIT WORK;
CHECKPOINT;
DB.DBA.VT_INC_INDEX_DB_DBA_RDF_OBJ();
CHECKPOINT;
COMMIT WORK;
CHECKPOINT;
CHECKPOINT_INTERVAL(60);
COMMIT WORK;
Enjoy your fully-text-searchable database!