TDB Jena Querying - sparql

I am trying to query in Java with Jena using TDB. So basically i got an n3 file name song.n3 and using this file I want to use this with TDB. So I have created a directory which is generated in my Java1 folder (Netbeans project folder) and then I have the source of the actual n3 file. After running this code I am having the error "java.lang.NoClassDefFoundError". Basically debugging the code lead to the error being caused by the line : Dataset dataset = TDBFactory.createDataset(directory);. I am not too sure why this error is caused could it be that it because my directory is empty with no model.
public static void main(String[] args) throws IOException {
String directory = "./tdb";
Dataset dataset = TDBFactory.createDataset(directory);
Model tdb = dataset.getDefaultModel();
String source = "C:\\Users\\Name\\Documents\\NetBeansProjects\\Java1\\src\\song.n3";
FileManager.get().readModel( tdb, source, "N3" );
String queryString = "PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT * WHERE { ?x owl:sameas ?y }";
Query query = QueryFactory.create(queryString);
QueryExecution qe = QueryExecutionFactory.create(query, tdb);
ResultSet results = qe.execSelect();
ResultSetFormatter.out(System.out, results, query);
qe.close();
}
}

This should be a problem with your CLASSPATH, when I use TDB I have the following script to load the Jena-TDB libraries into my classpath ..
#!/bin/bash
CP="."
for i in ./TDB-0.8.9/lib/*.jar ; do
CP=$CP:./TDB-0.8.9/lib/$i
done
export CLASSPATH=$CP
It is bash but very easy to translate into a Windows script. Bottom line, make sure that all the jars in /lib/ directory are in the CLASSPATH. Anyway, it would help it you give the complete java.lang.NoClassDefFoundError where the class not found is shown, that would give you a hint of what it is missing. Probably some of the logging libs that are not shipped inside the jena distribution.
Also, watch out for that owl:sameas predicate. SPARQL and RDF are case sensitive and the correct predicate is owl:sameAs.

Related

GraphDB: use similarity search with embedded repository

I have been using GraphDB's (version 9.2) similarity search from the workbench. Now I also want to use this feature for an embedded repository using graphdb-free-runtime 9.2.1. However I have no clue how this feature can be used from the APIs provided by the runtime. My questions are:
In http://graphdb.ontotext.com/documentation/standard/semantic-similarity-searches.html it is mentioned that similarity search is a plugin. However I could not find any such plugin within the runtime. Do I have to load it from some external resource? Where?
Is there any documentation or example how to create a similarity index programmatically?
Would it be an option to run the workbench as some kind of server and access the similarity search via REST API? However the REST API documentation at http://graphdb.ontotext.com/documentation/9.0/free/using-the-workbench-rest-api.html does not mention any API for similarity searches. Am I missing something?
Any hints or pointers are welcome.
You could add similarity plugin runtime, setting following "graphdb.extra.plugins" property to directory where similarity-plugin (you could find such into GDB instance -> dist/lib/plugins) is located , or:
-Dgraphdb.extra.plugins=directory
System.setProperty("graphdb.extra.plugins", "directory");
You may create index programmatically using SPARQL or:
to create similarity text index "allNews" execute following update:
PREFIX : <http://www.ontotext.com/graphdb/similarity/>
PREFIX inst: <http://www.ontotext.com/graphdb/similarity/instance/>
PREFIX pred: <http://www.ontotext.com/graphdb/similarity/psi/>
insert {
inst:allNews :createIndex "-termweight idf" ;
:analyzer "org.apache.lucene.analysis.en.EnglishAnalyzer" ;
:documentID ?documentID .
?documentID :documentText ?documentText .
} where {
SELECT ?documentID ?documentText {
?documentID ?p ?documentText .
filter(isLiteral(?documentText))
}
}
to delete index "allNews" execute following update:
PREFIX :<http://www.ontotext.com/graphdb/similarity/>
PREFIX inst:<http://www.ontotext.com/graphdb/similarity/instance/>
insert { inst:allNews :deleteIndex '' } where {}
to rebuild index "allNews":
PREFIX :<http://www.ontotext.com/graphdb/similarity/>
PREFIX inst:<http://www.ontotext.com/graphdb/similarity/instance/>
insert { inst:allNews :rebuildIndex '' } where {}
followed by the create query!
to list all created indexes, execute following query:
PREFIX :<http://www.ontotext.com/graphdb/similarity/>
PREFIX inst:<http://www.ontotext.com/graphdb/similarity/instance/>
select ?index ?status ?type
where {
?index :status ?status .
?index :type ?type .
}

How to protect my system, which runs the Sesame triplestore, from injections when querying using SPARQL?

Title says it all. Is there something equivalent to SQL's prepared statements?
(assuming you are using a recent version of RDF4J, and not Sesame)
To prevent vulnerabilities due to injection, a simple approach is to use a prepared query, and use Query#setBinding to inject actual user input values into your query. For example:
// some input keyword to inject
String keyword = "foobar";
TupleQuery query = con.prepareTupleQuery(
"PREFIX ex: <htt://example.org/> "
+ "SELECT ?document WHERE { ?document ex:keyword ?keyword . }");
// inject the input keyword
query.setBinding("keyword", factory.createLiteral(keyword));
// execute the query
TupleQueryResult result = query.evaluate();
For more advanced control, RDF4J also has a SparqlBuilder, a fluent API for creating SPARQL queries in Java, for this purpose. For example:
String keyword = "foobar";
Prefix ex = SparqlBuilder.prefix("ex", Rdf.iri("http://example.org/"));
Variable document = SparqlBuilder.var("document");
SelectQuery query = Queries.SELECT().prefix(ex).select(document)
.where(GraphPatterns.tp(document, ex.iri("keyword"), Rdf.literalOf(keyword));

Apache Jena - Is it possible to write to output the BASE directive?

I just started using Jena Apache, on their introduction they explain how to write out the created model. As input I'm using a Turtle syntax file containing some data about some OWL ontologies, and I'm using the #base directive to use relative URI's on the syntax:
#base <https://valbuena.com/ontology-test/> .
And then writing my data as:
<sensor/AD590/1> a sosa:Sensor ;
rdfs:label "AD590 #1 temperatue sensor"#en ;
sosa:observes <room/1#Temperature> ;
ssn:implements <MeasureRoomTempProcedure> .
Apache Jena is able to read that #base directive and expands the relative URI to its full version, but when I write it out Jena doesn't write the #base directive and the relative URI's. The output is shown as:
<https://valbuena.com/ontology-test/sensor/AD590/1> a sosa:Sensor ;
rdfs:label "AD590 #1 temperatue sensor"#en ;
sosa:observes <https://valbuena.com/ontology-test/room/1#Temperature> ;
ssn:implements <https://valbuena.com/ontology-test/MeasureRoomTempProcedure> .
My code is the following:
Model m = ModelFactory.createOntologyModel();
String base = "https://valbuena.com/ontology-test/";
InputStream in = FileManager.get().open("src/main/files/example.ttl");
if (in == null) {
System.out.println("file error");
return;
} else {
m.read(in, null, "TURTLE");
}
m.write(System.out, "TURTLE");
There are multiple read and write commands that take as parameter the base:
On the read() I've found that if on the data file the #base isn't declared it must be done on the read command, othwerwise it can be set to null.
On the write() the base parameter is optional, it doesn't matter if I specify the base (even like null or an URI) or not, the output is always the same, the #base doesn't appear and all realtive URI's are full URI's.
I'm not sure if this is a bug or it's just not possible.
First - consider using a prefix like ":" -- this is not the same as base but makes the output nice as well.
You can configure the base with (current version of Jena):
RDFWriter.create()
.source(model)
.lang(Lang.TTL)
.base("http://base/")
.output(System.out);
It seems that the command used on the introduction tutorial of Jena RDF API is not updated and they show the reading method I showed before (FileManager) which now is replaced by RDFDataMgr. The FileManager way doesn't work with "base" directive well.
After experimenting I've found that the base directive works well with:
Model model = ModelFactory.createDefaultModel();
RDFDataMgr.read(model,"src/main/files/example.ttl");
model.write(System.out, "TURTLE", base);
or
Model model = ModelFactory.createDefaultModel();
model.read("src/main/files/example.ttl");
model.write(System.out, "TURTLE", base);
Although the model.write() command is said to be legacy on RDF output documentation (whereas model.read() is considered common on RDF input documentation, don't understand why), it is the only one I have found that allows the "base" parameter (required to put the #base directive on the output again), RDFDataMgr write methods don't include it.
Thanks to #AndyS for providing a simpler way to read the data, which led to fix the problem.
#AndyS's answer allowed me to write relative URIs to the file, but did not include the base in use for RDFXML variations. To get the xml base directive added correctly, I had to use the following
RDFDataMgr.read(graph, is, Lang.RDFXML);
Map<String, Object> properties = new HashMap<>();
properties.put("xmlbase", "http://example#");
Context cxt = new Context();
cxt.set(SysRIOT.sysRdfWriterProperties, properties);
RDFWriter.create().source(graph).format(RDFFormat.RDFXML_PLAIN).base("http://example#").context(cxt).output(os);

SPARUL query to drop most graphs, using Jena API

I am trying to clear most of the graphs contained in my local Virtuoso triple store, using Apache Jena, as part of my clean up process before and after my unit tests. I think that something like this should be done. First, I retrieve the graph URIs to be deleted; then I execute a SPARUL Drop operation.
String sparqlEndpointUsername = ...;
String sparqlEndpointPassword = ...;
String sparqlQueryString = ...; // Returns the URIs of the graphs to be deleted
HttpAuthenticator authenticator = new SimpleAuthenticator(sparqlEndpointUsername,
sparqlEndpointPassword.toCharArray());
ResultSet resultSetToReturn = null;
try (QueryEngineHTTP queryEngine = new QueryEngineHTTP(sparqlEndpoint, sparqlQueryString, authenticator)) {
resultSetToReturn = queryEngine.execSelect();
resultSetToReturn = ResultSetFactory.copyResults(resultSetToReturn);
while(resultSetToReturn.hasNext()){
String graphURI = resultSetToReturn.next().getResource("?g").getURI();
UpdateRequest request = UpdateFactory.create() ;
request.add("DROP GRAPH <"+graphURI+">");
Dataset dataset = ...; // how can I create a default dataset pointing to my local virtuoso installation?
// And perform the operations.
UpdateAction.execute(request, dataset) ;
}
}
;
Questions:
As shown in this example, ARQ needs a dataset to operate on. How would I create this dataset pointing to my local Virtuoso installation for an update operation?
Is there perhaps an alternative to my approach? Would using another approach (apart from jena) be a better idea?
Please note that I am not trying to delete all graphs. I am deleting only the graphs whose names are returned through the SPARQL query defined in the beginning (3rd line).
Your question appears to be specific to Virtuoso, and meant to remove all RDF data, so you could use Virtuoso's built-in RDF_GLOBAL_RESET() function.
This is not a SPARQL/SPARUL query; it is usually issued through an SQL connection -- which could be JDBC, ODBC, ADO.NET, OLE DB, iSQL, etc.
That said, as you are connecting through a SPARUL-privileged connection, you should be able to use Virtuoso's (limited) SQL-in-SPARQL support, a la --
SELECT
( bif:RDF_GLOBAL_RESET() AS reset )
WHERE
{ ?s ?p ?o }
LIMIT 1
(Executing this through an unprivileged connection like the default SPARQL endpoint will result in an error like Virtuoso 37000 Error SP031: SPARQL compiler: Function bif:RDF_GLOBAL_RESET() can not be used in text of SPARQL query due to security restrictions.)
(ObDisclaimer: OpenLink Software produces Virtuoso, and employs me.)
You can build a single SPARQL Update request:
DROP GRAPH <g1> ;
DROP GRAPH <g2> ;
DROP GRAPH <g3> ;
... ;
because in SPARQL Update one HTTP requests can be several update operations, separated by ;.

Jena SPARQL API using inference rules file

I am working with Jena SPARQL API, and I want to execute queries on my RDF files after applying inference rules. I created a .rul file that contains all my rules; now I want to run those rules and execute my queries. When I used OWL, I proceeded this way:
OntModel model1 = ModelFactory.createOntologyModel( OntModelSpec..OWL_MEM_MICRO_RULE_INF);
// read the RDF/XML file
model1.read( "./files/ontology.owl", "RDF/XML" );
model1.read( "./files/data.rdf", "RDF/XML" );
// Create a new query
String queryString =
".....my query";
Query query = QueryFactory.create(queryString);
QueryExecution qe = QueryExecutionFactory.create(query, model1);
ResultSet results = qe.execSelect();
ResultSetFormatter.out(System.out, results, query);
I want to do the same thing with inferences rules, i.e., load my .rul file like this:
model1.read( "./files/rules.rul", "RDF/XML" );
This didn't work with .rul files, the rules are not executed. Any ideas how to load a .rul file? Thanks in advance.
Jena rules aren't RDF, and you don't read them into a model.
RDFS is RDF, and it is implemented internally using rules.
To build an inference model:
Model baseData = ...
List<Rule> rules = Rule.rulesFromURL("file:YourRulesFile") ;
Reasoner reasoner = new GenericRuleReasoner(rules);
Model infModel = ModelFactory.createInfModel(reasoner, baseData) ;
See ModelFactory for other ways to build models (e.g., RDFS inference) directly.