Why does the OSPG.dat file grows so much more than all other files? - indexing

Dear Jena Community,
I'm running Jena Fuseki Version 4.4.0 as a container on an OpenShift Cluster.
OS Version Info (cat /etc/os-release):
NAME="Red Hat Enterprise Linux"
VERSION="8.5 (Ootpa)"
ID="rhel"
ID_LIKE="fedora" ="8.5"
...
Hardware Info (from Jena Fuseki initialization log):
[2023-01-27 20:08:59] Server INFO Memory: 32.0 GiB
[2023-01-27 20:08:59] Server INFO Java: 11.0.14.1
[2023-01-27 20:08:59] Server INFO OS: Linux 3.10.0-1160.76.1.el7.x86_64 amd64
[2023-01-27 20:08:59] Server INFO PID: 1
Disk Info (df -h):
Filesystem Size Used Avail Use% Mounted on
overlay 99G 76G 18G 82% /
tmpfs 64M 0 64M 0% /dev
tmpfs 63G 0 63G 0% /sys/fs/cgroup
shm 64M 0 64M 0% /dev/shm
/dev/mapper/docker_data 99G 76G 18G 82% /config
/data 1.0T 677G 348G 67% /usr/app/run
tmpfs 40G 24K 40G 1%
My dataset is built using TDB2, and currently has the following RDF Stats:
Triples: 65KK (Approximately 65 million)
Subjects: ~20KK (Aproximately 20 million)
Objects: ~8KK (Aproximately 8 million)
Graphs: ~213K (Aproximately 213 thousand)
Predicates: 153
The files corresponding to this dataset alone on disk sum up to approximately 671GB (measured with du -h).
From these, the largest files are:
/usr/app/run/databases/my-dataset/Data-0001/OSPG.dat: 243GB
/usr/app/run/databases/my-dataset/Data-0001/nodes.dat: 76GB
/usr/app/run/databases/my-dataset/Data-0001/POSG.dat: 35GB
/usr/app/run/databases/my-dataset/Data-0001/nodes.idn: 33GB
/usr/app/run/databases/my-dataset/Data-0001/POSG.idn: 29GB
/usr/app/run/databases/my-dataset/Data-0001/OSPG.idn: 27GB
I've looked into several documentation pages, source code, forums, ... nowhere I was able to find some explanation to why OSPG.dat is so much larger than all other files.
I've been using Jena for quite some time now and I'm well aware that its indexes grow significantly during usage, specially when triples are being added across multiple requests (transactional workloads).
Even though, the size of this particular file (OSPG.dat) surprised me, as in my prior experience the indexes would never get larger than the nodes.dat file.
Is there a reasonable explanation for this based on the content of the dataset or the way it was generated? Could this be an indexing bug within TDB2?
Thank you for your support!
For completeness, here is the assembler configuration for my dataset:
#prefix : <http://base/#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix root: <http://dev-test-jena-fuseki/$/datasets#> .
#prefix tdb2: <http://jena.apache.org/2016/tdb#> .
tdb2:GraphTDB rdfs:subClassOf ja:Model .
ja:ModelRDFS rdfs:subClassOf ja:Model .
ja:RDFDatasetSink rdfs:subClassOf ja:RDFDataset .
<http://jena.hpl.hp.com/2008/tdb#DatasetTDB>
rdfs:subClassOf ja:RDFDataset .
tdb2:GraphTDB2 rdfs:subClassOf ja:Model .
<http://jena.apache.org/text#TextDataset>
rdfs:subClassOf ja:RDFDataset .
ja:RDFDatasetZero rdfs:subClassOf ja:RDFDataset .
:service_tdb_my-dataset
rdf:type fuseki:Service ;
rdfs:label "TDB my-dataset" ;
fuseki:dataset :ds_my-dataset ;
fuseki:name "my-dataset" ;
fuseki:serviceQuery "sparql" , "query" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:serviceReadWriteGraphStore
"data" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" .
ja:ViewGraph rdfs:subClassOf ja:Model .
ja:GraphRDFS rdfs:subClassOf ja:Model .
tdb2:DatasetTDB rdfs:subClassOf ja:RDFDataset .
<http://jena.hpl.hp.com/2008/tdb#GraphTDB>
rdfs:subClassOf ja:Model .
ja:DatasetTxnMem rdfs:subClassOf ja:RDFDataset .
tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset .
ja:RDFDatasetOne rdfs:subClassOf ja:RDFDataset .
ja:MemoryDataset rdfs:subClassOf ja:RDFDataset .
ja:DatasetRDFS rdfs:subClassOf ja:RDFDataset .
:ds_my-dataset rdf:type tdb2:DatasetTDB2 ;
tdb2:location "run/databases/my-dataset" ;
tdb2:unionDefaultGraph true ;
ja:context \[ ja:cxtName "arq:optFilterPlacement" ;
ja:cxtValue "false"
\] .
Updates:
Currently, I'm trying to recreate the dataset from an NQuads Backup (~15GB after decompression) in the hope the index sizes will decrease. Nonetheless, as this dataset will continue to grow during the system usage it would be helpful to understand what caused this astounding growth in this particular index.

Related

Fuseki Named Graph persistence with inference

I am trying to persist multiple named graphs with inference in Fuseki.
I am referring to this excellent article, but facing some issues in Scenario 2: named graphs and no online updates.
My assembler configuration looks like this:
#prefix : <#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
## ---------------------------------------------------------------
## Updatable TDB dataset with all services enabled.
:service_tdb_all rdf:type fuseki:Service ;
rdfs:label "TDB onekg-metadata-dev" ;
fuseki:name "onekg-metadata-dev" ;
fuseki:serviceQuery "" ;
fuseki:serviceQuery "sparql" ;
fuseki:serviceQuery "query" ;
fuseki:serviceUpdate "" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ;
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset :dataset ;
.
# Location of the TDB dataset
:tdb_dataset_readwrite
a tdb:DatasetTDB ;
tdb:location "reasoning-dir" ;
.
# Inference dataset
:dataset a ja:RDFDataset ;
ja:defaultGraph :model_inf .
# Inference Model
:model_inf a ja:InfModel ;
ja:baseModel :graph ;
ja:reasoner [
ja:reasonerURL
<http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>
] .
# Intermediate graph referencing to the union graph
:graph rdf:type tdb:GraphTDB ;
tdb:dataset :tdb_dataset_readwrite ;
.
I am uploading some triples in a named graph and can query the triples(without inference) as is expected.
However, if I try to restart and try to query again, the named graph is missing and I can not query anything from there.
I would like to get some help here. Thank you in advance.

Configure Apache Jena RDF reasoner

I am trying to setup a reasoner for Apache Jena.
But I cannot get it working.
When I have a skos concept with skos:broader, I cannot query that concept by searching in the result by skos:narrower
Currently my config.ttl looks like see below.
Do I need to load the schema for skos in the configuration to make this work?
## Licensed under the terms of http://www.apache.org/licenses/LICENSE-2.0
#prefix : <#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model base data is in TDB.
<#service2> rdf:type fuseki:Service ;
rdfs:label "skos" ;
fuseki:name "skos" ;
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ;
fuseki:dataset <#dataset> ;
.
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model_inf> ;
.
<#model_inf> a ja:InfModel ;
ja:baseModel <#tdbGraph> ;
ja:reasoner [
ja:reasonerURL <http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner>
] .
## Base data in TDB.
<#tdbDataset> rdf:type tdb:DatasetTDB ;
tdb:location "run/databases/skos" ;
# If the unionDefaultGraph is used, then the "update" service should be removed.
# tdb:unionDefaultGraph true ;
.
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:dataset <#tdbDataset> .
My data would look something like this.
I would expect the example.com/1 to have example.com/2 as narrower as it is the inverse of broader.
<http://example.com/2>
a skos:Concept ;
skos:broader <http://example.com/1> ;
skos:prefLabel "schaapskooi"#nl .
<http://example.com/1>
a skos:Concept ;
skos:prefLabel "restant landschapstermen"#nl ;

Currently in a locked region: Fuseki + full text search + inference

I've recently started playing with the full text search in the Fuseki 0.2.8 snapshot.
I have an InfModel backed by a TDB dataset, which I've added a Lucene text index to. I have tested it out with some search queries like this:
prefix text: <http://jena.apache.org/text#>
select distinct ?s where { ?s text:query ('stu' 16) }
This works great, until I have two or more simultaneous queries to Fuseki, then occasionally I get:
Error 500: Currently in a locked region Fuseki - version 0.2.8-SNAPSHOT (Build date: 20130820-0755).
I've tried testing out the endpoint with 10 concurrent users sending queries at random intervals, over a two minute period around 30% of the queries return the 500 error above.
I have also tried disabling inference by replacing this section (full assembler file below):
<#dataset_fulltext> rdf:type text:TextDataset ;
text:dataset <#dataset_inf> ;
##text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
with this:
<#dataset_fulltext> rdf:type text:TextDataset ;
##text:dataset <#dataset_inf> ;
text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
and there are no exceptions generated when the TextDataset is using #tdbDataset rather than #dataset_inf.
Are there any problems with my set up, or is this a bug in Fuseki?
Here is my current assembler file:
#prefix : <#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix text: <http://jena.apache.org/text#> .
#prefix dc: <http://purl.org/dc/terms/> .
[] rdf:type fuseki:Server ;
# Timeout - server-wide default: milliseconds.
# Format 1: "1000" -- 1 second timeout
# Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest of query.
# See java doc for ARQ.queryTimeout
ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "12000,50000" ] ;
fuseki:services (
<#service1>
) .
# Custom code.
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
# TDB
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
## Initialize text query
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset rdfs:subClassOf ja:RDFDataset .
# Lucene index
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model bbase data in TDB.
<#service1> rdf:type fuseki:Service ;
rdfs:label "TDB/text service" ;
fuseki:name "dataset" ; # http://host/dataset
fuseki:serviceQuery "query" ;
fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ;
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset <#dataset_fulltext> ;
.
<#dataset_inf> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model_inf> .
<#model_inf> rdf:type ja:Model ;
ja:baseModel <#tdbGraph> ;
ja:reasoner [ ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner> ] .
<#tdbDataset> rdf:type tdb:DatasetTDB ;
tdb:location "Data" .
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:dataset <#tdbDataset> .
# Dataset with full text index.
<#dataset_fulltext> rdf:type text:TextDataset ;
text:dataset <#dataset_inf> ;
##text:dataset <#tdbDataset> ;
text:index <#indexLucene> .
# Text index description
<#indexLucene> a text:TextIndexLucene ;
text:directory <file:Lucene> ;
##text:directory "mem" ;
text:entityMap <#entMap> ;
.
# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "text" ;
text:map (
[ text:field "text" ; text:predicate dc:title ]
[ text:field "text" ; text:predicate dc:description ]
) .
And here is the full stack trace for one of the exceptions in Fuseki's log:
16:27:01 WARN Fuseki :: [2484] RC = 500 : Currently in a locked region
com.hp.hpl.jena.sparql.core.DatasetGraphWithLock$JenaLockException: Currently in a locked region
at com.hp.hpl.jena.sparql.core.DatasetGraphWithLock.checkNotActive(DatasetGraphWithLock.java:72)
at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.begin(DatasetGraphTrackActive.java:44)
at org.apache.jena.query.text.DatasetGraphText.begin(DatasetGraphText.java:102)
at org.apache.jena.fuseki.servlets.HttpAction.beginRead(HttpAction.java:117)
at org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:236)
at org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:195)
at org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:80)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeLifecycle(SPARQL_ServletBase.java:185)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeAction(SPARQL_ServletBase.java:166)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.execCommonWorker(SPARQL_ServletBase.java:154)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:73)
at org.apache.jena.fuseki.servlets.SPARQL_Query.doGet(SPARQL_Query.java:61)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:298)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
Any advice appreciated.
Thanks,
Stuart.
This looks like it is probably a bug which I have filed as JENA-522, if you have further details on the bug to add please add a comment there.
The issue is that the dataset with inference implicitly uses ARQ's standard in-memory Dataset implementation and this does not support transactions.
However text datasets which correspond to DatasetGraphText internally (and in your stack trace) requires the wrapped dataset to support transactions and where they do not wraps them with DatasetGraphWithLock. It is this that appears to be encountering the problem with the lock, the documentation states that this should support multiple readers but having followed the logic of the code I'm not sure that it actually allows this.

Fuseki Error 500: DatasetGraph.delete(Quad)

I have set up a Fuseki endpoint with the hopes of hitting a Jena SDB database (Truly a MySQL database in the backend) with some SPARQL, both select and update/delete queries. SPARQL select queries work just fine, but whenever I try and run an update query such as insert or delete, I get the following from Fuseki:
Error 500: DatasetGraph.delete(Quad)
Fuseki - version 0.2.6-SNAPSHOT (Build date: 2012-12-06T08:26:37-0500)
I have started up my endpoint by running the following script:
#!/bin/bash
export FusekiInstallDir=/data/jena-fuseki-0.2.6-SNAPSHOT
export FusekiPort=3030
export FusekiJVMArgs="-cp ./fuseki-server.jar:lib/ReconnectingSDB-0.1-SNAPSHOT.jar:lib/jena-sdb-1.3.6-SNAPSHOT.jar:lib/mysql-connector-java-5.1.16-bin.jar:lib/jena-arq-2.9.5-SNAPSHOT.jar -Xmx768M"
export Date=`date +%Y-%m-%d`
export FusekiLogFile=$FusekiInstallDir/FusekiLog-$Date.log
export FusekiConfigFile=$FusekiInstallDir/fuseki.ttl
export FusekiServiceName=/VIVO
# Check to see if logfile exists
if [ ! -f $FusekiLogFile ]; then
touch $FusekiLogFile
fi
# Check to see if config file exists
if [ ! -f $FusekiConfigFile ]; then
echo “ERROR – Fuseki failed to start – no configuration file - $FusekiConfigFile” >> $FusekiLogFile
exit 1
fi
# Execute Java calling the package for Fuseki
java $FusekiJVMArgs org.apache.jena.fuseki.FusekiCmd --desc $FusekiConfigFile --update --port=$FusekiPort $FusekiServiceName >> $FusekiLogFile 2>&1 &
Which accesses the following configuration file (fuseki.ttl):
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix sdb: <http://jena.hpl.hp.com/2007/sdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix jumble: <http://rootdev.net/vocab/jumble#> .
#prefix : <#> .
[] rdf:type fuseki:Server ;
# Services available. Only explicitly listed services are configured.
# If there is a service description not linked from this list, it is ignored.
fuseki:services (
<#service1>
) .
# Declaration additional assembler items.
# [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
[] ja:loadClass "net.rootdev.fusekisdbconnect.SDBConnect" .
jumble:SDBConnect rdfs:subClassOf ja:RDFDataset .
<#dataset> rdf:type jumble:SDBConnect ;
rdfs:label "SDB" ;
sdb:layout "layout2" ;
jumble:defaultUnionGraph "true" ; # will switch option on globally, in fact
sdb:connection [
rdf:type sdb:SDBConnection ;
# Using MySQL
sdb:sdbHost "LASP-DB-DEV" ;
sdb:sdbType "MySQL" ;
sdb:sdbName "db" ;
sdb:sdbUser "user" ;
sdb:sdbPassword "pass" ;
sdb:driver "com.mysql.jdbc.Driver" ;
#sdb:jdbcUrl "jdbc:mysql://localhost/vivodb?autoReconnect=true"
] ;
.
## ---------------------------------------------------------------
## Vivo dataset in sdb
<#service1> rdf:type fuseki:Service ;
fuseki:name "VIVO" ; # http://host:port/ds
fuseki:serviceQuery "query" ; # SPARQL query service
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ;
# A separate ead-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read only)
fuseki:dataset <#dataset> ;
.
Does anyone have any ideas on what might be causing this? Below is the resulting FusekiLog file if that is helpful. Thanks in advance!
12:32:40 INFO Server :: Dataset from assembler
12:32:40 WARN SDBConnect :: Hooking in to update and query engines
12:32:40 WARN SDBConnect :: Hooking in to assemblers
12:32:40 WARN SDBConnect :: Hooking in to assemblers
12:32:41 INFO Server :: Dataset path = /VIVO
12:32:41 INFO Server :: Fuseki 0.2.6-SNAPSHOT 2012-12-06T08:26:37-0500
12:32:41 INFO Server :: Started 2013/03/20 12:32:41 MDT on port 3030
12:32:52 INFO Fuseki :: [1] GET http://sorce-dp2:3030/VIVO/query?query=PREFIX+rdf%3A+++%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+rdfs%3A++%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+xsd%3A+++%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0D%0APREFIX+owl%3A+++%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0APREFIX+swrl%3A++%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F11%2Fswrl%23%3E%0D%0APREFIX+swrlb%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F11%2Fswrlb%23%3E%0D%0APREFIX+vitro%3A+%3Chttp%3A%2F%2Fvitro.mannlib.cornell.edu%2Fns%2Fvitro%2F0.7%23%3E%0D%0APREFIX+bibo%3A+%3Chttp%3A%2F%2Fpurl.org%2Fontology%2Fbibo%2F%3E%0D%0APREFIX+dcelem%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0D%0APREFIX+dcterms%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0D%0APREFIX+event%3A+%3Chttp%3A%2F%2Fpurl.org%2FNET%2Fc4dm%2Fevent.owl%23%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0APREFIX+geo%3A+%3Chttp%3A%2F%2Faims.fao.org%2Faos%2Fgeopolitical.owl%23%3E%0D%0APREFIX+pvs%3A+%3Chttp%3A%2F%2Fvivoweb.org%2Fontology%2Fprovenance-support%23%3E%0D%0APREFIX+ero%3A+%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2F%3E%0D%0APREFIX+scires%3A+%3Chttp%3A%2F%2Fvivoweb.org%2Fontology%2Fscientific-research%23%3E%0D%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0APREFIX+vitro-public%3A+%3Chttp%3A%2F%2Fvitro.mannlib.cornell.edu%2Fns%2Fvitro%2Fpublic%23%3E%0D%0APREFIX+vivo%3A+%3Chttp%3A%2F%2Fvivoweb.org%2Fontology%2Fcore%23%3E%0D%0APREFIX+laspcms%3A+%3Chttp%3A%2F%2Fsorce-dp2%3A8080%2Flaspcms%2F%3E%0D%0APREFIX+vsto%3A+%3Chttp%3A%2F%2Fescience.rpi.edu%2Fontology%2Fvsto%2F2%2F0%2Fvsto.owl%23%3E%0D%0A%0D%0ASELECT+%3Fname%0D%0AWHERE%0D%0A%7B%0D%0A%3Fthing+a+vsto%3ADataset+.%0D%0A%3Fthing+rdfs%3Alabel+%3Fname%0D%0A%7D%0D%0A&output=text&stylesheet=%2Fxml-to-html.xsl
12:32:52 INFO Fuseki :: [1] Query = PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX swrl: <http://www.w3.org/2003/11/swrl#> PREFIX swrlb: <http://www.w3.org/2003/11/swrlb#> PREFIX vitro: <http://vitro.mannlib.cornell.edu/ns/vitro/0.7#> PREFIX bibo: <http://purl.org/ontology/bibo/> PREFIX dcelem: <http://purl.org/dc/elements/1.1/> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX event: <http://purl.org/NET/c4dm/event.owl#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX geo: <http://aims.fao.org/aos/geopolitical.owl#> PREFIX pvs: <http://vivoweb.org/ontology/provenance-support#> PREFIX ero: <http://purl.obolibrary.org/obo/> PREFIX scires: <http://vivoweb.org/ontology/scientific-research#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX vitro-public: <http://vitro.mannlib.cornell.edu/ns/vitro/public#> PREFIX vivo: <http://vivoweb.org/ontology/core#> PREFIX laspcms: <http://sorce-dp2:8080/laspcms/> PREFIX vsto: <http://escience.rpi.edu/ontology/vsto/2/0/vsto.owl#> SELECT ?name WHERE { ?thing a vsto:Dataset . ?thing rdfs:label ?name }
12:32:52 INFO Fuseki :: [1] OK/select
12:32:52 INFO Fuseki :: [1] 200 OK
12:35:06 INFO Fuseki :: [2] POST http://sorce-dp2:3030/VIVO/update
12:35:06 WARN SPARQL_Update$HttpActionUpdate :: Transaction still active in endWriter - no commit or abort seen (forced abort)
12:35:06 WARN SPARQL_Update$HttpActionUpdate :: Exception in forced abort (trying to continue)
com.hp.hpl.jena.sparql.core.DatasetGraphWithLock$JenaLockException: Can't abort a locked update
at com.hp.hpl.jena.sparql.core.DatasetGraphWithLock._abort(DatasetGraphWithLock.java:97)
at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.abort(DatasetGraphTrackActive.java:63)
at org.apache.jena.fuseki.servlets.HttpAction.endWrite(HttpAction.java:121)
at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:246)
at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:232)
at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:118)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommonWorker(SPARQL_ServletBase.java:117)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:67)
at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:84)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:598)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:442)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1033)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:369)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:967)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:358)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:851)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:293)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)
at java.lang.Thread.run(Thread.java:662)
12:35:06 WARN Fuseki :: [2] RC = 500 : DatasetGraph.delete(Quad)
java.lang.UnsupportedOperationException: DatasetGraph.delete(Quad)
at com.hp.hpl.jena.sparql.core.DatasetGraphBase.delete(DatasetGraphBase.java:81)
at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
at com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:76)
at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.execDelete(UpdateEngineWorker.java:438)
at com.hp.hpl.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:357)
at com.hp.hpl.jena.sparql.modify.request.UpdateModify.visit(UpdateModify.java:97)
at com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:57)
at com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:334)
at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:327)
at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:307)
at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:257)
at org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:242)
at org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:232)
at org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:118)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommonWorker(SPARQL_ServletBase.java:117)
at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:67)
at org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:84)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:598)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:442)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1033)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:369)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:967)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:358)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:851)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:293)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)
at java.lang.Thread.run(Thread.java:662)
12:35:06 INFO Fuseki :: [2] 500 DatasetGraph.delete(Quad)
You seem to be running
Fuseki - version 0.2.2-incubating-SNAPSHOT
despite
export FusekiInstallDir=/data/jena-fuseki-0.2.4
It could be a known bug that is now fixed. Could you upgrade to 0.2.5?
The config says:
jumble:SDBConnect rdfs:subClassOf ja:RDFDataset .
but
<#ufvivo_dataset_read> rdf:type sdb:DatasetStore ;
sdb:store <#VIVOStore>
.
<#VIVOStore> rdf:type jumble:SDBConnect;
so jumble:SDBConnect is not a dataset (may not matter - the code follows the <#VIVOStore> link
There is no
sdb:DatasetStore rdfs:subClassOf ja:RDFDataset .
but your local init class may do something here.
there is tdb.lock exist, you need to remove this file
Use the following command to remove that file
rm -f run/system/tdb.lock

Can I configure Jena Fuseki with inference and TDB?

I want to configure Fuseki with an inference model supported by TDB.
I have been able to configure it with a Memory Model, but not with a TDB Model where I could update triples.
I am using the following assembler description:
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#infModel> .
<#infModel> a ja:InfModel ;
ja:baseModel <#tdbGraph>;
ja:reasoner
[ja:reasonerURL <http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner>].
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:location "DB" ;
.
It works fine and it is able to do RDFS inference and even to insert new triples.
However, once I stop and restart the server, it raises the following exception:
Error 500: Invalid id node for subject (null node): ([000000000000001D], [00000000000000AF], [000000000000003D])
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:location "DB" ;
.
Get rid of the semi colon after the second statement and terminate with the full stop ie:
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:location "DB".