Virtuoso SPARQL endpoint retrieving old values

Virtuoso SPARQL endpoint retrieving old values - sparql

I am having a problem with my Virtuoso RDF Store.
Even though I am uploading new RDF files, everytime I make a SPARQL query, the only values which are retrieved are from old RDF files (the ones I manually deleted a long time ago).
This is an example of the SPARQL query I am doing:
PREFIX owl-time: <http://www.w3.org/2006/time#>
PREFIX cdc-owl: <http://www.contextdatacloud.org/ontology/>
SELECT *
FROM <miOnt:move>
WHERE
{
?ws rdf:type cdc-owl:WeatherSituation.
?ws cdc-owl:hasWeatherTime ?time.
?time owl-time:inXSDDateTime "2015-06-16T09:00:00".
?ws cdc-owl:hasTemperature ?temperature
}
And these are the results I am obtaining (as it can be seen, they are old files):
Any idea of why is this happening?
This is the way my repository looks like:

From what I can see, you have loaded your files into the Virtuoso WebDAV (file) repository, but you have not loaded the RDF therein into the Virtuoso (RDF) Quad Store.
See this guide to the bulk loader, and this page of RDF loading methods.
(ObDisclaimer: I work for OpenLink Software, producer of Virtuoso.)

Related

Clearing the internal string cache in rdf4j with sparql

To avoid a possible "XY problem", let me explain my real goal: I am trying to change the capitalization of language tags in an rdf4j repo, using sparql. But although rdf4j stores language tags as written when they were defined, it knows enough to treat them as case-insensitive as the standard dictates. So it treats my attempted edit as a no-op:
Set-up:
INSERT DATA { test:a skos:prefLabel "hello"#EN }
Attempt:
DELETE { test:a skos:prefLabel "hello"#EN }
INSERT { test:a skos:prefLabel "hello"#en }
WHERE
{ test:a skos:prefLabel "hello"#EN }
Result:
This query does nothing. The language tag is still spelled EN.
Interestingly, this also fails if I execute two separate queries:
Query 1:
DELETE DATA { test:a skos:prefLabel "hello"#EN }
Query 2:
INSERT DATA { test:a skos:prefLabel "hello"#en }
Evidently, deleted strings remain in an internal cache and are resurrected, so that my INSERT query resurrects "hello"#EN instead. A restart will clear the cache, but it's not the best UX...
Now, with some older versions of rdf4j I could clear this internal cache with the magic command CLEAR SILENT GRAPH <urn:uri:cache>. But this does not appear to work with rdf4j 2.3.3, which is what we are stuck with at the moment. Is there still a way to clear the string cache without a restart, or to change the capitalization of language tags in any other way?
PS I found this interesting thread about the handling of case in language tags; but it has brought me no closer to a solution.

At first glance this looks like a bug to me, an unintended consequence of a fix we did donkey's years ago for allowing preservation of case in language tags (https://openrdf.atlassian.net/browse/SES-1659).
I'm not sure there are any SPARQL-only workarounds for this, so please feel free to log a bug report/feature request at https://github.com/eclipse/rdf4j/issues.
Having said that, RDF4J does have functionality for normalizing language tags of course. In particular, the RDF parsers can be configured to normalize language tags (see the Rio configuration documentation), and in addition there's a utility method Literals.normalizeLanguageTag which you can use to convert any language tag to a standard canonical form.

Library of Congress RDF files-issue with accessing connected items through Sparql in JAVA

I am facing issues to fetch Library of Congress RDF Triples files connected items through SPARQL in Java. Can any one help me please.
When I fetched the data using SPARQL Fuseki server its working fine. But in Java its not showing the data related using nodes like below.
Data:
_:bnode11385494358076243167 <http://www.loc.gov/mads/rdf/v1#variantLabel> "Theft"#en .
Query:
String queryString=
"SELECT ?variantLabel"+
"WHERE {"+
"_:bnode11385494358076243167 <http://www.loc.gov/mads/rdf/v1#variantLabel> ?variantLabel.}";

Apache jena ARQ FILTER optimization

I have a reasonable implementation of Jena over MongoDB by providing impls for Graph and DatasetGraph. SPARQL queries are converted into the appropriate query expressions in MongoDB and material, at least on triple-match-by-triple-match basis, is vended in a high performance way. This is not a surprise; indexes do what they're supposed to do. Graph is wrapped with an RDFS reasoner Model and all is fine.
I am interested in now exploring ways to optimize filtering push-down into MongoDB. For example, this SPARQL:
?s a:attested "2017-06-01T00:00:00Z"^^xsd:dateTime .
results in this setup of a MongoDB find expression:
{ "P" : "a:attested", "O" : { "$date" : 1496275200000 } }
And all is good. But this SPARQL:
?s a:attested ?theDate .
FILTER (?theDate = "2017-06-01T00:00:00Z"^^xsd:dateTime)
causes ARQ to pass only the predicate to Graph::find():
{ "P" : "a:attested" }
and a whole lot of things are pulled from the DB and the filtering is done in ARQ itself. The FILTER expression above is obviously not needed for a simple equality but it proves the point.
The TDB documentation says that "... TDB uses the OpExecutor extension point of ARQ." But the link for OpExecutor goes to a To-Do.
Can anyone point at any examples of anything where something can be hooked or otherwise accessed around the time ARQ calls Graph::ExtendedIterator<Triple> find(Triple m)? It is at this point that my implementation jumps in to craft the query, and if I can ask if filters exist, then I can "improve" the restriction on the query. At this time, it is not so important that I deal with stopping the filtering from happening again in ARQ itself.

Copy a Sesame repository into a new one

I'd like to copy all the data from an existing Sesame repository into a new one. I need the migration so that I use OWL-inferencing on my triplestore which is not possible using OWLIM in an In Memory repository (the type of my existing repository).
What is the most efficient way to copy all triples from a repository into a new one?
UPDATE 1:
I'm curious to understand why using SPARQL INSERT cannot be a valid approach. I tried this code under the SPARQL Update section of a new repository:
PREFIX : <http://dbpedia.org/resource/>
INSERT{?s ?p ?o}
WHERE
{
SERVICE <http://my.ip.ad.here:8080/openrdf-workbench/repositories/rep_name>
{
?s ?p ?o
}
}
I get the following error:
org.openrdf.query.UpdateExecutionException: org.openrdf.sail.SailException: org.openrdf.query.QueryEvaluationException:
Is there an error in the query or can the data not be inserted this way? I've inserted data from DBpedia using queries of similar structure.

Manually (Workbench)
Open the repository you want to copy from.
select 'Export'.
choose a suitable export format ('TriG' or 'BinaryRDF' are good choices as these both preserve context information), and hit the 'download' button to save the data export to local disk.
Open the repository you want to copy to.
select 'Add'.
choose 'select the file containing the RDF data you wish to upload'
Click 'Browse' to find the data export file on your local disk.
Make sure 'use Base URI as context identifier' is not selected.
Hit 'upload', and sit back.
Programmatically
First, open RepositoryConnnections to both repositories:
RepositoryConnection source = sourceRepo.getConnection();
RepositoryConnection target = targetRepo.getConnection();
Then, read all statements from source and immediately insert into target:
target.add(source.getStatements(null, null, null, true));
Either basic method should work just fine for any repository up to about several million triples in size. Of course there are plenty of more advanced methods for larger bulk sizes.

Separation between TBox and ABox in Fuseki with TDB and Pubby

For my current project I need to load a dataset and different ontologies and expose everything as linked data using Fuseki with TDB and Pubby. Pubby will take a data set from a single location and create URIs based on that location, therefore if we need multiple different locations (as in the case with 2–3 separate ontologies), that would be easy to do with Pubby by adding another data set.
The concept of dataset seems to also apply to Fuseki.
Essentially I will need to expose three types of URIs:
www.mywebsite.com/project/data
www.mywebsite.com/project/data/structure
www.mywebsite.com/project/ontology
In order to create such URIs with Pubby 0.3.3. you will have to specify lines like these:
conf:dataset [
conf:sparqlEndpoint <sparql_endpoint_url_ONE>;
conf:sparqlDefaultGraph <sparql_default_graph_name_ONE>;
conf:datasetBase <http://mywebsite.com/project/>;
conf:datasetURIPattern "(data)/.*";
(...)
]
Each data set specified in Pubby will take its data from a certain URL (typically a SPARQL endpoint).
For ontologies you will have a dataset that uses the second a datasetURIPattern like this one:
conf:dataset [
conf:sparqlEndpoint <sparql_endpoint_url_TWO>;
conf:sparqlDefaultGraph <sparql_default_graph_name_TWO>;
conf:datasetBase <http://mywebsite.com/project/>;
conf:datasetURIPattern "(ontology)/.*";
(...)
]
As you can see the differences would be at the following: conf:sparqlEndpoint (the SPARQL endpoint), conf:sparqlDefaultGraph (the default Graph), conf:datasetURIPattern (needed for creating the actual URIs with Pubby).
It is not however clear to me how can I have separate URIs for the data sets when using Fuseki. When using Sesame, for example, I just create two different repositories and this trick works like charm when publishing data with Pubby. Not immediately clear with
The examples from the official Fuseki documentation present a single dataset (read-only or not, etc), but none of them seem to present such a scenario. There is no immediate example where there is a clear separation between the TBox and the ABox, even though this is a fundamental principle of Linked Data (see Keeping ABox and TBox Split).
As far as I understand this should be possible, but how? Also is it correct that the TBox and ABox can be reunited under a single SPARQL endpoint later by using (tdb:unionDefaultGraph true ;).

The dataset concept is not unique to Jena Fuseki; it's quite central in SPARQL. A dataset is a collection of named graphs and a default graph. The prefix of a URI has nothing to do with where triples about it are stored (whether in a named graph or in the default graph).
It sounds like you want to keep your ABox triples in one named graph and your TBox triples in another. Then, if the default graph is the union of the named graphs, you'll see the contents of both in the default graph. You can do that in Fuseki.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas