virtuoso sparql engine - sparql

I'd like to ask if anyone knows how a sparql query is actually executed in virtuoso opensource edition. Is the sparql query mapped to an sql query? How the RDF data are accessed?
Should I read the source code for this?
Thanks for any hint!

Is the sparql query mapped to an sql query?
Yes, it is
How the RDF data are accessed?
You can find some details on rdf-data storage here: Implementing a SPARQL compliant RDF Triple Store using a SQL-ORDBMS
But if you are writing RDF-aware app you should just use the SPARQL interface and ignore all implementation details. It's fast and feature-rich.
Should I read the source code for this?
Well, it depends on what you need to achieve. It's not necessary for usage, but if you are planning to provide patches then reading source is a must :)

Related

SPARQL consider additional triples in a query

So, I need to run SPARQL query over a semantic database but some of the triples are not going to be in the database but are going to be provided by webservices (and not as a SPARQL endpoint). I would want to be able to run a SELECT query that take into consideration those additional triples but without having to insert them in the database, is there a way to do that ?
This is not part of the SPARQL spec, so "no" is the general answer.
That said, Virtuoso (possibly among others) lets you include an external RDF source (a/k/a webservice) as part of the FROM (among other methods), to be dereferenced during SPARQL query processing.
Such webservice need not be a SPARQL endpoint, but best performance will result if it provides RDF (though serialization may vary).
The Virtuoso Sponger can also be invoked on the fly to derive RDF from many document formats (with an obvious performance hit). To pursue, please raise this to the OpenLink Community Forum.

Using dotnetRDF library to query Large RDF file by SPARQL

i want to query an Ontology which is defined in an RDF file using SPARQL and dotnetRDF library. The problem is that the file is large, so it's not very practical to load the entire file in memory. What should i do ?
Thanks in advance
As AKSW says in the comment, the best approach would be to load your RDF file into a triple store and then run your SPARQL queries against that. dotNetRDF ships with support for several triple stores as listed at https://github.com/dotnetrdf/dotnetrdf/wiki/UserGuide-Storage-Providers. However, all you really need is a triple store that supports the SPARQL protocol and then you will be able to run your queries from dotNetRDF code using the SparqlRemoteEndpointclass as described at https://github.com/dotnetrdf/dotnetrdf/wiki/UserGuide-Querying-With-SPARQL#remote-query.
As for which triple store to use, Jena with Fuseki is probably a good open-source choice.

Mount a SPARQL endpoint for use with custom ontologies and triple RDFs

I've been trying to figure out how to mount a SPARQL endpoint for a couple of days, but as much as I read I can not understand it.
Comment my intention: I have an open data server mounted on CKAN and my goal is to be able to use SPARQL queries on the data. I know I could not do it directly on the datasets themselves, and I would have to define my own OWL and convert the data I want to use from CSV format (which is the format they are currently in) to RDF triple format (to be used as linked data).
The idea was to first test with the metadata of the repositories that can be generated automatically with the extension ckanext-dcat, but is that I really do not find where to start. I've searched for information on how to install a Virtuoso server for the SPARQL, but the information I've found leaves a lot to be desired, not to say that I can find nowhere to explain how I could actually introduce my own OWLs and RDFs into Virtuoso itself.
Someone who can lend me a hand to know how to start? Thank you
I'm a little confused. Maybe this is two or more questions?
1. How to convert tabular data, like CSV, into the RDF semantic format?
This can be done with an R2RML approach. Karma is a great GUI for that purpose. Like you say, a conversion like that can really be improved with an underlying OWL ontology. But it can be done without creating a custom ontology, too.
I have elaborated on this in the answer to another question.
2. Now that I have some RDF formatted data, how can I expose it with a SPARQL endpoint?
Virtuoso is a reasonable choice. There are multiple ways to deploy it and multiple ways to load the data, and therefore LOTs of tutorial on the subject. Here's one good one, from DBpedia.
If you'd like a simpler path to starting an RDF triplestore with a SPARQL endpoint, Stardog and Blazegraph are available as JARs, and RDF4J can easily be deployed within a container like Tomcat.
All provide web-based graphical interfaces for loading data and running queries, in addition to SPARQL REST endpoints. At least Stardog also provides command-line tools for bulk loading.

Convert user query to SPARQL Query

I'm building a movie ontology for my class project. My problem is how do I integrate user entered query (Through provided web page) to SPARQL Query and get the list of answers from the database through the ontology.
I have some knowledge of mapping the ontology with my database. Please provide me a solution for this.
Thanks in advance.
(I'm using Protege with ontopro plugin for mapping)
Thanks
If you mean converting user queries to machine understandable queries, is not an easy task. This is a question in the domain of Question Answering (QA) and needs sophisticated work.
Three steps of building a QA system including question analysis, document (database) analysis and answer extraction. Each of these three, also consists of different tasks. For example, question analysis includes feature selection and extraction, building a classifier and evaluating it to get the Expected Answer Type (EAT).
But if you have a static format for your queries, then you can ask users to propose their queries in that format and then, you have an easier job of matching user queries to SPARQL queries and getting the answers.
Personally, I think you should start from the point of having the SPARQL query or use a keyword-based interface instead of a text-based interface if you want to make your job easy as this method is suitable for small projects.
What do you mean by user query? a text search? a faceted search? a mix? something else? For me, this is not clear and a bit of detail would help understanding.
Also an example would actually be ideal. Like the "user query" and the "sparql equivalent" if possible.

Manipulating RDF in Jena

currently, I found out that I can query using model (Model) syntax in Jena in a rdf after loading the model from a file, it gives me same output if I apply a sparql query. So, I want to know that , is it a good way to that without sparql? Though I have tested it with a small rdf file. I also want to know if I use Virtuoso can i manipulate using model syntax without sparql?
Thanks in Advance.
I'm not quite sure if I understand your question. If I can paraphrase, I think you're asking:
Is it OK to query and manipulate RDF data using the Jena Model API instead of using
SPARQL? Does it make a difference if the back-end store is Virtuoso?
Assuming that's the right re-phrasing of the question, then the first part is definitively yes: you can manipulate RDF data through the Model and OntModel APIs. In fact, I would say that's what the majority of Jena users do, particularly for small queries or updates. I find personally that going direct to the API is more succinct up to a certain point of complexity; after that, my code is clearer and more concise if I express the query in SPARQL. Obviously circumstances will have an effect: if you're working with a mixture of local stores and remote SPARQL endpoints (for which sending a query string is your only option) then you may find the consistency of always using SPARQL makes your code clearer.
Regarding Virtuoso, I don't have any direct experience to offer. As far as I know, the Virtuoso Jena Provider fully implements the features of the Model API using a Virtuoso store as the storage layer. Whether the direct API or using SPARQL queries gives you a performance advantage is something you should measure by benchmark with your data and your typical query patterns.