Using dotnetRDF library to query Large RDF file by SPARQL - sparql

i want to query an Ontology which is defined in an RDF file using SPARQL and dotnetRDF library. The problem is that the file is large, so it's not very practical to load the entire file in memory. What should i do ?
Thanks in advance

As AKSW says in the comment, the best approach would be to load your RDF file into a triple store and then run your SPARQL queries against that. dotNetRDF ships with support for several triple stores as listed at https://github.com/dotnetrdf/dotnetrdf/wiki/UserGuide-Storage-Providers. However, all you really need is a triple store that supports the SPARQL protocol and then you will be able to run your queries from dotNetRDF code using the SparqlRemoteEndpointclass as described at https://github.com/dotnetrdf/dotnetrdf/wiki/UserGuide-Querying-With-SPARQL#remote-query.
As for which triple store to use, Jena with Fuseki is probably a good open-source choice.

Related

Mount a SPARQL endpoint for use with custom ontologies and triple RDFs

I've been trying to figure out how to mount a SPARQL endpoint for a couple of days, but as much as I read I can not understand it.
Comment my intention: I have an open data server mounted on CKAN and my goal is to be able to use SPARQL queries on the data. I know I could not do it directly on the datasets themselves, and I would have to define my own OWL and convert the data I want to use from CSV format (which is the format they are currently in) to RDF triple format (to be used as linked data).
The idea was to first test with the metadata of the repositories that can be generated automatically with the extension ckanext-dcat, but is that I really do not find where to start. I've searched for information on how to install a Virtuoso server for the SPARQL, but the information I've found leaves a lot to be desired, not to say that I can find nowhere to explain how I could actually introduce my own OWLs and RDFs into Virtuoso itself.
Someone who can lend me a hand to know how to start? Thank you
I'm a little confused. Maybe this is two or more questions?
1. How to convert tabular data, like CSV, into the RDF semantic format?
This can be done with an R2RML approach. Karma is a great GUI for that purpose. Like you say, a conversion like that can really be improved with an underlying OWL ontology. But it can be done without creating a custom ontology, too.
I have elaborated on this in the answer to another question.
2. Now that I have some RDF formatted data, how can I expose it with a SPARQL endpoint?
Virtuoso is a reasonable choice. There are multiple ways to deploy it and multiple ways to load the data, and therefore LOTs of tutorial on the subject. Here's one good one, from DBpedia.
If you'd like a simpler path to starting an RDF triplestore with a SPARQL endpoint, Stardog and Blazegraph are available as JARs, and RDF4J can easily be deployed within a container like Tomcat.
All provide web-based graphical interfaces for loading data and running queries, in addition to SPARQL REST endpoints. At least Stardog also provides command-line tools for bulk loading.

Is it possible to load multiple ontologies with Jena?

I am new with Jena and I am implementing an application in order to manipulate RDF data. I have already implemented some basic fonctions to see Classes, Properties and other stuffs.
And I wonder if it could be possible to load multiple ontologies with Jena and then have some SPARQL queries on those. (I already know that is possible because Protégé does it).
Thank's for your interest.
I'm open for any questions or precision.

virtuoso sparql engine

I'd like to ask if anyone knows how a sparql query is actually executed in virtuoso opensource edition. Is the sparql query mapped to an sql query? How the RDF data are accessed?
Should I read the source code for this?
Thanks for any hint!
Is the sparql query mapped to an sql query?
Yes, it is
How the RDF data are accessed?
You can find some details on rdf-data storage here: Implementing a SPARQL compliant RDF Triple Store using a SQL-ORDBMS
But if you are writing RDF-aware app you should just use the SPARQL interface and ignore all implementation details. It's fast and feature-rich.
Should I read the source code for this?
Well, it depends on what you need to achieve. It's not necessary for usage, but if you are planning to provide patches then reading source is a must :)

semantic web + linked data integration

i'm new to semantic web.
I'm trying to do a sample application where i can query data from different data sources in one query.
i have created a small rdf file which contains references to dbpedia resources for defining localities. my question is : how can i get the data contained in my file and other information which is in the description of the distant resource (for example : the name of the person from the local file, and the total poulation in a city dbpedia-owl:populationTotal from the distant rdf file).
i don't really understand the sparql query language, i tried to use the JENA ARQ API with the SERVICE keyword but it doesn't solve the problem.
Any help please?
I guess you are looking for something like the Semantic Web Client Library, which tries to leverage the GGG. Albeit, the standard exploration algorithm of this framework is that it follows rdfs:seeAlso links. Nevertheless, the general approach seems to be what your are looking for, i.e., you would create a local graph that consists of your seed graph and that traverse the relations up to a certain level, e.g., three steps, resolves the URIs and load that content into your local triple. Utilising advanced technologies like SPARQL federation might be something for later ;)
I have retrived data from two different sources using SPARQL query with named graphs.
I used jena-ARQ to execute the sparql query.

Manipulating RDF in Jena

currently, I found out that I can query using model (Model) syntax in Jena in a rdf after loading the model from a file, it gives me same output if I apply a sparql query. So, I want to know that , is it a good way to that without sparql? Though I have tested it with a small rdf file. I also want to know if I use Virtuoso can i manipulate using model syntax without sparql?
Thanks in Advance.
I'm not quite sure if I understand your question. If I can paraphrase, I think you're asking:
Is it OK to query and manipulate RDF data using the Jena Model API instead of using
SPARQL? Does it make a difference if the back-end store is Virtuoso?
Assuming that's the right re-phrasing of the question, then the first part is definitively yes: you can manipulate RDF data through the Model and OntModel APIs. In fact, I would say that's what the majority of Jena users do, particularly for small queries or updates. I find personally that going direct to the API is more succinct up to a certain point of complexity; after that, my code is clearer and more concise if I express the query in SPARQL. Obviously circumstances will have an effect: if you're working with a mixture of local stores and remote SPARQL endpoints (for which sending a query string is your only option) then you may find the consistency of always using SPARQL makes your code clearer.
Regarding Virtuoso, I don't have any direct experience to offer. As far as I know, the Virtuoso Jena Provider fully implements the features of the Model API using a Virtuoso store as the storage layer. Whether the direct API or using SPARQL queries gives you a performance advantage is something you should measure by benchmark with your data and your typical query patterns.