Does RDF4J's AST allow for SPARQL Query Rewriting? - sparql

According to a couple answers here and here, it seems that it might be possible to use the AST produced by RDF4J's SyntaxTreeBuilder to modify a parsed SPARQL query in memory and then serialize it back to a SPARQL string. However, I have not found any implementations that handle this serialization given the AST. The closest thing of course is the SparqlQueryRenderer but my use case demands working with the AST, not the SPARQL algebra model.
Naturally, I could imagine implementing a class that rebuilds the SPARQL query string as an AST visitor, but I would rather use a different library if it is not already supported or implemented for RDF4J.

Related

Can SHACL be implemented using SPARQL?

I'm trying to find a way to implement SHACL validations using SPARQL in my AWS Neptune Graph database. Is there a way to do so?
Well, it depends on what you mean by "implement". ;-)
You cannot implement all of SHACL with SPARQL alone, but you could implement some subset; not with a single query, though. You could, for example, write a query that collects the constraints of your shapes, and then use those results to generate a query that gets the relevant parts of your data; you could then examine those results and produce a validation report. And if you are doing stuff programmatically, you could of course implement also those parts with cannot be expressed through SPARQL (e.g., literal string patterns).
All that is somewhat "academic". There are open source SHACL implementations that you could use as a Neptune client (e.g., pySHACL if you are using Python and RDFLib). That would be a better and certainly a far more practical way.

Extension of SPARQL syntax

I am trying to extend SPARQL by introducing new clauses in the syntax. Is there a way to do this using graphdb?
I need a headstart for this
GraphDB uses the Eclipse RDF4J SPARQL parser, which is open-source, so you can extend that easily enough. The parser is defined as a JavaCC grammar file (see sparql.jjt in rdf4j github), so the easiest way to exend is to add your extensions to that grammar file and then recompile the parser using JavaCC.
However, given that, once you've parsed something, you need to transform it into an algebra model that then the underlying triplestore can actually do something with, extending the parser is the easy part.

Is Graql compiled or translated to gremlin?

I'm wrapping my head around Grakn a little to understands its added value, I wonder if Graql is compiled or translated to gremlin traversal step ?
This makes me wonder about the difference of expressivity between Sparql and Graql, given that the former is until now not fully translated into Gremlin. It seems to be an open problem ? Is Graql fundamentally simpler than sparql and that would explain the fact that is it fully translated if that's the case ? If not is there any limitation in translating it to gremlin steps at this point ?
I'll try to shine some light on your questions.
To begin with, Graql was designed to be a high-level, human-readable query language. The main idea was to abstract the node-vertex graph datastructure to concepts that are specific to a given user-defined domain. In that way the user doesn't need to worry about the underlying graph representation and low-level gremlin constructs and instead he can work with high-level terms he defined himself and/or he is familiar with.
Now, implementation-wise Graql is an abstraction over Gremlin which translates the high-level queries to Gremlin traversals which can then be executed against a specific graph. However, the mapping between Graql and Gremlin is not 1-1. In fact, Graql operates with some subset of Gremlin that allows to capture the intended behaviours of the Graql language. It was never our intention to find such a mapping as the goal was to translate high-level queries to queries understandable by the underlying graph processor.
Now the efficiency of the traversal generation. Graql queries can be decomposed to properties (has, isa, sub, etc) and fragments. Each fragment has a defined Gremlin counterpart and each property can possibly contain multiple fragments. Now the fragment translation is unambiguous, however there is a lot of freedom in picking and arranging the fragments that go into a property. Keeping in mind that queries contain multiple properties this makes the arrangement a strictly non-trivial task. To perform this arrangement, which in Gremlin is handed to the user, we implemented a query processor. The idea of the processor is to pick such an arrangement and ordering of the fragments that the resulting query execution is as fast as possible. This is reminiscent of SQL query processors and the motivation is exactly the same, to abstract the query optimisation from the user.
We are actively working on the query planning component and although it gives no guarantee to be produce the most optimal plan in all cases, we are trying to make the produced plans converge to optimal solutions.

Understanding the difference between SPARQL and semantic reasoning using Pellet

I have a pizza ontology that defines different types of pizzas, ingredients and relations among them.
I just want to understand several basic things:
Is it correct that I should apply SPARQL if I want to obtain information without
reasoning? E.g. which pizzas contain onion?
What is the difference between SPARQL and reasoning algorithms
like Pellet? Which queries cannot be answered by SPARQL, while can
be answered by Pellet? Some examples of queries (question-like) for the pizza ontology would be helpful.
As far as I understand to use SPARQL from Java with Jena, I
should save my ontology in RDF/XML format. However, to use Pellet
with Jena, which format do I need to select? Pellet uses OWL2...
SPARQL is a query language, that is, a language for formulating questions in. Reasoning, on the other hand, is the process of deriving new information from existing data. These are two different, complementary processes.
To retrieve information from your ontology you use SPARQL, yes. You can do this without reasoning, or in combination with a reasoner, too. If you have a reasoner active it means your queries can be simpler, and in some cases reasoners can derive information that is not really retrievable at all with just a query.
Reasoners like Pellet don't really answer queries, they just reason: they figure out what implicit information can be derived from the raw facts, and can do things like verifying that things are consistent (i.e. that there are no logical contradictions in your data). Pellet can figure out that that if you own a Toyota, which is of type Car, you own a Vehicle (because a Car is a type of Vehicle). Or it can figure out that if you define a pizza to have the ingredient "Parmesan", you have a pizza of type "Cheesy" (because it knows Parmesan is a type of Cheese). So you use a reasoner like Pellet to derive this kind of implicit information, and then you use a query language like SPARQL to actually ask: "Ok, give me an overview of all Cheesy pizzas that also have anchovies".
APIs like Jena are toolkits that treat RDF as an abstract model. Which syntax format you save your file in is immaterial, it can read almost any RDF syntax. As soon as you have it read in a Jena model you can execute the Pellet reasoner on it - it doesn't matter which syntax your original file was in. Details on how to do this can be found in the Jena documentation.

Manipulating RDF in Jena

currently, I found out that I can query using model (Model) syntax in Jena in a rdf after loading the model from a file, it gives me same output if I apply a sparql query. So, I want to know that , is it a good way to that without sparql? Though I have tested it with a small rdf file. I also want to know if I use Virtuoso can i manipulate using model syntax without sparql?
Thanks in Advance.
I'm not quite sure if I understand your question. If I can paraphrase, I think you're asking:
Is it OK to query and manipulate RDF data using the Jena Model API instead of using
SPARQL? Does it make a difference if the back-end store is Virtuoso?
Assuming that's the right re-phrasing of the question, then the first part is definitively yes: you can manipulate RDF data through the Model and OntModel APIs. In fact, I would say that's what the majority of Jena users do, particularly for small queries or updates. I find personally that going direct to the API is more succinct up to a certain point of complexity; after that, my code is clearer and more concise if I express the query in SPARQL. Obviously circumstances will have an effect: if you're working with a mixture of local stores and remote SPARQL endpoints (for which sending a query string is your only option) then you may find the consistency of always using SPARQL makes your code clearer.
Regarding Virtuoso, I don't have any direct experience to offer. As far as I know, the Virtuoso Jena Provider fully implements the features of the Model API using a Virtuoso store as the storage layer. Whether the direct API or using SPARQL queries gives you a performance advantage is something you should measure by benchmark with your data and your typical query patterns.