Simple Jena SPARQL query not working - sparql

What am I doing wrong here?
public class SimpleSearchTest {
public static void main(String[] args) throws Exception {
Model model = ModelFactory.createDefaultModel();
model.getGraph().add(new Triple(Node.createURI("a"), Node.createURI("b"), Node.createURI("c")));
String queryString = "SELECT ?p ?o WHERE { <a> ?p ?o }";
Query query = QueryFactory.create(queryString);
QueryExecution qExec = QueryExecutionFactory.create(query, model);
ResultSetFormatter.out(qExec.execSelect());
}
}
I am expecting
-------------
| p | o |
=============
| <b> | <c> |
-------------
But instead I am getting no results:
---------
| p | o |
=========
---------
I am sure it is something dumb...

I think the SPARQL parser isn't liking your <a> because it's not a legal URI (though it's odd that you don't get a warning). If you change your code as follows:
model.getGraph().add(new Triple(Node.createURI("http://example.com/a"), Node.createURI("b"), Node.createURI("c")));
String queryString = "SELECT ?p ?o WHERE { <http://example.com/a> ?p ?o }";
you get the result you are expecting.
Parenthetically, by creating the test graph with Node.createURI() you are using the lower-level internal Graph API, rather than the more normally used Model API. It's perfectly fine to do this, but the Graph API generally assumes you know more what you are doing, and may have fewer checks against doing the unexpected.

Related

How to convert a class expression (from restriction unionOf) to a string?

A SPARQL query returns a result with restrictions with allValuesFrom and unionOf. I need do concat these values, but, when I use bind or str functions, the result is blank.
I tried bind, str and group_concat functions, but, all of it was unsuccessful. Group_concat return a blank node.
SELECT DISTINCT ?source ?is_succeeded_by
WHERE {
?source rdfs:subClassOf ?restriction .
?restriction owl:onProperty j.0:isSucceededBy .
?restriction owl:allValuesFrom ?is_succeeded_by .
FILTER (REGEX(STR(?source), 'gatw-Invoice_match'))
}
Result of SPARQL query in Protegé:
You can hardly obtain strings like 'xxx or yyy' programmatically in Jena,
since it is Manchester Syntax, an OWL-API native format, and it is not supported by Jena.
Any class expression is actually b-node, there are no such builtin symbols like 'or' in raw RDF.
To represent any anonymous class expression as a string, you can use ONT-API,
which is a jena-based OWL-API, and, therefore, both SPARQL and Manchester Syntax are supported there.
Here is an example based on pizza ontology:
// use pizza, since no example data provided in the question:
IRI pizza = IRI.create("https://raw.githubusercontent.com/owlcs/ont-api/master/src/test/resources/ontapi/pizza.ttl");
// get OWLOntologyManager instance from ONT-API
OntologyManager manager = OntManagers.createONT();
// as extended Jena model:
OntModel model = manager.loadOntology(pizza).asGraphModel();
// prepare query that looks like the original, but for pizza
String txt = "SELECT DISTINCT ?source ?is_succeeded_by\n" +
"WHERE {\n" +
" ?source rdfs:subClassOf ?restriction . \n" +
" ?restriction owl:onProperty :hasTopping . \n" +
" ?restriction owl:allValuesFrom ?is_succeeded_by .\n" +
" FILTER (REGEX(STR(?source), 'Am'))\n" +
"}";
Query q = new Query();
q.setPrefixMapping(model);
q = QueryFactory.parse(q, txt, null, Syntax.defaultQuerySyntax);
// from owlapi-parsers package:
OWLObjectRenderer renderer = new ManchesterOWLSyntaxOWLObjectRendererImpl();
// from ont-api (although it is a part of internal API, it is public):
InternalObjectFactory iof = new SimpleObjectFactory(manager.getOWLDataFactory());
// exec SPARQL query:
try (QueryExecution exec = QueryExecutionFactory.create(q, model)) {
ResultSet res = exec.execSelect();
while (res.hasNext()) {
QuerySolution qs = res.next();
List<Resource> vars = Iter.asStream(qs.varNames()).map(qs::getResource).collect(Collectors.toList());
if (vars.size() != 2)
throw new IllegalStateException("For the specified query and valid OWL must not happen");
// Resource (Jena) -> OntCE (ONT-API) -> ONTObject (ONT-API) -> OWLClassExpression (OWL-API)
OWLClassExpression ex = iof.getClass(vars.get(1).inModel(model).as(OntClass.class)).getOWLObject();
// format: 'class local name' ||| 'superclass string in ManSyn'
System.out.println(vars.get(0).getLocalName() + " ||| " + renderer.render(ex));
}
}
The output:
American ||| MozzarellaTopping or PeperoniSausageTopping or TomatoTopping
AmericanHot ||| HotGreenPepperTopping or JalapenoPepperTopping or MozzarellaTopping or PeperoniSausageTopping or TomatoTopping
Used env: ont-api:2.0.0, owl-api:5.1.11, jena-arq:3.13.1

Does Jena TDB load all data into memory every time?

I am a newbie of Jena. I try to deal with the Yoga dataset using TDB. The dataset is about 200M and everytime I run the same query, it will have to take about 5 minutes to load the data then give out the results. I am wondering do I misunderstand any part of TDB? The following are my codes.
String directory = "tdb";
Dataset dataset = TDBFactory.createDataset(directory);
dataset.begin(ReadWrite.WRITE);
Model tdb = dataset.getDefaultModel();
//String source = "yagoMetaFacts.ttl";
//FileManager.get().readModel(tdb, source);
String queryString = "SELECT DISTINCT ?p WHERE { ?s ?p ?o. }";
Query query = QueryFactory.create(queryString);
try(QueryExecution qexec = QueryExecutionFactory.create(query, tdb)){
ResultSet results = qexec.execSelect();
ResultSetFormatter.out(System.out, results, query) ;
}
dataset.commit();
dataset.end();
There are two ways to load data into tdb, either by API or CMD. Much thanks to #ASKW and #AndyS
1 Load data via API
These codes need to be executed only once especially the readModel line which will takes long time.
String directory = "tdb";
Dataset dataset = TDBFactory.createDataset(directory);
dataset.begin(ReadWrite.WRITE);
Model tdb = dataset.getDefaultModel();
String source = "yagoMetaFacts.ttl";
FileManager.get().readModel(tdb, source);
dataset.commit(); //Important!! This is to commit the data to tdb.
dataset.end();
After the data is loaded into tdb, we can use following codes to query. And it is not necessary to load data again.
String directory = "path\\to\\tdb";
Dataset dataset = TDBFactory.createDataset(directory);
Model tdb = dataset.getDefaultModel();
String queryString = "SELECT DISTINCT ?p WHERE { ?s ?p ?o. }";
Query query = QueryFactory.create(queryString);
try(QueryExecution qexec = QueryExecutionFactory.create(query, tdb)){
ResultSet results = qexec.execSelect();
ResultSetFormatter.out(System.out, results, query) ;
}
2 Load data via CMD
To load data
>tdbloader --loc=path\to\tdb path\to\dataset.ttl
To query
>tdbquery --loc=path\to\tdb --query=q1.rq
q1.rq is the file which stores the query
Should get results like this
-------------------------------------------------------
| p |
=======================================================
| <http://yago-knowledge.org/resource/hasGloss> |
| <http://yago-knowledge.org/resource/occursSince> |
| <http://yago-knowledge.org/resource/occursUntil> |
| <http://yago-knowledge.org/resource/byTransport> |
| <http://yago-knowledge.org/resource/hasPredecessor> |
| <http://yago-knowledge.org/resource/hasSuccessor> |
| <http://www.w3.org/2000/01/rdf-schema#comment> |
-------------------------------------------------------

Simple SPARQL query does not return any results

I am just getting up and running with Blazegraph in embedded mode. I load a few sample triples and am able to retrieve them with a "select all" query:
SELECT * WHERE { ?s ?p ?o }
This query returns all my sample triples:
[s=<<<http://github.com/jschmidt10#person_Thomas>, <http://github.com/jschmidt10#hasAge>, "30"^^<http://www.w3.org/2001/XMLSchema#int>>>;p=blaze:history:added;o="2017-01-15T16:11:15.909Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>]
[s=<<<http://github.com/jschmidt10#person_Tommy>, <http://github.com/jschmidt10#hasLastName>, "Test">>;p=blaze:history:added;o="2017-01-15T16:11:15.909Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>]
[s=<<<http://github.com/jschmidt10#person_Tommy>, <http://www.w3.org/2002/07/owl#sameAs>, <http://github.com/jschmidt10#person_Thomas>>>;p=blaze:history:added;o="2017-01-15T16:11:15.909Z"^^<http://www.w3.org/2001/XMLSchema#dateTime>]
[s=<http://github.com/jschmidt10#person_Thomas>;p=<http://github.com/jschmidt10#hasAge>;o="30"^^<http://www.w3.org/2001/XMLSchema#int>]
[s=<http://github.com/jschmidt10#person_Tommy>;p=<http://github.com/jschmidt10#hasLastName>;o="Test"]
[s=<http://github.com/jschmidt10#person_Tommy>;p=<http://www.w3.org/2002/07/owl#sameAs>;o=<http://github.com/jschmidt10#person_Thomas>]
Next I try a simple query for a particular subject:
SELECT * WHERE { <http://github.com/jschmidt10#person_Thomas> ?p ?o }
This query yields no results. It seems that none of my queries for a URI are working. I am able to get results when I query for a literal (e.g. ?s ?p "Test").
The API I am using to create my query is BigdataSailRepositoryConnection.prepareQuery().
Code snippet (Scala) that executes and generates the query:
val props = BasicRepositoryProvider.getProperties("./graph.jnl")
val sail = new BigdataSail(props)
val repo = new BigdataSailRepository(sail)
repo.initialize()
val query = "SELECT ?p ?o WHERE { <http://github.com/jschmidt10#person_Thomas> ?p ?o }"
val cxn = repo.getConnection
cxn.begin()
var res = cxn.
prepareTupleQuery(QueryLanguage.SPARQL, query).
evaluate()
while (res.hasNext) println(res.next)
cxn.close()
repo.shutDown()
Have you checked the way you filled the database? You might have characters that are getting encoded strangely, or it looks like you might have excess brackets in your objects.
From the print statement, your URI's are printing extra angled brackets. You are likely using:
val subject = valueFactory.createURI("<http://some.url/some/entity>")
when you should be doing this (without angled brackets):
val subject = valueFactory.createURI("http://some.url/some/entity")

Searching semantically tagged documents in MarkLogic

Can any one please point me to some simple examples of semantic tagging and querying semantically tagged documents in MarkLogic?
I am fairly new in this area,so some beginner level examples will do.
When you say "semantically tagged" do you mean regular XML documents that happen to have some triples in them? The discussion and examples at http://docs.marklogic.com/guide/semantics/embedded are pretty good for that.
Start by enabling the triple index in your database. Then insert a test doc. This is just XML, but the sem:triple element represents a semantic fact.
xdmp:document-insert(
'test.xml',
<test>
<source>AP Newswire</source>
<sem:triple date="1972-02-21" confidence="100">
<sem:subject>http://example.org/news/Nixon</sem:subject>
<sem:predicate>http://example.org/wentTo</sem:predicate>
<sem:object>China</sem:object>
</sem:triple>
</test>)
Then query it. The example query is pretty complicated. To understand what's going on I'd insert variations on that sample document, using different URIs instead of just test.xml, and see how the various query terms match up. Try using just the SPARQL component, without the extra cts query. Try cts:search with no SPARQL, just the cts:query.
xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics"
at "/MarkLogic/semantics.xqy";
sem:sparql('
SELECT ?country
WHERE {
<http://example.org/news/Nixon> <http://example.org/wentTo> ?country
}
',
(),
(),
cts:and-query((
cts:path-range-query( "//sem:triple/#confidence", ">", 80) ,
cts:path-range-query( "//sem:triple/#date", "<", xs:date("1974-01-01")),
cts:or-query((
cts:element-value-query( xs:QName("source"), "AP Newswire"),
cts:element-value-query( xs:QName("source"), "BBC"))))))
In case you are talking about enriching your content using semantic technology, that is not directly provided by MarkLogic.
You can enrich your content externally, for instance by calling a public service like the one provided by OpenCalais, and then insert the enrichments to the content before insert.
You can also build lists of lookup values, and then using cts:highlight to mark such terms within your content. That could be as simple as:
let $labels := ("MarkLogic", "StackOverflow")
return
cts:highlight($doc, cts:word-query($labels), <b>{$cts:text}</b>)
Or with a more dynamic replacement using spraql:
let $labels := map:new()
let $_ :=
for $result in sem:sparql('
PREFIX demo: <http://www.marklogic.com/ontologies/demo#>
SELECT DISTINCT ?label
WHERE {
?s a demo:person.
{
?s demo:fullName ?label
} UNION {
?s demo:initialsName ?label
} UNION {
?s demo:email ?label
}
}
')
return
map:put($labels, map:get($result, 'label'), 'person')
return
cts:highlight($doc, cts:word-query(map:keys($labels)),
let $result := sem:sparql(concat('
PREFIX demo: <http://www.marklogic.com/ontologies/demo#>
SELECT DISTINCT ?s ?p
{
?s a demo:', map:get($labels, $cts:text), ' .
?s ?p "', $cts:text, '" .
}
'))
return
if (map:contains($labels, $cts:text))
then
element { xs:QName(fn:concat("demo:", map:get($labels, $cts:text))) } {
attribute subject { map:get($result, 's') },
attribute predicate { map:get($result, 'p') },
$cts:text
}
else ()
)
HTH!

How to parse elements from Sparql algebra

I can get the algebra form from sparql query string using ARQ algebra (com.hp.hpl.jena.sparql.algebra):
String queryStr =
"PREFIX foaf: <http://xmlns.com/foaf/0.1/>" +
"SELECT DISTINCT ?name ?nick" +
"{?x foaf:mbox <mailt:person#server> ." +
"?x foaf:name ?name" +
"OPTIONAL { ?x foaf:nick ?nick }}";
Query query = QueryFactory.create(queryStr);
Op op = Algebra.compile(query);
Print the returned value of op:
(distinct
(project (?name ?nick)
(join
(bgp
(triple ?x <http://xmlns.com/foaf/0.1/mbox> <mailt:person#server>)
(triple ?x <http://xmlns.com/foaf/0.1/name> ?nameOPTIONAL)
)
(bgp (triple ?x <http://xmlns.com/foaf/0.1/nick> ?nick)))))
Returned value is an Op type, but I can't find any direct methods that can parse the op into elements, e.g., basic graph patterns of s, p, o, and the relations between these graph patterns.
Any hint is appreciated, thanks.
Why serialise out the algebra at all?
If your aim is to walk the algebra tree and extract the BGPs then you can do this using the OpVisitor interface of which there are various implementations of that will get you started. The particular method you would care about is visit(OpBgp opBgp) since then you can access the methods of the OpBgp class to extract the pattern information
It might be too late but still as I am not seeing final answer so I am writing the code for printing all BGP.
For that create the class as follows that extends OpVisitorBase and override the public void visit(final OpBGP) function as given below in the code.
And from your code simply call the function:
MyOpVisitorBase.myOpVisitorWalker(op);
public class MyOpVisitorBase extends OpVisitorBase
{
public static void myOpVisitorWalker(Op op)
{
OpWalker.walk(op, this);
}
#Override
public void visit(final OpBGP opBGP) {
final List<Triple> triples = opBGP.getPattern().getList();
int i = 0;
for (final Triple triple : triples) {
System.out.println("Triple: "+triple.toString());
}
}
}