I have a problem with the command line tools of Apache Jena. I want to create a tdb2 database for a big turtle file. For this reason I used the tdb2.loader command as follows:
tdb2.tdbloader --loc ~/indexer ~/indexer/test.ttl
My test.ttl file contains entries of the form:
#prefix bbase: <http://data.bibbase.org/ontology/#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix dblp: <https://dblp.org/rdf/schema-2017-04-18#>
<https://dblp.org/rec/conf/romoco/Siegwart13>
a dblp:Publication ;
owl:sameAs <http://dx.doi.org/10.1109/RoMoCo.2013.6614591> ;
dblp:authoredBy <https://dblp.org/pers/s/Siegwart:Roland> ;
dblp:bibtexType bbase:Inproceedings ;
dblp:listedOnTocPage <https://dblp.org/db/conf/romoco/romoco2013> ;
dblp:pageNumbers "98" ;
dblp:primaryElectronicEdition <https://doi.org/10.1109/RoMoCo.2013.6614591> ;
dblp:doi "10.1109/RoMoCo.2013.6614591";
dblp:publicationType dblp:Inproceedings ;
dblp:publishedAsPartOf <https://dblp.org/rec/conf/romoco/2013> ;
dblp:publishedInBook "RoMoCo" ;
dblp:title "Design and navigation of wheeled, running, swimming and flying robots." ;
dblp:yearOfPublication "2013" .
...
Now my problem is that if I query the output (tdb2 file), by using the tdb2.tdbquery command, an empty table will be the result. My query searches for all entities having a dblp:doi property and the result should not be an empty table as you can see on the example above. My query file looks als follows:
PREFIX dblp: <https://dblp.org/rdf/schema-2017-04-18#>
SELECT *
WHERE{
?s dblp:doi ?o .
}
And my my tdb2.query command looks as follows:
./tdb2.tdbquery --loc=~/indexer/Data-0001 --query=~/indexer/query.rq
No matter what I'm doing, my result is always:
---------
| s | o |
=========
---------
If I query the .ttl-files directly with the sparql command in the same manner as I use the tdb2.query command, I will get a reasonable result containing some entries.
Unfortunately I cannot find an answer to my question, neither in the Jena documentation nor in this forum. Can someone give me an answer or at least a hint what might go wrong?
Thanks in advance for your help!
Your --loc parameter on the query should be the location where you created the TDB2 database i.e. ~/indexer
Related
I am trying to run the below sparql query to insert data to my graph, but it's failing with the error: Error 400: Bad Request.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX person: <http://example.com/data/person#>
PREFIX contact: <http://example.com/data/contact#>
INSERT
{
person:p4595 contact:email "test1#example.com" ;
contact:mobile "99483-43843" .
person:p6593 contact:email "test2#example.com" ;
contact:mobile "55351-56477" .
}
I have validated all the IRIs and triples are correct. Not sure why the query is failing.
Any help would be much appreciated. Thanks.
The insert syntax is incorrect. You should use either insert data or insert {...} followed by where {} (with empty {}). I have not run the query, but it should be something like this:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX person: <http://example.com/data/person#>
PREFIX contact: <http://example.com/data/contact#>
INSERT DATA
{
person:p4595 contact:email "test1#example.com" ;
contact:mobile "99483-43843" .
person:p6593 contact:email "test2#example.com" ;
contact:mobile "55351-56477" .
}
I'm learning how to query ttl file that I retrieve from Wikipedia.
However, I stuck on getting the correct query to show the result that I seek for an answer.
Here is the example ttl file that I used.
# File name : WikipediaGraph.ttl
#prefix ns1: <http://www.example.com/> .
<http://en.wikipedia.org/wiki/Abilene,_Texas> ns1:2010population "117063" ;
ns1:2019population "123420" ;
ns1:areaSqKm "276.4" ;
ns1:areaSqMile "106.7" ;
ns1:popChange "5.43" ;
ns1:popPerSqKm "442" ;
ns1:popPerSqMile "1146" ;
ns1:state <http://en.wikipedia.org/wiki/Texas> .
The sample question that I want to get an answer to listed below.
Which state(s) has the most cities in the list? *
Which state(s) has the most population in cities in the list? *
Which state(s) has seen its cities in the list grow the most? *
Therefore, I stuck from the beginning as I can't get the result from the simple query.
import rdflib
from rdflib import Graph, URIRef, Literal, Namespace, RDF
g=rdflib.Graph()
g.parse("WikipediaGraph.ttl", format='turtle')
results = g.query("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ns1: <http://www.example.com/>
SELECT ?v
WHERE {
?v rdf:type ns1:state .
}
ORDER BY ?NAME
""")
print('Results!')
for row in results:
print(row)
Could you please advise how I should write to query the above question?
I'm interested in obtaining a list of available distinct hierarchies from statistics.gov.scot. The best-fit hierarchies, which I would like to list, are as follow:
http://statistics.gov.scot/def/hierarchy/best-fit#community-health-partnership
http://statistics.gov.scot/def/hierarchy/best-fit#council-area
http://statistics.gov.scot/def/hierarchy/best-fit#country
As available through API section of this sample geography.
Desired results
I would like for the desired results to return:
community-health-partnership
council-area
country
How can I construct query that would actually produce that, I can get a list of available all geographies via:
PREFIX sdmx: <http://purl.org/linked-data/sdmx/2009/dimension#>
SELECT DISTINCT ?framework
WHERE {
?a sdmx:refArea ?framework .
} LIMIT 10
I was trying something on the lines:
PREFIX fits: <http://statistics.gov.scot/def/hierarchy/best-fit#>
SELECT DISTINCT ?framework
WHERE {
?a fits ?framework .
} LIMIT 10
but naturally this syntax is not correct.
Starting on their SPARQL endpoint, you could do something like this --
DESCRIBE <http://statistics.gov.scot/def/hierarchy/best-fit#country>
Then, based on those results, you might try something like this, which results aren't exactly what you say you want, but might be better --
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?hierarchy
?label
WHERE
{ ?hierarchy rdfs:subPropertyOf <http://statistics.gov.scot/def/hierarchy/best-fit>
; rdfs:label ?label
}
I am trying to get all the books about the author William Shakespeare in the gutemberg project (http://www4.wiwiss.fu-berlin.de/gutendata/).
I can use this query in order to return all the items in which Shakespeare is the author :
PREFIX dc:<http://purl.org/dc/elements/1.1/>
PREFIX foaf:<http://xmlns.com/foaf/0.1/>
SELECT ?bookTitle
WHERE {
?author foaf:name "Shakespeare, William, 1564-1616".
?book dc:creator ?author;
dc:title ?bookTitle
}
But I'd rather obtain the list of the criticism books about Shakespeare, using the dcterms:subjects
http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/subject/Shakespeare_William_1564-1616.
I would really appreciate your help !
I agree with RobV's comment that it's surprising that after writing the first query there would be much difficulty in writing the second. At any rate, the change is relatively trivial. I checked using both dc:subject and dcterms:subject, and it appears that the Gutenberg data is stored with dc:subject, so I've used that in this query, not dcterms:subject. This is the query; you just need to add dc:subject <.../Shakespeare...> to the pattern, and remove the author constraint. A few prefixes make the query easier to write, and the results easier to read, too.
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX gp: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/people/>
PREFIX gs: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/subject/>
PREFIX gr: <http://wifo5-04.informatik.uni-mannheim.de/gutendata/resource/>
SELECT ?book ?author ?bookTitle WHERE {
?book dc:creator ?author ;
dc:title ?bookTitle ;
dc:subject gs:Shakespeare_William_1564-1616 .
}
With a query like this, you should end up with results like:
--------------------------------------------------------------------------------------------
| book | author | bookTitle |
============================================================================================
| gr:etext14827 | gp:Guizot_François_Pierre_Guillaume_1787-1874 | "Étude sur Shakspeare" |
| gr:etext5429 | gp:Johnson_Samuel_1709-1784 | "Preface to Shakespeare" |
--------------------------------------------------------------------------------------------
Apologies if this has been asked before but it seems an obvious question for which I can't seem to get a working solution. I've loaded some data into Fuseki which contains statements comprising xsd:date information. I have written a query to extract these dates and I would like to find the duration (in days) between two specific date. A cut down query is shown below.
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX abc: <http://www.acme.com/ABC/1.0/abc-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?entry
(date1 - date2) AS ?interval)
WHERE
{
# Not the real query but you get the idea
?entry a abc:thing ;
abc:abcDateType1 ?date1 ;
abc:abcDateType2 ?date2 .
}
?interval is calculated correctly and seems to be in the form of an xsd:duration data type, my question is how do I extract the number of days it contains? Alternatively is there a better (standard) way of doing this.
Thank you for reading.
You will need to work with the lexical for of the XSD duration. It is likely to be for the form "PD...."
The answer should be call fn:days-from-duration, or a keyword for that function, but that isn't in the standard distribution at the moment.
One way would be to add unix-style timestamps related to a date, abstractly:
?blankNode eg:hasDate ?date
?blankNode eg:hasEpoch ?epoch
Then you can easily subtract the timestamp for one day from the other, and divide by 86400 to give the answer in day, e.g.:
SELECT ?entry
((?time1 - ?time2)/86400) AS ?interval)
WHERE
{
# Not the real query but you get the idea
?entry a abc:thing ;
abc:abcDateType1 ?date1 ;
abc:abcDateType2 ?date2 .
?blankNode1 hasDate ?date1 .
?blankNode1 hasTime ?time1 .
?blankNode2 hasDate ?date1 .
?blankNode2 hasTime ?time1 .
}