I want to setup dbpedia dataset locally - sparql

I want to setup DBpedia dataset locally, but I'm not sure how to do it. I have downloaded mappingbased_objects_en.ttl and infobox_properties_mapped_en.ttl.bz2, is there anything else I need to download,
now how can I query this using SPARQL ? do I need to install anything to make it queryable from sparql. is there any Database software for SPARQL like mysql ??
I tried http://dbpedia.org/sparql, but due to the restriction of 10000 query limit I want to setup DBpedia in my system
Any lead would be appreciated.
Thanks
PS: This two files (mappingbased_objects_en.ttl, infobox_properties_mapped_en.ttl.bz2) doesn't seems to have all the entity information for ex: Steve Jobs is not there in those files but Tim Cook is there and I'm certain Steve jobs is present in DBpedia.

You need to install DBPedia on a local triplestore, such as Virtuoso. I explain this in this article but here is the gist on how to install and query DBPedia locally with Virtuoso Triplestore:
The Virtuoso Open Source Edition can be downloaded from here.
Once Virtuoso is installed, run it and start VOS Database.
Go to Virtuoso admin page in the browser (you may have to give it a bit of time to start): http://localhost:8890/conductor/
Login with default credentials (dba/dba)
In tab “Quad Store Upload” for testing you can upload a ttl file to the specified named graph IRI, such as “http://localhost:8890/DBPedia”.
Next you can test the triplestore in the SPARQL tab or directly at the local endpoint. For example:
SELECT count(*) WHERE
{?category skos:broader <http://dbpedia.org/resource/Category:Environmental_issues>}
However the upload might fail for bigger files.
For bigger files and also for uploading multiple files, it is best to use the bulk upload.
In order to bulk upload files from anywhere (and not just the Virtuoso import folder), you must add your folder to the DirsAllowed property in the Virtuoso configuration file virtuoso.ini. You must restart Virtuoso for the changes in virtuoso.ini to be effective. For example, assuming that the dumps are in /tmp/virtuoso_db/dbpedia/ttl, you can add path /tmp/virtuoso_db to DirsAllowed.
Once Virtuoso is back and running, go the the Interactive SQL (ISQL) window and register the files to be loaded by typing in:
ld_dir('/tmp/virtuoso_db/dbpedia/ttl/','*.ttl','http://localhost:8890/DBPedia');
You can then perform the bulk load of all the registered files by typing in:
rdf_loader_run();
You can monitor the number of triples being uploaded by performing the following SPARQL query on the local endpoint:
select count(*) as ?c where {?a ?b ?c}

Although #firefly's answer is still correct, there is a much simpler way to setup dbpedia locally provided by dbpedia itself:
git clone https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart.git
cd virtuoso-sparql-endpoint-quickstart
COLLECTION_URI=https://databus.dbpedia.org/dbpedia/collections/latest-core VIRTUOSO_ADMIN_PASSWD=password docker-compose up
Source: https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart

Related

Load OpenStreetMap data into Virtuoso

How can I load data from OpenStreetMap of a particulat area (e.g. Berlin) into the (open source) Virtuoso triple store which runs on my local computer (Ubuntu)?
I tried to download the particular OSM file and access it with sparqlify in order to convert it to RDF (or turtle etc.), which then later (at least that was my idea) could be loaded to Virtuoso using the bulk loading strategy. That did not work.
I would be happy if you could tell me if there is any other alternative how I could convert the osm files into RDF.... or, if there is a totally different approach?!
I also thought about using apache jena within Java to access the linkedgeodata access point directly, however, I guess having the data locally gives me much more performance at a later point when I run SPARQL commands.
Cheers

Fetch data from Factbook offline

Since Factbook's SPARQL endpoint is down, I downloaded the zip file of their data. I do not understand how can i fire SPARQL queries at it. Any idea as to how it can be achieved? The unzipped version doesn't have a single RDF file so Fuseki is giving me an error

RDF4J rdf lucene configuration

I have been trying for some time to configure my sesame RDF repository (at the moment is called RDF4j) in order to use full text queries.
I did not find much documentation about this configuration, I think that I need to create a template file so then I can use it with the console. Here is the little information about the topic https://groups.google.com/forum/#!topic/rdf4j-users/xw2UJCziKl8
Does anybody know any information about the configuration of RDF4j with Lucene? Any clue would be very appreciate. Otherway, I would think about change the whole repository for another, like for example virtuoso.
Thanks in advance,
You need to do these operations:
Start rdf4j-server. I used rdf4j-server.war (and rdf4j-workbench.war). My url was http://127.0.0.1:8080/rdf4j-server
Put lucene.ttl (lucene.ttl or this) into ~/.RDF4J/console/templates, where "~" your home directory
Correct settings in this file
Then start console from rdf4j distributive
Then execute next commands in console:
connect http://127.0.0.1:8080/rdf4j-server
create lucene
Enter lucene repository Id: myRepositoryId
Enter lucene repository name: myRepositoryName
Then you can see in http://127.0.0.1:8080/rdf4j-workbench created repository. If you add some triples you can use lucene search

Access LinkedMDB offline [duplicate]

i want to query from Linked Movie Database at linkedmdb.org locally.
is there some rdf or owl version of it that can i download query locally instead of remotely
I tried to query it and got the following error:
org.openjena.riot.RiotException: <E:\Applications\linkedmdb-latest-dump\linkedmdb-latest-dump.nt> Code: 11/LOWERCASE_PREFERRED in SCHEME: lowercase is preferred in this component
org.openjena.riot.system.IRIResolver.exceptions(IRIResolver.java:256)
org.openjena.riot.system.IRIResolver.access$100(IRIResolver.java:24)
org.openjena.riot.system.IRIResolver$IRIResolverNormal.resolveToString(IRIResolver.java:380)
org.openjena.riot.system.IRIResolver.resolveGlobalToString(IRIResolver.java:78)
org.openjena.riot.system.JenaReaderRIOT.readImpl(JenaReaderRIOT.java:121)
org.openjena.riot.system.JenaReaderRIOT.read(JenaReaderRIOT.java:79)
com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:226)
com.hp.hpl.jena.util.FileManager.readModelWorker(FileManager.java:395)
com.hp.hpl.jena.util.FileManager.loadModelWorker(FileManager.java:299)
com.hp.hpl.jena.util.FileManager.loadModel(FileManager.java:250)
ServletExample.runQuery(ServletExample.java:92)
ServletExample.doGet(ServletExample.java:62)
javax.servlet.http.HttpServlet.service(HttpServlet.java:627)
javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
There's a claim that there is a download from this page. Haven't tried it myself, so I don't know whether it's fresh or not.
There is a dump in ntriples format at this address:
http://queens.db.toronto.edu/~oktie/linkedmdb/
If you want to query it you may upload the dump files onto one local triple store such as 4store or jena (using the relational support). Other libraries and tools are available, depending on the language you're more familiar with.
If you need more information let me know.

How to install Jena SemanticWeb Framework in Play Framework

I put jena jar files in the lib folder and see the message:
A JPA error occurred (Cannot start a JPA manager without a
properly configured database): No datasource configured
what am I doing wrong?
I found the answer. This was a problem in Play.
There's some reason put in front of the class directive from the module javax
I do not know why it happened, simply remove and earned
This error doesn't asoociate with jena, because if your dont't choose model (dataset) while your execute query, you will get next message - No dataset description for query and com.hp.hpl.jena.query.QueryExecException. But if you choose jena as datasource in play, you may get your message(sorry, but i don't know much about Play).
What operations you do with jena?
I don't know much Jena but it seems that some persistent ontologies might be stored into database. Thus, it would mean Jena needs a database connection?
Is this error an error of Jena and not from Play?
What do you try to do in your code before getting this error?
If Jena requires some configuration and resource creation before using it, you should think of creating a little Jena play plugin to initialize your Jena context...