How do I query/update a remote RDF-endpoint with Jena - sparql

I would like to send updates to a remote endpoint over http.
I have found that joseki serves as such an endpoint.
However, how do i send update queries to this endpoint, if I only know the uri of the endpoint?
// To do a select-query you can use this:
QueryExecution qe = QueryExecutionFactory.sparqlService(serviceURI, query);
// (Sidenote:) does the next line have the desired effect of setting the binding?
// After all, sparqlService has no alternative with initialBindang as parameter
qe.setInitialBinding(initialBinding);
result = qe.execSelect();
// But updates do not support this kind of sparqlService method
// Illegal:
// UpdateAction.sparqlServiceExecute(serviceURI, query);
// I can use the following:
UpdateAction.parseExecute(updateQuery, dataset, initialBinding);
// But dataset is a Dataset object, not the uri.
// I don't believe this is the correct way to overcome this:
Dataset dataset = DatasetFactory.create(serviceURI);
Otherwise I would like to hear how to do remote update queries to endpoints for which only the URI is known.
Update:
Resorted to a local jena in the end. This kind of RDF-endpoint accepts insert and delete statements.
I did not succeed in finding a remote RDF-endpoint that would accept modifying queries.

Otherwise I would like to hear how to
do remote update queries to endpoints
for which only the URI is known.
This is handled a little differently depending on the endpoint server. There is a draft sparql/update protocol. But as it is draft and fairly new support is a little variant.
Genrally you can write sparql update queries a little like you would write SQL insert or update statements.
The update commands are Modify, Insert, Delete, Load, Clear but not every imlpementation supports all of these.
As endpoints are often public there is usually some authentication needed before the action is allowed, this is not defined in the spec so is implementation specific.
It is advised that a different URL is used for update statements so that http authentication can be used. 4store, uses /sparql for queries and /update for update queries.
The W3C draft has some examples of how to construct sparql update queries.

Joseki doesn't support remote updates. You probably should have a look at its successor, Fuseki, which supports SPARQL Update.

Related

Prepared Statement support in Apache Ignite Cache API

Is any facility like Prepared statement supported in IgniteCache API to avoid query parsing each time? I saw that a Jira issue has been raised for this , and it says its resolved in version 1.5.0.final,
https://issues.apache.org/jira/browse/IGNITE-1856 , but i could not find any documentation for this in Apache Ignite site. I know that we can use prepared statement by connecting via JDBC Connection but that does not suit my use case.
My Code looks like below ,this query will be called again and again with different parameters,
IgniteCache<Integer,Subscriber> subscriberCache= rocCachemanager.getCache("subscriberCache");
SqlQuery<Integer, Subscriber> sql = new SqlQuery(Subscriber.class,
"from Subscriber where Subscriber.MSISDNNo=? and Subscriber.status='Active'");
sql.setArgs("SomeNumber");
QueryCursor<Entry<Integer,Subscriber>> cursor =ss.query(sql);
Statements are cached automatically, no action required. If your query text does not change, only parameters do, Ignite will not parse the query again.

Jena-Fuseki requires dataset specified

I have Jena-Fuseki server accessed via browser at http://localhost:3030/sparql.html. The query
select * where { }
results in an error:
Error 400: No dataset description in protocol request or in the query string
The query
select * from <http://xmlns.com/foaf/0.1/> where {}
results in an empty table.
Example queries at 2.1 Writing a Simple Query from the SPARQL specification do not require a 'from' clause. How to configure Jena so that examples execute without errors?
How to make a query to know which datasets are present in a database?
The endpoint "/sparql.html" is a general SPARQL query engine and needs to be told where to get the data from. That can be in protocol or with "FROM".
Fuseki can also be configured to have SPARQL services acting on a specific database. The URL will look like
http://localhost:3030/DATASET/sparql
where DATASET is your choice of name. See the documentation on configuration. http://jena.apache.org/documentation/serving_data/
[Jan2015] Fuseki1 requires datasets to be given on the command line or configuration. Fuseki2, soon to be released, has a UI for creating new datasets in a running server as well as the Fuseki1 style configuration.
Its easy to miss the first time you use Fuseki, but you've got to navigate to your dataset, and from there, there's a special query box for that dataset.
start at http://localhost:3030/
click on Control Panel
Select your dataset from the dropdown menu, click "select"
run a query

MySql NetScaler DataStream Content Switching failing to detect select

We are using the new DataStream feature introduced in NetScaler 9 (we're on v10) to do content switching (described here: http://support.citrix.com/proddocs/topic/netscaler/ns-dbproxy-wrapper-con.html). We have a read-only virtual server that balances across several read-only MySql slaves. We use our Content Switching to send all "Selects" over to the read-only server.
the policy is configured as such:
mysql.req.query.command.contains("select")
our users send multi-part queries to our database server. Most often they are simple, like:
use database;
select col1 from table1;
Sometimes they will put comments at the head of the query. for example:
-- this is my query
select col1 from table1;
What we've found is that if the query simply starts with a select, everything works swimmingly. However, in the cases where there is a use statement or comments preceding the query, the content swticher fails to detect that this is a select query and it bypasses our read-only virtual server.
I am about to tell all of our developers that they must fully alias every table in every query and avoid use statements (yes, this is a good thing anyway), and also that they cannot use comments in their sql (that's just silly).
Does anyone know how I can configure my NetScaler DataStream Content Switching to ignore comments and use statements?
The decision on where to send the query is done on the first line received after successful authentication... so ignoring the comment won't work.
You could setup a responder policy which sends back an error message saying "Please don't use SQL Comments in commands sent to the Load Balanced VIP". A bit draconian, but your devs would get the message fairly quick.. but there's no way to ignore the comment, but still base a decision on the select statement. However, I was under the impression that the select statement is up to the first semi colon... so in your example above, it should (in theory) still find the select statement. I'd need to test that to be certain of the behaviour however.
Also - the USE statement is critical. This is the DB on which all subsequent commands are issued.
It would be best practice to NOT use the USE statement, but instead, change the select statement to:
select col1 from database.table1;
Once the USE statement is seen, it prevents any subsequent commands being pipelined down the same connection... So if there are a lot of Use statements, you will not get to enjoy the connection multiplexing functionality that comes with DataStream.
We learned that Block Level comments are acceptable, but single line comments are not.
This is properly ignored:
/* my comment */
These comment styles are treated as part of the query:
-- my comment
# my comment
kind of ridiculous when having SET autocommit=0 is perfectly reasonable. What about in that situation.

Invoking a large set of SQL from a Rails 4 application

I have a Rails 4 application that I use in conjunction with sidekiq to run asynchronous jobs. One of the jobs I normally run outside of my Rails application is a large set of complex SQL queries that cannot really be modeled by ActiveRecord. The connection this set of SQL queries has with my Rails app is that it should be executed anytime one of my controller actions is invoked.
Ideally, I'd queue a job from my Rails application within the controller for Sidekiq to go ahead and run the queries. Right now they're stored in an external file, and I'm not entirely sure what the best way is to have Rails run the said SQL.
Any solutions are appreciated.
I agree with Sharagoz, if you just need to run a specific query, the best way is to send the query string directly into the connection, like:
ActiveRecord::Base.connection.execute(File.read("myquery.sql"))
If the query is not static and you have to compose it, I would use Arel, it's already present in Rails 4.x:
https://github.com/rails/arel
You didn't say what database you are using, so I'm going to assume MySQL.
You could shell out to the mysql binary to do the work:
result = `mysql -u #{user} --password #{password} #{database} < #{huge_sql_filename}`
Or use ActiveRecord::Base.connection.execute(File.read("huge.sql")), but it won't work out of the box if you have multiple SQL statements in your SQL file.
In order to run multiple statements you will need to create an initializer that monkey patches the ActiveRecord::Base.mysql2_connection to allow setting MySQL's CLIENT_MULTI_STATEMENTS and CLIENT_MULTI_RESULTS flags.
Create a new initializer config/initializers/mysql2.rb
module ActiveRecord
class Base
# Overriding ActiveRecord::Base.mysql2_connection
# method to allow passing options from database.yml
#
# Example of database.yml
#
# login: &login
# socket: /tmp/mysql.sock
# adapter: mysql2
# host: localhost
# encoding: utf8
# flags: 131072
#
# #param [Hash] config hash that you define in your
# database.yml
# #return [Mysql2Adapter] new MySQL adapter object
#
def self.mysql2_connection(config)
config[:username] = 'root' if config[:username].nil?
if Mysql2::Client.const_defined? :FOUND_ROWS
config[:flags] = config[:flags] ? config[:flags] | Mysql2::Client::FOUND_ROWS : Mysql2::Client::FOUND_ROWS
end
client = Mysql2::Client.new(config.symbolize_keys)
options = [config[:host], config[:username], config[:password], config[:database], config[:port], config[:socket], 0]
ConnectionAdapters::Mysql2Adapter.new(client, logger, options, config)
end
end
end
Then update config/database.yml to add flags:
development:
adapter: mysql2
database: app_development
username: user
password: password
flags: <%= 65536 | 131072 %>
I just tested this on Rails 4.1 and it works great.
Source: http://www.spectator.in/2011/03/12/rails2-mysql2-and-stored-procedures/
Executing one query is - as outlined by other people - quite simply done through
ActiveRecord::Base.connection.execute("SELECT COUNT(*) FROM users")
You are talking about a 20.000 line sql script of multiple queries. Assuming you have the file somewhat under control, you can extract the individual queries from it.
script = Rails.root.join("lib").join("script.sql").read # ah, Pathnames
# this needs to match the delimiter of your queries
STATEMENT_SEPARATOR = ";\n\n"
ActiveRecord::Base.transaction do
script.split(STATEMENT_SEPARATOR).each do |stmt|
ActiveRecord::Base.connection.execute(stmt)
end
end
If you're lucky, then the query delimiter could be ";\n\n", but this depends of course on your script. We had in another example "\x0" as delimiter. The point is that you split the script into queries to send them to the database. I wrapped it in a transaction, to let the database know that there is coming more than one statement. The block commits when no exception is raised while sending the script-queries.
If you do not have the script-file under control, start talking to those who control it to get a reliable delimiter. If it's not under your control and you cannot talk to the one who controls it, you wouldn't execute it, I guess :-).
UPDATE
This is a generic way to solve this. For PostgreSQL, you don't need to split the statements manually. You can just send them all at once via execute. For MySQL, there seem to be solutions to get the adapter into a CLIENT_MULTI_STATEMENTS mode.
If you want to execute raw SQL through active record you can use this API:
ActiveRecord::Base.connection.execute("SELECT COUNT(*) FROM users")
If you are running big SQL every time, i suggest you to create a sql view for it. It be boost the execution time. The other thing is, if possible try to split all those SQL query in such a way that it will be executed parallely instead of sequentially and then push it to sidekiq queue.
You have to use ActiveRecord::Base.connection.execute or ModelClass.find_by_sql to run custom SQL.
Also, put an eye on ROLLBACK transactions, you will find many places where you dont need such ROLLBACK feature. If you avoid that, the query will run faster but it is dangerous.
Thanks all i can suggest.
use available database tools to handle the complex queries, such as views, stored procedures etc and call them as other people already suggested (ActiveRecord::Base.connection.execute and ModelClass.find_by_sql for example)- it might very well cut down significantly on query preparation time in the DB and make your code easier to handle
http://dev.mysql.com/doc/refman/5.0/en/create-view.html
http://dev.mysql.com/doc/connector-cpp/en/connector-cpp-tutorials-stored-routines-statements.html
abstract your query input parameters into a hash so you can pass it on to sidekiq, don't send SQL strings as this will probably degrade performance (due to query preparation time) and make your life more complicated due to funny SQL driver parsing bugs
run your complex queries in a dedicated named queue and set concurrency to such a value that will prevent your database of getting overwhelmed by the queries as they smell like they could be pretty db heavy
https://github.com/mperham/sidekiq/wiki/API
https://github.com/mperham/sidekiq/wiki/Advanced-Options
have a look at Squeel, its a great addition to AR, it might be able to pull some of the things you are doing
https://github.com/activerecord-hackery/squeel
http://railscasts.com/episodes/354-squeel
I'll assume you use MySQL for now, but your mileage will vary depending on the DB type that you use. For example, Oracle has some good gems for handling stored procedures, views etc, for example https://github.com/rsim/ruby-plsql
Let me know if some of this stuff doesn't fit your use case and I'll expand
I see this post is kind of old. But I would like to add my solution to it. I was in a similar situation; I also needed a way to force feed "PRAGMA foreign_keys = on;" into my sqlite connection (I could not find a previous post that spelled it out how to do it.) Anywho, this worked like a charm for me. It allowed me to write "pretty" sql and still get it executed. Blank lines are ignored by the if statement.
conn = ActiveRecord::Base.establish_connection(adapter:'sqlite3',database:DB_NAME)
sqls = File.read(DDL_NAME).split(';')
sqls.each {|sql| conn.connection.execute(sql<<';') unless sql.strip.size == 0 }
conn.connection.execute('PRAGMA foreign_keys = on;')
I had the same problem with a set of sql statements that I needed to execute all in one call to the server. What worked for me was to set up an initializer for Mysql2 adapter (as explained in infused answer) but also do some extra work to process multiple results. A direct call to ActiveRecord::Base.connection.executewould only retrieve the first result and issue an Internal Error.
My solution was to get the Mysql2 adapter and work directly with it:
client = ActiveRecord::Base.connection.raw_connection
Then, as explained here, execute the query and loop through the results:
client.query(multiple_stms_query)
while client.next_result
result = client.store_result
# do something with it ...
end

Hibernate and dry-running HQL queries statically

I'd like to "dry-run" Hibernate HQL queries. That is I'd like to know what actual SQL queries Hibernate will execute from given HQL query without actually executing the HQL query against real database.
I have access to hibernate mapping for tables, the HQL query string, the dialect for my database. I have also access to database if that is needed.
Now, how can I find out all the SQL queries Hibernate can generate from my HQL without actually executing the query against any database? Are there any tools for this?
Note, that many SQL queries can be generated from one HQL query and the set of generated SQL queries may differ based on the contents of database.
I am not asking how to log SQL queries while HQL query is executing.
Edit: I don't mind connecting to database to fetch some metadata, I just don't want to execute queries.
Edit: I also know what limits and offsets are applied to query. I also have the actual parameters that will be bind to query.
The short answer is "you can't". The long answer is below.
There are two approaches you can take:
A) Look into HQLQueryPlan class, particularly its getSqlStrings() method. It will not get you the exact SQL because further preprocessing is involved before query is actually executed (parameters are bound, limit / offset are applied, etc...) but it may be close enough to what you want.
The thing to keep in mind here is that you'll need an actual SessionFactory instance in order to construct HQLQueryPlan, which means you won't be able to do so without "connecting to any database". You can, however, use in-memory database (SqlLite and the likes) and have Hibernate auto-create necessary schema for it.
B) Start with ASTQueryTranslatorFactory and descend into AST / ANTLR madness. In theory you may be able to hack together a parser that would work without relying on metadata but I have a hardest time imagining what is it you're trying to do for this to be worth it. Perhaps you can clarify? There has to be a better approach.
Update: for an offline, dry-run of some HQL, using HQLQueryPlan directly is a good approach. If you want to intercept every query in the app, while it's running, and record the SQL, you'll have to use proxies and reflection as described below.
Take a look at this answer for Criteria Queries.
For HQL, it's the same concept - you have to cast to Hibernate implementation classes and/or access private members, so it's not a supported method, but it will work with a the 3.2-3.3 versions of Hibernate. Here is the code to access the query from HQL (query is the object returned by session.createQuery(hql_string):
Field f = AbstractQueryImpl.class.getDeclaredField("session");
f.setAccessible(true);
SessionImpl sessionImpl = (SessionImpl) f.get(query);
Method m = AbstractSessionImpl.class.getDeclaredMethod("getHQLQueryPlan", new Class[] { String.class, boolean.class });
m.setAccessible(true);
HQLQueryPlan plan = (HQLQueryPlan) m.invoke(sessionImpl, new Object[] { query.getQueryString(), Boolean.FALSE });
for (int i = 0; i < plan.getSqlStrings().length; ++i) {
sql += plan.getSqlStrings()[i];
}
I would wrap all of that in a try/catch so you can go on with the query if the logging doesn't work.
It's possible to proxy your session and then proxy your queries so that you can log the sql and the parameters of every query (hql, sql, criteria) before it runs, without the code that builds the query having to do anything (as long as the initial session is retrieved from code you control).