java/jdbc timeout in clojure - testing

I am trying to add timeout to jdbc/query and jdbc/execute!. Somewhere in the web I found that both functions take :timeout as an option. Documention also says the options are passed to prepare-statment which takes in :timeout as an option.
My function calls look like,
(jdbc/query db-read-spec query {:timeout 2})
(jdbc/execute! db-write-spec query {:timeout 2})
Is this how it is done? If yes, How do I test this?
If there is different way of doing this which is testable, that works too.

The :timeout option causes .setQueryTimeout to be called on the PreparedStatement used under the hood of clojure.java.jdbc. It is in seconds, not milliseconds, so your query would have to be extremely slow for a timeout of 2,000 seconds (just over half an hour) to take effect.
JDBC supports several different timeouts across several of its classes. For example, javax.sql.DataSource supports .setLoginTimeout (also in seconds), as does java.sql.DriverManager.
There are also database-specific options you can add to the connection string (which you can add as additional key/value pairs in your "db-spec") to control lower-level timeouts. For example, MySQL supports connectionTimeout and socketTimeout in the connection string -- and both of those are in milliseconds. clojure.java.jdbc allows for those to be provided in your "db-spec" hash map as :connectTimeout and :socketTimeout keys respectively.
Note that clojure.java.jdbc is considered "Stable" at this point and all current and future development effort is focused on next.jdbc at this point. next.jdbc makes it easier to use the loginTimeout since it operates on JDBC objects directly, so the whole (Java) API is available as well. It also has built-in support for connection pooling and is, overall, simpler and faster than clojure.java.jdbc.

You can leverage query-hint on mysql-select-queries (time in ms)
SELECT /*+ MAX_EXECUTION_TIME(1000) */ * FROM t1 INNER JOIN t2 WHERE....
then you can just wrap your queries:
(defn timed-query [db query t]
(j/query db [(str (subs query 0 6)
(format " /*+ MAX_EXECUTION_TIME(%s) */ " t)
(subs query 7))]))
and test:
(deftest test-query-timeout
(is (thrown? Exception (timed-query db "select * from Employees where id>5" 1))))
you should use much-complex queries for this to work with 1ms;

I figure out a work around to test this out. Since I use postgres I could leverage select pg_sleep(time-in-seconds)
And my test looks like
(is (thrown-with-msg? PSQLException #"ERROR: canceling statement due to user request"
(fetch-or-save "select pg_sleep(3)")))

Related

DB doesn't respond after N calls

my web app call(via mybatis) the DB(sqlserver 2008) with websphere. Many times it blocks after N calls(N select) and the web app doesn't respond. Then I have to restart the Websphere.
for (Integer idMovimento: idMovimenti)
{
try{
Movimenti tmpMov = selectByPrimaryKeyForMovimentiRAC(idMovimento);
scartInputParam.setIdMovimento(idMovimento);
List<Scarto> tmpSc = movimentiDAO.selectScarti(scartInputParam);
It blocks in the last select every time.
Any suggestion?
Have I to configure the WAS or there are other tricks?
And can I recall the query after some time?
SOLVED
I changed my query adding join into the select query, in this way I avoid to call the database N time and I put all the result into a list. Then I worked with java without any needless call.

How to parallelize a REST API crawler in http4s & fs2?

I wrote a sequential REST API crawler in http4s & fs2 here:
https://gist.github.com/NicolasRouquette/656ed7a2d6984ce0995fd78a3aec2566
This is to query a REST API service to get a starting set of IDs, fetch elements for a batch of IDs and continue based on the cross-reference IDs found in these elements until there are no new IDs to fetch and return a map of all elements fetched.
This works; however, the performance is inadequate -- too slow!
Since I don't have access to the server, I tried experimenting with varying batch sizes, from 10, 50, 100, 200, 500 and even batching all IDs in a single query. Query time increases significantly with batch size.
At large sizes (500 and all), I even got HTTP 500 responses from the server.
I would like to experiment with batching parallel queries in a load-balancing fashion using a pool of threads; however, it is unclear to me how to do this based on the fs2 docs.
Can someone provide suggestions how to achieve this?
Regarding using http4s & fs2: Well, I found this library fairly easy to use for simple client-side programming. Given the emphasis on supporting tasks, streams, etc..., I presume that batching parallel queries should be doable somehow.
fs2.concurrent.join will allow you to run multiple streams concurrently. The specific section in the guide is available at https://github.com/functional-streams-for-scala/fs2/blob/v0.9.7/docs/guide.md#concurrency
For your use case you could take your queue of ids, chunk them, create a http task and then wrap it in a stream. You would then run this stream of streams concurrently with join and combine the results.
def createHttpRequest(ids: Seq[ID]): Task[(ElementMap, Set[ID])] = ???
def fetch(queue: Set[ID]): Task[(ElementMap, Set[ID])] = {
val resultStreams = Stream.emits(queue.toSeq)
.vectorChunkN(batchSize)
.map(createHttpRequest)
.map(Stream.eval)
val resultStream = fs2.concurrent.join(maxOpen)(resultStreams)
resultStream.runFold((Map.empty[ID, Element], Set.empty[ID])) {
case ((a, b), (_a, _b)) => (a ++ _a, b ++ _b)
}
}

ScalaQuery's query/queryNA several times slower than JDBC?

In the following performance tests of many queries, this timed JDBC code takes 500-600ms:
val ids = queryNA[String]("select id from account limit 1000").list
val stmt = session.conn.prepareStatement("select * from account where id = ?")
debug.time() {
for (id <- ids) {
stmt.setString(1, id)
stmt.executeQuery().next()
}
}
However, when using ScalaQuery, the time goes to >2s:
val ids = queryNA[String]("select id from account limit 1000").list
implicit val gr = GetResult(r => ())
val q = query[String,Unit]("select * from account where id = ?")
debug.time() {
for (id <- ids) {
q.first(id)
}
}
After debugging with server logs, this turns out to be due to the fact that the PreparedStatements are being repeatedly prepared and not reused.
This is in fact a performance issue that we've been hitting in our application code, so we're wondering if we're missing something regarding how to reuse prepared statements properly in ScalaQuery, or if dropping down to JDBC is the suggested workaround.
Got an answer from the scalaquery mailing list. This is just how ScalaQuery is designed - it assumes that you're something that provides statement pooling underneath:
Nowadays ScalaQuery always requests a new PreparedStatement from the Connection. There used to be a cache for PreparedStatements in early versions but I removed it because there are already good solutions for this problem. Every decent connection pool should have an option for PreparedStatement pooling. If you're using a Java EE server, it should have an integrated connection pool. For standalone applications, you can use something like http://sourceforge.net/projects/c3p0/

Grails/Hibernate Batch Insert

I am using STS + Grails 1.3.7 and doing the batch insertion for thousands instances of a domain class.
It is very slow because Hibernate simply batch all the SQL statements into one JDBC call instead of combining the statements into one.
How can I make them into one large statement?
What you can do is to flush the hibernate session each 20 insert like this :
int cpt = 0
mycollection.each{
cpt ++
if(cpt > 20){
mycollection.save(flush:true)
}
else{
mycollection.save()
}
}
The flushing of hbernate session executes the SQL statement each 20 inserts.
This is the easiest method but you can find more interessant way to do it in Tomas lin blog. He is explaining exactly what you want to do : http://fbflex.wordpress.com/2010/06/11/writing-batch-import-scripts-with-grails-gsql-and-gpars/
Using the withTransaction() method on the domain classes makes the inserts much faster for batch scripts. You can build up all of the domain objects in one collection, then insert them in one block.
For example:
Player.withTransaction{
for (p in players) {
p.save()
}
}
You can see this line in Hibernate doc:
Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.
When I changed the type of generator, it worked.

How to flush the io buffer in Erlang?

How do you flush the io buffer in Erlang?
For instance:
> io:format("hello"),
> io:format(user, "hello").
This post seems to indicate that there is no clean solution.
Is there a better solution than in that post?
Sadly other than properly implementing a flush "command" in the io/kernel subsystems and making sure that the low level drivers that implement the actual io support such a command you really have to simply rely on the system quiescing before closing. A failing I think.
Have a look at io.erl/io_lib.erl in stdlib and file_io_server.erl/prim_file.erl in kernel for the gory details.
As an example, in file_io_server (which effectively takes the request from io/io_lib and routes it to the correct driver), the command types are:
{put_chars,Chars}
{get_until,...}
{get_chars,...}
{get_line,...}
{setopts, ...}
(i.e. no flush)!
As an alternative you could of course always close your output (which would force a flush) after every write. A logging module I have does something like this every time and it doesn't appear to be that slow (it's a gen_server with the logging received via cast messages):
case file:open(LogFile, [append]) of
{ok, IODevice} ->
io:fwrite(IODevice, "~n~2..0B ~2..0B ~4..0B, ~2..0B:~2..0B:~2..0B: ~-8s : ~-20s : ~12w : ",
[Day, Month, Year, Hour, Minute, Second, Priority, Module, Pid]),
io:fwrite(IODevice, Msg, Params),
io:fwrite(IODevice, "~c", [13]),
file:close(IODevice);
io:put_chars(<<>>)
at the end of the script works for me.
you could run
flush().
from the shell, or try
flush()->
receive
_ -> flush()
after 0 -> ok
end.
That works more or less like a C flush.