Clojure concurrency : Automating SQL Queries

Clojure concurrency : Automating SQL Queries - sql

I have a small program that is supposed to read SQL queries/commands one by one and execute them against a database.
If a query executes successfully, the next query is executed.
If there is an error executing one query, the program should stop executing all together.
I have the code, except that the query still continues execution even when there is an exception.
(defn main
[]
(loop [queries (get-all-queries)
querycount 1]
(let [q (first queries)]
(println (format "currently processing query %s", querycount))
(cond (nil? q) (println "All Queries ran successfully.")
:else (do
(cond (= (:status (process-query q querycount)) "OK")
(recur (rest queries) (+querycount 1)))
:else (println "An error occured while running queries")))))))
(defn process-query
[query query-count]
(let [{query-body :query-body, is-query-running? :is-query-running?} query
my-agent (agent
{:error false, :query-count query-count}
:error-handler handler-fn)]
(send my-agent (fn[_]
(execute-query! db query-body)))))
(loop [is-query-running? (is-query-running?)
error? (:error #my-agent)]
(cond error? (do (println "Error")
{:status "ERROR" :error-msg (:error-msg #my-agent)})
(and (not is-query-running?) (not error?)) (do (println "Success")
{:status "OK"})
(:else (do
(Thread/sleep 2000)
(recur (is-query-running?) (:error #my-agent)))))))
(defn handler-fn
[agent exception]
(println (format "an exception occured : %s" exception))
(if (instance? java.sql.BatchUpdateException exception)
(println (.getNextException exception)))
(send agent (? [_] {:error true, :error-message exception}))
(throw exception))
The reason why I'm using an agent is that I have some queries that take 4 hours to run.
and when that happens, the database does not notify the program that the query has been completed. instead, the program is stuck. so, instead, I constantly poll to check if the query is done already.
Is this the best way to accomplish what I'm trying to do ?
Should I be using any other concurrency primitives ?
Do I even need concurrency primitives ?
I've been thinking about this for a long time now.

I think you will need to use core.async to resolve this kind of workflow
Take a look at http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html
This lib will let you to check your conditions with the related asynchronous tasks involved
A few resources that may help you
http://www.infoq.com/news/2013/07/core-async
https://www.youtube.com/watch?v=AhxcGGeh5ho

The main problem seems to be: on the one hand, you write that the long queries never return, i.e. they don't even throw exceptions. On the other hand, your error detection mechanism for the agent is based on catching an exception.
I think what you need to do is not check (primarily) whether an exception was caught, but whether execute-query has actually returned a valid result when is-query-running? returns false.
Regarding the right concurrency primitive, I would suggest using a future instead of an agent. They are simpler than agents, since they can only return a single value, instead of changing their state multiple times, and their way of error handling is to simply return the exception instead of the regular return value.
You can then follow this implementation idea: in the loop, do a deref with timeout on the future. If the return value of the deref is whatever execute-query! returns regularly, return "OK"(resp. add a second expression to the future body as a clearly identifiable return value, e.g. the keyword :ok). Otherwise, if the return value of the deref is an exception, return "ERROR" with the :error-msg from the exception like you do now. Finally, if the return value is the timeout value you gave to the deref, call is-query-running?. If it's true, loop another time, if it's false, return ERROR with a special :error-msg which communicates that your query ended without either returning nor throwing an exception. (And probably call future-cancel so you don't leak threads of never-ending execute-query! calls.)

Related

Persist detailed information about failed Item processing

I´ve got a Job that runs a TaskletStep, then a chunk-based step and then another TaskletStep.
In each of these steps, errors (in the form of Exceptions) can occur.
The chunk-based step looks like this:
stepBuilderFactory
.get("step2")
.chunk<SomeItem, SomeItem>(1)
.reader(flatFileItemReader)
.processor(itemProcessor)
.writer {}
.faultTolerant()
.skipPolicy { _ , _ -> true } // skip all Exceptions and continue
.taskExecutor(taskExecutor)
.throttleLimit(taskExecutor.corePoolSize)
.build()
The whole job definition:
jobBuilderFactory.get("job1")
.validator(validator())
.preventRestart()
.start(taskletStep1)
.next(step2)
.next(taskletStep2)
.build()
I expected that Spring Batch somehow picks up the Exceptions that occur along the way, so I can then create a Report including them after the Job has finished processing. Looking at the different contexts, there´s also fields that should contain failureExceptions. However, it seems there´s no such information (especially for the chunked step).
What would be a good approach if I need information about:
what Exceptions did occur in which Job execution
which Item was the one that triggered it

The JobExecution provides a method to get all failure exceptions that happened during the job. You can use that in a JobExecutionListener#afterJob(JobExecution jobExecution) to generate your report.
In regards to which items caused the issue, this will depend on where the exception happens (during the read, process or write operation). For this requirement, you can use one of the ItemReadListener, ItemProcessListener or ItemWriteListener to keep record of the those items (For example, by adding them to the job execution context to be able to get access to them in the JobExecutionListener#afterJob method for your report).

Handling errors with clojure core.async pipeline

I am trying to understand what's the correct way to handle errors using core.async/pipeline, my pipeline is the following:
input --> xf-run-computation --> first-out
first-out --> xf-run-computation --> last-out
Where xf-run-computation will do an http calls and return response. However some of these responses will return an error. What's the best way to handles these errors?
My solution is to split the outputs channels in success-values and error-values and then merge them back to a channel:
(let [[success-values1 error-values1] (split fn-to-split first-out)
[success-values2 error-values2] (split fn-to-split last-out)
errors (merge [error-values1 error-values2])]
(pipeline 4 first-out xf-run-computation input)
(pipeline 4 last-out xf-run-computation success-values1)
[last-out errors])
So my function will return the last results and the errors.

Generally speaking, what is "the" correct way is probably depending on your application needs, but given your problem description, I think there are three things you need to consider:
xf-run-computation returns data that your business logic would see as errors,
xf-run-computation throws an exception and
given that http calls are involved, some runs of xf-run-computation might never finish (or not finish in time).
Regarding point 3., the first thing you should consider is using pipeline-blocking instead of pipeline.
I think your question is mostly related to point 1. The basic idea is that the result of xf-run-computation needs to return a data structure (say a map or a record), which clearly marks a result as an error or a success, e.g. {:title nil :body nil :status "error"}. This will give you some options of dealing with the situation:
all your later code simply ignores input data which has :status "error". I.e., your xf-run-computation would contain a line like (when (not (= (:status input) "error")) (run-computation input)),
you could run a filter on all results between the pipeline-calls and filter them as needed (note that filter can also be used as a transducer in a pipeline, thereby obliterating the old filter> and filter< functions of core.async),
you use async/split like you suggested / Alan Thompson shows in his answer to to filter out the error values to a separate error channel. There is no real need to have a second error channel for your second pipeline if you're going to merge the values anyway, you can simply re-use your error channel.
For point 2., the problem is that any exception in xf-run-computation is happening in another thread and will not simply propagate back to your calling code. But you can make use of the ex-handler argument to pipeline (and pipeline-blocking). You could either simply filter out all exceptions, put the result on a separate exception channel or try to catch them and turn them into errors (potentially putting them back on the result or another error channel) -- the latter only makes sense, if the exception gives you enough information, e.g. an id or something that allows to tie back the exception to the input which caused the exception. You could arrange for this in xf-run-computation (i.e. catch any exception thrown from a third-party library like the http call).
For point 3, the canonical answer in core.async would be to point to a timeout channel, but this doesn't make much sense in relation to pipeline. A better idea is to ensure on your http calls that a timeout is set, e.g. the :timeout option of http-kit or :socket-timeout and :conn-timeout of clj-http. Note that these options will usually result in an exception on timeout.

Here is an example that does what you are suggesting. Beginning with (range 10) it first filters out the multiples of 5, then the multiples of 3.
(ns tst.clj.core
(:use clj.core
clojure.test )
(:require
[clojure.core.async :as async]
[clojure.string :as str]
)
)
(defn err-3 [x]
"'fail' for multiples of 3"
(if (zero? (mod x 3))
(+ x 300) ; error case
x)) ; non-error
(defn err-5 [x]
"'fail' for multiples of 5"
(if (zero? (mod x 5))
(+ x 500) ; error case
x)) ; non-error
(defn is-ok?
"Returns true if the value is not 'in error' (>=100)"
[x]
(< x 100))
(def ch-0 (async/to-chan (range 10)))
(def ch-1 (async/chan 99))
(def ch-2 (async/chan 99))
(deftest t-2
(let [
_ (async/pipeline 1 ch-1 (map err-5) ch-0)
[ok-chan-1 fail-chan-1] (async/split is-ok? ch-1 99 99)
_ (async/pipeline 1 ch-2 (map err-3) ok-chan-1)
[ok-chan-2 fail-chan-2] (async/split is-ok? ch-2 99 99)
ok-vec-2 (async/<!! (async/into [] ok-chan-2))
fail-vec-1 (async/<!! (async/into [] fail-chan-1))
fail-vec-2 (async/<!! (async/into [] fail-chan-2))
]
(is (= ok-vec-2 [1 2 4 7 8]))
(is (= fail-vec-1 [500 505]))
(is (= fail-vec-2 [303 306 309]))))
Rather than return the errors, I would probably just log them as soon as they are detected and then forget about them.

F#: UnitTest case failled

Consider i have a function like newConvert. I am sopposed to recieve an error for newCovertor("IL") . To generate this erorr: I used failwith "Invalid Input"
The erorr is :
System.Exception: Invalid Input
> at FSI_0160.inputChecker(FSharpList`1 numberList) in C:\Users\Salman\Desktop\coursework6input\itt8060-master\coursework6input\BrokenRomanNumbers\Library1.fs:line 140
at FSI_0160.newCovertor(String romanNumber) in C:\Users\Salman\Desktop\coursework6input\itt8060-master\coursework6input\BrokenRomanNumbers\Library1.fs:line 147
at <StartupCode$FSI_0165>.$FSI_0165.main#() in C:\Users\Salman\Desktop\coursework6input\itt8060-master\coursework6input\BrokenRomanNumbers\newtest.fs:line 32
Stopped due to error
I used FsUnit and Nunit, they are loaded and installed and working rightly.
Then I made a tes for it using
[<TestFixture>]
type ``Given a Roman number15 ``()=
[<Test>]
member this.
``Whether the right convert for this number must be exist``()=
newCovertor("IL") |> should equal System.Exception
I cannot understand!! The function fails rightly, but the test does not accept it, so why??????

newCovertor does not produce a System.Execption - it throws it - so you never get to the should equal ... part
to catch the exception with FsUnit you have to wrap it as an action:
(fun () -> newCovertor("IL") |> ignore) |> should throw typeof<System.Exception>
also see the really good docs - there are quite a few ways to achieve this
The reason your version is not working is that NUnit will normaly mark a test as failed exactly when a exception is thrown inside the test-methods body - which you do.
If you wrap it in an action than the should throw can choose to execute this delayed action inside try ... match block to catch the expected exception and filter it out (or in this case throw an exception if there where none)
btw: just like in C#/VB.net you can also use the ExpectedExceptionAttribute from NUnit on the method level if you like - this way the test-framework will handle it for you.

inconsistency between repl and test runner

I'm having some issues with testing a clojure macro. When I put the code through the repl, it behaves as expected, but when I try to expect this behavior in a test, I'm getting back nil instead. I have a feeling it has to do with how the test runner handles macroexpansion, but I'm not sure what exactly is going on. Any advice/alternative ways to test this code is appreciated.
Here is a simplified example of the macro I'm trying to test
(defmacro macro-with-some-validation
[-name- & forms]
(assert-symbols [-name-])
`(defn ~-name- [] (println "You passed the validation")))
(macroexpand-1 (read-string "(macro-with-some-validation my-name (forms))"))
;; ->
(clojure.core/defn my-name [] (clojure.core/println "You passed the validation"))
When passed into the repl
(macroexpand-1 (read-string "(macro-with-some-validation 'not-symbol (forms))"))
;; ->
rulesets.core-test=> Exception Non-symbol passed in to function. chibi-1-0-0.core/assert-symbols (core.clj:140)
But when put through a test
(deftest macro-with-some-validation-bad
(testing "Passing in a non-symbol to the macro"
(is (thrown? Exception
(macroexpand-1 (read-string "(macro-with-some-validation 'not-symbol (forms))"))))))
;; after a lein test ->
FAIL in (macro-with-some-validation-bad) (core_test.clj:50)
Passing in a non-symbol to the macro
expected: (thrown? Exception (macroexpand-1 (read-string "(macro-with-some-validation 'not-symbol (forms))")))
actual: nil
Thanks.
Edit: forgot to include the source for assert-symbols in case it matters
(defn assert-symbol [symbol]
(if (not (instance? clojure.lang.Symbol symbol))
(throw (Exception. "Non-symbol passed in to function."))))
(defn assert-symbols [symbols]
(if (not (every? #(instance? clojure.lang.Symbol %) symbols))
(throw (Exception. "Non-symbol passed in to function."))))

After changing my read-strings to be ` instead, I'm able to get the code working again. Still strange that read-string wasn't working correctly, though. Thanks for the help.

Understanding Erlang ODBC application

I'm connecting to an DB source with Erlang ODBC. My code looks like:
main() ->
Sql = "SELECT 1",
Connection = connect(),
case odbc:sql_query(Connection, Sql) of
{selected, Columns, Results} ->
io:format("Success!~n Columns: ~p~n Results: ~p~n",
[Columns, Results]),
ok;
{error, Reason} ->
{error, Reason}
end.
connect() ->
ConnectionString = "DSN=dsn_name;UID=uid;PWD=pqd",
odbc:start(),
{ok, Conn} = odbc:connect(ConnectionString, []),
Conn.
It's ok now. But how can I handle errors at least in my query? As I understand it contains in {error, Reason}, but how can I output it when something gone wrong? I'm trying to add io:format like at the first clause, but it doesn't work.
At second, unfortunately, I can't find any reference that can explain syntax well, I can't understand what does ok mean in this code (first - line 8, and second - line 16. If I'm right it just means the case that connection is ok and this variable isn't assigned? But what it means at 8 line?)

ok in line 8 is the return value of the case statement when the call to odbc:sql_query(Connection, Sql) returns a result that can match the expression {selected, Columns, Results}. In this case it is useless since the function io:format(...) already returns ok.
the second ok: {ok, Conn} is a very common Erlang usage: the function returns a tuple {ok,Value} in case of success and {error,Reason} in case of failure. So you can match on the success case and extract the returned value with this single line: {ok, Conn} = odbc:connect(ConnectionString, []),
In this case the function connect() doesn't handle the error case, so this code has 4 different possible behaviors:
It can fails to connect to the database: the process will crash with a badmatch error at line 16.
It connects to the database but the query fails: the main function will return the value {error,Reason}.
It connects to the database and the query returns an answer that doesn't match the tuple {selected, Columns, Results}: the process will crash with a badmatch error at line 4.
It connects to the database and the query returns an answer that matches the tuple {selected, Columns, Results}: the function will print
Success!
Columns: Column
Results: Result
and returns ok

So I found something. The {error, Reason} contains the connection errors, means that we specified wrong DSN name etc. Regarding to my offer to catch query error we can read this from Erlang reference:
Gaurds All API-functions are guarded and if you pass an argument of
the wrong type a runtime error will occur. All input parameters to
internal functions are trusted to be correct. It is a good programming
practise to only distrust input from truly external sources. You are
not supposed to catch these errors, it will only make the code very
messy and much more complex, which introduces more bugs and in the
worst case also covers up the actual faults. Put your effort on
testing instead, you should trust your own input.
Means that we should be careful about what we write. That's not bad.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas