Clojure - refer to a defrecord function - error-handling

How do I refer a function of a record ?
For the context, I am using Stuart Sierra's component. So I have a record like this :
(defrecord MyComponent []
component/Lifecycle
(start [component]
...)
(stop [component]
...)
However in the README, it is stated :
...you could wrap the body of stop in a try/catch that ignores all
exceptions. That way, errors stopping one component will not prevent
other components from shutting down cleanly.
I would like, however, to use Dire for that. Now how do I reference that stop function to use with Dire ?

There are two natural options:
You could use Dire to handle errors for component/stop (and possibly start):
(dire.core/with-handler! #'com.stuartsierra.component/stop
…)
This way you'd be committing to handling errors for all components you might use in your system and any calls to component/stop made anywhere in your application.
You could introduce a top-level function to handle your component's stop logic, register it with Dire and have your component/stop implementation merely delegate to it, and perhaps handle start similarly:
(defn start-my-component [component]
…)
(defn stop-my-component [component]
…)
(dire.core/with-handler! #'start-my-component
…)
(dire.core/with-handler! #'stop-my-component
…)
(defrecord MyComponent […]
component/Lifecycle
(start [component]
(start-my-component component))
(stop [component]
(stop-my-component component)))

You don't wrap stop, you wrap the body of stop - that is, everything but the argument declaration gets wrapped in your dire/with-handler! block, or whatever other method of error catching you prefer.
(defstruct MyComponent []
component/Lifecycle
(start [component]
(try (/ 1 0)
(catch Exception e)
(finally component))))
Note that however you handle errors, you will break the system if you do not return the component from your start method.

Related

What exactly does it mean for a handler to "decline" to handle a signal?

In the HyperSpec entry for HANDLER-BIND, it says that a handler can decline to handle a signal.
However, the linked glossary entry for decline to handle a signal is not very enlightening:
decline v. (of a handler) to return normally without having handled the condition being signaled, permitting the signaling process to continue as if the handler had not been present.
This definition begs and does not answer the question of what constitutes not returning normally?
Is there a full list of actions that constitute "handling" a signal?
I know from hands-on experience that INVOKE-RESTART appears to fit this criterion. But is that the only way for a handler to "handle" a signal, or are there others?
I think to understand the true meaning of “handle”, you should consider how the condition system works under the hood. Kent Pitman's sample implementation, written during the standardization process, is a good place to start (even though it has kludgy stuff like essentially implementing an entire object system, since CLOS was not yet part of the language).
Roughly speaking, the action of handler-bind is to set up a special variable, which we will call *handler-clusters*, so that it holds lists of lists of (type . function) pairs, corresponding to the list of bindings. A possible definition is
(defmacro handler-bind (bindings &body forms)
`(let ((*handler-clusters* (cons (list (mapcar #'(lambda (x) `(cons ',(car x) ,(cadr x)))
bindings))
*handler-clusters*)))
,#forms))
The signal function, then, goes through the clusters; if it finds one for the correct condition type, it calls the associated function. Definition:
(defun signal (datum &rest arguments)
(let ((condition (coerce-to-condition datum arguments :default 'simple-condition))
(*handler-clusters* *handler-clusters*)) ; save current value
(when (typep condition *break-on-signals*)
(with-simple-restart (continue "Continue the signaling process")
(break "Break caused by *BREAK-ON-SIGNALS*")))
(loop for cluster := (pop *handler-clusters*) do
(loop for binding in cluster do
(when (typep condition (car binding))
(funcall (cdr binding) condition)))))
nil)
where coerce-to-condition is a function that deals with “condition designators”. A subtle point is that we can't simply (loop for cluster in *handler-clusters* do …), because if, during the call of a handler, a condition of the same type as that which is being handled is signaled, the handler would be called recursively, which is probably not desirable. Thus the previous value is saved and we destructively pop the cluster off.
Now, remember that Common Lisp allows closures over block names and tagbody tags. That is, after the definitions
(defvar *transfer-control*)
(defun weird (function)
(tagbody
(go :start)
:tag
(print 'transferred)
(return-from weird)
:start
(setf *transfer-control* (lambda () ; captures the tag :tag
(go :tag)))
(funcall function)))
a form like
(weird (lambda () (funcall *transfer-control*)))
is allowed, and will print 'transferred. The control transfer happens, in a sense, outside of the lexical scope of the tagbody; the ability to (go :tag) has “escaped” its enclosing scope. (It would be an error to funcall *transfer-control* after weird has returned, because the dynamic extent of the tagbody has been exited.)
All this is to say that calling an ordinary Common Lisp function can cause a transfer of control, instead of returning a value. Calling *transfer-control* does nothing but unwind the dynamic environment up to the point of the tagbody, and then jumps to :tag. The function doesn't “return” in the usual sense of the term, because the evaluation of the expression in which it is embedded will be abruptly stopped, never to resume. (With weird and *transfer-control*, we've defined a primitive substitute for catch and throw that simply transfers control but doesn't convey a value at the same time. To see definitions of tagbody, block, and catch in terms of each other, see Henry Baker's “Metacircular Semantics for Common Lisp Special Forms”.)
Therefore, when signal calls the handler for a condition, two things can happen:
The handler transfers control, aborting the evaluation of signal and unwinding the stack to the place to which control was transferred.
The handler does not transfer control, but returns a value. In this case, as the definition above shows, signal will continue looking for a handler until reaching the end of *handler-clusters* or until another handler transfers control. This is called “declining”.
(In a way, it can also do neither or both, by, for instance, calling signal on another condition. The specification calls this deferring.)
For example, the hyperspec gives a sample expansion for handler-case. The form
(handler-case form
(type1 (var1) . body1)
(type2 (var2) . body2) ...)
becomes (ignoring problems of variable capture)
(block return-point
(let ((condition nil))
(tagbody
(handler-bind ((type1 #'(lambda (temp)
(setq condition temp)
(go :handler-tag-1)))
(type2 #'(lambda (temp)
(setq condition temp)
(go :handler-tag-2)))
...)
(return-from return-point form))
:handler-tag-1
(return-from return-point (let ((var1 condition)) . body1))
:handler-tag-2
(return-from return-point (let ((var2 condition)) . body2))
...)))
(I've rewritten the hyperspec's code to be more readable, although unhygienic, and I also fixed an error in the original.)
As you can see, the handlers established by handler-case unconditionally transfer control if called. Thus handler-case handlers definitely “handle” conditions.
Restarts are implemented in a very similar way, by restart-bind setting up the dynamic environment and invoke-restart using it to call a function. Because restarts are just functions, they need not transfer control, and so calling invoke-restart is not always an act of “handling”, although it is if the restart was established by restart-case or with-simple-restart—or, of course, if the restart transfers control.
The glossary describes things exactly but tersely: if a handler returns normally, it has declined to handle the condition. In order to handle the condition, it must instead transfer control so that it never returns. In CL this means it must transfer control 'upwards'.
An example might be
(block done
(handler-bind ((error (lambda (c)
(return-from done c))))
(error "exploded")))
Here the handler for conditions of type error is handling the condition, since it never returns normally but rather returns from the done block.
A full description of this is here.
Apologies for any indentation / paren errors: my lisp machine has emitted smoke so I am typing this on a more primitive system.

What is the difference between an Idempotent and a Deterministic function?

Are idempotent and deterministic functions both just functions that return the same result given the same inputs?
Or is there a distinction that I'm missing?
(And if there is a distinction, could you please help me understand what it is)
In more simple terms:
Pure deterministic function: The output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output. There are no side-effects or other output.
Impure deterministic function: As with a deterministic function that is a pure function: the output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output - however there is other output (side-effects).
Idempotency: The practical definition is that you can safely call the same function multiple times without fear of negative side-effects. More formally: there are no changes of state between subsequent identical calls.
Idempotency does not imply determinacy (as a function can alter state on the first call while being idempotent on subsequent calls), but all pure deterministic functions are inherently idempotent (as there is no internal state to persist between calls). Impure deterministic functions are not necessarily idempotent.
Pure deterministic
Impure deterministic
Pure Nondeterministic
Impure Nondeterministic
Idempotent
Input
Only parameter arguments (incl. this)
Only parameter arguments (incl. this)
Parameter arguments and hidden state
Parameter arguments and hidden state
Any
Output
Only return value
Return value or side-effects
Only return value
Return value or side-effects
Any
Side-effects
None
Yes
None
Yes
After 1st call: Maybe.After 2nd call: None
SQL Example
UCASE
CREATE TABLE
GETDATE
DROP TABLE
C# Example
String.IndexOf
DateTime.Now
Directory.Create(String)Footnote1
Footnote1 - Directory.Create(String) is idempotent because if the directory already exists it doesn't raise an error, instead it returns a new DirectoryInfo instance pointing to the specified extant filesystem directory (instead of creating the filesystem directory first and then returning a new DirectoryInfo instance pointing to it) - this is just like how Win32's CreateFile can be used to open an existing file.
A temporary note on non-scalar parameters, this, and mutating input arguments:
(I'm currently unsure how instance methods in OOP languages (with their hidden this parameter) can be categorized as pure/impure or deterministic or not - especially when it comes to mutating the the target of this - so I've asked the experts in CS.SE to help me come to an answer - once I've got a satisfactory answer there I'll update this answer).
A note on Exceptions
Many (most?) programming languages today treat thrown exceptions as either a separate "kind" of return (i.e. "return to nearest catch") or as an explicit side-effect (often due to how that language's runtime works). However, as far as this answer is concerned, a given function's ability to throw an exception does not alter its pure/impure/deterministic/non-deterministic label - ditto idempotency (in fact: throwing is often how idempotency is implemented in the first place e.g. a function can avoid causing any side-effects simply by throwing right-before it makes those state changes - but alternatively it could simply return too.).
So, for our CS-theoretical purposes, if a given function can throw an exception then you can consider the exception as simply part of that function's output. What does matter is if the exception is thrown deterministically or not, and if (e.g. List<T>.get(int index) deterministically throws if index < 0).
Note that things are very different for functions that catch exceptions, however.
Determinacy of Pure Functions
For example, in SQL UCASE(val), or in C#/.NET String.IndexOf are both deterministic because the output depends only on the input. Note that in instance methods (such as IndexOf) the instance object (i.e. the hidden this parameter) counts as input, even though it's "hidden":
"foo".IndexOf("o") == 1 // first cal
"foo".IndexOf("o") == 1 // second call
// the third call will also be == 1
Whereas in SQL NOW() or in C#/.NET DateTime.UtcNow is not deterministic because the output changes even though the input remains the same (note that property getters in .NET are equivalent to a method that accepts no parameters besides the implicit this parameter):
DateTime.UtcNow == 2016-10-27 18:10:01 // first call
DateTime.UtcNow == 2016-10-27 18:10:02 // second call
Idempotency
A good example in .NET is the Dispose() method: See Should IDisposable.Dispose() implementations be idempotent?
a Dispose method should be callable multiple times without throwing an exception.
So if a parent component X makes an initial call to foo.Dispose() then it will invoke the disposal operation and X can now consider foo to be disposed. Execution/control then passes to another component Y which also then tries to dispose of foo, after Y calls foo.Dispose() it too can expect foo to be disposed (which it is), even though X already disposed it. This means Y does not need to check to see if foo is already disposed, saving the developer time - and also eliminating bugs where calling Dispose a second time might throw an exception, for example.
Another (general) example is in REST: the RFC for HTTP1.1 states that GET, HEAD, PUT, and DELETE are idempotent, but POST is not ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html )
Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.
So if you use DELETE then:
Client->Server: DELETE /foo/bar
// `foo/bar` is now deleted
Server->Client: 200 OK
Client->Server DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
// the client asks again:
Client->Server: DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
So you see in the above example that DELETE is idempotent in that the state of the server did not change between the last two DELETE requests, but it is not deterministic because the server returned 200 for the first request but 404 for the second request.
A deterministic function is just a function in the mathematical sense. Given the same input, you always get the same output. On the other hand, an idempotent function is a function which satisfies the identity
f(f(x)) = f(x)
As a simple example. If UCase() is a function that converts a string to an upper case string, then clearly UCase(Ucase(s)) = UCase(s).
Idempotent functions are a subset of all functions.
A deterministic function will return the same result for the same inputs, regardless of how many times you call it.
An idempotent function may NOT return the same result (it will return the result in the same form but the value could be different, see http example below). It only guarantees that it will have no side effects. In other words it will not change anything.
For example, the GET verb is meant to be idempotent in HTTP protocol. If you call "~/employees/1" it will return the info for employee with ID of 1 in a specific format. It should never change anything but simply return the employee information. If you call it 10, 100 or so times, the returned format will always be the same. However, by no means can it be deterministic. Maybe if you call it the second time, the employee info has changed or perhaps the employee no longer even exists. But never should it have side effects or return the result in a different format.
My Opinion
Idempotent is a weird word but knowing the origin can be very helpful, idem meaning same and potent meaning power. In other words it means having the same power which clearly doesn't mean no side effects so not sure where that comes from. A classic example of There are only two hard things in computer science, cache invalidation and naming things. Why couldn't they just use read-only? Oh wait, they wanted to sound extra smart, perhaps? Perhaps like cyclomatic complexity?

Alternative to the try (?) operator suited to iterator mapping

In the process of learning Rust, I am getting acquainted with error propagation and the choice between unwrap and the ? operator. After writing some prototype code that only uses unwrap(), I would like to remove unwrap from reusable parts, where panicking on every error is inappropriate.
How would one avoid the use of unwrap in a closure, like in this example?
// todo is VecDeque<PathBuf>
let dir = fs::read_dir(&filename).unwrap();
todo.extend(dir.map(|dirent| dirent.unwrap().path()));
The first unwrap can be easily changed to ?, as long as the containing function returns Result<(), io::Error> or similar. However, the second unwrap, the one in dirent.unwrap().path(), cannot be changed to dirent?.path() because the closure must return a PathBuf, not a Result<PathBuf, io::Error>.
One option is to change extend to an explicit loop:
let dir = fs::read_dir(&filename)?;
for dirent in dir {
todo.push_back(dirent?.path());
}
But that feels wrong - the original extend was elegant and clearly reflected the intention of the code. (It might also have been more efficient than a sequence of push_backs.) How would an experienced Rust developer express error checking in such code?
How would one avoid the use of unwrap in a closure, like in this example?
Well, it really depends on what you wish to do upon failure.
should failure be reported to the user or be silent
if reported, should one failure be reported or all?
if a failure occur, should it interrupt processing?
For example, you could perfectly decide to silently ignore all failures and just skip the entries that fail. In this case, the Iterator::filter_map combined with Result::ok is exactly what you are asking for.
let dir = fs::read_dir(&filename)?;
let todos.extend(dir.filter_map(Result::ok));
The Iterator interface is full of goodies, it's definitely worth perusing when looking for tidier code.
Here is a solution based on filter_map suggested by Matthieu. It calls Result::map_err to ensure the error is "caught" and logged, sending it further to Result::ok and filter_map to remove it from iteration:
fn log_error(e: io::Error) {
eprintln!("{}", e);
}
(|| {
let dir = fs::read_dir(&filename)?;
todo.extend(dir
.filter_map(|res| res.map_err(log_error).ok()))
.map(|dirent| dirent.path()));
})().unwrap_or_else(log_error)

FreeMarker ?has_content on Iterator

How does FreeMarker implement .iterator()?has_content for Iterator ?
Does it attempt to get the first item to decide whether to render,
and does it keep it for the iteration? Or does it start another iteration?
I have found this
static boolean isEmpty(TemplateModel model) throws TemplateModelException
{
if (model instanceof BeanModel) {
...
} else if (model instanceof TemplateCollectionModel) {
return !((TemplateCollectionModel) model).iterator().hasNext();
...
}
}
But I am not sure what Freemarker wraps the Iterator type to.
Is it TemplateCollectionModel?
It doesn't get the first item, it just calls hasNext(), and yes, with the default ObjectWrapper (see the object_wrapper configuration setting) it will be treated as a TemplateCollectionModel, which can be listed only once though. But prior to 2.3.24 (it's unreleased when I write this), there are some glitches to look out for (see below). Also if you are using pure BeansWrapper instead of the default ObjectWrapper, there's a glitch (see below too).
Glitch (2.3.23 and earlier): If you are using the default ObjectWrapper (you should), and the Iterator is not re-get for the subsequent listing, that is, the same already wrapped value is reused, then it will freak out telling that an Iterator can be listed only once.
Glitch 2, with pure BeansWrapper: It will just always say that it has content, even if the Iterator is empty. That's also fixed in 2.3.24, however, you need to create the BeansWrapper itself (i.e., not (just) the Configuration) with 2.3.24 incompatibleImprovements for the fix to be active.
Note that <#list ...>...<#else>...</#list> has been and is working in all cases (even before 2.3.24).
Last not least, thanks for bringing this topic to my attention. I have fixed these in 2.3.24 because of that. (Nightly can be built from here: https://github.com/apache/incubator-freemarker/tree/2.3-gae)

Which one is better for me to use: "defer-panic-recover" or checking "if err != nil { //dosomething}" in golang?

I've made a large program that opens and closes files and databases, perform writes and reads on them etc among other things. Since there no such thing as "exception handling in go", and since I didn't really know about "defer" statement and "recover()" function, I applied error checking after every file-open, read-write, database entry etc. E.g.
_,insert_err := stmt.Run(query)
if insert_err != nil{
mylogs.Error(insert_err.Error())
return db_updation_status
}
For this, I define db_updation_status at the beginning as "false" and do not make it "true" until everything in the program goes right.
I've done this in every function, after every operation which I believe could go wrong.
Do you think there's a better way to do this using defer-panic-recover? I read about these here http://golang.org/doc/articles/defer_panic_recover.html, but can't clearly get how to use them. Do these constructs offer something similar to exception-handling? Am I better off without these constructs?
I would really appreciate if someone could explain this to me in a simple language, and/or provide a use case for these constructs and compare them to the type of error handling I've used above.
It's more handy to return error values - they can carry more information (advantage to the client/user) than a two valued bool.
What concerns panic/recover: There are scenarios where their use is completely sane. For example, in a hand written recursive descent parser, it's quite a PITA to "bubble" up an error condition through all the invocation levels. In this example, it's a welcome simplification if there's a deferred recover at the top most (API) level and one can report any kind of error at any invocation level using, for example
panic(fmt.Errorf("Cannot %v in %v", foo, bar))
If an operation can fail and returns an error, than checking this error immediately and handling it properly is idiomatic in go, simple and nice to check if anything gets handled properly.
Don't use defer/recover for such things: Needed cleanup actions are hard to code, especially if stuff gets nested.
The usual way to report an error to a caller is to return an error as an extra return value. The canonical Read method is a well-known instance; it returns a byte count and an error.
But what if the error is unrecoverable? Sometimes the program simply cannot continue.
For this purpose, there is a built-in function panic that in effect creates a run-time error that will stop the program (but see the next section). The function takes a single argument of arbitrary type—often a string—to be printed as the program dies. It's also a way to indicate that something impossible has happened, such as exiting an infinite loop.
http://golang.org/doc/effective_go.html#errors