Is it possible to override type-error behavior in Common Lisp upon out-of-range values for limited-range variables? - error-handling

Sorry if the question's topic is oddly phrased (for lack of better terminology -- also one of the reasons I didn't find anything Googling this specific topic), so here's what I mean with an example.
Let's say this function foobar is defined:
(defun foobar (x)
(declare (type (integer -100 100) x))
(format T "X is ~A~%" x))
So with the declare line, x is an integer that must be -100, 100, or any integer in-between. Thus, doing this yields an error:
CL-USER> (foobar 101)
The value 101 is not of type (INTEGER -100 100).
[Condition of type TYPE-ERROR]
Restarts:
(blah blah blah)
Short of changing the function itself to explicitly do clamping, is there a way to specify an override behavior such that doing this, without altering the defun of foobar itself:
(foobar [any-value-over-100])
Clamps it to 100, and likewise with x < -100, without the function body itself having extra lines of code to do so?
Edit: To answer one responder, this is clamping -- keeping a value strictly within a defined minimum and maximum range. In Lisp, this is an example:
CL-USER> (defun clamp (x min max)
(if (> x max)
max
(if (< x min)
min
x)))
CLAMP
CL-USER> (clamp 5 4 9)
5
CL-USER> (clamp -2 4 9)
4
CL-USER> (clamp 123 4 9)
9
While I can easily just make this a macro and put it in the beginning of any function (and I have an odd feeling this'll ultimately be what I'll have to do), this question is asking whether it's possible to tell the Common Lisp error handler to "just do this with the values instead!", rather than having it interrupting the entire program flow as it normally does.

Type declarations in Common Lisp
Your code:
(defun foobar (x)
(declare (type (integer -100 100) x))
(format T "X is ~A~%" x))
The consequences of call above with something like (foobar 120) are entirely undefined in Common Lisp.
it may be completely ignored
it may lead to errors or various runtime problems
it may help the compiler to create better code (this is btw. the main reason for those declarations)
it may be typed checked at compile / or runtime. Only very few Lisp compilers do it.
Portable runtime type checking in Common Lisp
If you want to portably check for runtime type errors use CHECK-TYPE or ASSERT.
(defun foobar (x)
(check-type x (integer -100 100))
(format T "X is ~A~%" x))
Advising
Extending functions without changing their source code is called 'advising' in Lisp. This is not in the Common Lisp standard for normal functions, but there should be tools for it and it is not that difficult to write such a thing.
Extending Generic Functions
Common Lisp has this mechanism built-in for generic functions. The standard method combination has :before, :after and :around advising.
(defmethod foobar ((x integer))
(check-type x (integer -100 100))
(format T "X is ~A~%" x))
In Common Lisp one cannot dispatch on arbitrary types - only on classes. There are classes for basic types like string, integer, ... Here we use that x is an integer.
If you want to clamp foobar's x:
(defmethod foobar :around ((x integer))
(call-next-method (clamp x -200 100)))
Above is an :around method. I calls the next method, the one above, with a changed argument. This is allowed as long as the argument class does not change the dispatch.
Alternative approach: Macro
One goal might be to write less code and have code more declarative.
Maybe one wants to write:
(defun-clamped foobar ((x (integer :min -100 :clampled-max 100)))
(format T "X is ~A~%" x))
Then I would just write the defun-clamped macro, which expands into a normal DEFUN, which does the necessary things.

Ignore Declarations
If you compile the function with the appropriate settings, the type declaration will be ignored.
Redefine Function
Alternatively, you can redefine your function like this:
(defparameter *foobar-orig* (fdefinition *foobar*))
(defun foobar (x)
(funcall *foobar-orig* (whatever-you-want x)))
Use restarts
Your best way forward is to replace declarations with check-type and establish appropriate handlers, e.g.,
(handler-bind ((type-error
(lambda (c)
(let ((et (type-error-expected-type c)))
(store-value (clamp (type-error-datum c) (second et) (third et)))))))
(let ((x 100))
(check-type x (integer 1 10))
(print x)))
The standard does not provide for global error handlers, but implementations usually do.

If I understand you correctly you want to ensure the integer to be in that range, if that is the case I don't think you should handle it with type error, but a (let ...) something like this:
(defun ensure-range (x low high)
(cond ((< x low) low)
((> x high) high)
(t x)))
(defun foobar (x)
(let (x (ensure-range x -100 100))
(format T "X is ~A~%" x)))

Don't declare x to be between -100 and 100 if you can accept something else. I think that an implementation might be free to allow any kinds of memory corruption if a declaration is violated.
So, doing
(declare (optimize (safety 0)))
to avoid the declaration throwing an error is not really a good idea.
You can first clamp the value and then put the rest of the function definition into a LET form.
(defun foo (x)
(declare (type integer x))
(let ((x (clamp x -100 100)))
(declare (type (integer -100 100) ; LET allows declarations, too!
x))
(bar x)))
If you want a macro to do this for you, something like the following should work:
(defmacro defclamp (name (arg min max)
&body body)
`(defun ,name (,arg)
(declare (type real ,arg))
(let ((,arg (clamp ,arg ,min ,max)))
(declare (type (real ,min ,max)
,arg))
,#body)))

Related

How can I use types so that generic operations are inlined (or "open coded") in sbcl?

SBCL compiler optimizations are based on the idea that if a type is declared, then "open coding" allows generic operations to be replaced with specific ones.
For example
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
Will allow the generic + to be replaced with a single instruction for fixnum.
However, I have found that in practice, this seems to rarely be possible because:
In order for a function to be specialized/optimized it must be inlinable. The declaration must be marked explicitly with a (declaim (inline ...)), so the author of a function must anticipate that others might want to inline it. (In theory the compiler could generate multiple versions, but this doesn't seem to be the case.)
Most standard functions do not appear inlineable.
For example, one would expect that the following declaration is sufficient for open coding to take place:
(defun max-integers (array)
(declare (optimize (speed 3) (space 0) (safety 0)))
(declare (inline reduce))
(declare (type (simple-array fixnum (*)) array))
(reduce (lambda (a b) (if (> b a) b a)) array))
However, the assembly shows it's making a function call to the generic reduce:
; Size: 22 bytes. Origin: #x1001BC8109
; 09: 488B15B0FFFFFF MOV RDX, [RIP-80] ; no-arg-parsing entry point
; #<FUNCTION (LAMBDA
; # ..)>
; 10: B904000000 MOV ECX, 4
; 15: FF7508 PUSH QWORD PTR [RBP+8]
; 18: B8781C3220 MOV EAX, #x20321C78 ; #<FDEFN REDUCE>
; 1D: FFE0 JMP RAX
The conclusion seems to be that the compiler cannot actually do much type optimization, as each usage of reduce, map, etc is a barrier to type propagation, and they are building blocks of everything else.
How can I overcome this and take advantage of optimizations by declaring types?
I really want to avoid writing type specific versions of each function or "macroifying" what should be a function.
I think one answer is that if you want to write FORTRAN-style array-bashing code, write FORTRAN-style array-bashing code. In particular using things like reduce is probably not the way to do this.
For instance if you change your function to the perfectly readable
(defun max-integers/loop (array)
(declare (optimize (speed 3) (space 0) (safety 0))
(type (simple-array fixnum (*)) array))
(loop for i of-type fixnum across array
maximizing i))
Then SBCL does a far, far better job of optimising it.
It's worth pointing out another confusion in your question: You say that for something like
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
SBCL will optimize + to the machine instruction. No, it won't. The reason it won't is because the fixnum type is not closed under addition: consider what (add most-positive-fixnum 1) should do. If you want to generate very fast code for integers you need to make sure that your integer types are small enough that the compiler can be sure that the operations you're doing on them remain machine integers (or, if you want to live dangerously, cover your code with (the fixnum ...) and set safety to 0 when compiling, which seems to allow the compiler to just return the wrong answer for addition in the way people usually expect computers to do).
You can't force the implementation to open-code functions that weren't declared INLINE when they were defind -- it simply hasn't saved the information needed.
However, the overhead of calling REDUCE is probably negligible compared to the actual processing. So what you can do is declare the types of a and b, to optimize the callback function.
(reduce (lambda (a b) (declare (type fixnum a b)) (if (> b a) b a)) array)
I guess you were hoping that if it open-coded reduce it would automatically propagate this type from the declaration of array, so you wouldn't need to do this.

Why does this simple LISP function throw an error?

I isolated this function from a larger script and ran it through https://www.jdoodle.com/execute-clisp-online/. Even though there is an error thrown, it seems to follow the rules of LISP unless I'm missing something blatantly obvious.
(defun cannibals-can-eat (state start-state)
(let ((left-bank-missionaries 2)
(left-bank-cannibals 5)
(right-bank-missionaries (- 3 left-bank-missionaries))
(right-bank-cannibals (- 2 left-bank-cannibals)))
(if (or (> left-bank-cannibals left-bank-missionaries)
(> right-bank-cannibals right-bank-missionaries))
t
nil)))
The error is sometimes The variable LEFT-BANK-MISSIONARIES is unbound.unmatched close parenthesis or syntax error near unexpected token('`. With this version of the function the error is the latter.
In Common Lisp there are two forms of local declarations (let):
(let ((var1 exp1)
(var2 exp2)
...
(varn expn))
exp)
and
(let* ((var1 exp1)
(var2 exp2)
...
(varn expn))
exp)
In the first every expression expi is evaluated in the environment holding before the let. In the second every expression expi is evaluated in the environment containing all the previous declarations var1 ... var(i-1).
So in your example the declaration of right-bank-missionaries uses left-bank-missionaries which is undefined since it is declared in the same let.
Simply use let* to allow the use of every variable immediately after its declaration:
(defun cannibals-can-eat (state start-state)
(let* ((left-bank-missionaries 2)
(left-bank-cannibals 5)
(right-bank-missionaries (- 3 left-bank-missionaries))
(right-bank-cannibals (- 2 left-bank-cannibals)))
(or (> left-bank-cannibals left-bank-missionaries)
(> right-bank-cannibals right-bank-missionaries))))
Note that the final if is useless if you want to return a generalized boolean.

How does this Scheme list iterator use call-with-current-continuation?

I'm trying to read this code:
(define list-iter
(lambda (a-list)
(define iter
(lambda ()
(call-with-current-continuation control-state)))
(define control-state
(lambda (return)
(for-each
(lambda (element)
(set! return (call-with-current-continuation
(lambda (resume-here)
(set! control-state resume-here)
(return element)))))
a-list)
(return 'list-ended)))
iter))
Can anyone explain how call-with-current-continuation works in this example?
Thanks
The essence of call-with-concurrent-continuation, or call/cc for short, is the ability to grab checkpoints, or continuations, during the execution of a program. Then, you can go back to those checkpoints by applying them like functions.
Here's a simple example where the continuation isn't used:
> (call/cc (lambda (k) (+ 2 3)))
5
If you don't use the continuation, it's hard to tell the difference. Here's a few where we actually use it:
> (call/cc (lambda (k) (+ 2 (k 3))))
3
> (+ 4 (call/cc (lambda (k) (+ 2 3))))
9
> (+ 4 (call/cc (lambda (k) (+ 2 (k 3)))))
7
When the continuation is invoked, control flow jumps back to where the continuation was grabbed by call/cc. Think of the call/cc expression as a hole that gets filled by whatever gets passed to k.
list-iter is a substantially more complex use of call/cc, and might be a difficult place to begin using it. First, here's an example usage:
> (define i (list-iter '(a b c)))
> (i)
a
> (i)
b
> (i)
c
> (i)
list-ended
> (i)
list-ended
Here's a sketch of what's happening:
list-iter returns a procedure of no arguments i.
When i is invoked, we grab a continuation immediately and pass it to control-state. When that continuation, bound to return, is invoked, we'll immediately return to whoever invoked i.
For each element in the list, we grab a new continuation and overwrite the definition of control-state with that new continuation, meaning that we'll resume from there the next time step 2 comes along.
After setting up control-state for the next time through, we pass the current element of the list back to the return continuation, yielding an element of the list.
When i is invoked again, repeat from step 2 until the for-each has done its work for the whole list.
Invoke the return continuation with 'list-ended. Since control-state isn't updated, it will keep returning 'list-ended every time i is invoked.
As I said, this is a fairly complex use of call/cc, but I hope this is enough to get through this example. For a gentler introduction to continuations, I'd recommend picking up The Seasoned Schemer.
Basically it takes a function f as its parameter, and applies f to the current context/state of the program.
From wikipedia:
(define (f return)
(return 2)
3)
(display (f (lambda (x) x))) ; displays 3
(display (call-with-current-continuation f)) ; displays 2
So basically when f is called without current-continuation (cc), the function is applied to 2, and then returns 3. When using current-continuation, the parameter is applied to 2, which forces the program to jump to the point where the current-continuation was called, and thus returns 2. It can be used to generate returns, or to suspend execution flow.
If you know C, think about it like this: in C, you can take a pointer to a function. You also have a return mechanism. Suppose the return took a parameter of the same type the function takes. Suppose you could take its address and store that address in a variable or pass it as a parameter, and allow functions to return for you. It can be used to mimic throw/catch, or as a mechanism for coroutines.
This is essentially:
(define (consume)
(write (call/cc control)))
(define (control ret)
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 1))))
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 2))))
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 3)))))
(consume)
(consume)
(consume)
Hope it is easier to understand.

Reorder function arguments in Lisp

I'm interested in an operator, "swap-arg", that takes as input 1) a function f of n variables, and 2) index k, and then returns a the same function except with the first and kth input variables swapped. eg (in mathematical notation):
(swap-arg(f,2))(x,y,z,w) = f(z,y,x,w)
Now my first idea is to implement this using rotatef as follows,
(defun swap-args (f k)
(lambda (L) (f (rotatef (nth k L) (car L)))))
However, this seems inelegant since it uses rotatef on the input. Also, it's O(n), and could be O(n^2) in practice if applied repeatedly to reindex everything.
This seems like a common problem people would have already considered, but I haven't been able to find anything. What's a good way to swap inputs like this? Is there a standard method people use?
Using APPLY:
(defun create-swapped-arg-function (f k)
"Takes as input a function f of n variables and an index k.
Returns returns a new function with the first and kth input variables swapped,
which calls the function f."
(lambda (&rest args)
(apply f (progn
(rotatef (nth k args) (first args))
args))))
Example:
CL-USER 5 > (funcall (create-swapped-arg-function #'list 2) 0 1 2 3 4 5 6)
(2 1 0 3 4 5 6)
Another way to do it would be to build the source code for such a function, compile it at runtime and return it. That would be useful if these functions are not created often, but called often.
Just for completeness, functions can also take keyword (named) arguments, using this the function can be called with any order of its keyword arguments.

Scheme: why this result when redefining a predefined operator?

I received an unexpected result when redefining the + operator in a scheme program using guile. I should point out that this occurred while experimenting to try to understand the language; there's no attempt here to write a useful program.
Here's the code:
(define (f a b) 4)
(define (show)
(display (+ 2 2)) (display ",") (display (f 2 2)) (newline))
(show)
; guile & mit-scheme: "4,4"
(define (+ a b) 5)
(define (f a b) 5)
(show)
; mit-scheme: "5,5"
; guile: "4,5" - this "4" is the unexpected result
(define (show)
(display (+ 2 2)) (display ",") (display (f 2 2)) (newline))
(show)
; guile & mit-scheme: "5,5"
In guile the function show uses the predefined definition of + even after I've redefined it, though it uses the new definition of f. I have to redefine show to get it to recognise the new definition of +. In mit-scheme both new definitions are recognised immediately, which is what I was expecting to happen. Also, any further definitions of + are instantly recognised by both interpreters without having to redefine show.
What's going on behind the scenes in guile to make it bind references to these redefined operators differently?
And why the difference between the two interpreters?
It looks like Guile is wrongly assuming that nobody is crazy enough to redefine + and is making the optimization of folding (+ 2 2) => 4, making (display (+ 2 2)) become (display 4). That would explain why you need to redefine show in order to reflect your new +.
Indeed, if you first do (define (+ a b) 4) at the very top of your program, Guile will not do that optimization and you will get 4,4 and 5,5 just like MIT Scheme.
Edit: Actually, it looks like Guile will optimize + to reference its own native + construct, meaning that even if you don't use constants (no constant folding) you will still be unable to redefine + like that.