How can I use types so that generic operations are inlined (or "open coded") in sbcl? - optimization

SBCL compiler optimizations are based on the idea that if a type is declared, then "open coding" allows generic operations to be replaced with specific ones.
For example
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
Will allow the generic + to be replaced with a single instruction for fixnum.
However, I have found that in practice, this seems to rarely be possible because:
In order for a function to be specialized/optimized it must be inlinable. The declaration must be marked explicitly with a (declaim (inline ...)), so the author of a function must anticipate that others might want to inline it. (In theory the compiler could generate multiple versions, but this doesn't seem to be the case.)
Most standard functions do not appear inlineable.
For example, one would expect that the following declaration is sufficient for open coding to take place:
(defun max-integers (array)
(declare (optimize (speed 3) (space 0) (safety 0)))
(declare (inline reduce))
(declare (type (simple-array fixnum (*)) array))
(reduce (lambda (a b) (if (> b a) b a)) array))
However, the assembly shows it's making a function call to the generic reduce:
; Size: 22 bytes. Origin: #x1001BC8109
; 09: 488B15B0FFFFFF MOV RDX, [RIP-80] ; no-arg-parsing entry point
; #<FUNCTION (LAMBDA
; # ..)>
; 10: B904000000 MOV ECX, 4
; 15: FF7508 PUSH QWORD PTR [RBP+8]
; 18: B8781C3220 MOV EAX, #x20321C78 ; #<FDEFN REDUCE>
; 1D: FFE0 JMP RAX
The conclusion seems to be that the compiler cannot actually do much type optimization, as each usage of reduce, map, etc is a barrier to type propagation, and they are building blocks of everything else.
How can I overcome this and take advantage of optimizations by declaring types?
I really want to avoid writing type specific versions of each function or "macroifying" what should be a function.

I think one answer is that if you want to write FORTRAN-style array-bashing code, write FORTRAN-style array-bashing code. In particular using things like reduce is probably not the way to do this.
For instance if you change your function to the perfectly readable
(defun max-integers/loop (array)
(declare (optimize (speed 3) (space 0) (safety 0))
(type (simple-array fixnum (*)) array))
(loop for i of-type fixnum across array
maximizing i))
Then SBCL does a far, far better job of optimising it.
It's worth pointing out another confusion in your question: You say that for something like
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
SBCL will optimize + to the machine instruction. No, it won't. The reason it won't is because the fixnum type is not closed under addition: consider what (add most-positive-fixnum 1) should do. If you want to generate very fast code for integers you need to make sure that your integer types are small enough that the compiler can be sure that the operations you're doing on them remain machine integers (or, if you want to live dangerously, cover your code with (the fixnum ...) and set safety to 0 when compiling, which seems to allow the compiler to just return the wrong answer for addition in the way people usually expect computers to do).

You can't force the implementation to open-code functions that weren't declared INLINE when they were defind -- it simply hasn't saved the information needed.
However, the overhead of calling REDUCE is probably negligible compared to the actual processing. So what you can do is declare the types of a and b, to optimize the callback function.
(reduce (lambda (a b) (declare (type fixnum a b)) (if (> b a) b a)) array)
I guess you were hoping that if it open-coded reduce it would automatically propagate this type from the declaration of array, so you wouldn't need to do this.

Related

Is it possible / what are examples of using hygienic macros for the compile time computational optimization?

I've been reading through https://lispcast.com/when-to-use-a-macro, and it states (about clojure's macros)
Another example is performing expensive calculations at compile time as an optimization
I looked up, and it seems clojure has unhygienic macros. Can this also be applied to hygienic ones? Particularly talking about Scheme. As far as I understand hygienic macros, they only transform syntax, but the actual execution of code is deferred until the runtime no matter what.
Yes. Macro hygiene just refers to whether or not macro expansion can accidentally capture identifiers. Whether or not a macro is hygienic, regular macro expansion (as opposed to reader macro expansion) occurs at compile-time. Macro expansion replaces the macro's code with the results of it being executed. Two major use cases for them are to transform syntax (i.e. DSLs), to enhance performance by eliminating computations at run time or both.
A few examples come to mind:
You prefer to write your code with angles in degrees but all of the calculations are actually in radians. You could have macros eliminate these trivial, but unnecessary (at run time) conversions, at compile time.
Memoization is a broad example of compute optimization that macros can be used for.
You have a string representing a SQL statement or complex textual math expression which you want to parse and possibly even execute at compile time.
You could also combine the examples and have a memoizing SQL parser. Pretty much any scenario where you have all the necessary inputs at compile time and can therefore compute the result is a candidate.
Yes, hygienic macros can do this sort of thing. As an example here is a macro called plus in Racket which is like + except that, at macroexpansion-time, it sums sequences of adjacent literal numbers. So it does some of the work you might expect to be done at run-time at macroexpansion-time (so, effectively, at compile-time). So, for instance
(plus a b 1 2 3 c 4 5)
expands to
(+ a b 6 c 9)
Some notes on this macro.
It's probably not very idiomatic Racket, because I'm a mostly-unreformed CL hacker, which means I live in a cave and wear animal skins and say 'ug' a lot. In particular I am sure I should use syntax-parse but I can't understand it.
It might not even be right.
There are subtleties with arithmetic which means that this macro can return different results than +. In particular + is defined to add pairwise from left to right, while plus does not in general: all the literals get added firsto in particular (assuming you have done (require racket/flonum, and +max.0 &c have the same values as they do on my machine), then (+ -max.0 1.7976931348623157e+308 1.7976931348623157e+308) has a value of 1.7976931348623157e+308, while (plus -max.0 1.7976931348623157e+308 1.7976931348623157e+308) has a value of +inf.0, because the two literals get added first and this overflows.
In general this is a useless thing: it's safe to assume, I think, that any reasonable compiler will do these kind of optimisations for you. The only purpose of it is to show that it's possible to do the detect-and-compile-away compile-time constants.
Remarkably, at least from the point of view of caveman-lisp users like me, you can treat this just like + because of the last in the syntax-case: it works to say (apply plus ...) for instance (although no clever optimisation happens in that case of course).
Here it is:
(require (for-syntax racket/list))
(define-syntax (plus stx)
(define +/stx (datum->syntax stx +))
(syntax-case stx ()
[(_)
;; return additive identity
#'0]
[(_ a)
;; identity with one argument
#'a]
[(_ a ...)
;; the interesting case: there's more than one argument, so walk over them
;; looking for literal numbers. This is probably overcomplicated and
;; unidiomatic
(let* ([syntaxes (syntax->list #'(a ...))]
[reduced (let rloop ([current (first syntaxes)]
[tail (rest syntaxes)]
[accum '()])
(cond
[(null? tail)
(reverse (cons current accum))]
[(and (number? (syntax-e current))
(number? (syntax-e (first tail))))
(rloop (datum->syntax stx
(+ (syntax-e current)
(syntax-e (first tail))))
(rest tail)
accum)]
[else
(rloop (first tail)
(rest tail)
(cons current accum))]))])
(if (= (length reduced) 1)
(first reduced)
;; make sure the operation is our +
#`(#,+/stx #,#reduced)))]
[_
;; plus on its own is +, but we want our one. I am not sure this is right
+/stx]))
It is possible to do this even more aggressively in fact, so that (plus a b 1 2 c 3) is turned into (+ a b c 6). This has probably even more exciting might-get-different answers implications. It's worth noting what the CL spec says about this:
For functions that are mathematically associative (and possibly commutative), a conforming implementation may process the arguments in any manner consistent with associative (and possibly commutative) rearrangement. This does not affect the order in which the argument forms are evaluated [...]. What is unspecified is only the order in which the parameter values are processed. This implies that implementations may differ in which automatic coercions are applied [...].
So an optimisation like this is clearly legal in CL: I'm not clear that it's legal in Racket (although I think it should be).
(require (for-syntax racket/list))
(define-for-syntax (split-literals syntaxes)
;; split a list into literal numbers and the rest
(let sloop ([tail syntaxes]
[accum/lit '()]
[accum/nonlit '()])
(if (null? tail)
(values (reverse accum/lit) (reverse accum/nonlit))
(let ([current (first tail)])
(if (number? (syntax-e current))
(sloop (rest tail)
(cons (syntax-e current) accum/lit)
accum/nonlit)
(sloop (rest tail)
accum/lit
(cons current accum/nonlit)))))))
(define-syntax (plus stx)
(define +/stx (datum->syntax stx +))
(syntax-case stx ()
[(_)
;; return additive identity
#'0]
[(_ a)
;; identity with one argument
#'a]
[(_ a ...)
;; the interesting case: there's more than one argument: split the
;; arguments into literals and nonliterals and handle approprately
(let-values ([(literals nonliterals)
(split-literals (syntax->list #'(a ...)))])
(if (null? literals)
(if (null? nonliterals)
#'0
#`(#,+/stx #,#nonliterals))
(let ([sum/stx (datum->syntax stx (apply + literals))])
(if (null? nonliterals)
sum/stx
#`(#,+/stx #,#nonliterals #,sum/stx)))))]
[_
;; plus on its own is +, but we want our one. I am not sure this is right
+/stx]))

Optimization for accessing array in lisp

I am trying to learn how to make type declarations in lisp. I figured out that aref causes problems:
(defun getref (seq k)
(declare (optimize (speed 3) (safety 0)))
(declare (type (vector fixnum *) seq) (type fixnum k))
(aref seq k))
Compiled, it says:
; in: DEFUN GETREF
; (AREF MORE-LISP::SEQ MORE-LISP::K)
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-REF ARRAY SB-INT:INDEX)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (VECTOR FIXNUM), not a SIMPLE-ARRAY.
;
; compilation unit finished
; printed 1 note
And so in every other function, where I want to use aref (and I do, since I need adjustable vectors), this happens too. How do I fix it?
It's not a problem and not an error. It just an information (a note) from the SBCL compiler that it can't optimize the code better. The code will work just fine. You can safely ignore it.
If you can't use a simple vector (a one-dimensional simple array), then this is the price to pay for it: aref might be slightly slower.
The optimization hint you get comes from the docstring of a deftransform defined in sbcl/src/compiler/generic/vm-tran.lisp:
(deftransform hairy-data-vector-ref ((array index) (simple-array t) *)
"avoid runtime dispatch on array element type"
...)
It has a comment which says:
This and the corresponding -SET transform work equally well on non-simple
arrays, but after benchmarking (on x86), Nikodemus didn't find any cases
where it actually helped with non-simple arrays -- to the contrary, it
only made for bigger and up to 100% slower code.
The code for arrays is quite complex and it is hard to say why and how things are designed as they are. You should probably ask SBCL developers on sbcl-help. See the mailing lists section on
Sourceforge.
Currently it seems preferable to favor simple arrays if possible.

Avoiding float to pointer coercion in Common Lisp

I use SBCL (64-bit v1.4.0) for numerical calculation.
After enabling optimization, following compiler note appears:
note: doing float to pointer coercion (cost 13) to "<return value>"
The code I use is as follows:
(defun add (a b)
(declare (optimize (speed 3) (safety 0)))
(declare (double-float a b))
(the double-float (+ a b)))
I've also tried ftype and got the same note.
On the other hand, following code doesn't show the note:
(defun add-fixnum (a b)
(declare (optimize (speed 3) (safety 0)))
(declare (fixnum a b))
(the fixnum (+ a b)))
I think double-float and fixnum are both 64 bits wide.
Why can not SBCL return a double-float value via a register like C language? And are there any way to avoid float to pointer coercion without inline expansion?
The problem is that Lisp data is dynamically typed, and the return value of a function has to include the type information. The type tag in most implementations is stored in the low-order bits of a value.
This allows a special optimization for fixnums. Their type tag is all zeroes, and the value is the integer shifted left by the number of bits in the type tag. When you add these values, the result still has zeroes in the tag bits, so you can perform arithmetic on the values using normal CPU operations.
But this doesn't work for floating point values. After performing the CPU operations, it has to add the type tag to the value. This is what it means by "float to pointer coercion" (a more common word for it in many languages is "boxing").
Declaring the return type doesn't avoid this, because the callers don't necessarily have access to the declarations -- Lisp allows you to compile the callers in a separate compilation unit than the functions they call.
If you declare the function INLINE, then this doesn't need to be done, because the callers know the type it's returning, and the hardware value can be returned directly to them without adding the tag.
A more detailed explanation can be found in this ancient comp.lang.lisp thread. It's referring to CMUCL, which is what SBCL is derived from (notice that the wording of the warning is exactly the same).

Is it possible to override type-error behavior in Common Lisp upon out-of-range values for limited-range variables?

Sorry if the question's topic is oddly phrased (for lack of better terminology -- also one of the reasons I didn't find anything Googling this specific topic), so here's what I mean with an example.
Let's say this function foobar is defined:
(defun foobar (x)
(declare (type (integer -100 100) x))
(format T "X is ~A~%" x))
So with the declare line, x is an integer that must be -100, 100, or any integer in-between. Thus, doing this yields an error:
CL-USER> (foobar 101)
The value 101 is not of type (INTEGER -100 100).
[Condition of type TYPE-ERROR]
Restarts:
(blah blah blah)
Short of changing the function itself to explicitly do clamping, is there a way to specify an override behavior such that doing this, without altering the defun of foobar itself:
(foobar [any-value-over-100])
Clamps it to 100, and likewise with x < -100, without the function body itself having extra lines of code to do so?
Edit: To answer one responder, this is clamping -- keeping a value strictly within a defined minimum and maximum range. In Lisp, this is an example:
CL-USER> (defun clamp (x min max)
(if (> x max)
max
(if (< x min)
min
x)))
CLAMP
CL-USER> (clamp 5 4 9)
5
CL-USER> (clamp -2 4 9)
4
CL-USER> (clamp 123 4 9)
9
While I can easily just make this a macro and put it in the beginning of any function (and I have an odd feeling this'll ultimately be what I'll have to do), this question is asking whether it's possible to tell the Common Lisp error handler to "just do this with the values instead!", rather than having it interrupting the entire program flow as it normally does.
Type declarations in Common Lisp
Your code:
(defun foobar (x)
(declare (type (integer -100 100) x))
(format T "X is ~A~%" x))
The consequences of call above with something like (foobar 120) are entirely undefined in Common Lisp.
it may be completely ignored
it may lead to errors or various runtime problems
it may help the compiler to create better code (this is btw. the main reason for those declarations)
it may be typed checked at compile / or runtime. Only very few Lisp compilers do it.
Portable runtime type checking in Common Lisp
If you want to portably check for runtime type errors use CHECK-TYPE or ASSERT.
(defun foobar (x)
(check-type x (integer -100 100))
(format T "X is ~A~%" x))
Advising
Extending functions without changing their source code is called 'advising' in Lisp. This is not in the Common Lisp standard for normal functions, but there should be tools for it and it is not that difficult to write such a thing.
Extending Generic Functions
Common Lisp has this mechanism built-in for generic functions. The standard method combination has :before, :after and :around advising.
(defmethod foobar ((x integer))
(check-type x (integer -100 100))
(format T "X is ~A~%" x))
In Common Lisp one cannot dispatch on arbitrary types - only on classes. There are classes for basic types like string, integer, ... Here we use that x is an integer.
If you want to clamp foobar's x:
(defmethod foobar :around ((x integer))
(call-next-method (clamp x -200 100)))
Above is an :around method. I calls the next method, the one above, with a changed argument. This is allowed as long as the argument class does not change the dispatch.
Alternative approach: Macro
One goal might be to write less code and have code more declarative.
Maybe one wants to write:
(defun-clamped foobar ((x (integer :min -100 :clampled-max 100)))
(format T "X is ~A~%" x))
Then I would just write the defun-clamped macro, which expands into a normal DEFUN, which does the necessary things.
Ignore Declarations
If you compile the function with the appropriate settings, the type declaration will be ignored.
Redefine Function
Alternatively, you can redefine your function like this:
(defparameter *foobar-orig* (fdefinition *foobar*))
(defun foobar (x)
(funcall *foobar-orig* (whatever-you-want x)))
Use restarts
Your best way forward is to replace declarations with check-type and establish appropriate handlers, e.g.,
(handler-bind ((type-error
(lambda (c)
(let ((et (type-error-expected-type c)))
(store-value (clamp (type-error-datum c) (second et) (third et)))))))
(let ((x 100))
(check-type x (integer 1 10))
(print x)))
The standard does not provide for global error handlers, but implementations usually do.
If I understand you correctly you want to ensure the integer to be in that range, if that is the case I don't think you should handle it with type error, but a (let ...) something like this:
(defun ensure-range (x low high)
(cond ((< x low) low)
((> x high) high)
(t x)))
(defun foobar (x)
(let (x (ensure-range x -100 100))
(format T "X is ~A~%" x)))
Don't declare x to be between -100 and 100 if you can accept something else. I think that an implementation might be free to allow any kinds of memory corruption if a declaration is violated.
So, doing
(declare (optimize (safety 0)))
to avoid the declaration throwing an error is not really a good idea.
You can first clamp the value and then put the rest of the function definition into a LET form.
(defun foo (x)
(declare (type integer x))
(let ((x (clamp x -100 100)))
(declare (type (integer -100 100) ; LET allows declarations, too!
x))
(bar x)))
If you want a macro to do this for you, something like the following should work:
(defmacro defclamp (name (arg min max)
&body body)
`(defun ,name (,arg)
(declare (type real ,arg))
(let ((,arg (clamp ,arg ,min ,max)))
(declare (type (real ,min ,max)
,arg))
,#body)))

How to know whether a racket variable is defined or not

How you can have a different behaviour if a variable is defined or not in racket language?
There are several ways to do this. But I suspect that none of these is what you want, so I'll only provide pointers to the functions (and explain the problems with each one):
namespace-variable-value is a function that retrieves the value of a toplevel variable from some namespace. This is useful only with REPL interaction and REPL code though, since code that is defined in a module is not going to use these things anyway. In other words, you can use this function (and the corresponding namespace-set-variable-value!) to get values (if any) and set them, but the only use of these values is in code that is not itself in a module. To put this differently, using this facility is as good as keeping a hash table that maps symbols to values, only it's slightly more convenient at the REPL since you just type names...
More likely, these kind of things are done in macros. The first way to do this is to use the special #%top macro. This macro gets inserted automatically for all names in a module that are not known to be bound. The usual thing that this macro does is throw an error, but you can redefine it in your code (or make up your own language that redefines it) that does something else with these unknown names.
A slightly more sophisticated way to do this is to use the identifier-binding function -- again, in a macro, not at runtime -- and use it to get information about some name that is given to the macro and decide what to expand to based on that name.
The last two options are the more useful ones, but they're not the newbie-level kind of macros, which is why I suspect that you're asking the wrong question. To clarify, you can use them to write a kind of a defined? special form that checks whether some name is defined, but that question is one that would be answered by a macro, based on the rest of the code, so it's not really useful to ask it. If you want something like that that can enable the kind of code in other dynamic languages where you use such a predicate, then the best way to go about this is to redefine #%top to do some kind of a lookup (hashtable or global namespace) instead of throwing a compilation error -- but again, the difference between that and using a hash table explicitly is mostly cosmetic (and again, this is not a newbie thing).
First, read Eli's answer. Then, based on Eli's answer, you can implement the defined? macro this way:
#lang racket
; The macro
(define-syntax (defined? stx)
(syntax-case stx ()
[(_ id)
(with-syntax ([v (identifier-binding #'id)])
#''v)]))
; Tests
(define x 3)
(if (defined? x) 'defined 'not-defined) ; -> defined
(let ([y 4])
(if (defined? y) 'defined 'not-defined)) ; -> defined
(if (defined? z) 'defined 'not-defined) ; -> not-defined
It works for this basic case, but it has a problem: if z is undefined, the branch of the if that considers that it is defined and uses its value will raise a compile-time error, because the normal if checks its condition value at run-time (dynamically):
; This doesn't work because z in `(list z)' is undefined:
(if (defined? z) (list z) 'not-defined)
So what you probably want is a if-defined macro, that tells at compile-time (instead of at run-time) what branch of the if to take:
#lang racket
; The macro
(define-syntax (if-defined stx)
(syntax-case stx ()
[(_ id iftrue iffalse)
(let ([where (identifier-binding #'id)])
(if where #'iftrue #'iffalse))]))
; Tests
(if-defined z (list z) 'not-defined) ; -> not-defined
(if-defined t (void) (define t 5))
t ; -> 5
(define x 3)
(if-defined x (void) (define x 6))
x ; -> 3