How to translate a LOOP into a DO inside a macro (common lisp)? - iteration

I'm currently reading through Seibel's "Practical common lisp" and found this example macro:
(defmacro check (&rest forms)
`(progn
,#(loop for f in forms collect `(do-stuff ,f ',f))
(defun test ()
(check ( (= (+ 1 2 ) 3) (= (+ 1 2 ) 4 )))
)
do-stuff simply then format the two args in a certain way allowing for 'testing' the truth of a form, but that's not important for my question.
What I was interested in was to translate the loop into a DO, unfortunately, I'm totally lacking in the necessary skill to do so:
(defmacro check (&rest forms)
`(progn
,#(do ((index 0 (list-length forms))
(accumulator nil))
((= index (list-length forms)) accumulator)
(push `(do-stuff ,(nth index forms) ',(nth index forms)) accumulator)
))
This does the job, I can also do this (put every form into a variable inside the do):
(defmacro check (&rest forms)
`(progn
,#(do* ((index 0 (list-length forms))
(accumulator nil)
(f (nth index forms) (nth index forms)))
((= index (list-length forms)) accumulator)
(push `(do-stuff ,f ',f) accumulator)
))
My problem is the following :
Is there a more efficient way to write this do loop ? Is this a good way to implement it ?
Something in the LOOP version is making me wonder if there is not a simple way to extract an element of a list without the need to define an index variable, or to regroup the COLLECTED elements without the need to define an accumulator list...

If you use do you shouldn't use nth. Just iterate over the list, not the indexes.
(do ((l forms (cdr l))
(accumulator nil))
((null l) (nreverse accumulator))
(let ((f (car l)))
(push `(do-stuff ,f ',f) accumulator)))
You can also use the built-in dolist:
(let ((accumulator nil))
(dolist (f forms (nreverse accumulator))
(push `(do-stuff ,f ',f) accumulator)))
Finally there's mapcar:
(mapcar (lambda (f) `(do-stuff ,f ',f)) forms)

Is there a more efficient way to write this do loop ? Is this a good way to implement it ?
The complexity of your code is quadratic to the size N of the list, since for each item you call nth to access an element inside, resulting in a O(N*N) execution time. There is a more efficient way to do it (the original LOOP version is an example of a linear algorithm).
Here is a different version where instead of calling push followed by nreverse, the items are queued at the end of the list during traversal. I added comments to explain what each part does.
By the way I don't claim that this is more efficient that using nreverse, I think we can't know without testing. Note however that there are as many operations in both cases (cons a new item, and eventually mutate the cdr slot), they are just done either in two passes or one pass.
In fact the code below is very not far from being an implementation of MAPCAR where there is only one list to traverse (not the variadic version in the standard).
First, define a helper function that transforms one form:
(defun expand-check (form)
`(do-stuff ,form ',form))
Recall that you could just (mapcar #'expand-check checks) to have the desired result.
Anyway, here is a DO version:
(defun expand-checks (checks)
;; LIST-HOLDER is just a temporary cons-cell that allows us to treat
;; the rest of the queue operations without having to worry about
;; the corner case of the first item (the invariant is that LAST is
;; always a cons-cell, never NIL). Here LIST-HOLDER is initially
;; (:HANDLE), the first value being discarded later.
(let ((list-holder (list :handle)))
;; DO is sufficient because the iterator values are independant
;; from each other (no need to use DO*).
(do (;; descend the input list
(list checks (cdr list))
;; update LAST so that it is always the last cons cell, this
;; is how we can QUEUE items at the end of the list without
;; traversing it. This queue implementation was first
;; described by Peter Norvig as far as I known.
(last list-holder (cdr last)))
;; End iteration when LIST is empty
((null list)
;; In which case, return the rest of the LIST-HOLDER, which
;; is the start of the list that was built.
(rest list-holder))
;; BODY of the DO, create a new cons-cell at the end of the
;; queue by mutating the LAST const cell.
(setf (cdr last)
(list (expand-check
(first list)))))))

Firstly, anything of the form
(loop for v in <list> collect (f v ...))
Can be easily expressed as mapcar:
(mapcar (lambda (v)
(f v ...))
<list>)
The interesting case is when the loop only collects a value sometimes, or when the iteration is over some more complicated thing.
In that case one nice approach is to factor out the iteration bit and the 'collecting values' bit, using do or whatever to perform the iteration and some other mechanism to collect values.
One such is collecting. So, for instance, you could use dolist to iterate over the list and collecting to collect values. And perhaps we might only want to collect non-nil values or something to make it more interesting:
(collecting
(dolist (v <list>)
(when v
(collect (f v ...)))))
All of these are more verbose than the simple loop case, but for instance collecting can do things which are painful to express with loop.

Related

How can I use types so that generic operations are inlined (or "open coded") in sbcl?

SBCL compiler optimizations are based on the idea that if a type is declared, then "open coding" allows generic operations to be replaced with specific ones.
For example
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
Will allow the generic + to be replaced with a single instruction for fixnum.
However, I have found that in practice, this seems to rarely be possible because:
In order for a function to be specialized/optimized it must be inlinable. The declaration must be marked explicitly with a (declaim (inline ...)), so the author of a function must anticipate that others might want to inline it. (In theory the compiler could generate multiple versions, but this doesn't seem to be the case.)
Most standard functions do not appear inlineable.
For example, one would expect that the following declaration is sufficient for open coding to take place:
(defun max-integers (array)
(declare (optimize (speed 3) (space 0) (safety 0)))
(declare (inline reduce))
(declare (type (simple-array fixnum (*)) array))
(reduce (lambda (a b) (if (> b a) b a)) array))
However, the assembly shows it's making a function call to the generic reduce:
; Size: 22 bytes. Origin: #x1001BC8109
; 09: 488B15B0FFFFFF MOV RDX, [RIP-80] ; no-arg-parsing entry point
; #<FUNCTION (LAMBDA
; # ..)>
; 10: B904000000 MOV ECX, 4
; 15: FF7508 PUSH QWORD PTR [RBP+8]
; 18: B8781C3220 MOV EAX, #x20321C78 ; #<FDEFN REDUCE>
; 1D: FFE0 JMP RAX
The conclusion seems to be that the compiler cannot actually do much type optimization, as each usage of reduce, map, etc is a barrier to type propagation, and they are building blocks of everything else.
How can I overcome this and take advantage of optimizations by declaring types?
I really want to avoid writing type specific versions of each function or "macroifying" what should be a function.
I think one answer is that if you want to write FORTRAN-style array-bashing code, write FORTRAN-style array-bashing code. In particular using things like reduce is probably not the way to do this.
For instance if you change your function to the perfectly readable
(defun max-integers/loop (array)
(declare (optimize (speed 3) (space 0) (safety 0))
(type (simple-array fixnum (*)) array))
(loop for i of-type fixnum across array
maximizing i))
Then SBCL does a far, far better job of optimising it.
It's worth pointing out another confusion in your question: You say that for something like
(defun add (a b)
(declare (type fixnum a b))
(+ a b))
SBCL will optimize + to the machine instruction. No, it won't. The reason it won't is because the fixnum type is not closed under addition: consider what (add most-positive-fixnum 1) should do. If you want to generate very fast code for integers you need to make sure that your integer types are small enough that the compiler can be sure that the operations you're doing on them remain machine integers (or, if you want to live dangerously, cover your code with (the fixnum ...) and set safety to 0 when compiling, which seems to allow the compiler to just return the wrong answer for addition in the way people usually expect computers to do).
You can't force the implementation to open-code functions that weren't declared INLINE when they were defind -- it simply hasn't saved the information needed.
However, the overhead of calling REDUCE is probably negligible compared to the actual processing. So what you can do is declare the types of a and b, to optimize the callback function.
(reduce (lambda (a b) (declare (type fixnum a b)) (if (> b a) b a)) array)
I guess you were hoping that if it open-coded reduce it would automatically propagate this type from the declaration of array, so you wouldn't need to do this.

Is it possible / what are examples of using hygienic macros for the compile time computational optimization?

I've been reading through https://lispcast.com/when-to-use-a-macro, and it states (about clojure's macros)
Another example is performing expensive calculations at compile time as an optimization
I looked up, and it seems clojure has unhygienic macros. Can this also be applied to hygienic ones? Particularly talking about Scheme. As far as I understand hygienic macros, they only transform syntax, but the actual execution of code is deferred until the runtime no matter what.
Yes. Macro hygiene just refers to whether or not macro expansion can accidentally capture identifiers. Whether or not a macro is hygienic, regular macro expansion (as opposed to reader macro expansion) occurs at compile-time. Macro expansion replaces the macro's code with the results of it being executed. Two major use cases for them are to transform syntax (i.e. DSLs), to enhance performance by eliminating computations at run time or both.
A few examples come to mind:
You prefer to write your code with angles in degrees but all of the calculations are actually in radians. You could have macros eliminate these trivial, but unnecessary (at run time) conversions, at compile time.
Memoization is a broad example of compute optimization that macros can be used for.
You have a string representing a SQL statement or complex textual math expression which you want to parse and possibly even execute at compile time.
You could also combine the examples and have a memoizing SQL parser. Pretty much any scenario where you have all the necessary inputs at compile time and can therefore compute the result is a candidate.
Yes, hygienic macros can do this sort of thing. As an example here is a macro called plus in Racket which is like + except that, at macroexpansion-time, it sums sequences of adjacent literal numbers. So it does some of the work you might expect to be done at run-time at macroexpansion-time (so, effectively, at compile-time). So, for instance
(plus a b 1 2 3 c 4 5)
expands to
(+ a b 6 c 9)
Some notes on this macro.
It's probably not very idiomatic Racket, because I'm a mostly-unreformed CL hacker, which means I live in a cave and wear animal skins and say 'ug' a lot. In particular I am sure I should use syntax-parse but I can't understand it.
It might not even be right.
There are subtleties with arithmetic which means that this macro can return different results than +. In particular + is defined to add pairwise from left to right, while plus does not in general: all the literals get added firsto in particular (assuming you have done (require racket/flonum, and +max.0 &c have the same values as they do on my machine), then (+ -max.0 1.7976931348623157e+308 1.7976931348623157e+308) has a value of 1.7976931348623157e+308, while (plus -max.0 1.7976931348623157e+308 1.7976931348623157e+308) has a value of +inf.0, because the two literals get added first and this overflows.
In general this is a useless thing: it's safe to assume, I think, that any reasonable compiler will do these kind of optimisations for you. The only purpose of it is to show that it's possible to do the detect-and-compile-away compile-time constants.
Remarkably, at least from the point of view of caveman-lisp users like me, you can treat this just like + because of the last in the syntax-case: it works to say (apply plus ...) for instance (although no clever optimisation happens in that case of course).
Here it is:
(require (for-syntax racket/list))
(define-syntax (plus stx)
(define +/stx (datum->syntax stx +))
(syntax-case stx ()
[(_)
;; return additive identity
#'0]
[(_ a)
;; identity with one argument
#'a]
[(_ a ...)
;; the interesting case: there's more than one argument, so walk over them
;; looking for literal numbers. This is probably overcomplicated and
;; unidiomatic
(let* ([syntaxes (syntax->list #'(a ...))]
[reduced (let rloop ([current (first syntaxes)]
[tail (rest syntaxes)]
[accum '()])
(cond
[(null? tail)
(reverse (cons current accum))]
[(and (number? (syntax-e current))
(number? (syntax-e (first tail))))
(rloop (datum->syntax stx
(+ (syntax-e current)
(syntax-e (first tail))))
(rest tail)
accum)]
[else
(rloop (first tail)
(rest tail)
(cons current accum))]))])
(if (= (length reduced) 1)
(first reduced)
;; make sure the operation is our +
#`(#,+/stx #,#reduced)))]
[_
;; plus on its own is +, but we want our one. I am not sure this is right
+/stx]))
It is possible to do this even more aggressively in fact, so that (plus a b 1 2 c 3) is turned into (+ a b c 6). This has probably even more exciting might-get-different answers implications. It's worth noting what the CL spec says about this:
For functions that are mathematically associative (and possibly commutative), a conforming implementation may process the arguments in any manner consistent with associative (and possibly commutative) rearrangement. This does not affect the order in which the argument forms are evaluated [...]. What is unspecified is only the order in which the parameter values are processed. This implies that implementations may differ in which automatic coercions are applied [...].
So an optimisation like this is clearly legal in CL: I'm not clear that it's legal in Racket (although I think it should be).
(require (for-syntax racket/list))
(define-for-syntax (split-literals syntaxes)
;; split a list into literal numbers and the rest
(let sloop ([tail syntaxes]
[accum/lit '()]
[accum/nonlit '()])
(if (null? tail)
(values (reverse accum/lit) (reverse accum/nonlit))
(let ([current (first tail)])
(if (number? (syntax-e current))
(sloop (rest tail)
(cons (syntax-e current) accum/lit)
accum/nonlit)
(sloop (rest tail)
accum/lit
(cons current accum/nonlit)))))))
(define-syntax (plus stx)
(define +/stx (datum->syntax stx +))
(syntax-case stx ()
[(_)
;; return additive identity
#'0]
[(_ a)
;; identity with one argument
#'a]
[(_ a ...)
;; the interesting case: there's more than one argument: split the
;; arguments into literals and nonliterals and handle approprately
(let-values ([(literals nonliterals)
(split-literals (syntax->list #'(a ...)))])
(if (null? literals)
(if (null? nonliterals)
#'0
#`(#,+/stx #,#nonliterals))
(let ([sum/stx (datum->syntax stx (apply + literals))])
(if (null? nonliterals)
sum/stx
#`(#,+/stx #,#nonliterals #,sum/stx)))))]
[_
;; plus on its own is +, but we want our one. I am not sure this is right
+/stx]))

Lisp, iterating backwards

Is there a way (with loop or iterate, doesn't matter) to iterate over sequence backwards?
Apart from (loop for i downfrom 10 to 1 by 1 do (print i)) which works with indexes, and requires length, or (loop for elt in (reverse seq)) which requires reversing sequence (even worse then the first option).
For lists the easiest is (dolist (x (reverse list)) ..) or using the more efficient nreverse if the list can be modified.
For vectors an alternative is dotimes with index calculation, something like:
(let* ((vec #(1 2 3))
(len (length vec)))
(dotimes (i len)
(print (aref vec (- len i 1)))))
Typically lists are iterated over from the start as each cons points to the next. Doing it from the back is inherently inefficient.
If you nevertheless have a list and wish fast reverse or random access, an option is to coerce it to a vector using e.g (coerce my-list 'array) and then access the elements using aref (or coerce to simple-vector and use svref).
If you are the one building the list, consider creating an adjustable vector with fill-pointer (see make-array documentation) and then use vector-push-extend to add items. That gives fast random access from the beginning.
Iterate can do it:
(iterate (for x :in-sequence #(1 2 3) :downto 0)
(princ x))
; => 321
As others have noted, this will be very inefficient if used on lists.

Why can't I delete the current element when iterating over a list?

I want to iterate over a list, perform an action with the elements and based on some criteria, I want to get rid of the active element. However, when using the function below I end up in an infinite loop.
(defun foo (list action test)
(do ((elt (car list) (car list)))
((null list))
(funcall action elt)
(when (funcall test elt)
(delete elt list))))
(setq list '(1 2 3 4))
(foo list #'pprint #'oddp)
-> infinite loop
Is it not possible as it points to itself? In the end, elt is (car list) of course.
Is this a correct assessment? And how could I solve this efficiently?
The loop is infinite since you are not iterating over anything, you apply the action repeatedly, but if it doesn't mutate the element, as pprint obviously doesn't, then if the test result is negative then it will remain so and the list wouldn't empty even if the deletion worked as you attempt it.
DELETE is a destructive function. In Common Lisp destructive operations are allowed to destroy their argument. You are supposed to discard any references to the argument and use only the return value. After the operation is completed there are no guarantees about the state of the argument. In particular, there might be no effect as implementations are also allowed to act identically to a non-destructive counterpart, but usually the component parts of the sequence will be reassembled in some difficult to predict way. You are also destroying a literal in your example, which has undefined behaviour and it should be avoided.
It is generally best to treat lists in Common Lisp as immutable and destructive operations as a microoptization which should only be used when you are sure they won't break anything. For this problem you might want to iterate over the list using LOOP assembling the result list with conditional COLLECT. See the LOOP chapter of PCL.
Actually you can alter the state of your list while iterating over it. You will just have to use rplacd in addition to delete, and control the advancement along the list not in the iteration clause, but inside the do body:
(defun nfoo (lst action test)
(do* ((list (cons 1 lst))
(elt (cadr list) (cadr list)))
((null (cdr list))
(if (funcall test (car lst)) (cdr lst) lst))
(funcall action elt)
(if (funcall test elt)
(rplacd list (delete elt (cddr list)))
(setf list (cdr list)))))
You should call it via copy-list if you don't want it to destroy the argument list.
If you want to remove from your list not all elements equal to elt that passed the test, but rather all such that will pass the test, then the delete call will need to be passed the test function as the :test argument.
(edit:) and even much simpler and straightforward, like this (non-destructive) version:
(defun foo (list action test)
(do* ((elt (car list) (car list)))
((null list))
(funcall action elt)
(if (funcall test elt)
(setf list (delete elt list))
(setf list (cdr list)))))
I'm a bit new to lisp, so perhaps I'm missing something in your question. Still, I think I understand what you're asking, and I wonder why you're not using some existing structures for this... namely remove-if-not (or remove-if if I have things backwards) and mapcar...
(mapcar #'pprint (remove-if-not #'oddp '(1 2 3 4))
The above prints 1 and 3 (and returns (nil nil), but presumably you can ignore that... or you could do a defun that does the above and ends with (values)). (If you wanted the evens, change remove-if-not to remove-if.)
This strikes me as perhaps a more sensible way to go about things, unless you're doing this for pedagogical reasons or I'm missing something... either of which I admit is quite possible. :)
P.S. Hyperspec info on remove-if, remove-if-not, etc.

How does this Scheme list iterator use call-with-current-continuation?

I'm trying to read this code:
(define list-iter
(lambda (a-list)
(define iter
(lambda ()
(call-with-current-continuation control-state)))
(define control-state
(lambda (return)
(for-each
(lambda (element)
(set! return (call-with-current-continuation
(lambda (resume-here)
(set! control-state resume-here)
(return element)))))
a-list)
(return 'list-ended)))
iter))
Can anyone explain how call-with-current-continuation works in this example?
Thanks
The essence of call-with-concurrent-continuation, or call/cc for short, is the ability to grab checkpoints, or continuations, during the execution of a program. Then, you can go back to those checkpoints by applying them like functions.
Here's a simple example where the continuation isn't used:
> (call/cc (lambda (k) (+ 2 3)))
5
If you don't use the continuation, it's hard to tell the difference. Here's a few where we actually use it:
> (call/cc (lambda (k) (+ 2 (k 3))))
3
> (+ 4 (call/cc (lambda (k) (+ 2 3))))
9
> (+ 4 (call/cc (lambda (k) (+ 2 (k 3)))))
7
When the continuation is invoked, control flow jumps back to where the continuation was grabbed by call/cc. Think of the call/cc expression as a hole that gets filled by whatever gets passed to k.
list-iter is a substantially more complex use of call/cc, and might be a difficult place to begin using it. First, here's an example usage:
> (define i (list-iter '(a b c)))
> (i)
a
> (i)
b
> (i)
c
> (i)
list-ended
> (i)
list-ended
Here's a sketch of what's happening:
list-iter returns a procedure of no arguments i.
When i is invoked, we grab a continuation immediately and pass it to control-state. When that continuation, bound to return, is invoked, we'll immediately return to whoever invoked i.
For each element in the list, we grab a new continuation and overwrite the definition of control-state with that new continuation, meaning that we'll resume from there the next time step 2 comes along.
After setting up control-state for the next time through, we pass the current element of the list back to the return continuation, yielding an element of the list.
When i is invoked again, repeat from step 2 until the for-each has done its work for the whole list.
Invoke the return continuation with 'list-ended. Since control-state isn't updated, it will keep returning 'list-ended every time i is invoked.
As I said, this is a fairly complex use of call/cc, but I hope this is enough to get through this example. For a gentler introduction to continuations, I'd recommend picking up The Seasoned Schemer.
Basically it takes a function f as its parameter, and applies f to the current context/state of the program.
From wikipedia:
(define (f return)
(return 2)
3)
(display (f (lambda (x) x))) ; displays 3
(display (call-with-current-continuation f)) ; displays 2
So basically when f is called without current-continuation (cc), the function is applied to 2, and then returns 3. When using current-continuation, the parameter is applied to 2, which forces the program to jump to the point where the current-continuation was called, and thus returns 2. It can be used to generate returns, or to suspend execution flow.
If you know C, think about it like this: in C, you can take a pointer to a function. You also have a return mechanism. Suppose the return took a parameter of the same type the function takes. Suppose you could take its address and store that address in a variable or pass it as a parameter, and allow functions to return for you. It can be used to mimic throw/catch, or as a mechanism for coroutines.
This is essentially:
(define (consume)
(write (call/cc control)))
(define (control ret)
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 1))))
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 2))))
(set! ret (call/cc (lambda (resume)
(set! control resume)
(ret 3)))))
(consume)
(consume)
(consume)
Hope it is easier to understand.