SBCL optimization: function type declaration - optimization

If I have a function that accepts function argument, for optimization purposes I can declare it to be a function, let's say
(defun foo (f)
(declare (type function f))
...)
However, I can be even more specific:
(defun foo (f)
(declare (type (function (double-float) double-float) f))
...)
i.e. telling that f will accept one double-float argument and return one double-float value. SBCL, however, seem to be able to perform a better optimization on the former and for the latter it says that it doesn't know if f is fdefinition (try to compile with (optimize (speed 3)) declaration to reproduce).
So, my questions are:
Am I doing something wrong? Especially If SBCL would do exactly the same thing for just function and (function ...) I would be OK with it, but it actually does worse. Or should it be considered a bug in SBCL?
Is function type declaration in general useless in CL in terms of optimization for some reason?
SysInfo: SBCL 1.3.18

From the SBCL Manual (4.2.3 Getting Existing Programs to Run):
Some incorrect declarations can only be detected by run-time type
checking [...] because the SBCL compiler does much more
type inference than other Common Lisp compilers, so an incorrect
declaration can do more damage.
It's possible, that's why your function does worse with the variable type declarations included.
Further:
The most common problem is with variables whose constant initial value
doesn't match the type declaration. Incorrect constant initial values
will always be flagged by a compile-time type error, and they are
simple to fix once located. Consider this code fragment:
(prog (foo)
(declare (fixnum foo))
(setq foo ...)
...)
Here foo is given an initial value of nil, but is declared to be a fixnum. Even if it is never read, the initial value of a
variable must match the declared type. There are two ways to fix this
problem. Change the declaration
(prog (foo)
(declare (type (or fixnum null) foo))
(setq foo ...)
...)
or change the initial value
(prog ((foo 0))
(declare (fixnum foo))
(setq foo ...)
...)
This is from the manual for the current version of SBCL (1.4), so it may or may not apply to your situation.

Related

If CLOS had a compile-time dispatch, what would happen to this code snippet?

I am reading the Wikipedia article about CLOS.
It says that:
This dispatch mechanism works at runtime. Adding or removing methods thus may lead to changed effective methods (even when the generic function is called with the same arguments) at runtime. Changing the method combination also may lead to different effective methods.
Then, I inserted:
; declare the common argument structure prototype
(defgeneric f (x y))
; define an implementation for (f integer t), where t matches all types
(defmethod f ((x integer) y) 1)
Using SBCL and SLIME, I compiled the regions with code and had the following result:
CL-USER> (f 1 2)
1
Then, I added to the definition:
; define an implementation for (f integer real)
(defmethod f ((x integer) (y real)) 2)
Again, I repeated the process compiling the new region and using the REPL to eval:
CL-USER> (f 1 2.0)
2
First question, if CLOS had the opposite behavior of run-time dispatch (compile-time dispatch, I suppose), what would the result be?
Second question, I decided to comment out the second method, leaving just the generic function and the first written method. Then, I re-compiled the region with Emacs.
When calling the fuction fin the REPL with (f 1 2) I thought I would get 1 since the second method is out. Instead, I got 2.
CL-USER> (f 1 2.0)
2
Why did this happen?
The only way I can get back to (f 1 2) returning 1 is re-starting the Slime REPL and compiling the region (with the second method being commented out). Third question, is there a better way to have this result without having to re-start the REPL?
Thanks!
These two are actually the same question. The answer is: you are modifying a system while it is running.
If CLOS objects weren't re-definable at run-time, this would simply not work, or you'd not be allowed to do that. Try such re-definitions with basic structs (i. e. the things you get when using defstruct), and you will often run into pretty severe warnings or even errors when the change is not compatible. Of course, structs have other limitations, too, e. g. only single dispatch, so that it's not so easy to make an exactly analogous example. But try to remove a slot from a defstruct.
Just commenting out some source code doesn't change the fact that you evaluated (compiled and loaded) it before. You are manipulating a running system, and the source code is just that: source. If you want to remove a method from the running system, you can use remove-method (see also How to remove a defmethod for a struct). Most Lisp IDEs have ways to do that interactively, e. g. in SLIME using the SLIME inspector.

Common Lisp: How do I set a variable in my parent's lexical scope?

I want to define a function (not a macro) that can set a variable in the scope its called.
I have tried:
(defun var-set (var-str val)
(let ((var-interned
(intern (string-upcase var-str))))
(set var-interned val)
))
(let ((year "1400"))
(var-set "year" 1388)
(labeled identity year))
Which doesn't work because of the scoping rules. Any "hacks" to accomplish this?
In python, I can use
previous_frame = sys._getframe(1)
previous_frame_locals = previous_frame.f_locals
previous_frame_locals['my-var'] = some_value
Any equivalent API for lisp?
You cannot do that because after compilation the variable might not even exist in any meaningful sense.
E.g., try to figure out by looking at the output of (disassemble (lambda (x) (+ x 4))) where you
would write the new values of x.
You have to tell both the caller and the callee (at compile time!) that the variable is special:
(defun set-x (v)
(declare (special x))
(setq x v))
(defun test-set (a)
(let ((x a))
(declare (special x))
(set-x 10)
x))
(test-set 3)
==> 10
See Dynamic and Lexical variables in Common Lisp for further details on lexical vs dynamic bindings.
You can't. This is why it is called lexical scope: you have access to variable bindings if and only if you can see them. The only way to get at such a binding is to create some object for which it is visible and use that. For instance:
(defun foo (x)
(bar (lambda (&optional (v nil vp)
(if vp (setf x vp) x))))
(defun bar (a)
...
(funcall a ...))
Some languages, such as Python have both rather rudimentary implementations of variable bindings and a single implementation (or a mandated implementation) which allow you to essentially poke around inside the system to subvert lexical scoping. Some CL implementations may have rudimentary implementation of variable bindings (probably none do) but Common Lisp the language does not mandate such implementations and nor should it.
As an example of the terrible consequences of mandating that some kind of access to lexical variables must be allowed consider this:
(defun outer (n f)
(if (> n 0)
(outer (g n) f)
(funcall f)))
If f could somehow poke at the lexical bindings of outer this would mean that all those bindings would need to exist at the point f was called: tail-call elimination would thus be impossible. If the language mandated that such bindings should be accessible then the language is mandating that tail-call elimination is not possible. That would be bad.
(It is quite possible that implementations, possibly with some debugging declarations, allow such access in some circumstances. That's very different than the language mandating such a thing.)
What are you trying to achieve?
What about a closure? Write the defun inside the let, and create an accessor and a setter function if needed.
(let ((year "1400"))
(defun current-year ()
year)
(defun set-year (val)
(setf year val)))
and
CL-USER> (current-year)
"1400"
CL-USER> (set-year 1200)
1200
CL-USER> (current-year)
1200
That Python mechanism violates the encapsulation which motivates the existence of lexical scope.
Because a lexical scope is inaccessible by any external means other than invocations of function bodies which are in that scope, a compiler is free to translate a lexical scope into any representation which performs the same semantics. Variables named in the source code of the lexical scope can disappear entirely. For instance, in your example, all references to year can be replaced by copies of the pointer to the "1400" string literal object.
Separately from the encapsulation issue there is also the consideration that a function does not have any access at all to a parent lexical scope, regardless of that scope's representation. It does not exist. Functions do not implicitly pass their lexical scope to children. Your caller may not have a lexical environment at all, and there is no way to know. The essence of the lexical environment is that no aspect of it is passed down to children, other than via the explicit passage of lexical closures.
Python's feature is poorly considered because it makes programs dependent on the representation of scopes. If a compiler like PyPy is to make that code work, it has to constrain its treatment of lexical scopes to mimic the byte code interpreted version of Python.
Firstly, each function has to know who called it, so it has to receive some parameter(s) about that, including a link to the caller's environment. That's going to be a source of inefficiency even in code that doesn't take advantage of it.
The concept of a well-defined "previous frame" means that the compiler cannot merge together frames. Code which expects some variable to be in the third frame up from here will break if those frames are all inlined together due to a nested lexical scope being flattened, or due to function inlining.
As soon as you provide an interface to the parent lexical environment, and applications start using it, you no longer have lexical scoping. You have a form of dynamic scoping with lexical-like visibility rules.
The application logic can implement de facto dynamic scope on top of this API, because you can write a loop which searches for a variable across the chain of lexical scopes. Does my parent have an x variable? If not, does the grandparent, if there is one? You can search the dynamic chain of invocations for the most recent one which binds x, and that is dynamic scope.
There is nothing wrong with dynamic scope, if it is a separate discipline that is not entangled in the implementation of lexical scope.
That said, an API for tracing frames and getting at local variables is is the sort of introspection that is very useful in developing a debugger. Another angle on this is that if you work that API into an application, you're using debugging features in production.
(defvar *lexical-variables* '())
(defun get-var (name)
(let ((var (cdr (assoc name *lexical-variables*))))
(unless var (error "No lexical variable named ~S" name))
var))
(defun deref (var)
(funcall (if (symbolp var)
(or (cdr (assoc var *lexical-variables*))
(error "No lexical variable named ~S" var))
var)))
(defun (setf deref) (new-value var)
(funcall (if (symbolp var)
(or (cdr (assoc var *lexical-variables*))
(error "No lexical variable named ~S" var))
var)
new-value))
(defmacro with-named-lexical-variable ((&rest vars) &body body)
(let ((vvar (gensym))
(vnew-value (gensym))
(vsetp (gensym)))
`(let ((*lexical-variables* (list* ,#(mapcar (lambda (var)
`(cons ',var
(lambda (&optional (,vnew-value nil ,vsetp))
(if ,vsetp
(setf ,var ,vnew-value)
,var))))
vars)
*lexical-variables*)))
,#body)))
(defun var-set (var-str val)
(let ((var-interned (intern (string-upcase var-str))))
(setf (deref var-interned) val)))
(let ((x 1)
(y 2))
(with-named-lexical-variable (x y)
(var-set "x" 3)
(setf (deref 'y) 4)
(mapcar (function deref) '(x y))))
;; -> (3 4)
(let ((year "1400"))
(with-named-lexical-variable (year)
(var-set "year" 1388))
year)
;; --> 1388

How to make SBCL optimize away possible call to FDEFINITION?

Apologies: I don't have sufficient knowledge to rework this as an easy to understand code snippet.
I've been using the SBCL compiler notes as signs to what might be improved but I'm well out of my depth with this —
; compiling (DEFUN EXECUTE-PARALLEL ...)
; file: /home/dunham/8000-benchmarksgame/bench/spectralnorm/spectralnorm.sbcl-8.sbcl
; in: DEFUN EXECUTE-PARALLEL
; (FUNCALL FUNCTION START END)
; --> SB-C::%FUNCALL THE
; ==>
; (SB-KERNEL:%COERCE-CALLABLE-FOR-CALL FUNCTION)
;
; note: unable to
; optimize away possible call to FDEFINITION at runtime
; because:
; FUNCTION is not known to be a function
—
#+sb-thread
(defun execute-parallel (start end function)
(declare (type int31 start end))
(let* ((num-threads 4))
(loop with step = (truncate (- end start) num-threads)
for index from start below end by step
collecting (let ((start index)
(end (min end (+ index step))))
(sb-thread:make-thread
(lambda () (funcall function start end))))
into threads
finally (mapcar #'sb-thread:join-thread threads))))
#-sb-thread
(defun execute-parallel (start end function )
(funcall function start end))
(The program is here. Measurements for similar programs are here.)
Is it practical to make SBCL "optimize away possible call to FDEFINITION" or is that compiler note an explanation rather than an opportunity?
The reason for the possible call to fdefinition is that it doesn't know that function is a function: it might be the name of one: in general it may be a function designator rather than a function. To keep the compiler quiet, explain to it that it is a function with a suitable type declaration, which is (declare (type function function)): you just need to declare that its type is function).
Rainer is right: there is ε chance that this is ever going to be a performance problem, given you're starting a new thread. In particular it is fairly likely that adding a declaration will make no difference at all:
without a declaration the call to funcall will get compiled as something like 'check the type of the object: if it is a function, call it; if it is not, call fdefinition on it and call the result;';
with a declaration then the overall function looks like 'check the object is a function, signalling an error if not ... call the function'.
In both cases, if the object is a function, there is one type check and one call: the type check is just in a different place. In the first case, the code will still work if the object is merely the name of a function, while with the type check it won't.
And in both of these cases this is code where you care calling make-thread: if this is anything like as fast as a function call, even via fdefinition I would be really impressed by the threading system! Almost certainly the performance of this function is entirely dominated by the overhead of making threads.
In real code, avoid optimizations like that - unless really needed
Is it practical to make SBCL "optimize away possible call to FDEFINITION" or is that compiler note an explanation rather than an opportunity?
Generally it does not matter, especially since most Lisp code should not be compiled with optimization qualities (speed 3) (safety 0) (space 0), since it may open up the software to runtime errors and crashes depending on the implementation and program used. Calling things unchecked (without safety), other than functions or symbols naming functions, via funcall might be dangerous enough to cause a program crash.
For a specific benchmark one might check via timings if a type declaration and a specialized fdefinition compilation brings any advantage.
a type declaration
A type declaration to make clear that a variable named fn is referencing an object of type function would be:
(declare (type function fn))
in the specific benchmark program FDEFINITION won't be called anyway
In the example you have provided, fdefinition will not be called anyway.
(setf foo (lambda (x) x)) ; foo references a function object
(funcall foo 3)
funcall is probably implemented by something like this:
(etypecase f
((or cons symbol) (funcall (fdefinition f) ...))
(function ...))
Since your code passes a function object, there is never the need to call fdefinition.
The optimization benefit then will be that the runtime type dispatch can be removed and the dead code for the cons or symbol case...
You ask a question about removing an fdefinition but actually your question relies on a premise that the sbcl notes are a good way to drive optimisations and improvements. The notes are a good way to spot obvious issues and places where type declarations can help. They do not tell you what actually makes your program slow. The correct way to improve the performance of a program is to 1. Think if there is a faster algorithm, and 2. Measure it’s performance and work out what is slow.
A single fdefinition call will only matter if it happens in a tight loop (ie it is not single but very plural)
In this case it happens to start a thread. If you are starting threads in a tight loop then your performance problem comes from starting threads in a tight loop. Don’t do that.
If you aren’t starting threads in a tight loop (looking at your code, it appears you are not), there are bigger fish to fry. Why waste time on an fdefinition that maybe gets called 4 times per call to execute-parallel when you can optimise the inner function instead.

Common Lisp scoping (dynamic vs lexical)

EDIT: I changed the example code after the first answer because I came up with a simple version that begs the same questions.
I am currently learning Common Lisp's scoping properties. After I thought I had a solid understanding I decided to code up some examples that I could predict the outcome of, but apparently I was wrong. I have three question, each one relating to an example below:
Example 1:
(defmethod fun1 (x)
(print x)
(fun2))
(defmethod fun2 ()
(print x))
(fun1 5)
Output:
5
*** - EVAL: variable X has no value
Question: This makes sense. x is statically scoped and fun2 has no way of finding the value of x without having it passed explicitly.
Example 2:
(defvar x 100)
(defmethod fun1 (x)
(print x)
(fun2))
(defmethod fun2 ()
(print x))
(fun1 5)
Output:
5
5
Question: I don't understand why x is suddenly visible to fun2 with the value that fun1 gave it, instead of having a value of 100...
Example 3:
(setf x 100)
(defmethod fun1 (x)
(print x)
(fun2))
(defmethod fun2 ()
(print x))
(fun1 5)
Output:
5
100
Question: Should I ignore these results since calling setf on an undeclared variable is apparently undefined? This happens to be what I would expect in my second example...
Any insight would be greatly appreciated...
The effects of setting an undefined variable using setf is undefined in ANSI Common Lisp.
defvar will define a special variable. This declaration is global and also has effect on let bindings. That's the reason that by convention these variables are written as *foo*. If you have ever defined x with defvar, it is declared special and there is no way to declare it lexical later.
let by default provides local lexical variables. If the variable was already declared special (for example because of a defvar), then it just creates a new local dynamic binding.
Update
Example 1 .
Nothing to see.
Example 2
x has been declared special. All uses of the variable x now use dynamic binding.
When calling the function, you bind x to 5. Dynamically. Other functions can now access this dynamic binding and get that value.
Example 3
This is undefined behavior in Common Lisp. You are setting an undeclared variable. What happens then is implementation dependent. Your implementation (most do something similar) sets the symbol value of x to 100. In fun1, x is lexically bound. In fun2 evaluating x retrieves the symbol value (or possibly to the dynamically bound value) of x.
As an example for an implementation that did (does?) something else: the CMUCL implementation would also have declare x in example 3 by default to be special. Setting an undefined variable also declares it special.
NOTE
In portable standard compliant Common Lisp code the global variables are defined with defvar and defparameter. Both declare these variables to be special. ALL uses of these variables now involve dynamic binding.
Remember:
((lambda (x)
(sin x))
10)
is basically the same as
(let ((x 10))
(sin x))
Which means that variable bindings in let bindings and variable bindings in function calls are working the same way. If x would have been declared special in some place earlier, both would involve dynamic binding.
This is specified in the Common Lisp standard. See for example the explanation to the SPECIAL declaration.

What is a programming language with dynamic scope and static typing?

I know the language exists, but I can't put my finger on it.
dynamic scope
and
static typing?
We can try to reason about what such a language might look like. Obviously something like this (using a C-like syntax for demonstration purposes) cannot be allowed, or at least not with the obvious meaning:
int x_plus_(int y) {
return x + y; // requires that x have type int
}
int three_plus_(int y) {
double x = 3.0;
return x_plus_(y); // calls x_plus_ when x has type double
}
So, how to avoid this?
I can think of a few approaches offhand:
Commenters above mention that Fortran pre-'77 had this behavior. That worked because a variable's name determined its type; a function like x_plus_ above would be illegal, because x could never have an integer type. (And likewise one like three_plus_, for that matter, because y would have the same restriction.) Integer variables had to have names beginning with i, j, k, l, m, or n.
Perl uses syntax to distinguish a few broad categories of variables, namely scalars vs. arrays (regular arrays) vs. hashes (associative arrays). Variables belonging to the different categories can have the exact same name, because the syntax distinguishes which one is meant. For example, the expression foo $foo, $foo[0], $foo{'foo'} involves the function foo, the scalar $foo, the array #foo ($foo[0] being the first element of #foo), and the hash %foo ($foo{'foo'} being the value in %foo corresponding to the key 'foo'). Now, to be quite clear, Perl is not statically typed, because there are many different scalar types, and these are types not distinguished syntactically. (In particular: all references are scalars, even references to functions or arrays or hashes. So if you use the syntax to dereference a reference to an array, Perl has to check at runtime to see if the value really is an array-reference.) But this same approach could be used for a bona fide type system, especially if the type system were a very simple one. With that approach, the x_plus_ method would be using an x of type int, and would completely ignore the x declared by three_plus_. (Instead, it would use an x of type int that had to be provided from whatever scope called three_plus_.) This could either require some type annotations not included above, or it could use some form of type inference.
A function's signature could indicate the non-local variables it uses, and their expected types. In the above example, x_plus_ would have the signature "takes one argument of type int; uses a calling-scope x of type int; returns a value of type int". Then, just like how a function that calls x_plus_ would have to pass in an argument of type int, it would also have to provide a variable named x of type int — either by declaring it itself, or by inheriting that part of the type-signature (since calling x_plus_ is equivalent to using an x of type int) and propagating this requirement up to its callers. With this approach, the three_plus_ function above would be illegal, because it would violate the signature of the x_plus_ method it invokes — just the same as if it tried to pass a double as its argument.
The above could just have "undefined behavior"; the compiler wouldn't have to explicitly detect and reject it, but the spec wouldn't impose any particular requirements on how it had to handle it. It would be the responsibility of programmers to ensure that they never invoke a function with incorrectly-typed non-local variables.
Your professor was presumably thinking of #1, since pre-'77 Fortran was an actual real-world language with this property. But the other approaches are interesting to think about. :-)
I haven't found elsewhere has it written down, but AXIOM CAS (and various forks, including FriCAS which is still been actively developed) uses a script language called SPAD with both a very novel strong static dependent type system and dynamic scoping (although it is possibly an unintended implementation bug).
Most of the time the user won't realize that, but when they start trying to build closures like other functional languages it reveals its dynamic scoping nature:
FriCAS Computer Algebra System
Version: FriCAS 2021-03-06
Timestamp: Mon May 17 10:43:08 CST 2021
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------
(1) -> foo (x,y) == x + y
Type: Void
(2) -> foo (1,2)
Compiling function foo with type (PositiveInteger, PositiveInteger)
-> PositiveInteger
(2) 3
Type: PositiveInteger
(3) -> foo
(3) foo (x, y) == x + y
Type: FunctionCalled(foo)
(4) -> bar x y == x + y
Type: Void
(5) -> bar
(5) bar x == y +-> x + y
Type: FunctionCalled(bar)
(6) -> (bar 1)
Compiling function bar with type PositiveInteger ->
AnonymousFunction
(6) y +-> #1 + y
Type: AnonymousFunction
(7) -> ((bar 1) 2)
(7) #1 + 2
Type: Polynomial(Integer)
Such a behavior is similar to what will happen when trying to build a closure by using (lambda (x) (lambda (y) (+ x y))) in a dynamically scoped Lisp, such as Emacs Lisp. Actually the underlying representation of functions is essentially the same as Lisp in the early days since AXIOM has been first developed on top of an early Lisp implementation on IBM mainframe.
I believe it is however a defect (like what JMC happened did when implementing the first version of LISP language) because the implementor made the parser to do uncurrying as in the function definition of bar, but it is unlikely to be useful without the ability to build the closure in the language.
It is also worth notice that SPAD automatically renames the variables in anomalous functions to avoid captures so its dynamic scoping could be used as a feature as in other Lisps.
Dynamic scope means, that the variable and its type in a specific line of your code depends on the functions, called before. This means, you can not know the type in a specific line of your code, because you can not know, which code has been executed before.
Static typing means, that you have to know the type in every line of your code, before the code starts to run.
This is irreconcilable.