Why are class slots specified with keywords but accessed with symbols? - oop

I have recently encountered a confusing dichotomy regarding structures in Lisp.
When creating a structure with (defstruct), we specify the slots by keyword (:slotname). But when accessing it, we use local symbols ('slotname).
Why? This makes no sense to me.
Also, doesn't this pollute the keyword package every time you declare a structure?
If I try to access the slots by keyword, I get confusing errors like:
When attempting to read the slot's value (slot-value), the slot :BALANCE is
missing from the object #S(ACCOUNT :BALANCE 1000 :CUSTOMER-NAME "John Doe").
I don't understand this message. It seems to be telling me that something right under my nose doesn't exist.
I have tried declaring the structure using local symbols; and also with unbound keywords (#:balance) and these don't work.

DEFSTRUCT is designed in the language standard in this way:
slot-names are not exposed
there is no specified way to get a list of slot-names of a structure class
there is no specified way to access a slot via a slot-name
thus at runtime there might be no slot-names
access to slots is optimized with accessor functions: static structure layout, inlined accessor functions, ...
Also explicitly:
slot-names are not allowed to be duplicate under string=. Thus slots foo::a and bar::a in the same structure class are not allowed
the effects of redefining a structure is undefined
The goal of structures is to provide fast record-like objects without costly features like redefinition, multiple inheritance, etc.
Thus using SLOT-VALUE to access structure slots is an extension of implementations, not a part of the defined language. SLOT-VALUE was introduced when CLOS was added to Common Lisp. Several implementations provide a way to access a structure slot via SLOT-VALUE. This then also requires that the implementation has kept track of slot names of that structure.
SLOT-VALUE is simply a newer API function, coming from CLOS for CLOS. Structures are an older feature, which was defined already in the first version of Common Lisp defined by the book CLtL1.

You used make-instance to create a class instance and then you are showing a struct, I am confused.
structs automatically build their accessor functions. You create it with make-account. Then you'd use account-balance instead of slot-value.
I don't know what is the expected behavior to use make-instance with a struct. While it seemed to work on my SBCL, you are not using structs right.
(defstruct account
(balance))
(make-account :balance 100)
#S(ACCOUNT :BALANCE 100)
(account-balance *)
100
With classes, you are free to name your accessor functions as you want.
;;(pseudocode)
(defclass bank-account ()
((balance :initform nil ;; otherwise it's unbound
:initarg :balance ;; to use with make-instance :balance
:accessor balance ;; or account-balance, as you wish.
)))
(make-instance 'bank-account :balance 200)
#<BANK-ACCOUNT {1009302A33}>
(balance *)
200
https://lispcookbook.github.io/cl-cookbook/data-structures.html#structures
http://www.lispworks.com/documentation/HyperSpec/Body/m_defstr.htm
the slot :BALANCE is missing from the object #S(ACCOUNT :BALANCE 1000 :CUSTOMER-NAME "John Doe").
The slot name is actually balance and the representation uses the generated initargs. With the class object, the error message might be less confusing:
When attempting to read the slot's value (slot-value), the slot :BALANCE is missing from the object #<BANK-ACCOUNT {1009302A33}>.

First of all, see Rainer's excellent answer on structures. In summary:
Objects defined with defstruct have named accessor functions, not named slots. Further the field names of these objects which are mentioned in the defstruct form must be distinct as strings, and so keywords are completely appropriate for use in constructor functions. Any use of slot-value on such objects is implementation-dependent, and indeed whether or not named slots exist at all is entirely implementation-dependent.
You generally want keyword arguments for the constructors for the reasons you want keyword arguments elsewhere: you don't want to have to painfully provide 49 optional arguments so you can specify the 50th. So it's reasonable that the default thing defstruct does is that. But you can completely override this if you want to, using a BOA constructor, which defstruct allows you to do. You can even have no constructor at all! As an example here is a rather perverse structure constructor: it does use keyword arguments, but not the ones you might naively expect.
(defstruct (foo
(:constructor
make-foo (&key ((:y x) 1) ((:x y) 2))))
y
x)
So the real question revolves around classes defined with defclass, which usually do have named slots and where slot-value does work.
So in this case there are really two parts to the annswer.
Firstly, as before, keyword arguments are really useful for constructors because no-one wants to have to remember 932 optional argument defaults. But defclass provides complete control over the mapping between keyword arguments and the slots they initialise, or whether they initialise slots at all or instead are passed to some initialize-instance method. You can just do anything you want here.
Secondly, you really want slot names for objects of classes defined with defclass to be symbols which live in packages. You definitely do not want this to happen:
(in-package "MY-PACKAGE")
(use-package "SOMEONE-ELSES-PACKAGE")
(defclass my-class (someone-elses-class)
((internal-implementation-slot ...)))
only to discover that you have just modified the definition of the someone-elses-package::internal-implementation-slot slot in someone-elses-class. That would be bad. So slot names are symbols which live in packages and the normal namespace control around packages works for them too: my-package::internal-implementation-slot and someone-elses-package::internal-implementation-slot are not (usually) the same thing.
Additionally, the whole keyword-symbol-argument / non-keyword-symbol-variable thing is, well, let's just say well-established:
(defun foo (&key (x 1))
... x ...)
Finally note, of course, that keyword arguments don't actually have to be keywords: it's generally convenient that they are because you need quotes otherwise, but:
(defclass silly ()
((foo :initarg foo
:accessor silly-foo)
(bar :initarg bar
:accessor silly-bar)))
And now
> (silly-foo (make-instance 'silly 'bar 3 'foo 9))
9

Related

Is the term immutable variable just a convention?

In Rust variables are immutable by default, i.e., they don't vary but are not constants (as noted here).
Do they retain the name "variable" just by convention, or is there another reason why the term "variable" is maintained?
It should be noted that the term mut in Rust was hotly debated before stabilization with some arguing that it should be called excl or uniq. The matter is that the mut in in let mut x and &mut x are two completely different things.
let mut x declares that x is mutable, in the sense that it can be re-assigned, but also that one can take a &mut reference of it; which is best called an exclusive or unique reference. It is quite possible in Rust in some cases to mutate through a shared reference in the case of std::cell::Cell, for instance, and not all operations that require an exclusive reference involve mutation. An operation that requires an exclusive reference is simply one that would be unsafe with a shared one; Cell is designed in such a way that it is not, by strictly controlling under what conditions mutation can occur.
In theory, the two functions of let mut x could have different keywords, but they are compressed into one for simplicity. Rust could in theory be designed with mut and excl being different keywords, and allowing for let excl x, which would be a variable wherefrom one could take an exclusive reference, but not mutate.
One can also have variables that are not declared with mut, in particular in function calls. In a signature like fn func ( x : u32 ), x is not mutable, but it is variable, because it a different x can be passed every single time.
The let mut x type of "mutable" is purely a lint and, in theory, unnecessary for Rust to work — any currently working Rust program will continue to work if all non-mutable variables be made mutable. It's simply considered bad practice to do so and the compiler will warn the programmer whenever he make a variable mutable that isn't necessary to be mutable; this helps catching unintended bugs. This is absolutely not the case with exclusive and shared references, which are necessary to be distinguished and more than just a lint.
Here "variable" means "factor involved in computation" not "varying". This is from the mathematical principle where expressions like f(x) include x, a variable, as a part of the equation.
In Rust, like with other languages, you'll need variables (e.g. input) that affects how the program runs, otherwise your program would only ever behave in a singular, specific way, producing the same output each time.
You'll need to think of what variables change during processing and which do not. Those that do not need to change do not need to be declared mutable.
Regardless of if or when they change, they're still considered variables.
In C++ you'll have things like const int x which is a constant (read-only) variable, so the term can take on all sorts of specific meanings.
Is the term immutable variable just a convention?
By definition every... definition of a word is a convention, language, meaning of the word, change by time, is unique for every people that live, you can take 100 peoples and end with 100 difference definition of 1 word. That why we often start scientific paper by defining word that could be miss understand in the paper. Trying to clarify as much as possible. Rust does not differs that why we have The Reference
We have a specific section for variable
A variable is a component of a stack frame, either a named function
parameter, an anonymous temporary, or a named local variable.
A local variable (or stack-local allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack
frame.
Local variables are immutable unless declared otherwise. For example:
let mut x = ....
Function parameters are immutable unless declared with mut. The mut
keyword applies only to the following parameter. For example: |mut x,
y| and fn f(mut x: Box, y: Box) declare one mutable variable
x and one immutable variable y.
Local variables are not initialized when allocated. Instead, the
entire frame worth of local variables are allocated, on frame-entry,
in an uninitialized state. Subsequent statements within a function may
or may not initialize the local variables. Local variables can be used
only after they have been initialized through all reachable control
flow paths.
So there is not much to add, variable in rust is clearly defined, it doesn't matter if your definition doesn't match or you find a definition of variable that doesn't match Rust one. In the context of Rust, variable is that. If you want to ask about opinion about this choice then it's off topic as opinion oriented. But, wiki definition make Rust definition quite standard both from mathematics view than computer science:
Variable (computer science), a symbolic name associated with a value and whose associated value may be changed
Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many sciences

When does clojure remove a variable?

I was looking at the source for the memoize.
Coming from languages like C++/Python, this part hit me hard:
(let [mem (atom {})] (fn [& args] (if-let [e (find #mem args)] ...
I realize that memoize returns a function, but for storing state, it uses a local "variable" mem. But after memoize returns the function, shouldn't that outer let vanish from scope. How can the function still refer to the mem.
Why doesn't Clojure delete that outer variable, and how does it manage variable names. Like suppose, I make another memoized function, then memoize uses another mem. Doesn't that name clash with the earlier mem?
P.S.: I was thinking that there must be something much be happening in there, that prevents that, so I wrote myself a easier version, that goes like http://ideone.com/VZLsJp , but that still works like the memoize.
Objects are garbage collectable if no thread can access them, as per usual for JVM languages. If a thread has a reference to the function returned by memoize and the function has a reference to the atom in mem then transitively the atom is still accessible.
But after memoize returns the function, shouldn't that outer let vanish from scope. How can the function still refer to the mem.
This is what is called a closure. If a function is defined using a name from its environment, it keeps a reference to that value afterwards - even if the defining environment is gone and the function is the only thing that has access any more.
Like suppose, I make another memoized function, then memoize uses another mem. Doesn't that name clash with the earlier mem?
No, except possibly by confusing programmers. Having multiple scopes each declare their own name mem is very much possible and the usual rules of lexical scoping are used to determine which is meant when mem is read. There are some trickier edge cases such as
(let[foo 2]
(let[foo (fn[] foo)] ;; In the function definition, foo has the value from the outer scope
;; because the second let has not yet bound the name
(foo))) ;; => 2.
but generally the idea is pretty simple - the value of a name is the one given in the definition closest in the program text to the place it is used - either in the local scope or in the closest outer scope.
Different invocations of memoize create different closures so that the name mem refers to different atoms in each returned function.

How can I attach a type tag to a closure in Scheme?

How can I attach an arbitrary tag to a closure in Scheme?
Here are a couple things I'd like to use this for:
(1) To mark closures that provide an interface to produce a string for what they represent, like what #kud0h asked for here. A general ->string procedure could include code something like this:
(display (if (stringable? x)
(x 'string)
x)
str-port)
(2) More generally, to determine if a closure is an "object" that obeys the rules of a general object interface, or maybe to tell the class of an object (something like what #KPatnode was asking about here).
I can't query a procedure to see if it supports a certain interface by calling it, because if it doesn't support a known interface, calling the procedure will produce unpredictable results, most likely a run-time error.
Chez Scheme has putprop and getprop procedures that allow you to add keys and values to symbols. However, closures can be anonymous, or bound to different symbols, so I'd prefer to attach a calling-convention tag to the closure itself, not a symbol that it's bound to.
The only idea I have right now is to maintain a global hash table of all "stringable" or "object" closures in the system. That seems a little clunky. Is there a simpler, more elegant, or more efficient way?
Racket has applicable structures: you can give a structure type an apply hook to be called if an instance is used as a function.
If you want a more portable solution, you can use a hash table to associate your data with certain procedures. Unless your Scheme provides weak hashtables, though, keep in mind that the hashtable will prevent the procedures from being garbage-collected.
I think you might, instead of tagging procedures per se, want to look at Racket's object system, which has a concept of interfaces. It sounds quite similar to what you're after.
You could go extreme and redefine lambda syntax. Something like this (but untested by me):
(define *properties* '()) ;; example only
(define-syntax lambda
(let-syntax ((sys-lambda
(syntax-rules ()
((_ args body ...)
(lambda args body ...)))))
(syntax-rules ()
((_ args body ...)
(let ((func (sys-lambda args body ...)))
(set! *properties*
(cons (cons func '(NO-PROPERTIES))
*properties*))
func)))))

Ocaml naming convention

I am wondering if there exists already some naming conventions for Ocaml, especially for names of constructors, names of variables, names of functions, and names for labels of record.
For instance, if I want to define a type condition, do you suggest to annote its constructors explicitly (for example Condition_None) so as to know directly it is a constructor of condition?
Also how would you name a variable of this type? c or a_condition? I always hesitate to use a, an or the.
To declare a function, is it necessary to give it a name which allows to infer the types of arguments from its name, for example remove_condition_from_list: condition -> condition list -> condition list?
In addition, I use record a lot in my programs. How do you name a record so that it looks different from a normal variable?
There are really thousands of ways to name something, I would like to find a conventional one with a good taste, stick to it, so that I do not need to think before naming. This is an open discussion, any suggestion will be welcome. Thank you!
You may be interested in the Caml programming guidelines. They cover variable naming, but do not answer your precise questions.
Regarding constructor namespacing : in theory, you should be able to use modules as namespaces rather than adding prefixes to your constructor names. You could have, say, a Constructor module and use Constructor.None to avoid confusion with the standard None constructor of the option type. You could then use open or the local open syntax of ocaml 3.12, or use module aliasing module C = Constructor then C.None when useful, to avoid long names.
In practice, people still tend to use a short prefix, such as the first letter of the type name capitalized, CNone, to avoid any confusion when you manipulate two modules with the same constructor names; this often happen, for example, when you are writing a compiler and have several passes manipulating different AST types with similar types: after-parsing Let form, after-typing Let form, etc.
Regarding your second question, I would favor concision. Inference mean the type information can most of the time stay implicit, you don't need to enforce explicit annotation in your naming conventions. It will often be obvious from the context -- or unimportant -- what types are manipulated, eg. remove cond (l1 # l2). It's even less useful if your remove value is defined inside a Condition submodule.
Edit: record labels have the same scoping behavior than sum type constructors. If you have defined a {x: int; y : int} record in a Coord submodule, you access fields with foo.Coord.x outside the module, or with an alias foo.C.x, or Coord.(foo.x) using the "local open" feature of 3.12. That's basically the same thing as sum constructors.
Before 3.12, you had to write that module on each field of a record, eg. {Coord.x = 2; Coord.y = 3}. Since 3.12 you can just qualify the first field: {Coord.x = 2; y = 3}. This also works in pattern position.
If you want naming convention suggestions, look at the standard library. Beyond that you'll find many people with their own naming conventions, and it's up to you to decide who to trust (just be consistent, i.e. pick one, not many). The standard library is the only thing that's shared by all Ocaml programmers.
Often you would define a single type, or a single bunch of closely related types, in a module. So rather than having a type called condition, you'd have a module called Condition with a type t. (You should give your module some other name though, because there is already a module called Condition in the standard library!). A function to remove a condition from a list would be Condition.remove_from_list or ConditionList.remove. See for example the modules List, Array, Hashtbl,Map.Make`, etc. in the standard library.
For an example of a module that defines many types, look at Unix. This is a bit of a special case because the names are mostly taken from the preexisting C API. Many constructors have a short prefix, e.g. O_ for open_flag, SEEK_ for seek_command, etc.; this is a reasonable convention.
There's no reason to encode the type of a variable in its name. The compiler won't use the name to deduce the type. If the type of a variable isn't clear to a casual reader from the context, put a type annotation when you define it; that way the information provided to the reader is validated by the compiler.

Separate Namespaces for Functions and Variables in Common Lisp versus Scheme

Scheme uses a single namespace for all variables, regardless of whether they are bound to functions or other types of values. Common Lisp separates the two, such that the identifier "hello" may refer to a function in one context, and a string in another.
(Note 1: This question needs an example of the above; feel free to edit it and add one, or e-mail the original author with it and I will do so.)
However, in some contexts, such as passing functions as parameters to other functions, the programmer must explicitly distinguish that he's specifying a function variable, rather than a non-function variable, by using #', as in:
(sort (list '(9 A) '(3 B) '(4 C)) #'< :key #'first)
I have always considered this to be a bit of a wart, but I've recently run across an argument that this is actually a feature:
...the
important distinction actually lies in the syntax of forms, not in the
type of objects. Without knowing anything about the runtime values
involved, it is quite clear that the first element of a function form
must be a function. CL takes this fact and makes it a part of the
language, along with macro and special forms which also can (and must)
be determined statically. So my question is: why would you want the
names of functions and the names of variables to be in the same
namespace, when the primary use of function names is to appear where a
variable name would rarely want to appear?
Consider the case of class names: why should a class named FOO prevent
the use of variables named FOO? The only time I would be referring the
class by the name FOO is in contexts which expect a class name. If, on
the rare occasion I need to get the class object which is bound to the
class name FOO, there is FIND-CLASS.
This argument does make some sense to me from experience; there is a similar case in Haskell with field names, which are also functions used to access the fields. This is a bit awkward:
data Point = Point { x, y :: Double {- lots of other fields as well --} }
isOrigin p = (x p == 0) && (y p == 0)
This is solved by a bit of extra syntax, made especially nice by the NamedFieldPuns extension:
isOrigin2 Point{x,y} = (x == 0) && (y == 0)
So, to the question, beyond consistency, what are the advantages and disadvantages, both for Common Lisp vs. Scheme and in general, of a single namespace for all values versus separate ones for functions and non-function values?
The two different approaches have names: Lisp-1 and Lisp-2. A Lisp-1 has a single namespace for both variables and functions (as in Scheme) while a Lisp-2 has separate namespaces for variables and functions (as in Common Lisp). I mention this because you may not be aware of the terminology since you didn't refer to it in your question.
Wikipedia refers to this debate:
Whether a separate namespace for functions is an advantage is a source of contention in the Lisp community. It is usually referred to as the Lisp-1 vs. Lisp-2 debate. Lisp-1 refers to Scheme's model and Lisp-2 refers to Common Lisp's model. These names were coined in a 1988 paper by Richard P. Gabriel and Kent Pitman, which extensively compares the two approaches.
Gabriel and Pitman's paper titled Technical Issues of Separation in Function Cells and Value Cells addresses this very issue.
Actually, as outlined in the paper by Richard Gabriel and Kent Pitman, the debate is about Lisp-5 against Lisp-6, since there are several other namespaces already there, in the paper are mentioned type names, tag names, block names, and declaration names. edit: this seems to be incorrect, as Rainer points out in the comment: Scheme actually seems to be a Lisp-1. The following is largely unaffected by this error, though.
Whether a symbol denotes something to be executed or something to be referred to is always clear from the context. Throwing functions and variables into the same namespace is primarily a restriction: the programmer cannot use the same name for a thing and an action. What a Lisp-5 gets out of this is just that some syntactic overhead for referencing something from a different namespace than what the current context implies is avoided. edit: this is not the whole picture, just the surface.
I know that Lisp-5 proponents like the fact that functions are data, and that this is expressed in the language core. I like the fact that I can call a list "list" and a car "car" without confusing my compiler, and functions are a fundamentally special kind of data anyway. edit: this is my main point: separate namespaces are not a wart at all.
I also liked what Pascal Constanza had to say about this.
I've met a similar distinction in Python (unified namespace) vs Ruby (distinct namespaces for methods vs non-methods). In that context, I prefer Python's approach -- for example, with that approach, if I want to make a list of things, some of which are functions while others aren't, I don't have to do anything different with their names, depending on their "function-ness", for example. Similar considerations apply to all cases in which function objects are to be bandied around rather than called (arguments to, and return values from, higher-order functions, etc, etc).
Non-functions can be called, too (if their classes define __call__, in the case of Python -- a special case of "operator overloading") so the "contextual distinction" isn't necessarily clear, either.
However, my "lisp-oid" experience is/was mostly with Scheme rather than Common Lisp, so I may be subconsciously biased by the familiarity with the uniform namespace that in the end comes from that experience.
The name of a function in Scheme is just a variable with the function as its value. Whether I do (define x (y) (z y)) or (let ((x (lambda (y) (z y)))), I'm defining a function that I can call. So the idea that "a variable name would rarely want to appear there" is kind of specious as far as Scheme is concerned.
Scheme is a characteristically functional language, so treating functions as data is one of its tenets. Having functions be a type of their own that's stored like all other data is a way of carrying on the idea.
The biggest downside I see, at least for Common Lisp, is understandability. We can all agree that it uses different namespaces for variables and functions, but how many does it have? In PAIP, Norvig showed that it has "at least seven" namespaces.
When one of the language's classic books, written by a highly respected programmer, can't even say for certain in a published book, I think there's a problem. I don't have a problem with multiple namespaces, but I wish the language was, at the least, simple enough that somebody could understand this aspect of it entirely.
I'm comfortable using the same symbol for a variable and for a function, but in the more obscure areas I resort to using different names out of fear (colliding namespaces can be really hard to debug!), and that really should never be the case.
There's good things to both approaches. However, I find that when it matters, I prefer having both a function LIST and a a variable LIST than having to spell one of them incorrectly.