When does lexical scoping binding take place - in runtime or compile time? - language-features

C language take scope binding during compile time (variable reference get fixed address - doesn't change at all), that is example of static scoping.
Elisp language take scope binding during run time (variable point to top of own personal reference stack, let/defun/... special forms add to the top of stack on entry from top on the leave - in that time capture modified), that is example of dynamic scoping.
What type of binding used in lexical scoping?
Such language as Common Lisp, Python, R, JavaScript state that they implement lexical scoping.
What technics are used in implementation for that languages?
I hear about environments that carried with function appearance. If I right - when was environments created?
Is it possible or usual to construct and bind environment to function by developer manually? Some thing like call( bind_env(make_env({buf: [...], len: 5}), myfunc) )

In short, lexical scoping takes place at compile time (or more precisely, at the time when the function definition is evaluated). Also, lexical scoping can be static scoping: this is how ML-languages (SML, OCaml, Haskell) do it.
Environments
Every function is defined in some environment. Also, each function creates its own local environment nested in enclosing environment. Top level environment is where all usual variables, functions (+, -, sin, map, etc.) and syntax (relevant for languages that can extend syntax, like Common Lisp, Scheme, Clojure) are defined.
Each function creates its own local environment nested in enclosing environment (e.g., top level or of other function). Function arguments and all local variables live in this environment. If function reference a variable or a function that is not defined in the local environment (called free in this environment) it is found in enclosing environment of the function definition (or higher in enclosing env of enclosing environment if it is not found there and so on). This is different from dynamic scoping where the value would be found in the environment from where the function is called.
I am going to illustrate this using Scheme:
y is free in this definition
(define (foo x)
(+ x y))
Here is y defined in top-level environment
(define y 1)
Introduce local 'y', but foo will use y from enclosing (top-level) environment of the definition. Hence, the result is 2 and not 11.
(let ((y 10))
(foo 1))
=> 2
You can also define a function (or a procedure in Scheme's terms) with a local environment enclosing it:
(define bar
(let ((y 100))
(lambda (x) (+ x y))))
(bar 1)
=> 101
Here procedure value bar is defined to be a procedure. Variable y is again free in the procedure body. But enclosing environment is created by let form in which y is defined to be 100. So, when bar is called, it is that value of y which is fetched and not top-level (in a dynamically scoped language it would have returned 2).
Answering your last question, it is possible to create your own environment manually, but it would be too much work and probably wouldn't be very efficient. When the language is implemented (for example, Scheme interpreter), that is exactly what the language developer is doing.
Good explanation of environments is shown in SICP, Chapter 3
Other comments
AFAIK, from Emacs 23, ELisp is using lexical scoping as well as dynamic scoping (similarly to Common Lisp, see below).
Common Lisp uses lexical scoping for local variables (introduced by let form) and dynamic scoping for global variables (they are also called special; it is possible to declare a local variable special, but it is rarely used) defined with defvar and defparameter. To distinguish them from lexically scoped variables, their names usually have "ear muffs", for example *standard-input*. Top-level functions are also special in CL, which can be rather dangerous: one can unknowingly alter behavior by shadowing top-level function. This why CL standard specifies the locks on standard library function to prevent their re-definition.
Scheme, in contrast, always uses lexical scoping. Dynamic scoping, however, is useful sometimes (Richard Stallman makes a good point on it). To overcome this, many Scheme implementations introduced so-called parameters (implemented using lexical scoping).
Languages like Common Lisp, Scheme, Clojure, Python keep a dynamic reference to a variable: you can construct the variable name from a string (intern a symbol in Lisp's terms) and find its value. More static languages, like C, OCaml or Haskell, cannot do that (unless some form of reflection is used). But this has a weak connection to what kind of scoping they use.

Related

Why are class slots specified with keywords but accessed with symbols?

I have recently encountered a confusing dichotomy regarding structures in Lisp.
When creating a structure with (defstruct), we specify the slots by keyword (:slotname). But when accessing it, we use local symbols ('slotname).
Why? This makes no sense to me.
Also, doesn't this pollute the keyword package every time you declare a structure?
If I try to access the slots by keyword, I get confusing errors like:
When attempting to read the slot's value (slot-value), the slot :BALANCE is
missing from the object #S(ACCOUNT :BALANCE 1000 :CUSTOMER-NAME "John Doe").
I don't understand this message. It seems to be telling me that something right under my nose doesn't exist.
I have tried declaring the structure using local symbols; and also with unbound keywords (#:balance) and these don't work.
DEFSTRUCT is designed in the language standard in this way:
slot-names are not exposed
there is no specified way to get a list of slot-names of a structure class
there is no specified way to access a slot via a slot-name
thus at runtime there might be no slot-names
access to slots is optimized with accessor functions: static structure layout, inlined accessor functions, ...
Also explicitly:
slot-names are not allowed to be duplicate under string=. Thus slots foo::a and bar::a in the same structure class are not allowed
the effects of redefining a structure is undefined
The goal of structures is to provide fast record-like objects without costly features like redefinition, multiple inheritance, etc.
Thus using SLOT-VALUE to access structure slots is an extension of implementations, not a part of the defined language. SLOT-VALUE was introduced when CLOS was added to Common Lisp. Several implementations provide a way to access a structure slot via SLOT-VALUE. This then also requires that the implementation has kept track of slot names of that structure.
SLOT-VALUE is simply a newer API function, coming from CLOS for CLOS. Structures are an older feature, which was defined already in the first version of Common Lisp defined by the book CLtL1.
You used make-instance to create a class instance and then you are showing a struct, I am confused.
structs automatically build their accessor functions. You create it with make-account. Then you'd use account-balance instead of slot-value.
I don't know what is the expected behavior to use make-instance with a struct. While it seemed to work on my SBCL, you are not using structs right.
(defstruct account
(balance))
(make-account :balance 100)
#S(ACCOUNT :BALANCE 100)
(account-balance *)
100
With classes, you are free to name your accessor functions as you want.
;;(pseudocode)
(defclass bank-account ()
((balance :initform nil ;; otherwise it's unbound
:initarg :balance ;; to use with make-instance :balance
:accessor balance ;; or account-balance, as you wish.
)))
(make-instance 'bank-account :balance 200)
#<BANK-ACCOUNT {1009302A33}>
(balance *)
200
https://lispcookbook.github.io/cl-cookbook/data-structures.html#structures
http://www.lispworks.com/documentation/HyperSpec/Body/m_defstr.htm
the slot :BALANCE is missing from the object #S(ACCOUNT :BALANCE 1000 :CUSTOMER-NAME "John Doe").
The slot name is actually balance and the representation uses the generated initargs. With the class object, the error message might be less confusing:
When attempting to read the slot's value (slot-value), the slot :BALANCE is missing from the object #<BANK-ACCOUNT {1009302A33}>.
First of all, see Rainer's excellent answer on structures. In summary:
Objects defined with defstruct have named accessor functions, not named slots. Further the field names of these objects which are mentioned in the defstruct form must be distinct as strings, and so keywords are completely appropriate for use in constructor functions. Any use of slot-value on such objects is implementation-dependent, and indeed whether or not named slots exist at all is entirely implementation-dependent.
You generally want keyword arguments for the constructors for the reasons you want keyword arguments elsewhere: you don't want to have to painfully provide 49 optional arguments so you can specify the 50th. So it's reasonable that the default thing defstruct does is that. But you can completely override this if you want to, using a BOA constructor, which defstruct allows you to do. You can even have no constructor at all! As an example here is a rather perverse structure constructor: it does use keyword arguments, but not the ones you might naively expect.
(defstruct (foo
(:constructor
make-foo (&key ((:y x) 1) ((:x y) 2))))
y
x)
So the real question revolves around classes defined with defclass, which usually do have named slots and where slot-value does work.
So in this case there are really two parts to the annswer.
Firstly, as before, keyword arguments are really useful for constructors because no-one wants to have to remember 932 optional argument defaults. But defclass provides complete control over the mapping between keyword arguments and the slots they initialise, or whether they initialise slots at all or instead are passed to some initialize-instance method. You can just do anything you want here.
Secondly, you really want slot names for objects of classes defined with defclass to be symbols which live in packages. You definitely do not want this to happen:
(in-package "MY-PACKAGE")
(use-package "SOMEONE-ELSES-PACKAGE")
(defclass my-class (someone-elses-class)
((internal-implementation-slot ...)))
only to discover that you have just modified the definition of the someone-elses-package::internal-implementation-slot slot in someone-elses-class. That would be bad. So slot names are symbols which live in packages and the normal namespace control around packages works for them too: my-package::internal-implementation-slot and someone-elses-package::internal-implementation-slot are not (usually) the same thing.
Additionally, the whole keyword-symbol-argument / non-keyword-symbol-variable thing is, well, let's just say well-established:
(defun foo (&key (x 1))
... x ...)
Finally note, of course, that keyword arguments don't actually have to be keywords: it's generally convenient that they are because you need quotes otherwise, but:
(defclass silly ()
((foo :initarg foo
:accessor silly-foo)
(bar :initarg bar
:accessor silly-bar)))
And now
> (silly-foo (make-instance 'silly 'bar 3 'foo 9))
9

Is the term immutable variable just a convention?

In Rust variables are immutable by default, i.e., they don't vary but are not constants (as noted here).
Do they retain the name "variable" just by convention, or is there another reason why the term "variable" is maintained?
It should be noted that the term mut in Rust was hotly debated before stabilization with some arguing that it should be called excl or uniq. The matter is that the mut in in let mut x and &mut x are two completely different things.
let mut x declares that x is mutable, in the sense that it can be re-assigned, but also that one can take a &mut reference of it; which is best called an exclusive or unique reference. It is quite possible in Rust in some cases to mutate through a shared reference in the case of std::cell::Cell, for instance, and not all operations that require an exclusive reference involve mutation. An operation that requires an exclusive reference is simply one that would be unsafe with a shared one; Cell is designed in such a way that it is not, by strictly controlling under what conditions mutation can occur.
In theory, the two functions of let mut x could have different keywords, but they are compressed into one for simplicity. Rust could in theory be designed with mut and excl being different keywords, and allowing for let excl x, which would be a variable wherefrom one could take an exclusive reference, but not mutate.
One can also have variables that are not declared with mut, in particular in function calls. In a signature like fn func ( x : u32 ), x is not mutable, but it is variable, because it a different x can be passed every single time.
The let mut x type of "mutable" is purely a lint and, in theory, unnecessary for Rust to work — any currently working Rust program will continue to work if all non-mutable variables be made mutable. It's simply considered bad practice to do so and the compiler will warn the programmer whenever he make a variable mutable that isn't necessary to be mutable; this helps catching unintended bugs. This is absolutely not the case with exclusive and shared references, which are necessary to be distinguished and more than just a lint.
Here "variable" means "factor involved in computation" not "varying". This is from the mathematical principle where expressions like f(x) include x, a variable, as a part of the equation.
In Rust, like with other languages, you'll need variables (e.g. input) that affects how the program runs, otherwise your program would only ever behave in a singular, specific way, producing the same output each time.
You'll need to think of what variables change during processing and which do not. Those that do not need to change do not need to be declared mutable.
Regardless of if or when they change, they're still considered variables.
In C++ you'll have things like const int x which is a constant (read-only) variable, so the term can take on all sorts of specific meanings.
Is the term immutable variable just a convention?
By definition every... definition of a word is a convention, language, meaning of the word, change by time, is unique for every people that live, you can take 100 peoples and end with 100 difference definition of 1 word. That why we often start scientific paper by defining word that could be miss understand in the paper. Trying to clarify as much as possible. Rust does not differs that why we have The Reference
We have a specific section for variable
A variable is a component of a stack frame, either a named function
parameter, an anonymous temporary, or a named local variable.
A local variable (or stack-local allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack
frame.
Local variables are immutable unless declared otherwise. For example:
let mut x = ....
Function parameters are immutable unless declared with mut. The mut
keyword applies only to the following parameter. For example: |mut x,
y| and fn f(mut x: Box, y: Box) declare one mutable variable
x and one immutable variable y.
Local variables are not initialized when allocated. Instead, the
entire frame worth of local variables are allocated, on frame-entry,
in an uninitialized state. Subsequent statements within a function may
or may not initialize the local variables. Local variables can be used
only after they have been initialized through all reachable control
flow paths.
So there is not much to add, variable in rust is clearly defined, it doesn't matter if your definition doesn't match or you find a definition of variable that doesn't match Rust one. In the context of Rust, variable is that. If you want to ask about opinion about this choice then it's off topic as opinion oriented. But, wiki definition make Rust definition quite standard both from mathematics view than computer science:
Variable (computer science), a symbolic name associated with a value and whose associated value may be changed
Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many sciences

When does clojure remove a variable?

I was looking at the source for the memoize.
Coming from languages like C++/Python, this part hit me hard:
(let [mem (atom {})] (fn [& args] (if-let [e (find #mem args)] ...
I realize that memoize returns a function, but for storing state, it uses a local "variable" mem. But after memoize returns the function, shouldn't that outer let vanish from scope. How can the function still refer to the mem.
Why doesn't Clojure delete that outer variable, and how does it manage variable names. Like suppose, I make another memoized function, then memoize uses another mem. Doesn't that name clash with the earlier mem?
P.S.: I was thinking that there must be something much be happening in there, that prevents that, so I wrote myself a easier version, that goes like http://ideone.com/VZLsJp , but that still works like the memoize.
Objects are garbage collectable if no thread can access them, as per usual for JVM languages. If a thread has a reference to the function returned by memoize and the function has a reference to the atom in mem then transitively the atom is still accessible.
But after memoize returns the function, shouldn't that outer let vanish from scope. How can the function still refer to the mem.
This is what is called a closure. If a function is defined using a name from its environment, it keeps a reference to that value afterwards - even if the defining environment is gone and the function is the only thing that has access any more.
Like suppose, I make another memoized function, then memoize uses another mem. Doesn't that name clash with the earlier mem?
No, except possibly by confusing programmers. Having multiple scopes each declare their own name mem is very much possible and the usual rules of lexical scoping are used to determine which is meant when mem is read. There are some trickier edge cases such as
(let[foo 2]
(let[foo (fn[] foo)] ;; In the function definition, foo has the value from the outer scope
;; because the second let has not yet bound the name
(foo))) ;; => 2.
but generally the idea is pretty simple - the value of a name is the one given in the definition closest in the program text to the place it is used - either in the local scope or in the closest outer scope.
Different invocations of memoize create different closures so that the name mem refers to different atoms in each returned function.

Variable Encapsulation in Case Statement

While modifying an existing program's CASE statement, I had to add a second block where some logic is repeated to set NetWeaver portal settings. This is done by setting values in a local variable, then assigning that variable to a Changing parameter. I copied over the code and did a Pretty Print, expecting to compiler to complain about the unknown variable. To my surprise however, this code actually compiles just fine:
CASE i_actionid.
WHEN 'DOMIGO'.
DATA: ls_portal_actions TYPE powl_follow_up_sty.
CLEAR ls_portal_actions.
ls_portal_actions-bo_system = 'SAP_ECC_Common'.
" [...]
c_portal_actions = ls_portal_actions.
WHEN 'EBELN'.
ls_portal_actions-bo_system = 'SAP_ECC_Common'.
" [...]
C_PORTAL_ACTIONS = ls_portal_actions.
ENDCASE.
As I have seen in every other programming language, the DATA: declaration in the first WHEN statement should be encapsulated and available only inside that switch block. Does SAP ignore this encapsulation to make that value available in the entire CASE statement? Is this documented anywhere?
Note that this code compiles just fine and double-clicking the local variable in the second switch takes me to the data declaration in the first. I have however not been able to test that this code executes properly as our testing environment is down.
In short you cannot do this. You will have the following scopes in an abap program within which to declare variables (from local to global):
Form routine: all variables between FORM and ENDFORM
Method: all variables between METHOD and ENDMETHOD
Class - all variables between CLASS and ENDCLASS but only in the CLASS DEFINITION section
Function module: all variables between FUNCTION and ENDFUNCTION
Program/global - anything not in one of the above is global in the current program including variables in PBO and PAI modules
Having the ability to define variables locally in a for loop or if is really useful but unfortunately not possible in ABAP. The closest you will come to publicly available documentation on this is on help.sap.com: Local Data in the Subroutine
As for the compile process do not assume that ABAP will optimize out any variables you do not use it won't, use the code inspector to find and remove them yourself. Since ABAP works the way it does I personally define all my variables at the start of a modularization unit and not inline with other code and have gone so far as to modify the pretty printer to move any inline definitions to the top of the current scope.
Your assumption that a CASE statement defines its own scope of variables in ABAP is simply wrong (and would be wrong for a number of other programming languages as well). It's a bad idea to litter your code with variable declarations because that makes it awfully hard to read and to maintain, but it is possible. The DATA statements - as well as many other declarative statements - are only evaluated at compile time and are completely ignored at runtime. You can find more information about the scopes in the online documentation.
The inline variable declarations are now possible with the newest version of SAP Netweaver. Here is the link to the documentation DATA - inline declaration. Here are also some guidelines of a good and bad usage of this new feature
Here is a quote from this site:
A declaration expression with the declaration operator DATA declares a variable var used as an operand in the current writer position. The declared variable is visible statically in the program from DATA(var) and is valid in the current context. The declaration is made when the program is compiled, regardless of whether the statement is actually executed.
Personally have not had time to check it out yet, because of lack of access to such system.

What is a programming language with dynamic scope and static typing?

I know the language exists, but I can't put my finger on it.
dynamic scope
and
static typing?
We can try to reason about what such a language might look like. Obviously something like this (using a C-like syntax for demonstration purposes) cannot be allowed, or at least not with the obvious meaning:
int x_plus_(int y) {
return x + y; // requires that x have type int
}
int three_plus_(int y) {
double x = 3.0;
return x_plus_(y); // calls x_plus_ when x has type double
}
So, how to avoid this?
I can think of a few approaches offhand:
Commenters above mention that Fortran pre-'77 had this behavior. That worked because a variable's name determined its type; a function like x_plus_ above would be illegal, because x could never have an integer type. (And likewise one like three_plus_, for that matter, because y would have the same restriction.) Integer variables had to have names beginning with i, j, k, l, m, or n.
Perl uses syntax to distinguish a few broad categories of variables, namely scalars vs. arrays (regular arrays) vs. hashes (associative arrays). Variables belonging to the different categories can have the exact same name, because the syntax distinguishes which one is meant. For example, the expression foo $foo, $foo[0], $foo{'foo'} involves the function foo, the scalar $foo, the array #foo ($foo[0] being the first element of #foo), and the hash %foo ($foo{'foo'} being the value in %foo corresponding to the key 'foo'). Now, to be quite clear, Perl is not statically typed, because there are many different scalar types, and these are types not distinguished syntactically. (In particular: all references are scalars, even references to functions or arrays or hashes. So if you use the syntax to dereference a reference to an array, Perl has to check at runtime to see if the value really is an array-reference.) But this same approach could be used for a bona fide type system, especially if the type system were a very simple one. With that approach, the x_plus_ method would be using an x of type int, and would completely ignore the x declared by three_plus_. (Instead, it would use an x of type int that had to be provided from whatever scope called three_plus_.) This could either require some type annotations not included above, or it could use some form of type inference.
A function's signature could indicate the non-local variables it uses, and their expected types. In the above example, x_plus_ would have the signature "takes one argument of type int; uses a calling-scope x of type int; returns a value of type int". Then, just like how a function that calls x_plus_ would have to pass in an argument of type int, it would also have to provide a variable named x of type int — either by declaring it itself, or by inheriting that part of the type-signature (since calling x_plus_ is equivalent to using an x of type int) and propagating this requirement up to its callers. With this approach, the three_plus_ function above would be illegal, because it would violate the signature of the x_plus_ method it invokes — just the same as if it tried to pass a double as its argument.
The above could just have "undefined behavior"; the compiler wouldn't have to explicitly detect and reject it, but the spec wouldn't impose any particular requirements on how it had to handle it. It would be the responsibility of programmers to ensure that they never invoke a function with incorrectly-typed non-local variables.
Your professor was presumably thinking of #1, since pre-'77 Fortran was an actual real-world language with this property. But the other approaches are interesting to think about. :-)
I haven't found elsewhere has it written down, but AXIOM CAS (and various forks, including FriCAS which is still been actively developed) uses a script language called SPAD with both a very novel strong static dependent type system and dynamic scoping (although it is possibly an unintended implementation bug).
Most of the time the user won't realize that, but when they start trying to build closures like other functional languages it reveals its dynamic scoping nature:
FriCAS Computer Algebra System
Version: FriCAS 2021-03-06
Timestamp: Mon May 17 10:43:08 CST 2021
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------
(1) -> foo (x,y) == x + y
Type: Void
(2) -> foo (1,2)
Compiling function foo with type (PositiveInteger, PositiveInteger)
-> PositiveInteger
(2) 3
Type: PositiveInteger
(3) -> foo
(3) foo (x, y) == x + y
Type: FunctionCalled(foo)
(4) -> bar x y == x + y
Type: Void
(5) -> bar
(5) bar x == y +-> x + y
Type: FunctionCalled(bar)
(6) -> (bar 1)
Compiling function bar with type PositiveInteger ->
AnonymousFunction
(6) y +-> #1 + y
Type: AnonymousFunction
(7) -> ((bar 1) 2)
(7) #1 + 2
Type: Polynomial(Integer)
Such a behavior is similar to what will happen when trying to build a closure by using (lambda (x) (lambda (y) (+ x y))) in a dynamically scoped Lisp, such as Emacs Lisp. Actually the underlying representation of functions is essentially the same as Lisp in the early days since AXIOM has been first developed on top of an early Lisp implementation on IBM mainframe.
I believe it is however a defect (like what JMC happened did when implementing the first version of LISP language) because the implementor made the parser to do uncurrying as in the function definition of bar, but it is unlikely to be useful without the ability to build the closure in the language.
It is also worth notice that SPAD automatically renames the variables in anomalous functions to avoid captures so its dynamic scoping could be used as a feature as in other Lisps.
Dynamic scope means, that the variable and its type in a specific line of your code depends on the functions, called before. This means, you can not know the type in a specific line of your code, because you can not know, which code has been executed before.
Static typing means, that you have to know the type in every line of your code, before the code starts to run.
This is irreconcilable.