In lambda calculus, can variable be expression in general? - variables

For better understanding of functional programming, I am reading the wiki page for lambda calculus here.
The definition says:
If x is a variable and M ∈ Λ, then (λx.M) ∈ Λ
Intuitively I thought variable are / represented by single-letter id's. But since here we deal with strict math definitions, I just want to double confirm this understanding: in general, can expression be classified as variable?
e.g. if x is a variable, is expression (x + x) a variable in lambda calculus? i.e. is it ok to write (λ(x+x).M) as an lambda calculus abstraction?
(Concern is in some context this is true. e.g. Here: An expression such as 4x^3 is a variable)

No, (x + x) is no variable (indeed it's not even a expression in naive lambda calculus).
I think you mix the terms variables and expressions somehow (or want some kind of pattern-matching?).
So let's follow the core-definition of lambda-calculus and expressions:
The definition itself is not that hard (indeed you linked it yourself with the wiki-page).
It's mentioned right from the start:
you have a set of variables V: (v_1, v_2, ...) (of course you can name them as you want - it's only important that you remmber that these are considered different symbols in your calculus)
the symbols λ, ., ( and )
This is it - thats all of the "Tokens" for this grammar/calculus.
Now there are a couple of rules how you can form Expressions from these:
each Variable is a expression
Abstraction: if E is a expression and x is a Variable then (λx.E) is a expression (here x and E are templates or Metavariables - you have to fill them with some real Expression to make this an Expression!)
Application: if A and B are expressions than (A B) is a expression.
So possible expressions are:
v_50
(λv_4.v_5)
((λv_4.v_5) v_50)
....
This is all when it comes to expressions.
You see: if you don't allow (x+x) as a symbol or name for a variable from the start it can never be a variable - indeed no expression is a variable even if there are some expressions consisting only of one said variable - if you called something expression it will never be a variable (again) ;)
PS: of course there are a couple of conventions to keep the parentheses a bit down - but for a start you don't need those.

Related

Is it possible to preserve variable names when writing and reading term programatically?

I'm trying to write an SWI-Prolog predicate that applies numbervars/3 to a term's anonymous variables but preserves the user-supplied names of its non-anonymous variables. I eventually plan on adding some kind of hook to term_expansion (or something like that).
Example of desired output:
?- TestList=[X,Y,Z,_,_].
> TestList=[X,Y,Z,A,B].
This answer to the question Converting Terms to Atoms preserving variable names in YAP prolog shows how to use read_term to obtain as atoms the names of the variables used in a term. This list (in the form [X='X',Y='Y',...]) does not contain the anonymous variables, unlike the variable list obtained by term_variables, making isolation of the anonymous variables fairly straightforward.
However, the usefulness of this great feature is somewhat limited if it can only be applied to terms read directly from the terminal. I noticed that all of the examples in the answer involve direct user input of the term. Is it possible to get (as atoms) the variable names for terms that are not obtained through direct user input? That is, is there some way to 'write' a term (preserving variable names) to some invisible stream and then 'read' it as if it were input from the terminal?
Alternatively... Perhaps this is more of a LaTeX-ish line of thinking, but is there some way to "wrap" variables inside single quotes (thereby atom-ifying them) before Prolog expands/tries to unify them as variables, with the end result that they're treated as atoms that start with uppercase letters rather than as variables?
You can use the ISO core standard variable_names/1 read and write option. Here is some example code, that replaces anonymous variables in a variable name mapping:
% replace_anon(+Map, +Map, -Map)
replace_anon([_=V|M], S, ['_'=V|N]) :- member(_=W, S), W==V, !,
replace_anon(M, S, N).
replace_anon([A=V|M], S, [A=V|N]) :-
replace_anon(M, S, N).
replace_anon([], _, []).
variable_names/1 is ISO core standard. It was always a read option. It then became a write option as well. See also: https://www.complang.tuwien.ac.at/ulrich/iso-prolog/WDCor3
Here is an example run:
Welcome to SWI-Prolog (threaded, 64 bits, version 7.7.25)
?- read_term(X,[variable_names(M),singletons(S)]),
replace_anon(M,S,N),
write_term(X,[variable_names(N)]).
|: p(X,Y,X).
p(X,_,X)
To use the old numbervars/3 is not recommended, since its not compatible with attribute variables. You cannot use it for example in the presence of CLP(FD).
Is it possible to get (as atoms) the variable names for terms that are not obtained through direct user input?
if you want to get variable names from source files you should read them from there.
The easiest way to do so using term expansion.
Solution:
read_term_from_atom(+Atom, -Term, +Options)
Use read_term/3 to read the next term from Atom.
Atom is either an atom or a string object.
It is not required for Atom to end with a full-stop.
Use Atom as input to read_term/2 using the option variable_names and return the read term in Term and the variable bindings in variable_names(Bindings).
Bindings is a list of Name = Var couples, thus providing access to the actual variable names. See also read_term/2.
If Atom has no valid syntax, a syntax_error exception is raised.
write_term( Term ) :-
numbervars(Term, 0, End),
write_canonical(Term), nl.

What does the operator := mean? [duplicate]

I've seen := used in several code samples, but never with an accompanying explanation. It's not exactly possible to google its use without knowing the proper name for it.
What does it do?
http://en.wikipedia.org/wiki/Equals_sign#In_computer_programming
In computer programming languages, the equals sign typically denotes either a boolean operator to test equality of values (e.g. as in Pascal or Eiffel), which is consistent with the symbol's usage in mathematics, or an assignment operator (e.g. as in C-like languages). Languages making the former choice often use a colon-equals (:=) or ≔ to denote their assignment operator. Languages making the latter choice often use a double equals sign (==) to denote their boolean equality operator.
Note: I found this by searching for colon equals operator
It's the assignment operator in Pascal and is often used in proofs and pseudo-code. It's the same thing as = in C-dialect languages.
Historically, computer science papers used = for equality comparisons and ← for assignments. Pascal used := to stand in for the hard-to-type left arrow. C went a different direction and instead decided on the = and == operators.
In the statically typed language Go := is initialization and assignment in one step. It is done to allow for interpreted-like creation of variables in a compiled language.
// Creates and assigns
answer := 42
// Creates and assigns
var answer = 42
Another interpretation from outside the world of programming languages comes from Wolfram Mathworld, et al:
If A and B are equal by definition (i.e., A is defined as B), then this is written symbolically as A=B, A:=B, or sometimes A≜B.
■ http://mathworld.wolfram.com/Defined.html
■ https://math.stackexchange.com/questions/182101/appropriate-notation-equiv-versus
Some language uses := to act as the assignment operator.
In a lot of CS books, it's used as the assignment operator, to differentiate from the equality operator =. In a lot of high level languages, though, assignment is = and equality is ==.
This is old (pascal) syntax for the assignment operator. It would be used like so:
a := 45;
It may be in other languages as well, probably in a similar use.
A number of programming languages, most notably Pascal and Ada, use a colon immediately followed by an equals sign (:=) as the assignment operator, to distinguish it from a single equals which is an equality test (C instead used a single equals as assignment, and a double equals as the equality test).
Reference: Colon (punctuation).
In Python:
Named Expressions (NAME := expr) was introduced in Python 3.8. It allows for the assignment of variables within an expression that is currently being evaluated. The colon equals operator := is sometimes called the walrus operator because, well, it looks like a walrus emoticon.
For example:
if any((comment := line).startswith('#') for line in lines):
print(f"First comment: {comment}")
else:
print("There are no comments")
This would be invalid if you swapped the := for =. Note the additional parentheses surrounding the named expression. Another example:
# Compute partial sums in a list comprehension
total = 0
values = [1, 2, 3, 4, 5]
partial_sums = [total := total + v for v in values]
# [1, 3, 6, 10, 15]
print(f"Total: {total}") # Total: 15
Note that the variable total is not local to the comprehension (so too is comment from the first example). The NAME in a named expression cannot be a local variable within an expression, so, for example, [i := 0 for i, j in stuff] would be invalid, because i is local to the list comprehension.
I've taken examples from the PEP 572 document - it's a good read! I for one am looking forward to using Named Expressions, once my company upgrades from Python 3.6. Hope this was helpful!
Sources: Towards Data Science Article and PEP 572.
It's like an arrow without using a less-than symbol <= so like everybody already said "assignment" operator. Bringing clarity to what is being set to where as opposed to the logical operator of equivalence.
In Mathematics it is like equals but A := B means A is defined as B, a triple bar equals can be used to say it's similar and equal by definition but not always the same thing.
Anyway I point to these other references that were probably in the minds of those that invented it, but it's really just that plane equals and less that equals were taken (or potentially easily confused with =<) and something new to define assignment was needed and that made the most sense.
Historical References: I first saw this in SmallTalk the original Object Language, of which SJ of Apple only copied the Windows part of and BG of Microsoft watered down from them further (single threaded). Eventually SJ in NeXT took the second more important lesson from Xerox PARC in, which became Objective C.
Well anyway they just took colon-equals assiment operator from ALGOL 1958 which was later popularized by Pascal
https://en.wikipedia.org/wiki/PARC_(company)
https://en.wikipedia.org/wiki/Assignment_(computer_science)
Assignments typically allow a variable to hold different values at
different times during its life-span and scope. However, some
languages (primarily strictly functional) do not allow that kind of
"destructive" reassignment, as it might imply changes of non-local
state.
The purpose is to enforce referential transparency, i.e. functions
that do not depend on the state of some variable(s), but produce the
same results for a given set of parametric inputs at any point in
time.
https://en.wikipedia.org/wiki/Referential_transparency
For VB.net,
a constructor (for this case, Me = this in Java):
Public ABC(int A, int B, int C){
Me.A = A;
Me.B = B;
Me.C = C;
}
when you create that object:
new ABC(C:=1, A:=2, B:=3)
Then, regardless of the order of the parameters, that ABC object has A=2, B=3, C=1
So, ya, very good practice for others to read your code effectively
Colon-equals was used in Algol and its descendants such as Pascal and Ada because it is as close as ASCII gets to a left-arrow symbol.
The strange convention of using equals for assignment and double-equals for comparison was started with the C language.
In Prolog, there is no distinction between assignment and the equality test.

Fast lookup of tree with placeholders?

For an application I'm considering, there would be a large (100,000+) 'database' of trees (think expressions in a programming language, or S-expressions), and I would need to query that database for expressions that match a specific given expression.
Before giving the details of what I'd like to have, note that I'd appreciate any information related to indexing a large set of trees for optimizing lookup by a subtree.
In my specific situation (which would be for a backend to be used by Metamath proof assistants), expressions have the following structure (in Haskell-like notation):
data Expression = Placeholder Id | VarName Id | ConstName Id [Expression]
or as a BNF for an S-expression form:
Expression = '?' Id | Id | '(' Id Expression* ')'
where Id is some kind of identifier.
For example, I could have a database with expressions like
(equiv ?ph ?ps)
(not (in (appl (sqrt) (2)) (Q)))
(equiv (eq ?A ?B) (forall ?x (equiv (in ?x ?A) (in ?x ?B))))
In this context, two expressions match if they can be made equal by substitution of expressions for placeholders. So looking up (equiv (eq A (emptyset)) ?ph) in the above mini-database would result in the first and last expressions.
So again: how would I implement fast lookups in a large set of (expression) trees with placeholders? What kind of index data structure could I use?
I would implement the lookup with a trie. Each key would consist of one of the following:
ConstName Identifier
Variable w/ context info
ConstValue
Placeholder
These should be ordered in some fashion- possibly Placeholder, then all ConstNames (alphabetical), then variables (scope ordering, then argument order), then ConstValues (numerical order). As long as there's a concrete ordering for usage in the trie, you're fine.
Traverse the expression's tree, injecting the appropriate keys into the trie as they are encountered. Do this for all the expressions you want to insert into your data structure. When it comes time to query it, you can traverse the trie in a similar fashion, but with a few new rules.
Everything matches a placeholder node. If it matches some other key as well, then you'll need to explore both branches (easily done via a recursive DFS-like approach).
A placeholder matches everything. This is not equivalent to the previous point- we are talking about placeholders in the query here, the previous bullet is regarding placeholders as trie keys.
Now, this does mean that the search space can somewhat "explode" as you encounter placeholders, but there is one thing you can do to try to mitigate this in practice. Traverse the expression's tree in a breadth-first fashion (both in construction of the trie, and querying). This means if one of the arguments is a placeholder, you won't have to full-depth search every single subtree that matches that expression so far- instead you jump ahead to the next argument- which may not be a placeholder, and will thus greatly prune the search space (compared to matching "everything").
For completeness sake, lets take one of your examples
(not (in (appl (sqrt) (2)) (Q)))
and make a trie entry from that-
not -> in -> apply -> "Q" -> sqrt -> 2
adding (not (in ?ph E)) to this would result in-
not -> in -> apply -> "Q" -> sqrt -> 2
\-> ?ph -> "E"
Continue in this fashion injecting expressions into the trie. Also traverse in this fashion for querying until you reach the ends of your searches into the trie, and return those that matched.
Note- the uniqueness of these entries is based on the assumption you do not have to support variadic functions. If you do, attach to each key some context info (read the next paragraphs for info on how to do this) to distinguish which arguments go to which functions
There is one detail I glossed over- variables. If you only want it to match if they are the exact same variable name, then no work is necessary. But this likely isn't what you want; you probably want it to match generic variables as long as they are "consistent" with each other. The way to do this is to assign each variable an identifier that represents the scope of which it was first defined.
The easiest way to do this is just compose an identifier from the concatenation of the argument ordering of its ancestors. That is, if a variable is first defined as the second argument to a function which is the fifth argument to the root function, then we might label it as (5, 2) or (2, 5), whichever makes more sense intuitively. Either way, this will ensure the variable is given a consistent identifier regardless of other variables / functions elsewhere. Then proceed as normal with this new variable name.

How do the OCaml operators < and > work with non-integer types?

I'm curious how the greater than (>) and less than (<) operators work with types that are not int, float, or double in OCaml.
For instance, I was able to discover that string "a" > "b" but is there some reference that lists the conventions for all non-numerical data types. Furthermore, how do these operators work across types? e.g. Is "a" > true or is "a" < true?
Finally, how would these work across a user-defined data type?
Thanks!
The OCaml <, >, <=, >= operators only work with two values of the same type, so the expression "a" > true is invalid. However, they work for all types (with caveats below). You can find the definitions of these operators in the Stdlib module (formerly known as Pervasives).
The order for these operators is defined only for simple values (integers, characters, strings, byte sequences, and floating). In these cases the documentation says they give "the usual ordering".
The usual ordering for strings and byte sequences is lexicographic order. For strings, case is significant.
For compound values the order is only guaranteed to be consistent with = and to be a consistent ordering.
As far as I can see, order is not defined for simple user-defined types like type abc = A | B | C. I didn't expect this to be the case, but that's what I see in the documentation. In practice, the values of constant constructors like A, B, C, will be ordered in the order of declaration with the first value the smallest.
I also don't see a definition of the order between false and true. Again, this is surprising. In practice, false is less than true.
It is worth noting that comparisons between cyclic values is not guaranteed to terminate. Also, comparison between values that contain functions may raise an exception. These can cause unexpected problems, sometimes serious ones.
$ ocaml
OCaml version 4.02.1
# (+) < (+);;
Exception: Invalid_argument "equal: functional value".
# let rec cycle = 1 :: cycle;;
val cycle : int list = [1; <cycle>]
# cycle < cycle;;
(( Does not terminate ))

Handling expressions in GMP

I introduced myself to the GMP library for high precision arithmetic recently. It seems easy enough to use but in my first program I am running into practical problems. How are expressions to be evaluated. For instance, if I have "1+8*z^2" and z is a mpz_t "large integer" variable, how am I to quickly evaluate this? (I have larger expressions in the program that I am writing.) Currently, I am doing every single operation manually and storing the results in temporary variables like this for the "1+8*z^2" expression:
1) first do mpt_mul(z,z,z) to square z
2) then define an mpz_t variable called "eight" with the value 8.
3) multiply the result from step one by this 8 and store in temp variable.
4) define mpz_t variable called "one" with value 1.
5) add this to the result in step 3 to find final answer.
Is this what I am supposed to be doing? Or is there a better way? It would really help if there was a user's manual for GMP to get people started but there's only the reference manual.
GMP comes with a C++ class interface which provides a more straightforward way of expressing arithmetic expressions. This interface uses C++ operator overloading to allow you to write:
mpz_class z;
1 + 8 * z**2
This is, of course, assuming you're using C++. If you are using C only, you may need to use the C interface to GMP which does not provide operator overloading.
Turns out that there's an unsupported expression parser distributed with GMP in a "expr" subdirectory. It's not part of GMP proper and is subject to change but it is discussed in a README file in that directory. It isn't guaranteed to do the calculation in the fastest way possible, so buyer beware.
So the user must manually evaluate all expressions when using GMP unless they wish to use this library or make their own expression parser.