Initialize variable OCaml - variables

How does one do the equivalent of int variable; in OCaml? That is, how does one simply declare a variable? According to the OCaml manual, it seems as if one can only declare and initialize a variable in one step. If so, why would that be desired behavior?

Variables in OCaml are declared and immutable.
The main reason for that is that uninitialized variables are a source of mistakes:
int x; // not initialized
read_and_use(x); // error
By making sure that your variables are always initialized, you can make sure that no unauthorized value can happen anywhere in your code.
The other point of this is immutability (that comes with declarative statements):
let x = 4;; (* Declare x *)
let f y = x + y;; (* Use x *)
let x = 5;; (* Declare a new variable with the same name as x *)
assert (f 10 = 14);; (* The x = 4 definition is used, as x is immutable *)
Since variables are constants, declaring them initialized would create constantly invalid variables. And that's pretty useless.
The fact that variables in OCaml (and most functional languages) are set once and only once may seem odd at first, but it actually doesn't change your language expressiveness and helps make your code clear and safe.

TL;DR
Simply put: you don't need to declare the types of your functions and variables, because OCaml will just figure them out for you! let x = 3;;
OCaml uses type inference which means your compiler infers your variable type by what you're assigning to it.
Type inference is the ability to automatically deduce, either partially or fully, the type of an expression at compile time. The compiler is often able to infer the type of a variable or the type signature of a function, without explicit type annotations having been given. In many cases, it is possible to omit type annotations from a program completely if the type inference system is robust enough, or the program or language is simple enough.
It's used because it takes the housekeeping out of variable creation. You don't need to explicitly call out what is obvious and the compiler takes care of it for you. Additionally, you need to better understanding of how your code is using the variables you're assigning. This article has a bit more detail

Related

Is the term immutable variable just a convention?

In Rust variables are immutable by default, i.e., they don't vary but are not constants (as noted here).
Do they retain the name "variable" just by convention, or is there another reason why the term "variable" is maintained?
It should be noted that the term mut in Rust was hotly debated before stabilization with some arguing that it should be called excl or uniq. The matter is that the mut in in let mut x and &mut x are two completely different things.
let mut x declares that x is mutable, in the sense that it can be re-assigned, but also that one can take a &mut reference of it; which is best called an exclusive or unique reference. It is quite possible in Rust in some cases to mutate through a shared reference in the case of std::cell::Cell, for instance, and not all operations that require an exclusive reference involve mutation. An operation that requires an exclusive reference is simply one that would be unsafe with a shared one; Cell is designed in such a way that it is not, by strictly controlling under what conditions mutation can occur.
In theory, the two functions of let mut x could have different keywords, but they are compressed into one for simplicity. Rust could in theory be designed with mut and excl being different keywords, and allowing for let excl x, which would be a variable wherefrom one could take an exclusive reference, but not mutate.
One can also have variables that are not declared with mut, in particular in function calls. In a signature like fn func ( x : u32 ), x is not mutable, but it is variable, because it a different x can be passed every single time.
The let mut x type of "mutable" is purely a lint and, in theory, unnecessary for Rust to work — any currently working Rust program will continue to work if all non-mutable variables be made mutable. It's simply considered bad practice to do so and the compiler will warn the programmer whenever he make a variable mutable that isn't necessary to be mutable; this helps catching unintended bugs. This is absolutely not the case with exclusive and shared references, which are necessary to be distinguished and more than just a lint.
Here "variable" means "factor involved in computation" not "varying". This is from the mathematical principle where expressions like f(x) include x, a variable, as a part of the equation.
In Rust, like with other languages, you'll need variables (e.g. input) that affects how the program runs, otherwise your program would only ever behave in a singular, specific way, producing the same output each time.
You'll need to think of what variables change during processing and which do not. Those that do not need to change do not need to be declared mutable.
Regardless of if or when they change, they're still considered variables.
In C++ you'll have things like const int x which is a constant (read-only) variable, so the term can take on all sorts of specific meanings.
Is the term immutable variable just a convention?
By definition every... definition of a word is a convention, language, meaning of the word, change by time, is unique for every people that live, you can take 100 peoples and end with 100 difference definition of 1 word. That why we often start scientific paper by defining word that could be miss understand in the paper. Trying to clarify as much as possible. Rust does not differs that why we have The Reference
We have a specific section for variable
A variable is a component of a stack frame, either a named function
parameter, an anonymous temporary, or a named local variable.
A local variable (or stack-local allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack
frame.
Local variables are immutable unless declared otherwise. For example:
let mut x = ....
Function parameters are immutable unless declared with mut. The mut
keyword applies only to the following parameter. For example: |mut x,
y| and fn f(mut x: Box, y: Box) declare one mutable variable
x and one immutable variable y.
Local variables are not initialized when allocated. Instead, the
entire frame worth of local variables are allocated, on frame-entry,
in an uninitialized state. Subsequent statements within a function may
or may not initialize the local variables. Local variables can be used
only after they have been initialized through all reachable control
flow paths.
So there is not much to add, variable in rust is clearly defined, it doesn't matter if your definition doesn't match or you find a definition of variable that doesn't match Rust one. In the context of Rust, variable is that. If you want to ask about opinion about this choice then it's off topic as opinion oriented. But, wiki definition make Rust definition quite standard both from mathematics view than computer science:
Variable (computer science), a symbolic name associated with a value and whose associated value may be changed
Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many sciences

Are dynamic types slower in Dart?

I have been wondering if dynamic types are slower in Dart.
Example given:
final dynamic example = "Example"
versus
final String example = "Example"
Yes, using dynamic typed variables in Dart is often slower than using variables typed with an actual type.
However, your example is not using dynamic as type, it is using type inference to infer the String type. That might cost a little extra at compile-time, but at run-time, your two code examples are completely identical. Both variables are typed as String.
A dynamic method invocation may be slower because the run-time system must add extra checks to ensure that the variable can do the things you are trying to do with it.
If you have int x = 2; print(x + 3); the run-time system knows that int has a + operator, and even knows what it is.
If you write dynamic x = 2; print(x + 3);, the run-time system must first check whether x has a + operator before it can call it, and find that operator's definition on the object before calling it. It might not always be slower, some cases optimize better than others, but it can never be faster.
Not all code is performance sensitive, and not all variables can be typed. If you have a variable that holds either a String or a List, and you want to know the length, just writing stringOrList.length is more convenient than stringOrList is String ? stringOrList.length : (stringOrList as List).length. It may be slower depending on the compiler and the target platform.
Well, in your first example (heh), example is inferred to be a type String, not dynamic, so how could it be slower? The style guide even recommends not adding redundant types to those variables that can be inferred correctly.

Alternative syntax to __block?

I have question on the syntax of __block variables. I know you can use __block on a variable in scope so it's not read-only inside the block. However in one spot in the apple docs, I saw an alternative:
"Variables in the defining scope are read-only by default when used in a block. If you need to change the value of such a variable, you can use a special syntax:
int count = 0;
float cumulativeValue = 0.0;
UpdateElements( a, N, ^(float element){
|count, cumulativeValue|
float value = factor * element;
++count;
cumulativeValue += value;
return value;
} );
In this example, count and cumulativeValue are modified inside the block, so they are included in comma-separated list of shared variables at the beginning of the block scope.
This syntax seems much cleaner and I assume you could then modify variables you did not declare but are still in scope. However, I haven't seen this anywhere else and the xCode compiler does not like my basic block. Is this legitimate syntax?
Wow. Haven't seen that syntax in a long time.
That was one of the various syntactic structures explored during the development of blocks. It was eventually rejected because it was too imprecise in declaring intent and the resulting behavior would have been confusing.
Consider a scope with three blocks, two of which declare a variable as readwrite via |a|. There would be no way of knowing from the int a = 5; declaration at the top of the scope that the variable's value is readwrite in some of the block's scope.
As well, it would make the compiler implementation significantly more difficult. The tradition in C is that a variables storage type is fixed at the time of declaration. Supporting this syntax would have broken that expectation.
Thus, it was decided to use a storage type modifier akin to volatile or static. __block was used primarily because the __ prefix greatly reduces the amount of code that would break by adding a bare keyword.
Thanks for asking this. Bug filed and that documentation will be fixed and/or removed eventually.
The | | syntax was inspired by Smalltalk, as was, of course, the term "block".
As bbum points out, marking the decl site is more honest w.r.t. non-block usage and far more in line with C when modeled, as it ended up, as a new (C) object "duration".
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1451.pdf

What is a programming language with dynamic scope and static typing?

I know the language exists, but I can't put my finger on it.
dynamic scope
and
static typing?
We can try to reason about what such a language might look like. Obviously something like this (using a C-like syntax for demonstration purposes) cannot be allowed, or at least not with the obvious meaning:
int x_plus_(int y) {
return x + y; // requires that x have type int
}
int three_plus_(int y) {
double x = 3.0;
return x_plus_(y); // calls x_plus_ when x has type double
}
So, how to avoid this?
I can think of a few approaches offhand:
Commenters above mention that Fortran pre-'77 had this behavior. That worked because a variable's name determined its type; a function like x_plus_ above would be illegal, because x could never have an integer type. (And likewise one like three_plus_, for that matter, because y would have the same restriction.) Integer variables had to have names beginning with i, j, k, l, m, or n.
Perl uses syntax to distinguish a few broad categories of variables, namely scalars vs. arrays (regular arrays) vs. hashes (associative arrays). Variables belonging to the different categories can have the exact same name, because the syntax distinguishes which one is meant. For example, the expression foo $foo, $foo[0], $foo{'foo'} involves the function foo, the scalar $foo, the array #foo ($foo[0] being the first element of #foo), and the hash %foo ($foo{'foo'} being the value in %foo corresponding to the key 'foo'). Now, to be quite clear, Perl is not statically typed, because there are many different scalar types, and these are types not distinguished syntactically. (In particular: all references are scalars, even references to functions or arrays or hashes. So if you use the syntax to dereference a reference to an array, Perl has to check at runtime to see if the value really is an array-reference.) But this same approach could be used for a bona fide type system, especially if the type system were a very simple one. With that approach, the x_plus_ method would be using an x of type int, and would completely ignore the x declared by three_plus_. (Instead, it would use an x of type int that had to be provided from whatever scope called three_plus_.) This could either require some type annotations not included above, or it could use some form of type inference.
A function's signature could indicate the non-local variables it uses, and their expected types. In the above example, x_plus_ would have the signature "takes one argument of type int; uses a calling-scope x of type int; returns a value of type int". Then, just like how a function that calls x_plus_ would have to pass in an argument of type int, it would also have to provide a variable named x of type int — either by declaring it itself, or by inheriting that part of the type-signature (since calling x_plus_ is equivalent to using an x of type int) and propagating this requirement up to its callers. With this approach, the three_plus_ function above would be illegal, because it would violate the signature of the x_plus_ method it invokes — just the same as if it tried to pass a double as its argument.
The above could just have "undefined behavior"; the compiler wouldn't have to explicitly detect and reject it, but the spec wouldn't impose any particular requirements on how it had to handle it. It would be the responsibility of programmers to ensure that they never invoke a function with incorrectly-typed non-local variables.
Your professor was presumably thinking of #1, since pre-'77 Fortran was an actual real-world language with this property. But the other approaches are interesting to think about. :-)
I haven't found elsewhere has it written down, but AXIOM CAS (and various forks, including FriCAS which is still been actively developed) uses a script language called SPAD with both a very novel strong static dependent type system and dynamic scoping (although it is possibly an unintended implementation bug).
Most of the time the user won't realize that, but when they start trying to build closures like other functional languages it reveals its dynamic scoping nature:
FriCAS Computer Algebra System
Version: FriCAS 2021-03-06
Timestamp: Mon May 17 10:43:08 CST 2021
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------
(1) -> foo (x,y) == x + y
Type: Void
(2) -> foo (1,2)
Compiling function foo with type (PositiveInteger, PositiveInteger)
-> PositiveInteger
(2) 3
Type: PositiveInteger
(3) -> foo
(3) foo (x, y) == x + y
Type: FunctionCalled(foo)
(4) -> bar x y == x + y
Type: Void
(5) -> bar
(5) bar x == y +-> x + y
Type: FunctionCalled(bar)
(6) -> (bar 1)
Compiling function bar with type PositiveInteger ->
AnonymousFunction
(6) y +-> #1 + y
Type: AnonymousFunction
(7) -> ((bar 1) 2)
(7) #1 + 2
Type: Polynomial(Integer)
Such a behavior is similar to what will happen when trying to build a closure by using (lambda (x) (lambda (y) (+ x y))) in a dynamically scoped Lisp, such as Emacs Lisp. Actually the underlying representation of functions is essentially the same as Lisp in the early days since AXIOM has been first developed on top of an early Lisp implementation on IBM mainframe.
I believe it is however a defect (like what JMC happened did when implementing the first version of LISP language) because the implementor made the parser to do uncurrying as in the function definition of bar, but it is unlikely to be useful without the ability to build the closure in the language.
It is also worth notice that SPAD automatically renames the variables in anomalous functions to avoid captures so its dynamic scoping could be used as a feature as in other Lisps.
Dynamic scope means, that the variable and its type in a specific line of your code depends on the functions, called before. This means, you can not know the type in a specific line of your code, because you can not know, which code has been executed before.
Static typing means, that you have to know the type in every line of your code, before the code starts to run.
This is irreconcilable.

Does static typing mean that you have to cast a variable if you want to change its type?

Are there any other ways of changing a variable's type in a statically typed language like Java and C++, except 'casting'?
I'm trying to figure out what the main difference is in practical terms between dynamic and static typing and keep finding very academic definitions. I'm wondering what it means in terms of what my code looks like.
Make sure you don't get static vs. dynamic typing confused with strong vs. weak typing.
Static typing: Each variable, method parameter, return type etc. has a type known at compile time, either declared or inferred.
Dynamic typing: types are ignored/don't exist at compile time
Strong typing: each object at runtime has a specific type, and you can only perform those operations on it that are defined for that type.
Weak typing: runtime objects either don't have an explicit type, or the system attempts to automatically convert types wherever necessary.
These two opposites can be combined freely:
Java is statically and strongly typed
C is statically and weakly typed (pointer arithmetics!)
Ruby is dynamically and strongly typed
JavaScript is dynamically and weakly typed
Genrally, static typing means that a lot of errors are caught by the compiler which are runtime errors in a dynamically typed language - but it also means that you spend a lot of time worrying about types, in many cases unnecessarily (see interfaces vs. duck typing).
Strong typing means that any conversion between types must be explicit, either through a cast or through the use of conversion methods (e.g. parsing a string into an integer). This means more typing work, but has the advantage of keeping you in control of things, whereas weak typing often results in confusion when the system does some obscure implicit conversion that leaves you with a completely wrong variable value that causes havoc ten method calls down the line.
In C++/Java you can't change the type of a variable.
Static typing: A variable has one type assigned at compile type and that does not change.
Dynamic typing: A variable's type can change while runtime, e.g. in JavaScript:
js> x="5" <-- String
5
js> x=x*5 <-- Int
25
The main difference is that in dynamically typed languages you don't know until you go to use a method at runtime whether that method exists. In statically typed languages the check is made at compile time and the compilation fails if the method doesn't exist.
I'm wondering what it means in terms of what my code looks like.
The type system does not necessarily have any impact on what code looks like, e.g. languages with static typing, type inference and implicit conversion (like Scala for instance) look a lot like dynamically typed languages. See also: What To Know Before Debating Type Systems.
You don't need explicit casting. In many cases implicit casting works.
For example:
int i = 42;
float f = i; // f ~= 42.0
int b = f; // i == 42
class Base {
};
class Subclass : public Base {
};
Subclass *subclass = new Subclass();
Base *base = subclass; // Legal
Subclass *s = dynamic_cast<Subclass *>(base); // == subclass. Performs type checking. If base isn't a Subclass, NULL is returned instead. (This is type-safe explicit casting.)
You cannot, however, change the type of a variable. You can use unions in C++, though, to achieve some sort of dynamic typing.
Lets look at Java for he staitically typed language and JavaScript for the dynamc. In Java, for objects, the variable is a reference to an object. The object has a runtime type and the reference has a type. The type of the reference must be the type of the runtime object or one of its ancestors. This is how polymorphism works. You have to cast to go up the hierarchy of the reference type, but not down. The compiler ensures that these conditions are met. In a language like JavaScript, your variable is just that, a variable. You can have it point to whatever object you want, and you don't know the type of it until you check.
For conversions, though, there are lots of methods like toInteger and toFloat in Java to do a conversion and generate an object of a new type with the same relative value. In JavaScript there are also conversion methods, but they generate new objects too.
Your code should actally not look very much different, regardless if you are using a staticly typed language or not. Just because you can change the data type of a variable in a dynamically typed language, doesn't mean that it is a good idea to do so.
In VBScript, for example, hungarian notation is often used to specify the preferred data type of a variable. That way you can easily spot if the code is mixing types. (This was not the original use of hungarian notation, but it's pretty useful.)
By keeping to the same data type, you avoid situations where it's hard to tell what the code actually does, and situations where the code simply doesn't work properly. For example:
Dim id
id = Request.QueryString("id") ' this variable is now a string
If id = "42" Then
id = 142 ' sometimes turned into a number
End If
If id > 100 Then ' will not work properly for strings
Using hungarian notation you can spot code that is mixing types, like:
lngId = Request.QueryString("id") ' putting a string in a numeric variable
strId = 42 ' putting a number in a string variable