Duck typing, must it be dynamic? - language-design

Wikipedia used to say* about duck-typing:
In computer programming with
object-oriented programming languages,
duck typing is a style of dynamic
typing in which an object's current
set of methods and properties
determines the valid semantics, rather
than its inheritance from a particular
class or implementation of a specific
interface.
(* Ed. note: Since this question was posted, the Wikipedia article has been edited to remove the word "dynamic".)
It says about structural typing:
A structural type system (or
property-based type system) is a major
class of type system, in which type
compatibility and equivalence are
determined by the type's structure,
and not through explicit declarations.
It contrasts structural subtyping with duck-typing as so:
[Structural systems] contrasts with
... duck typing, in which only the
part of the structure accessed at
runtime is checked for compatibility.
However, the term duck-typing seems to me at least to intuitively subsume structural sub-typing systems. In fact Wikipedia says:
The name of the concept [duck-typing]
refers to the duck test, attributed to
James Whitcomb Riley which may be phrased as
follows: "when I see a bird that walks
like a duck and swims like a duck and
quacks like a duck, I call that bird a
duck."
So my question is: why can't I call structural subtyping duck-typing? Do there even exist dynamically typed languages which can't also be classified as being duck-typed?
Postscript:
As someone named daydreamdrunk on reddit.com so eloquently put-it "If it compiles like a duck and links like a duck ..."
Post-postscript
Many answers seem to be basically just rehashing what I already quoted here, without addressing the deeper question, which is why not use the term duck-typing to cover both dynamic typing and structural sub-typing? If you only want to talk about duck-typing and not structural sub-typing, then just call it what it is: dynamic member lookup. My problem is that nothing about the term duck-typing says to me, this only applies to dynamic languages.

C++ and D templates are a perfect example of duck typing that is not dynamic. It is definitely:
typing in which an
object's current set of methods and
properties determines the valid
semantics, rather than its inheritance
from a particular class or
implementation of a specific
interface.
You don't explicitly specify an interface that your type must inherit from to instantiate the template. It just needs to have all the features that are used inside the template definition. However, everything gets resolved at compile time, and compiled down to raw, inscrutable hexadecimal numbers. I call this "compile time duck typing". I've written entire libraries from this mindset that implicit template instantiation is compile time duck typing and think it's one of the most under-appreciated features out there.

Structural Type System
A structural type system compares one entire type to another entire type to determine whether they are compatible. For two types A and B to be compatible, A and B must have the same structure – that is, every method on A and on B must have the same signature.
Duck Typing
Duck typing considers two types to be equivalent for the task at hand if they can both handle that task. For two types A and B to be equivalent to a piece of code that wants to write to a file, A and B both must implement a write method.
Summary
Structural type systems compare every method signature (entire structure). Duck typing compares the methods that are relevant to a specific task (structure relevant to a task).

Duck typing means If it just fits, it's OK
This applies to both dynamically typed
def foo obj
obj.quak()
end
or statically typed, compiled languages
template <typename T>
void foo(T& obj) {
obj.quak();
}
The point is that in both examples, there has not been any information on the type given. Just when used (either at runtime or compile-time!), the types are checked and if all requirements are fulfilled, the code works. Values don't have an explicit type at their point of declaration.
Structural typing relies on explicitly typing your values, just as usual - The difference is just that the concrete type is not identified by inheritance but by it's structure.
A structurally typed code (Scala-style) for the above example would be
def foo(obj : { def quak() : Unit }) {
obj.quak()
}
Don't confuse this with the fact that some structurally typed languages like OCaml combine this with type inference in order to prevent us from defining the types explicitly.

I'm not sure if it really answers your question, but...
Templated C++ code looks very much like duck-typing, yet is static, compile-time, structural.
template<typename T>
struct Test
{
void op(T& t)
{
t.set(t.get() + t.alpha() - t.omega(t, t.inverse()));
}
};

It's my understanding that structural typing is used by type inferencers and the like to determine type information (think Haskell or OCaml), while duck typing doesn't care about "types" per se, just that the thing can handle a specific method invocation/property access, etc. (think respond_to? in Ruby or capability checking in Javascript).

There are always going to be examples from some programming languages that violate some definitions of various terms. For example, ActionScript supports doing duck-typing style programming on instances that are not technically dynamic.
var x:Object = new SomeClass();
if ("begin" in x) {
x.begin();
}
In this case we tested if the object instance in "x" has a method "begin" before calling it instead of using an interface. This works in ActionScript and is pretty much duck-typing, even though the class SomeClass() may not itself be dynamic.

There are situations in which dynamic duck typing and the similar static-typed code (in i.e. C++) behave differently:
template <typename T>
void foo(T& obj) {
if(obj.isAlive()) {
obj.quak();
}
}
In C++, the object must have both the isAlive and quak methods for the code to compile; for the equivalent code in dynamically typed languages, the object only needs to have the quak method if isAlive() returns true. I interpret this as a difference between structure (structural typing) and behavior (duck typing).
(However, I reached this interpretation by taking Wikipedia's "duck-typing must be dynamic" at face value and trying to make it make sense. The alternate interpretation that implicit structural typing is duck typing is also coherent.)

I see "duck typing" more as a programming style, whereas "structural typing" is a type system feature.
Structural typing refers to the ability of the type system to express types that include all values that have certain structural properties.
Duck typing refers to writing code that just uses the features of values that it is passed that are actually needed for the job at hand, without imposing any other constraints.
So I could use structural types to code in a duck typing style, by formally declaring my "duck types" as structural types. But I could also use structural types without "doing duck typing". For example, if I write interfaces to a bunch of related functions/methods/procedures/predicates/classes/whatever by declaring and naming a common structural type and then using that everywhere, it's very likely that some of the code units don't need all of the features of the structural type, and so I have unnecessarily constrained some of them to reject values on which they could theoretically work correctly.
So while I can see how there is common ground, I don't think duck typing subsumes structural typing. The way I think about them, duck typing isn't even a thing that might have been able to subsume structural typing, because they're not the same kind of thing. Thinking of duck typing in dynamic languages as just "implicit, unchecked structural types" is missing something, IMHO. Duck typing is a coding style you choose to use or not, not just a technical feature of a programming language.
For example, it's possible to use isinstance checks in Python to fake OO-style "class-or-subclass" type constraints. It's also possible to check for particular attributes and methods, to fake structural type constraints (you could even put the checks in an external function, thus effectively getting a named structural type!). I would claim that neither of these options is exemplifying duck typing (unless the structural types are quite fine grained and kept in close sync with the code checking for them).

Related

Why is adding methods to a type different than adding a sub or an operator in perl6?

Making subs/procedures available for reuse is one core function of modules, and I would argue that it is the fundamental way how a language can be composable and therefore efficient with programmer time:
if you create a type in your module, I can create my own module that adds a sub that operates on your type. I do not have to extend your module to do that.
# your module
class Foo {
has $.id;
has $.name;
}
# my module
sub foo-str(Foo:D $f) is export {
return "[{$f.id}-{$f.name}]"
}
# someone else using yours and mine together for profit
my $f = Foo.new(:id(1234), :name("brclge"));
say foo-str($f);
As seen in Overloading operators for a class this composability of modules works equally well for operators, which to me makes sense since operators are just some kinda syntactic sugar for subs anyway (in my head at least). Note that the definition of such an operator does not cause any surprising change of behavior of existing code, you need to import it into your code explicitly to get access to it, just like the sub above.
Given this, I find it very odd that we do not have a similar mechanism for methods, see e.g. the discussion at How do you add a method to an existing class in Perl 6?, especially since perl6 is such a method-happy language. If I want to extend the usage of an existing type, I would want to do that in the same style as the original module was written in. If there is a .is-prime on Int, it must be possible for me to add a .is-semi-prime as well, right?
I read the discussion at the link above, but don't quite buy the "action at a distance" argument: how is that different from me exporting another multi sub from a module? for example the rust way of making this a lexical change (Trait + impl ... for) seems quite hygienic to me, and would be very much in line with the operator approach above.
More interesting (to me at least) than the technicalities is the question if language design: isn't the ability to provide new verbs (subs, operators, methods) for existing nouns (types) a core design goal for a language like perl6? If it is, why would it treat methods differently? And if it does treat them differently for a good reason, does that not mean we are using way to many non-composable methods as nouns where we should be using subs instead?
From a language design perspective, it all comes down to a simple question: which language are we speaking? In Perl 6, this is a question about which we always try to be very clear.
The notion of ones current language in Perl 6 is defined entirely in terms of lexical scope. Sub declarations are lexically scoped. When we import symbols from a module, including extra multi candidates, those are lexically scoped. When we perform language tweaks - such as introducing new operators - those are lexically scoped. Verbs in our current language - that is, subroutine calls - are those with a lexical definition. (Operators are simply sub calls with more interesting parsing.) Since lexical scopes are closed at the end of compile time, the compiler has a complete view of the current language. That's why sub calls to non-existent subs, or references to undeclared variables, are detected and reported at compile time, as well as some basic compile-time type checking; future Perl 6 versions are likely to extend the set of compile-time checks that can be expected. The current language is the static, early-bound, part of Perl 6.
By contrast, a method call is a verb to be interpreted in the target object's language. This is the dynamic, late-bound, part of Perl 6. While the most immediate result of that is the typical polymorphism found in various forms in implementations of OO, thanks to meta-programming even the manner in which a verb is interpreted is up for grabs. For example, a monitor will acquire a lock while it interprets the verb and release it afterwards. Other objects might have been constructed based on things other than Perl 6 code, and so the interpretation of a verb doesn't mean invoking code written as a Perl 6 method. Or the code might be somewhere over the network. Who knows? Well, certainly not the caller, and that's the point, and the power, and the risk, of late binding.
The Perl 6 answer to "I want to extend the range of verbs I can use with this object in my current language" is very simple: use language features that relate to extending the current language! There's even a special syntax, $obj.&foo, that allows for a verb foo to be defined in the current language - by writing a sub - and then invoked much as if it's a method on the object. However, the small syntactic distinction makes it clear to the reader - and to the compiler - what is going on, and which language is getting to define that verb.
Through the use of augment it is possible to extend the language defined by some type of objects. However, it's rarely the best way to do things, given that it will have global effect, and also scatter the definition of the language of the object.
Much of what we do in programming is about building languages. By that I don't mean new syntax; most of our new languages - even in a language as open to mutation as Perl 6 - are just nouns and verbs defined using standard language features. However, in any non-trivial program, we can't keep every detail of every language in mind at once. When I go to the restaurant and order a schnitzel, I don't know how the order will be transported to the kitchen, what the kitchen looks like, whether the schnitzel is hammered out, breaded, and cooked on demand, or just served from a (hopefully not too stale) cache of prepared schnitzels. The kitchen and I have just enough shared meaning to make the right kind of thing happen, but I don't know how they'll precisely react to my request and they need not know what I'll do in the meantime. This kind of thinking is acknowledged by OO itself - at least when we fully embrace it - and at a larger scale by concepts such as bounded contexts, as found in Domain Driven Design.
In summary, Perl 6 tries to help us keep our languages straight: to know what is in our current language, and what we express with only limited understanding. That distinction is encoded by the sub/method distinction, which also turns out to be a sensible place to hang a static/dynamic distinction too.

repository pattern : generics vs polymorphism way of implementation

a generic repository interface looks something like below :
public interface IRepository<T> where T : Entity
{
T Get(int key);
IQueryable<T> Get();
void Save(T obj);
void Delete(T obj);
}
Another way to achieve the similar functionality is using polymorphism as shown below
public interface IRepository
{
Entity Get(int key);
IQueryable<Entity> Get();
void Save(Entity obj);
void Delete(Entity obj);
}
In general my question would be in what scenarios or use cases we should use generics? Can generics be avoided completely if we use polymorphism. Or I am completely making non sense here and these 2 are completely unrelated.
The most crucial difference between your first and second example is called type safety.
I'm assuming that you are asking the question from the point of view of a statically-typed language like C# or Java.
In the version that uses generics, your compiler makes sure you always work with suitable types. In contrast, in the second version (the one that uses a more general type), you’d be expected to check that manually. Also, the compiler will constantly force you to cast your general type (e.g., Entity) to a more specific one (e.g., Customer) to use the services the object provides.
In other words, you have to fight the compiler all the time since it will consistently require that we cast types for it to be able to validate the code we write.
With Generics
The first example uses a type variable T at the interface level. It means that when you define a variable for this interface type (or implement it), you will also be forced to provide a type argument for T.
For example
IRepository<Customer> custRepo = ...;
This means that wherever T appears, it must be replaced by a Customer, right?
Once you define the type argument for T to be Customer, it is as if your interface declaration would change, in the eyes of the compiler, to be something like this:
public interface IRepository
{
Customer Get(int key);
IQueryable<Customer> Get();
void Save(Customer obj);
void Delete(Customer obj);
}
Therefore, when you use it, you would be forced by the compiler to respect your type argument:
Customer customer = custRepo.Get(10);
customer.setEmail("skywalker#gmail.com");
custRepo.Save(customer);
In the example above, all repository methods work only with the Customer type. Therefore, I cannot pass an incorrect type (e.g., Employee) to the methods since the compiler will enforce type safety everywhere. For example, contrast it with this other example where the compiler catches a type error:
Employee employee = empRepo.Get(10);
custRepo.Save(employee); //Uh oh! Compiler error here
Without Generics
On the other hand, in your second example, all you have done is decide that you’ll use a more generic type. By doing that, you sacrifice type safety in exchange for some flexibility:
For example:
IRepository custRepo = ...;
Entity customer = custRepo.Get(10);
((Customer) customer).setEmail("skywalker#gmail.com"); //Now you'll need casting
custRepo.Save(customer);
In the case above, you're forced to always cast your Entity to a more usable type like Customer to use what it offers. This casting is an unsafe operation and can potentially introduce bugs (if we ever make a mistake in our assumptions about the types we're using).
Also, the repository types do not prevent you from passing the wrong types to the methods, so you could make semantic mistakes:
Entity employee = (Employee) empRepo.Get(10);
custRepo.Save(employee); //Uh Oh!
If you do it this way, you will probably have to make sure in your CustomerRepo that your entity is that of a Customer to prevent a mistake like the one in the last line of the example above.
In other words, you would be manually implementing the kind of type safety that the compiler gives you automatically when you use generics.
This is starting to sound like we are trying to use our statically-typed language as if it was a dynamically-typed one, right? That's why we have to fight the compiler all the way to enforce this style of programming.
About Parametric Polymorphism
If you want, you could explore the idea that generics is also a form of polymorphism known as parametric polymorphism. I think reading this another answer could help as well. In it, there's a citation to an excellent paper on polymorphism that might help you broaden your understanding of the topic beyond just class and interface inheritance.
Dynamically-Type Languages vs. Statically-Typed Languages Debate
An exciting conclusion that might help you explore this further is that dynamic languages (e.g., JavaScript, Python, Ruby, etc.), where you don't have to make explicit type declarations, actually work like your generic-free example.
Dynamic languages work as if all your variables were of type Object, and they allow you to do anything you want with that object to avoid continually casting your objects to different types. It is the programmer's responsibility to write tests to ensure the object is always used appropriately.
There has always been a debate between defenders of statically-typed languages and those that prefer dynamically-typed languages. This is what we typically call a holy war.
I believe that this is a topic you may want to explore more deeply to understand the fundamental differences between your two examples and to learn how static typing and type safety from generics compares to the high flexibility of just using dynamic types.
I recommend reading an excellent paper called Dynamically–Typed Languages by Laurence Tratt from Bournemouth University.
Now, in statically-typed languages like C#, or Java, we are typically expected to exploit type safety as best as we can. But nothing prevents us from writing the code as you would do in a dynamic language; it is just that the compiler will constantly fight us. That’s the case with your second example.
If that's a way of programming that resonates more with you and your work style, or if it offers you the flexibility you seek, then perhaps you should consider using a dynamically-type language. Maybe one that can be combined on top of your current platform (e.g., IronRuby or IronPython) such that you can also reuse pre-existing code from your current statically-typed language.

Is static typing a subset of dynamic typing?

I was going to add this as a comment to my previous question about type theory, but I felt it probably deserved its own exposition:
If you have a dynamic typing system and you add a "type" member to each object and verify that this "type" is a specific value before executing a function on the object, how is this different than static typing? (Other than the fact that it is run-time instead of compile-time).
Technically, it actually is the other way round: a "dynamically typed" language is a special case of a statically typed language, namely one with only a single type (in the mathematical sense). That at least is the view point of many in the type systems community.
Edit regarding static vs dynamic checking: only local properties can be checked dynamically, whereas properties that require some kind of global knowledge cannot. Think of properties such as something being unique, something not being aliased, a computation being free of race conditions. A suitable static type system can verify such properties, because it has the ability to establish certain invariants on the context of the expression that is being checked.
static typing happens at compile-time, not at run-time! And that difference is essential!!
See B.Pierce's book Types and Programming Languages for more.

What is open recursion?

What is open recursion? Is it specific to OOP?
(I came across this term in this tweet by Daniel Spiewak.)
just copying http://www.comlab.ox.ac.uk/people/ralf.hinze/talks/Open.pdf:
"Open recursion Another handy feature offered by most languages with objects and classes is the ability for one method body to invoke another method of the same object via a special variable called self or, in some langauges, this. The special behavior of self is that it is late-bound, allowing a method defined in one class to invoke another method that is defined later, in some subclass of the first. "
This paper analyzes the possibility of adding OO to ML, with regards to expressivity and complexity. It has the following excerpt on objects, which seems to make this term relatively clear –
3.3. Objects
The simplest form of object is just a record of functions that share a common closure environment that
carries the object state (we can call these simple objects). The function members of the record may or may not
be defined as mutually recursive. However, if one wants to support inheritance with overriding, the structure
of objects becomes more complicated. To enable open recursion, the call-graph of the method functions
cannot be hard-wired, but needs to be implemented indirectly, via object self-reference. Object self-reference
can be achieved either by construction, making each object a recursive, self-referential value (the fixed-point
model), or dynamically, by passing the object as an extra argument on each method call (the self-application
or self-passing model).5 In either case, we will call these self-referential objects.
The name "open recursion" is a bit misleading at first, because it has nothing to do with the recursion that normally is used (a function calling itself); and to that extent, there is no closed recursion.
It basically means, that a thing is referring to itself. I can only guess, but I do think that the term "open" comes from open as in "open for extension".
In that sense an object is open to extension, but still referring to itself.
Perhaps a small example can shed some light on the concept.
Imaging you write a Python class like this one:
class SuperClass:
def method1(self):
self.method2()
def method2(self):
print(self.__class__.__name__)
If you ran this by
s = SuperClass()
s.method1()
It will print "SuperClass".
Now we create a subclass from SuperClass and override method2:
class SubClass(SuperClass):
def method2(self):
print(self.__class__.__name__)
and run it:
sub = SubClass()
sub.method1()
Now "SubClass" will be printed.
Still, we only call method1() as before. Inside method1() the method2() is called, but both are bound to the same reference (self in Python, this in Java). During sub-classing SuperClass method2() is changed, which means that an object of SubClass refers to a different version of this method.
That is open recursion.
In most cases, you override methods and call the overridden methods directly.
This scheme here is using an indirection over self-reference.
P.S.: I don't think this has been invented but discovered and then explained.
Open recursion allows to call another methods of object from within, through special variable like this or self.
In short, open recursion is about something actually not related to OOP, but more general.
The relation with OOP comes from the fact that many typical "OOP" PLs have such properties, but it is essentially not tied to any distinguishing features about OOP.
So there are different meanings, even in same "OOP" language. I will illustrate it later.
Etymology
As mentioned here, the terminology is likely coined in the famous TAPL by BCP, which illustrates the meaning by concrete OOP languages.
TAPL does not define "open recursion" formally. Instead, it points out the "special behavior of self (or this) is that it is late-bound, allowing a method defined in one class to invoke another method that is defined later, in some subclass of the first".
Nevertheless, neither of "open" and "recursion" comes from the OOP basis of a language. (Actually, it is also nothing to do with static types.) So the interpretation (or the informal definition, if any) in that source is overspecified in nature.
Ambiguity
The mentioning in TAPL clearly shows "recursion" is about "method invocation". However, it is not that simple in real languages, which usually do not have primitive semantic rules on the recursive invocation itself. Real languages (including the ones considered as OOP languages) usually specify the semantics of such invocation for the notation of the method calls. As syntactic devices, such calls are subject to the evaluation of some kind of expressions relying on the evaluations of its subexpressions. These evaluations imply the resolution of method name, under some independent rules. Specifically, such rules are about name resolution, i.e. to determine the denotation of a name (typically, a symbol, an identifier, or some "qualified" name expressions) in the subexpression. Name resolution often respects to scoping rules.
OTOH, the "late-bound" property emphasizes how to find the target implementation of the named method. This is a shortcut of evaluation of specific call expressions, but it is not general enough, because entities other than methods can also have such "special" behavior, even make such behavior not special at all.
A notable ambiguity comes from such insufficient treatment. That is, what does a "binding" mean. Traditionally, a binding can be modeled as a pair of a (scoped) name and its bound value, i.e. a variable binding. In the special treatment of "late-bound" ones, the set of allowed entities are smaller: methods instead of all named entities. Besides the considerably undermining the abstraction power of the language rules at meta level (in the language specification), it does not cease the necessity of traditional meaning of a binding (because there are other non-method entities), hence confusing. The use of a "late-bound" is at least an instance of bad naming. Instead of "binding", a more proper name would be "dispatching".
Worse, the use in TAPL directly mix the two meanings when dealing with "recusion". The "recursion" behavior is all about finding the entity denoted by some name, not just specific to method invocation (even in those OOP language).
The title of the chapter (Case Study: Imperative Objects) also suggests some inconsistency. Obviously, the so-called late binding of method invocation has nothing to do with imperative states, because the resolution of the dispatching does not require mutable metadata of invocation. (In some popular sense of implementation, the virtual method table need not to be modifiable.)
Openness
The use of "open" here looks like mimic to open (lambda) terms. An open term has some names not bound yet, so the reduction of such a term must do some name resolution (to compute the value of the expression), or the term is not normalized (never terminate in evaluation). There is no difference between "late" or "early" for the original calculi because they are pure, and they have the Church-Rosser property, so whether "late" or not does not alter the result (if it is normalized).
This is not the same in the language with potentially different paths of dispatching. Even that the implicit evaluation implied by the dispatching itself is pure, it is sensitive to the order among other evaluations with side effects which may have dependency on the concrete invocation target (for example, one overrider may mutate some global state while another can not). Of course in a strictly pure language there can be no observable differences even for any radically different invocation targets, a language rules all of them out is just useless.
Then there is another problem: why it is OOP-specific (as in TAPL)? Given that the openness is qualifying "binding" instead of "dispatching of method invocation", there are certainly other means to get the openness.
One notable instance is the evaluation of a procedure body in traditional Lisp dialects. There can be unbound symbols in the body and they are only resolved when the procedure being called (rather than being defined). Since Lisps are significant in PL history and the are close to lambda calculi, attributing "open" specifically to OOP languages (instead of Lisps) is more strange from the PL tradition. (This is also a case of "making them not special at all" mentioned above: every names in function bodies are just "open" by default.)
It is also arguable that the OOP style of self/this parameter is equivalent to the result of some closure conversion from the (implicit) environment in the procedure. It is questionable to treat such features primitive in the language semantics.
(It may be also worth noting, the special treatment of function calls from symbol resolution in other expressions is pioneered by Lisp-2 dialects, not any of typical OOP languages.)
More cases
As mentioned above, different meanings of "open recursion" may coexist in a same "OOP" language.
C++ is the first instance here, because there are sufficient reasons to make them coexist.
In C++, name resolution are all static, normatively name lookup. The rules of name lookup vary upon different scopes. Most of them are consistent with identifier lookup rules in C (except for the allowance of implicit declarations in C but not in C++): you must first declare the name, then the name can be lookup in the source code (lexically) later, otherwise the program is ill-formed (and it is required to issue an error in the implementation of the language). The strict requirement of such dependency of names are considerable "closed", because there are no later chance to recover from the error, so you cannot directly have names mutually referenced across different declarations.
To work around the limitation, there can be some additional declarations whose sole duty is to break the cyclic dependency. Such declarations are called "forward" declarations. Using of forward declarations still does not require "open" recursion, because every well-formed use must statically see the previous declaration of that name, so each name lookup does not require additional "late" binding.
However, C++ classes have special name lookup rules: some entities in the class scope can be referred in the context prior to their declaration. This makes mutual recursive use of name across different declarations possible without any additional "forward" declarations to break the cycle. This is exactly the "open recursion" in TAPL sense except that it is not about method invocation.
Moreover, C++ does have "open recursion" as per the descriptions in TAPL: this pointer and virtual functions. Rules to determine the target (overrider) of virtual functions are independent to the name lookup rules. A non-static member defined in a derived class generally just hide the entities with same name in the base classes. The dispatching rules kick in only on virtual function calls, after the name lookup (the order is guaranteed since evaulations of C++ function calls are strict, or applicative). It is also easy to introduce a base class name by using-declaration without worry about the type of the entity.
Such design can be seen as an instance of separate of concerns. The name lookup rules allows some generic static analysis in the language implementation without special treatment of function calls.
OTOH, Java have some more complex rules to mix up name lookup and other rules, including how to identify the overriders. Name shadowing in Java subclasses is specific to the kind of entities. It is more complicate to distinguish overriding with overloading/shadowing/hiding/obscuring for different kinds. There also cannot be techniques of C++'s using-declarations in the definition of subclasses. Such complexity does not make Java more or less "OOP" than C++, anyway.
Other consequences
Collapsing the bindings about name resolution and dispatching of method invocation leads to not only ambiguity, complexity and confusion, but also more difficulties on the meta level. Here meta means the fact that name binding can exposing properties not only available in the source language semantics, but also subject to the meta languages: either the formal semantic of the language or its implementation (say, the code to implement an interpreter or a compiler).
For example, as in traditional Lisps, binding-time can be distinguished from evaluation-time, because program properties revealed in binding-time (value binding in the immediate contexts) is more close to meta properties compared to evaluation-time properties (like the concrete value of arbitrary objects). An optimizing compiler can deploy the code generation immediately depending on the binding-time analysis either statically at the compile-time (when the body is to be evaluate more than once) or derferred at runtime (when the compilation is too expensive). There is no such option for languages blindly assume all resolutions in closed recursion faster than open ones (and even making them syntactically different at the very first). In such sense, OOP-specific open recursion is not just not handy as advertised in TAPL, but a premature optimization: giving up metacompilation too early, not in the language implementation, but in the language design.

What features do you wish were in common languages? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What features do you wish were in common languages? More precisely, I mean features which generally don't exist at all but would be nice to see, rather than, "I wish dynamic typing was popular."
I've often thought that "observable" would make a great field modifier (like public, private, static, etc.)
GameState {
observable int CurrentScore;
}
Then, other classes could declare an observer of that property:
ScoreDisplay {
observe GameState.CurrentScore(int oldValue, int newValue) {
...do stuff...
}
}
The compiler would wrap all access to the CurrentScore property with notification code, and observers would be notified immediately upon the value's modification.
Sure you can do the same thing in most programming languages with event listeners and property change handlers, but it's a huge pain in the ass and requires a lot of piecemeal plumbing, especially if you're not the author of the class whose values you want to observe. In which case, you usually have to write a wrapper subclass, delegating all operations to the original object and sending change events from mutator methods. Why can't the compiler generate all that dumb boilerplate code?
I guess the most obvious answer is Lisp-like macros. Being able to process your code with your code is wonderfully "meta" and allows some pretty impressive features to be developed from (almost) scratch.
A close second is double or multiple-dispatch in languages like C++. I would love it if polymorphism could extend to the parameters of a virtual function.
I'd love for more languages to have a type system like Haskell. Haskell utilizes a really awesome type inference system, so you almost never have to declare types, yet it's still a strongly typed language.
I also really like the way you declare new types in Haskell. I think it's a lot nicer than, e.g., object-oriented systems. For example, to declare a binary tree in Haskell, I could do something like:
data Tree a = Node a (Tree a) (Tree a) | Nothing
So the composite data types are more like algebraic types than objects. I think it makes reasoning about the program a lot easier.
Plus, mixing in type classes is a lot nicer. A type class is just a set of classes that a type implements -- sort of like an interface in a language like Java, but more like a mixin in a language like Ruby, I guess. It's kind of cool.
Ideally, I'd like to see a language like Python, but with data types and type classes like Haskell instead of objects.
I'm a big fan of closures / anonymous functions.
my $y = "world";
my $x = sub { print #_ , $y };
&$x( 'hello' ); #helloworld
and
my $adder = sub {
my $reg = $_[0];
my $result = {};
return sub { return $reg + $_[0]; }
};
print $adder->(4)->(3);
I just wish they were more commonplace.
Things from Lisp I miss in other languages:
Multiple return values
required, keyword, optional, and rest parameters (freely mixable) for functions
functions as first class objects (becoming more common nowadays)
tail call optimization
macros that operate on the language, not on the text
consistent syntax
To start things off, I wish the standard for strings was to use a prefix if you wanted to use escape codes, rather than their use being the default. E.g. in C# you can prefix with # for a raw string. Similarly, Python has the r prefix. I'd rather use #/r when I don't want a raw string and need escape codes.
More powerful templates that are actually designed to be used for metaprogramming, rather than C++ templates that are really designed for relatively simple generics and are Turing-complete almost by accident. The D programming language has these, but it's not very mainstream yet.
immutable keyword. Yes, you can make immutable objects, but that's lot pain in most of the languages.
class JustAClass
{
private int readonly id;
private MyClass readonly obj;
public MyClass
{
get
{
return obj;
}
}
}
Apparently it seems JustAClass is an immutable class. But that's not the case. Because another object hold the same reference, can modify the obj object.
So it's better to introduce new immutable keyword. When immutable is used that object will be treated immutable.
I like some of the array manipulation capabilities found in the Ruby language. I wish we had some of that built into .Net and Java. Of course, you can always create such a library, but it would be nice not to have to do that!
Also, static indexers are awesome when you need them.
Type inference. It's slowly making it's way into the mainstream languages but it's still not good enough. F# is the gold standard here
I wish there was a self-reversing assignment operator, which rolled back when out of scope. This would be to replace:
type datafoobak = item.datafoobak
item.datafoobak = 'tootle'
item.handledata()
item.datafoobak = datafoobak
with this
item.datafoobar #=# 'tootle'
item.handledata()
One could explicitely rollback such changes, but they'd roll back once out of scope, too. This kind of feature would be a bit error prone, maybe, but it would also make for much cleaner code in some cases. Some sort of shallow clone might be a more effective way to do this:
itemclone = item.shallowclone
itemclone.datafoobak='tootle'
itemclone.handledata()
However, shallow clones might have issues if their functions modified their internal data...though so would reversible assignments.
I'd like to see single-method and single-operator interfaces:
interface Addable<T> --> HasOperator( T = T + T)
interface Splittable<T> --> HasMethod( T[] = T.Split(T) )
...or something like that...
I envision it as being a typesafe implementation of duck-typing. The interfaces wouldn't be guarantees provided by the original class author. They'd be assertions made by a consumer of a third-party API, to provide limited type-safety in cases where the original authors hadn't anticipated.
(A good example of this in practice would be the INumeric interface that people have been clamboring for in C# since the dawn of time.)
In a duck-typed language like Ruby, you can call any method you want, and you won't know until runtime whether the operation is supported, because the method might not exist.
I'd like to be able to make small guarantees about type safety, so that I can polymorphically call methods on heterogeneous objects, as long as all of those objects have the method or operator that I want to invoke.
And I should be able to verify the existence of the methods/operators I want to call at compile time. Waiting until runtime is for suckers :o)
Lisp style macros.
Multiple dispatch.
Tail call optimization.
First class continuations.
Call me silly, but I don't think every feature belongs in every language. It's the "jack of all trades, master of none" syndrome. I like having a variety of tools available, each one of which is the best it can be for a particular task.
Functional functions, like map, flatMap, foldLeft, foldRight, and so on. Type system like scala (builder-safety). Making the compilers remove high-level libraries at compile time, while still having them if you run in "interpreted" or "less-compiled" mode (speed... sometimes you need it).
There are several good answers here, but i will add some:
1 - The ability to get a string representation for the current and caller code, so that i could output a variable name and its value easily, or print the name of the current class, function or a stack trace at any time.
2 - Pipes would be nice too. This feature is common in shells, but uncommon in other types of languages.
3 - The ability to delegate any number of methods to another class easily. This looks like inheritance, but even in the presence of inheritance, once in a while we need some kind of wrapper or stub which cannot be implemented as a child class, and forwarding all methods requires a lot of boilerplate code.
I'd like a language that was much more restrictive and was designed around producing good, maintainable code without any trickiness. Also, it should be designed to give the compiler the ability to check as much as possible at compile time.
Start with a newish VM based heavily OO language.
Remove complexities like Operator Overloading and multiple inheritance if they exist.
Force all non-final variables to Private.
Members should default to "Final" but should have a "Variable" tag to override it. (This may require built-in support for the builder pattern to be fully effective).
Variables should not allow a "Null" value by default, but variables and parameters should have a "nullable" tag that indicates that null is acceptable for that variable.
It would also be nice to be able to avoid some common questionable patterns:
Some built-in way to simplify IOC/DI to eliminate singletons,
Java--eliminate checked exceptions so people stop putting in empty catches.
Finally focus on code readability:
Named Parameters
Remove the ability to create methods more than, say, 100 lines long.
Add some complexity analysis to help detect complicated methods and classes.
I'm sure I haven't named 1/10 of the items possible, but basically I'm talking about something that compiles to the same bytecode as C# or Java, but is so restrictive that a programmer can hardly help but write good code.
And yes, I know there are lint-type tools that will do some of this, but I've never seen them on any project I've worked on (and they wouldn't physically run on the code I'm working on now, for instance) so they aren't being very helpful, and I would love to see a compile actually fail when you type in a 101 line method...