What's so special about message passing in smalltalk - smalltalk

I was going through an introduction to Smalltalk.
In C++, the functions declared inside a class can be called by objects of that class, and similarly in Smalltalk a keyword, termed as message, is written adjacent to the name of the object.
(Don't know much but would also like to ask here whether in response to a message a unique method is there to be executed?)
Basically, to my naive mind, this seems to be only a difference in syntax style. But, I wonder if internally in terms of compilation or memory structure this difference in calling holds any significance.
Thanks in advance.
P.S : I bow down to all of you for your time and answers . Thanks a lot.

The fundamental difference is that in Smalltalk, the receiver of the message has complete control over how that message is handled. It's a true object, not a data structure with functions that operate on it.
That means that in Smalltalk you can send any message to any object. The compiler places no restrictions on that, it's all handled at runtime. In C++, you can only invoke functions that the compiler knows about.
Also, Smalltalk messages are simply symbols (unique character strings), not a function address in memory as in C++. That means it's easy to send messages interactively, or over a network connection. There is a perform: method that lets you send a message given its string name.
An object even receives messages it does not implement. The Virtual Machine detects that case and creates a Message object, and then sends the messageNotUnderstood: message. Again, it's the object's sole responsibility of how to handle that unknown message. Most objects simply inherit the default implementation which raises an error, but an object can also handle it itself. It could, for example, forward those messages to a remote object, or log them to a file, etc.

You call a function in C++ because during the compilation time you know which function will be called (or at least you have a finite set of functions defined in a class hierarchy.
Smalltalk is dynamically typed and late bound, so during the compilation time you have no idea which method is going to be evaluated (if one will be at all). Thus you send a message, and if the object has a method with that selector, it is evaluated. Otherwise, the "message not understood" exception is raised.

There are already good answers here. Let me add some details (originally, part of this was in a comment).
In plain C, the target of each function call is determined at link time (except when you use function pointers). C++ adds virtual functions, for which the actual function that will be invoked by a call is determined at runtime (dynamic dispatch, late binding). Function pointers allow for custom dispatch mechanisms to some degree, but you have to program it yourself.
In Smalltalk, all message sends are dynamically dispatched. In C++ terms this roughly means: All member functions are virtual, and there are no standalone functions (there is always a receiver). Therefore, the Smalltalk compiler never* decides which method will be invoked by a message send. Instead, the invoked method is determined at runtime by the Virtual Machine that implements Smalltalk.
One way to implement virtual function dispatching is virtual function tables. An approximate equivalent in Smalltalk are method dictionaries. However, these dictionaries are mutable, unlike typical virtual function tables, which are generated by the C++ compiler and do not change at runtime. All Smalltalk behaviors (Behavior being a superclass of Class) have such a method dictionary. As #aka.nice pointed out in his answer, the method dictionaries can be queried. But methods can also be added (or removed) while the Smalltalk system runs. When the Smalltalk VM dispatches a message send, it searches the method dictionaries of the receiver's superclass chain for the correct method. There are usually caches in place to avoid the recurring cost of that lookup.
Also note that message passing is the only way for objects to communicate in Smalltalk. Two objects cannot access each other's instance variables, even if they belong to the same class. In C++, you can write code that breaks this encapsulation. Hence, message sending is fundamental in Smalltalk, whereas in C++ it is basically an optional feature.
In C++, Java, and similar languages, there is another form of dispatch, called function overloading. It happens exclusively at compile time and selects a function based on the declared types of the arguments at the call site. You cannot influence it at runtime. Smalltalk obviously does not provide this form of dispatch because it does not have static typing of variables. It can be realized nevertheless using idioms such as double dispatch. Other languages, such as Common Lisp's CLOS or Groovy, provide the even more general multiple dispatch, which means that a method will be selected based on both the receiver's type and the runtime types of all the arguments.
* Some special messages such as ifTrue: ifFalse: whileTrue: are usually compiled directly to conditional branches and jumps in the bytecode, instead of message sends. But in most cases it does not influence the semantics.

Here are a few example of what you would not find in C++
In Smalltalk, you create a new class by sending a message (either to the superclass, or to the namespace depending on the dialect).
In Smalltalk, you compile a new method by sending a message to a Compiler.
In Smalltalk, a Debugger is opened in response to an unhandled exception by sending a message. All the exception handling is implemented in term of sending messages.
In Smalltalk you can query the methods of a Class, or gather all its instances by sending messages.
More trivially, all control structures (branch, loops, ...) are performed by sending messages.
It's messages all the way down.

Related

smalltalk: about method- "withArgs:executeMethod:"

I'm trying to understand the method "withArgs: executeMethod: " in smalltalk, squeak.
1. I am trying to understand what is the role of the method?
2. What arguments need to be passed to it for it to be carried out?
A good way to understand this method is by considering it as a syntactic variant of the general expression
object msg: arg (*)
where object is the receiver of the message with selector msg: and arg its argument. There are of course variants with no or multiple arguments, but the idea is the same.
When object receives this message (*) the Virtual Machine (VM) looks up for the CompiledMethod with selector msg: in the object's hierarchy, and transfers it the control, binding self to object and the formal argument of the method to arg.
Notice that this invocation is managed by the VM, no by the Virtual Image (VI). So, how could we reflect the same in the VI? Well, there are two steps in this behavior (1) find the method and (2) bind its formal receiver and arguments to the actual ones and let it run.
Step (1) is the so called lookup algorithm. It is easily implemented in Smalltalk: just ask the receiver its class, check whether the class includes the selector #msg: and, if not, go to the superclass and repeat. If all checks fail, issue the doesNotUnderstand: message.
Step (2) exactly requires what #withArgs:executeMethod: provides. It allows us to say
object withArgs: {arg} executeMethod: method
where method is the CompiledMethod found in step (1). [We have to use {arg} rather than arg because the plural in withArgs: suggests that the method expects an Array of arguments.]
Why would we want this?
Generally speaking, giving the VI the capability to mimic behavior implemented in the VM is good because it makes metaprogramming easier (and more natural).
More practically, a relevant example of the use of this capability is the implementation of Method Wrappers. Briefly described, given any particular method, you can wrap it (as the wrappee) inside a wrapper method, which also has a preBlock. If you then substitute the original method in the MethodDictionary where it belongs, with the wrapper, you can let the wrapper first execute the preBlock and then the intended method. The first task is easy: just send the message preBlock value. For the second we have the method (the wrappee), the receiver and the arguments (if any). So, to complete the task you only need to send to the receiver withArgs:executeMethod: with the actual argument(s) and the wrappee.
Ah! Let's not forget to mention that one of the reasons for having Method Wrappers is to measure testing coverage.
Note also that withArgs:executeMethod: does not require the second argument, i.e., the method to execute, to be in any class, let alone the class of the receiver. In particular, you could create a CompiledMethod on the fly and execute it on any given object. Of course, it is up to you to make sure that the execution will not crash the VM by, say, using the third ivar of the receiver if the receiver has only two etc. A simple way to create a CompiledMethod without installing it in any class is by asking the Smalltalk compiler to do so (look for senders of newCompiler to learn how to do that).

Can methodSignatureForSelector: create a "catch-all" method signature?

I am trying to make a catch-all proxy that takes any selector with any type of arguments and sends an RPC call down the wire. In this case, the signature of the method is not known because it can be created arbitrarily by the end user.
I see that the method signature requires supplying type encodings for each of the arguments. Is there a type encoding that signifies absolutely anything (pointer, int, everything else)? Otherwise, is there another way to accomplish this effect?
"sort of", but not really... or, at least, not practically.
You can implement the method forwarding protocol such that unrecognized method calls will be wrapped up in an NSInvocation and then you could tear into the NSInvocation, but it really isn't practical for a number of reasons.
First, it only really works for relatively simple argument types. The C ABI is such that complex arguments -- structures, C++ objects, etc.. -- can be encoded on the stack in wonky ways. In fact, they can be encoded in ways where there isn't enough metadata to decode the frames.
Secondly, any kind of a system where "selectors can be created arbitrarily by the user" has a very distinct odor about it; an odor of "you are doing it the hard way". Objective-C, while exceptionally dynamic, was really not designed to support this level of pure meta-object pattern.
As well, any such resulting system is going to be exceptionally fragile. What if the "arbitrary selector" happens to be, say, #selector(hash)?
Can you describe in more detail about this catch-all proxy and what it needs to forward to?
If your proxy only needs to forward each message to one target, then it can do so in its -forwardingTargetForSelector: at runtime. If not (e.g. you need to forward to multiple targets or do other complicated manipulation), you need to implement -forwardInvocation: to handle it. Using -forwardInvocation: to handle calls requires you to implement -messageSignatureForSelector: because it needs to get the method signature in order to be able to create the invocation. (Even if you forward it to another object, that object also needs to either implement the method directly, add the method in response to +resolveInstanceMethod:, or handle it using -forwardInvocation:, all of which requires it to also have the signature.)
A method signature encodes the types of the arguments and return type. The reason that this information is needed for an invocation is that when these arguments are passed, they are laid out at compile-time (probably consecutively) in memory according to their types in the declaration. A large struct parameter is going to take up more space than an int parameter. A double is also probably bigger than an int. An invocation needs to store all of these arguments and let you access or change them by index. There is no way to figure out how the arguments are laid out at runtime unless you knew the types (or at least the sizes of the types).
Also, the message passing mechanism is different for methods that return structs (they call objc_msgSend_stret) from other methods (they call objc_msgSend) (and on some platforms, methods that return doubles use objc_msgSend_fpret). In the former case, the struct is not returned directly, but the location to write to is passed as an extra pointer argument as an out parameter. So knowing the return type is also critical to handling the call and the return value in the invocation.
Even if you are forwarding the invocation to some other object, ultimately that object (or some object it forwards to down the line) has to know the method signature somehow. So why not ask that object for the signature of the selector when you need it?
There is no "safe" signature that will work for all things, because different types have different sizes.

How does an Objective-C method have access to the callee's ivars?

I was reading Apple's documentation, The Objective-C Programming Language (PDF link). On pg. 18, under The Receiver’s Instance Variables, I saw this.
A method has automatic access to the receiving object’s instance
variables. You don’t need to pass them to the method as parameters.
For example, the primaryColor method illustrated above takes no
parameters, yet it can find the primary color for otherRect and return
it. Every method assumes the receiver and its instance variables,
without having to declare them as parameters.
This convention simplifies Objective-C source code. It also supports
the way object-oriented programmers think about objects and messages.
Messages are sent to receivers much as letters are delivered to your
home. Message parameters bring information from the outside to the
receiver; they don’t need to bring the receiver to itself.
I am trying to better understand what they are describing; is this like Python's self parameter, or style?
Objective-C is a strict superset of C.
So Objective-C methods are "just" function pointers, and instances are "just" C structs.
A method has two hidden parameters. The first one is self(the current instance), the second _cmd (the method's selector).
But what the documentation is describing in page 18 is the access to the class instance variables from a method.
It just says a method of a class can access the instance variables of that class.
It's pretty basic from an object-oriented perspective, but not from a C perspective.
It also say that you can't access instance variables from another class instance, unless they are public.
While I would not say that it is a "slam" against Python, it is most certainly referring to the Python style of Object Orientation (which, in honesty, is derived from the "pseudo-object orientation" available in C (whether it is truly OO or not is a debate for another forum)).
It is good to remember that Python has a very different concept of scope from the rest of the world — each method more or less exists in its own little reality. This is contrasted with more "self-aware" languages which either have a "this" variable or an implicit instance construct of some form.

'Calling a method' OR 'sending a message' in Objective C

In C or any ECMAscript based language you 'call a public method or function' on an object. But in documentation for Objective C, there are no public method calls, only the sending of messages.
Is there anything wrong in thinking that when you 'send a message' in ObjC you are actually 'calling a public method on an Object'.?
Theoretically, they're different.
Practically, not so much.
They're different in that in Objective-C, objects can choose to not respond to messages, or forward messages on to different objects, or whatever. In languages like C, function calls are really just jumping to a certain spot in memory and executing code. There's no dynamic behavior involved.
However, in standard use cases, when you send a message to an object, the method that the message represented will usually end up being called. So about 99% of the time, sending a message will result in calling a method. As such, we often say "call a method" when we really mean "send a message". So practically, they're almost always the same, but they don't have to be.
A while ago, I waxed philosophical on this topic and blogged about it: http://davedelong.tumblr.com/post/58428190187/an-observation-on-objective-c
edit
To directly answer your question, there's usually nothing wrong with saying "calling a method" instead of "sending a message". However, it's important to understand that there is a very significant implementation difference.
(And as an aside, my personal preference is to say "invoke a method on an object")
Because of Objective-C's dynamic messaging dispatch, message sending is actually different from calling a C function or a C++ method (although eventually, a C function will be called). Messages are sent through selectors to the receiving object, which either responds to the message by invoking an IMP (a C function pointer) or by forwarding the message to its superclass. If no class in the inheritance chain responds to the message, an exception is thrown. It's also possible to intercept a message and forward it to a wholly different class (this is what NSProxy subclasses do).
When using Objective-C, there isn't a huge difference between message sending and C++-style method calling, but there are a few practical implications of the message passing system that I know of:
Since the message processing happens at runtime, instead of compile time, there's no compile-time way to know whether a class responds to any particular message. This is why you usually get compiler warnings instead of errors when you misspell a method, for instance.
You can safely send any message to nil, allowing for idioms like [foo release] without worrying about checking for NULL.
As #CrazyJugglerDrummer says, message dispatching allows you to send messages to a lot of objects at a time without worrying about whether they will respond to them. This allows informal protocols and sending messages to all objects in a container.
I'm not 100% sure of this, but I think categories (adding methods to already-existing classes) are made possible through dynamic message dispatch.
Message sending allows for message forwarding (for instance with NSProxy subclasses).
Message sending allows you to do interesting low-level hacking such as method swizzling (exchanging implementations of methods at runtime).
No, there's nothing at all wrong with thinking of it like that. They are called messages because they are a layer of abstraction over functions. Part of this comes from Objective C's type system. A better understanding of messages helps:
full source on wikipedia (I've picked out some of the more relevant issues)
Internal names of the function are
rarely used directly. Generally,
messages are converted to function
calls defined in the Objective-C
runtime library. It is not necessarily
known at link time which method will
be called because the class of the
receiver (the object being sent the
message) need not be known until
runtime.
from same article:
The Objective-C model of
object-oriented programming is based
on message passing to object
instances. In Objective-C one does not
call a method; one sends a message. The object to which the
message is directed — the receiver —
is not guaranteed to respond to a
message, and if it does not, it simply
raises an exception.
Smalltalk-style programming
allows messages to go unimplemented,
with the method resolved to its
implementation at runtime. For
example, a message may be sent to a
collection of objects, to which only
some will be expected to respond,
without fear of producing runtime
errors. (The Cocoa platform takes
advantage of this, as all objects in a
Cocoa application are sent the
awakeFromNib: message as the
application launches. Objects may
respond by executing any
initialization required at launch.)
Message passing also does not require
that an object be defined at compile
time.
On a C function call, the compiler replaces the selector with a call to a function, and execution jumps in response to the function call.
In Objective-C methods are dynamically bound to messages, which means that method names are resolved to implementations at runtime. Specifically, the object is examined at runtime to see if it contains a pointer to an implementation for the given selector.
As a consequence, Objective-C lets you load and link new classes and categories while it’s running, and perform techniques like swizzling, categories, object proxies, and others. None of this is possible in C.
Was taught this in my Java class. I would say they only have realistic differences in multithreaded scenarios, where message-passing is a very legitimate and often-used technique.

Is objc_msgSend() the significant piece that makes Objective-C object oriented?

While reading the documentation, I wonder if objc_msgSend() is actually the "core technology" in delivering the functionality for making Objective-C "object oriented". Maybe someone can explain in more detail which other pieces come into place to enable the object oriented paradigm of Objective-C?
Not entirely.
Objective-C is object oriented solely because it encapsulates data and functionality into a single container; a class.
That is pretty much all there is to "object oriented programming".
Now, there are many different kinds of object oriented programming and one critical aspect is whether or not a language uses dynamic or static dispatch.
In a statically dispatched language -- C++ is the best example (yes, I know it has virtual methods that give a form of dynamic dispatch) -- a method call is wired up at compile time and cannot change at runtime. That is, the implementation of the method that will be used to fulfill the method call is fixed during compilation and cannot change at runtime.
With a dynamically dispatched language like Objective-C, the implementation of the method that will be used to fulfill a method call is determined each time the method call happens. Thus, through the use of categories or the runtime's API, it is possible to change a method's implementation while an application is running (this is actually how Key Value Observation works, for example).
objc_msgSend() is the hook that does the dynamic dispatch. It takes a reference to an object or a class & a method name -- a selector or SEL, as it is called -- and looks up the implementation on the object or class that goes by that method name. Once the implementation is found, it is called.
If no implementation is found, objc_msgSend() will then take a series of steps to see if the class or instance wants to handle the unrecognized method call somehow, allowing one object to stand in for another (proxying) or similar functionality.
There is a lot more to it than that. I would suggest you read Apple's documentation for more information.
There's quite a bit more to it.