inner classes in smalltalk - smalltalk

I wonder why smalltalk doesn't make use of java-style inner class. This mechanism effectively allows you to define a new instance of a new class, on-the-fly, where you need it, when you need it. It comes handy when you need an object conforming to some specific protocol but you don't want to create a normal class for it, because of its temporary and local nature being very implementation specific.
As far I know, it could be done easily, since syntax for subclassing is standard message sending. And you can pass self to it so it has the notion of the "outer" object. The only issue is anonymousity - the class should not be present in object browser and must be garbage collected when no instances of it exit.
The question is: Has anyone thought of this?

There are really two answers here:
1 - Yes, it is not hard to create anonymous classes that automatically get garbage collected. In Squeak they are called "uniclasses" because the typical use case is for adding methods to a single object. Systems that use this are for example Etoys and Tweak (although in Etoys the classes are actually put into the SystemDict for historic reasons). Here's some Squeak code I recently used for it:
newClass := ClassBuilder new
newSubclassOf: baseClass
type: baseClass typeOfClass
instanceVariables: instVars
from: nil.
baseClass removeSubclass: newClass.
^newClass
Typically, you would add a convenience method to do this. You can can then add methods, and create an instance, and when all instances are gone, the class will be gc'ed too.
Note that in Java, the class object is not gc'ed - an inner class is compiled exactly like a regular class, it's only hidden by the compiler. In contrast, in Smalltalk this all happens at runtime, even the compiling of new methods for this class, which makes it comparatively inefficient. There is a much better way to create anonymous precompiled behavior, which brings us to answer 2:
2 - Even though it's not hard, it's rarely used in Smalltalk. The reason for that is that Smalltalk has a much more convenient mechanism. Inner classes in Java are most often used for making up a method on the fly implementing a specific interface. The inner class declaration is only needed to make the compiler happy for type safety. In Smalltalk, you simply use block closures. This lets you create behavior on the fly that you can pass around. The system libraries are structured in a way to make use of block closures.
I personally never felt that inner classes were something Smalltalk needed.

If you are thinking of using inner classes for tests, then you can also take a look to the class ClassFactoryForTestCase

Creating an anonymous class(es) in smalltalk is a piece of cake.
More than that, any object which has 3 its instance variables properly set to: its superclass, method dictionary and instance format could serve as a class (have instances).
So, i don't see how the problem here.
If you talking about tool(s) support, like browsing or navigating code which contained in such classes, this is different story. Because by default all classes in system are public, and system dictionary is a flat namespace of them (yes , some implementations has namespaces). This simple model works quite well most of the times.

I am pretty sure it could be done with some hacking around the Class and Metaclass protocol. And the question pops quite often from people who have more experience in Java, and Smalltalk becomes interesting to them. Since inner classes have not been implemented inspite of that, I take it to be the sign that most Smalltalk users do not find them usable. This might be because Smalltalk has blocks, which in simpler manner solve many if not all problems that led to the introduction of inner classes to Java.

(a) You could send the messages to create a new class from inside the method of another class
(b) I doubt that there is any benefit in hiding the resulting class from the introspection system
(c) The reason you use inner classes in Java is because there are no first-class functions. If you need to pass a piece of code in Smalltalk, you just pass a block. You don't need to wrap it up with some other type of object to do so.

The problem (in Squeak at least) comes from the lack of a clean separation of concerns. It's trivial to create your own subclass and put it in a private SystemDictionary:
myEnv := SystemDictionary new.
myClass := ClassBuilder new
name: 'MyClass'
inEnvironment: myEnv
subclassOf: Object
type: #normal
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'MyCategory'
unsafe: false.
But even though you put that class in your own SystemDictionary, the 'MyCategory' category added to the system navigation (verifiable by opening a Browser), and - worse - the class organisers aren't created, so when you navigate to MyClass you get a nil pointer.
It's certainly not impossible, theoretically. Right now the tooling's geared towards a single pool of globally visible class definitions.

Related

extending objects at run-time via categories?

Objective-C’s objects are pretty flexible when compared to similar languages like C++ and can be extended at runtime via Categories or through runtime functions.
Any idea what this sentence means? I am relatively new to Objective-C
While technically true, it may be confusing to the reader to call category extension "at runtime." As Justin Meiners explains, categories allow you to add additional methods to an existing class without requiring access to the existing class's source code. The use of categories is fairly common in Objective-C, though there are some dangers. If two different categories add the same method to the same class, then the behavior is undefined. Since you cannot know whether some other part of the system (perhaps even a system library) adds a category method, you typically must add a prefix to prevent collisions (for example rather than swappedString, a better name would likely be something like rnc_swappedString if this were part of RNCryptor for instance.)
As I said, it is technically true that categories are added at runtime, but from the programmer's point of view, categories are written as though just part of the class, so most people think of them as being a compile-time choice. It is very rare to decide at runtime whether to add a category method or not.
As a beginner, you should be aware of categories, but slow to create new ones. Creating categories is a somewhat intermediate-level skill. It's not something to avoid, but not something you'll use every day. It's very easy to overuse them. See Justin's link for more information.
On the other hand, "runtime functions" really do add new functionality to existing classes or even specific objects at runtime, and are completely under the control of code. You can, at runtime, modify a class such that it responds to a method it didn't previously respond to. You can even generate entirely new classes at runtime that did not exist when the program was compiled, and you can change the class of existing objects. (This is exactly how Key-Value Observation is implemented.)
Modifying classes and objects using the runtime is an advanced skill. You should not even consider using these techniques in production code until you have significant experience. And when you have that experience, it will tell you that you very seldom what to do this anyway. You will know the runtime functions because they are C-based, with names like method_exchangeImplmentations. You won't mistake them for normal ObjC (and you generally have to import objc/runtime.h to get to them.)
There is a middle-ground that bleeds into runtime manipulation called message forwarding and dynamic message resolution. This is often used for proxy objects, and is implemented with -forwardingTargetForSelector, +resolveInstanceMethod, and some similar methods. These are tools that allow classes to modify themselves at runtime, and is much less dangerous than modifying other classes (i.e. "swizzling").
It's also important to consider how all of this translates to Swift. In general, Swift has discouraged and restricted the use of runtime class manipulation, but it embraces (and improves) category-like extensions. By the time you're experienced enough to dig into the runtime, you will likely find it an even more obscure skill than it is today. But you will use extensions (Swift's version of categories) in every program.
A category allows you to add functionality to an existing class that you do not have access to source code for (System frameworks, 3rd party APIs etc). This functionality is possible by adding methods to a class at runtime.
For example lets say I wanted to add a method to NSString that swapped uppercase and lowercase letters called -swappedString. In static languages (such as C++), extending classes like this is more difficult. I would have to create a subclass of NSString (or a helper function). While my own code could take advantage of my subclass, any instance created in a library would not use my subclass and would not have my method.
Using categories I can extend any class, such as adding a -swappedString method and use it on any instance of the class, such asNSString transparently [anyString swappedString];.
You can learn more details from Apple's Docs

Inconsistencies in smalltalk

I'm a new comer to Smalltalk, and learned it in Squeak. But I find many things confusing in Smalltalk. In Squeak, MetaClass and MetaClass class are each other's class mutually. If I want to create the object MetaClass I should send a message new to its class which is MetaClass class. But it must have already existed as an object in the first place to accept the message. So I must create the object MetaClass class first, which can only be done by sending a message new to the object MetaClass which has not been created yet. So it is a chicken-or-the-egg problem.
Of course I can create the objects in Squeak now, because the MetaClass and MetaClass class objects have already been created auto-magically when Squeak is opened. But I don't know how. Maybe they are created somehow rather by sending messages. But then it contradicts Smalltalk's spirits: everything happens by sending messages except a few points (variable declaration, assignments, returns and primitives).
Is there something wrong with the above reasoning? Thanks in advance.
Your question is twofold, lets answer them separately.
How do mutual dependent classes get created?
You are right, Metaclass and Metaclass class are a singularity in the parallel hierarchy of the Smalltalk classes and metaclasses. How are they created?
That depends on the Smalltalk you are using. For GNU Smalltalk I am unsure, but for the descendants of the original Smalltalk-80 (VisualWorks, VA aka VisualAge, SqueakPharo) the are created in a Bootstrap process that creates an initial image.
However, at least for Squeak, this bootstrap happened at least 15 years ago, if not more. Metaclass and its class may even be as old as 30 years.
Long story short, both classes are created outside the typical image processing and linked together manually.
But if the objects are years old, that leads to the question
What happens at Smalltalk’s startup?
Contrary to languages like Ruby or Python, which are object-oriented, too, Smalltalk does not need to create a basic object environment with things like Object on every startup. Why?
When Smalltalk saves and shuts down, it basically takes a snapshot of all its object and saves those live object to a file. When it starts up again, it just has to read the objects from the snapshot and “revive” them.
Hence, for Metaclass and Metaclass class, both objects are read from the snapshot and revived, and from this point on, they are fully functional; they don’t need to be manually created anymore.
The 'automagically created' process actually is called bootstrapping. This is how the chicken-and-egg problem gets solved. Once the system is bootstrapped, all the rest can be expressed in terms of the system itself. So, there is no contradiction with Smalltalk's philosophy that everything happens by sending messages because it only becomes a Smalltalk system once it's bootstrapped.
Metaclass class class = Metaclass is a classical academic example of strange loop. But if you inquire a bit, you could find many others in Smalltalk.
Object superclass is nil which is an instance of UndefinedObject which is a subclass of Object (longer chain via ProtoObject in Squeak Object superclass superclass class superclass = Object)
The methods of MethodDictionary are stored in an instance of MethodDictionary (MethodDictionary methodDictionary class = MethodDictionary).
The name of Symbol is an instance of Symbol (works with ByteSymbol in Squeak ByteSymbol name class = ByteSymbol).
The subclasses of ArrayedCollection are stored in an instance of Array which is a subclass of ArrayedCollection (Array superclass subclasses class = Array).
Smalltalk is a SystemDictionary which points to Smalltalk via the #Smalltalk key (This is less direct in Squeak, (Smalltalk globals at: #Smalltalk) = Smalltalk).
I let you continue the list by yourself.
Whatever the implementation, the ultimate question is whether or not you can devise a self-describing system like Smalltalk without these strange loops, and you may glimpse a not so positive answer if you follow the sublinks of http://en.wikipedia.org/wiki/Kurt_G%C3%B6del#The_Incompleteness_Theorem
Related to the the bootstrap problem encountered with such system, an efficient way is to clone oneself to change oneself, and this is particularly true in Smalltalk image when you want to change base classes that you are using for changing / describing classes.
Hence my previous and concise answer which was deleted by applying the letter of the rules (https://stackoverflow.com/help/deleted-answers) more than the spirit in my opinion:
And here is how it was resolved: http://en.wikipedia.org/wiki/Drawing_Hands
Last point, I would have preferred to read, Incredible Consistency of Smalltalk, but I'm definitely biased.

can overriding of a method be prevented by downcasting to a superclass?

I'm trying to understand whether the answer to the following question is the same in all major OOP languages; and if not, then how do those languages differ.
Suppose I have class A that defines methods act and jump; method act calls method jump. A's subclass B overrides method jump (i.e., the appropriate syntax is used to ensure that whenever jump is called, the implementation in class B is used).
I have object b of class B. I want it to behave exactly as if it was of class A. In other words, I want the jump to be performed using the implementation in A. What are my options in different languages?
For example, can I achieve this with some form of downcasting? Or perhaps by creating a proxy object that knows which methods to call?
I would want to avoid creating a brand new object of class A and carefully setting up the sharing of internal state between a and b because that's obviously not future-proof, and complicated. I would also want to avoid copying the state of b into a brand new object of class A because there might be a lot of data to copy.
UPDATE
I asked this question specifically about Python, but it seems this is impossible to achieve in Python and technically it can be done... kinda..
It appears that apart from technical feasibility, there's a strong argument against doing this from a design perspective. I'm asking about that in a separate question.
The comments reiterated: Prefer composition over inheritance.
Inheritance works well when your subclasses have well defined behavioural differences from their superclass, but you'll frequently hit a point where that model gets awkward or stops making sense. At that point, you need to reconsider your design.
Composition is usually the better solution. Delegating your object's varying behaviour to a different object (or objects) may reduce or eliminate your need for subclassing.
In your case, the behavioural differences between class A and class B could be encapsulated in the Strategy pattern. You could then change the behaviour of class A (and class B, if still required) at the instance level, simply by assigning a new strategy.
The Strategy pattern may require more code in the short run, but it's clean and maintainable. Method swizzling, monkey patching, and all those cool things that allow us to poke around in our specific language implementation are fun, but the potential for unexpected side effects is high and the code tends to be difficult to maintain.
What you are asking is completely unrelated/unsupported by OOP programming.
If you subclass an object A with class B and override its methods, when a concrete instance of B is created then all the overriden/new implementation of the base methods are associated with it (either we talk about Java or C++ with virtual tables etc).
You have instantiated object B.
Why would you expect that you could/would/should be able to call the method of the superclass if you have overriden that method?
You could call it explicitely of course e.g. by calling super inside the method, but you can not do it automatically, and casting will not help you do that either.
I can't imagine why you would want to do that.
If you need to use class A then use class A.
If you need to override its functionality then use its subclass B.
Most programming languages go to some trouble to support dynamic dispatch of virtual functions (the case of calling the overridden method jump in a subclass instead of the parent class's implementation) -- to the degree that working around it or avoiding it is difficult. In general, specialization/polymorphism is a desirable feature -- arguably a goal of OOP in the first place.
Take a look at the Wikipedia article on Virtual Functions, which gives a useful overview of the support for virtual functions in many programming languages. It will give you a place to start when considering a specific language, as well as the trade-offs to weigh when looking at a language where the programmer can control how dispatch behaves (see the section on C++, for example).
So loosely, the answer to your question is, "No, the behavior is not the same in all programming languages." Furthermore, there is no language independent solution. C++ may be your best bet if you need the behavior.
You can actually do this with Python (sort of), with some awful hacks. It requires that you implement something like the wrappers we were discussing in your first Python-specific question, but as a subclass of B. You then need to implement write-proxying as well (the wrapper object shouldn't contain any of the state normally associated with the class hierarchy, it should redirect all attribute access to the underlying instance of B.
But rather than redirecting method lookup to A and then calling the method with the wrapped instance, you'd call the method passing the wrapper object as self. This is legal because the wrapper class is a subclass of B, so the wrapper instance is an instance of the classes whose methods you're calling.
This would be very strange code, requiring you to dynamically generate classes using both IS-A and HAS-A relationships at the same time. It would probably also end up fairly fragile and have bizarre results in a lot of corner cases (you generally can't write 100% perfect wrapper classes in Python exactly because this sort of strange thing is possible).
I'm completely leaving aside weather this is a good idea or not.

Is it good convention for a class to perform functions on itself?

I've always been taught that if you are doing something to an object, that should be an external thing, so one would Save(Class) rather than having the object save itself: Class.Save().
I've noticed that in the .Net libraries, it is common to have a class modify itself as with String.Format() or sort itself as with List.Sort().
My question is, in strict OOP is it appropriate to have a class which performs functions on itself when called to do so, or should such functions be external and called on an object of the class' type?
Great question. I have just recently reflected on a very similar issue and was eventually going to ask much the same thing here on SO.
In OOP textbooks, you sometimes see examples such as Dog.Bark(), or Person.SayHello(). I have come to the conclusion that those are bad examples. When you call those methods, you make a dog bark, or a person say hello. However, in the real world, you couldn't do this; a dog decides himself when it's going to bark. A person decides itself when it will say hello to someone. Therefore, these methods would more appropriately be modelled as events (where supported by the programming language).
You would e.g. have a function Attack(Dog), PlayWith(Dog), or Greet(Person) which would trigger the appropriate events.
Attack(dog) // triggers the Dog.Bark event
Greet(johnDoe) // triggers the Person.SaysHello event
As soon as you have more than one parameter, it won't be so easy deciding how to best write the code. Let's say I want to store a new item, say an integer, into a collection. There's many ways to formulate this; for example:
StoreInto(1, collection) // the "classic" procedural approach
1.StoreInto(collection) // possible in .NET with extension methods
Store(1).Into(collection) // possible by using state-keeping temporary objects
According to the thinking laid out above, the last variant would be the preferred one, because it doesn't force an object (the 1) to do something to itself. However, if you follow that programming style, it will soon become clear that this fluent interface-like code is quite verbose, and while it's easy to read, it can be tiring to write or even hard to remember the exact syntax.
P.S.: Concerning global functions: In the case of .NET (which you mentioned in your question), you don't have much choice, since the .NET languages do not provide for global functions. I think these would be technically possible with the CLI, but the languages disallow that feature. F# has global functions, but they can only be used from C# or VB.NET when they are packed into a module. I believe Java also doesn't have global functions.
I have come across scenarios where this lack is a pity (e.g. with fluent interface implementations). But generally, we're probably better off without global functions, as some developers might always fall back into old habits, and leave a procedural codebase for an OOP developer to maintain. Yikes.
Btw., in VB.NET, however, you can mimick global functions by using modules. Example:
Globals.vb:
Module Globals
Public Sub Save(ByVal obj As SomeClass)
...
End Sub
End Module
Demo.vb:
Imports Globals
...
Dim obj As SomeClass = ...
Save(obj)
I guess the answer is "It Depends"... for Persistence of an object I would side with having that behavior defined within a separate repository object. So with your Save() example I might have this:
repository.Save(class)
However with an Airplane object you may want the class to know how to fly with a method like so:
airplane.Fly()
This is one of the examples I've seen from Fowler about an aenemic data model. I don't think in this case you would want to have a separate service like this:
new airplaneService().Fly(airplane)
With static methods and extension methods it makes a ton of sense like in your List.Sort() example. So it depends on your usage pattens. You wouldn't want to have to new up an instance of a ListSorter class just to be able to sort a list like this:
new listSorter().Sort(list)
In strict OOP (Smalltalk or Ruby), all methods belong to an instance object or a class object. In "real" OOP (like C++ or C#), you will have static methods that essentially stand completely on their own.
Going back to strict OOP, I'm more familiar with Ruby, and Ruby has several "pairs" of methods that either return a modified copy or return the object in place -- a method ending with a ! indicates that the message modifies its receiver. For instance:
>> s = 'hello'
=> "hello"
>> s.reverse
=> "olleh"
>> s
=> "hello"
>> s.reverse!
=> "olleh"
>> s
=> "olleh"
The key is to find some middle ground between pure OOP and pure procedural that works for what you need to do. A Class should do only one thing (and do it well). Most of the time, that won't include saving itself to disk, but that doesn't mean Class shouldn't know how to serialize itself to a stream, for instance.
I'm not sure what distinction you seem to be drawing when you say "doing something to an object". In many if not most cases, the class itself is the best place to define its operations, as under "strict OOP" it is the only code that has access to internal state on which those operations depend (information hiding, encapsulation, ...).
That said, if you have an operation which applies to several otherwise unrelated types, then it might make sense for each type to expose an interface which lets the operation do most of the work in a more or less standard way. To tie it in to your example, several classes might implement an interface ISaveable which exposes a Save method on each. Individual Save methods take advantage of their access to internal class state, but given a collection of ISaveable instances, some external code could define an operation for saving them to a custom store of some kind without having to know the messy details.
It depends on what information is needed to do the work. If the work is unrelated to the class (mostly equivalently, can be made to work on virtually any class with a common interface), for example, std::sort, then make it a free function. If it must know the internals, make it a member function.
Edit: Another important consideration is performance. In-place sorting, for example, can be miles faster than returning a new, sorted, copy. This is why quicksort is faster than merge sort in the vast majority of cases, even though merge sort is theoretically faster, which is because quicksort can be performed in-place, whereas I've never heard of an in-place merge-sort. Just because it's technically possible to perform an operation within the class's public interface, doesn't mean that you actually should.

Discover subclasses of a given class in Obj-C

Is there any way to discover at runtime which subclasses exist of a given class?
Edit: From the answers so far I think I need to clarify a bit more what I am trying to do. I am aware that this is not a common practice in Cocoa, and that it may come with some caveats.
I am writing a parser using the dynamic creation pattern. (See the book Cocoa Design Patterns by Buck and Yacktman, chapter 5.) Basically, the parser instance processes a stack, and instantiates objects that know how to perform certain calculations.
If I can get all the subclasses of the MYCommand class, I can, for example, provide the user with a list of available commands. Also, in the example from chapter 5, the parser has an substitution dictionary so operators like +, -, * and / can be used. (They are mapped to MYAddCommand, etc.) To me it seemed this information belonged in the MyCommand subclass, not the parser instance as it kinda defeats the idea of dynamic creation.
Not directly, no. You can however get a list of all classes registered with the runtime as well as query those classes for their direct superclass. Keep in mind that this doesn't allow you to find all ancestors for the class up the inheritance tree, just the immediate superclass.
You can use objc_getClassList() to get the list of Class objects registered with the runtime. Then you can loop over that array and call [NSObject superclass] on those Class objects to get their superclass' Class object. If for some reason your classes do not use NSObject as their root class, you can use class_getSuperclass() instead.
I should mention as well that you might be thinking about your application's design incorrectly if you feel it is necessary to do this kind of discovery. Most likely there is another, more conventional way to do what you are trying to accomplish that doesn't involve introspecting on the Objective-C runtime.
Rather than try to automatically register all the subclasses of MYCommand, why not split the problem in two?
First, provide API for registering a class, something like +[MYCommand registerClass:].
Then, create code in MYCommand that means any subclasses will automatically register themselves. Something like:
#implementation MYCommand
+ (void)load
{
[MYCommand registerClass:self];
}
#end
Marc and bbum hit it on the money. This is usually not a good idea.
However, we have code on our CocoaHeads wiki that does this: http://cocoaheads.byu.edu/wiki/getting-all-subclasses
Another approach was just published by Matt Gallagher on his blog.
There's code in my runtime browser project here that includes a -subclassNamesForClass: method. See the RuntimeReporter.[hm] files.