What is the idiomatic way to define an enum type in Smalltalk? - smalltalk

Like in Java, C# etc. How would I define something like enum Direction { input, output } in Smalltalk?

Trivial Approach
The most trivial approach is to have class-side methods returning symbols or other basic objects (such as integers).
So you can write your example as follows:
Direction class>>input
^ #input
Direction class>>output
^ #output
And the usage:
Direction input
The main downsides are:
any other "enum" that happens to return the same value will equals to this one, even though the enums are different (you could return e.g. ^ self name + '::input')
during debugging, you see the actual value of the object, which is especially ugly for number-based enums (Uh... what does this 7 mean?)
Object Approach
A better way is to create your own enum object and return instances of it.
Such object should:
override = and hash, so you can compare them by your value and use the enum as key in hashed collections (dictionaries)
store the actual unique value representation
have custom printOn: method to ease debugging
It could look something like this
Object subclass: #DirectionEnum
slots: { #value }
classVariables: { }
category: 'DirectionEnum'
"enum accessors"
DirectionEnum class>>input
^ self new value: #input
DirectionEnum class>>output
^ self new value: #output
^ value
DirectionEnum>>value: aValue
value := aValue
DirectionEnum>>= anEnum
^ self class = anEnum class and: [ self value = anEnum value ]
^ self class hash bitXor: self value hash
DirectionEnum>>printOn: aStream
super printOn: aStream.
aStream << '(' << self value asString << ')'
the usage is still the same
Direction input.
DirectionEnum output asString. "'a DirectionEnum(output)'"
and the problems mentioned in the trivial approach are resolved.
Obviously this is much more work, but the result is better. If you have many enums, then it could make sense to move the basic behavior to a new superclass Enum, and then DirectionEnum would just need to contain the class-side methods.

The closest Smalltalk feature to an enum type is the SharedPool (a.k.a. PoolDictionary). Therefore, if you are porting some enum from, say, Java to Smalltalk, you might want to use a SharedPool. Here is how to do so:
For every enum in your type you will define an association in the pool with key the type name and value the type value.
In some dialects PoolDictionaries are dictionaries, in Pharo they are subclasses of SharedPool. In Pharo, therefore, you declare all the type names as class variables. Then you associate values to keys in an initialization method (class side).
For example, you could define a subclass of SharedPool named ColorConstants with the class variables 'Red', 'Green', 'Blue', 'Black', 'White', etc., like this:
subclass: #ColorConstants
instanceVariableNames: ''
classVariableNames: 'Red Green Blue Black White'
poolDictionaries: ''
package: 'MyPackage'
To associate names with values, add a class side initialization method on the lines of:
ColorConstants class >> initialize
Red := Color r: 1 g: 0 b: 0.
Green := Color r: 0 g: 1 b: 0.
Blue := Color r: 0 g: 0 b: 1.
Black := Color r: 0 g: 0 b: 0.
White := Color r: 1 g: 1 b: 1.
"and so on..."
Once you evaluate ColorConstants initialize you will be able to use the pool in your class
subclass: #MyClass
instanceVariableNames: 'blah'
classVariableNames: ''
poolDictionaries: 'ColorConstants'
package: 'MyPackage'
In MyClass (and its subclasses) you can refer to the colors by name:
MyClass >> displayError: aString
self display: aString foreground: Red background: White
MyClass >> displayOk: aString
self display: aString foreground: Green background: Black

For simple cases, just use symbols as Peter suggested - you could also just store the possible values in an IdentityDictionary.
If you mean the more powerful kind of enums that are available in Java (where they can be more than just a type of named constant; they can have behaviour, complex attributes, etc.), then I'd take it a step further than Peter and just create a subclass for each enum type. Even if you're talking a large number of enums (your use case seems to need only two), don't be afraid of having many subclasses that only have one or two methods in them that are just used to differentiate them from each other, with most of the work done in the common superclass.

Since Smalltalk is dynamically typed, you cannot restrict the value of a variable to a subset of objects anyway, so there is no difference between enum members and global constants, except for the namespacing through the enum name.
Edit: For options how to define your enum constants, see Peter's answer. Just let me mention that you can also use symbols directly if it is sufficient for your needs. direction := #input. direction := #output


Mixing Private and Public Attributes and Accessors in Raku

#Private attribute example
class C {
has $!w; #private attribute
multi method w { $!w } #getter method
multi method w ( $_ ) { #setter method
warn “Don’t go changing my w!”; #some side action
$!w = $_
my $c = C.new
$c.w( 42 )
say $c.w #prints 42
$c.w: 43
say $c.w #prints 43
#but not
$c.w = 44
Cannot modify an immutable Int (43)
so far, so reasonable, and then
#Public attribute example
class C {
has $.v is rw #public attribute with automatic accessors
my $c = C.new
$c.v = 42
say $c.v #prints 42
#but not
$c.v( 43 ) #or $c.v: 43
Too many positionals passed; expected 1 argument but got 2
I like the immediacy of the ‘=‘ assignment, but I need the ease of bunging in side actions that multi methods provide. I understand that these are two different worlds, and that they do not mix.
BUT - I do not understand why I can’t just go
$c.v( 43 )
To set a public attribute
I feel that raku is guiding me to not mix these two modes - some attributes private and some public and that the pressure is towards the method method (with some : sugar from the colon) - is this the intent of Raku's design?
Am I missing something?
is this the intent of Raku's design?
It's fair to say that Raku isn't entirely unopinionated in this area. Your question touches on two themes in Raku's design, which are both worth a little discussion.
Raku has first-class l-values
Raku makes plentiful use of l-values being a first-class thing. When we write:
has $.x is rw;
The method that is generated is:
method x() is rw { $!x }
The is rw here indicates that the method is returning an l-value - that is, something that can be assigned to. Thus when we write:
$obj.x = 42;
This is not syntactic sugar: it really is a method call, and then the assignment operator being applied to the result of it. This works out, because the method call returns the Scalar container of the attribute, which can then be assigned into. One can use binding to split this into two steps, to see it's not a trivial syntactic transform. For example, this:
my $target := $obj.x;
$target = 42;
Would be assigning to the object attribute. This same mechanism is behind numerous other features, including list assignment. For example, this:
($x, $y) = "foo", "bar";
Works by constructing a List containing the containers $x and $y, and then the assignment operator in this case iterates each side pairwise to do the assignment. This means we can use rw object accessors there:
($obj.x, $obj.y) = "foo", "bar";
And it all just naturally works. This is also the mechanism behind assigning to slices of arrays and hashes.
One can also use Proxy in order to create an l-value container where the behavior of reading and writing it are under your control. Thus, you could put the side-actions into STORE. However...
Raku encourages semantic methods over "setters"
When we describe OO, terms like "encapsulation" and "data hiding" often come up. The key idea here is that the state model inside the object - that is, the way it chooses to represent the data it needs in order to implement its behaviors (the methods) - is free to evolve, for example to handle new requirements. The more complex the object, the more liberating this becomes.
However, getters and setters are methods that have an implicit connection with the state. While we might claim we're achieving data hiding because we're calling a method, not accessing state directly, my experience is that we quickly end up at a place where outside code is making sequences of setter calls to achieve an operation - which is a form of the feature envy anti-pattern. And if we're doing that, it's pretty certain we'll end up with logic outside of the object that does a mix of getter and setter operations to achieve an operation. Really, these operations should have been exposed as methods with a names that describes what is being achieved. This becomes even more important if we're in a concurrent setting; a well-designed object is often fairly easy to protect at the method boundary.
That said, many uses of class are really record/product types: they exist to simply group together a bunch of data items. It's no accident that the . sigil doesn't just generate an accessor, but also:
Opts the attribute into being set by the default object initialization logic (that is, a class Point { has $.x; has $.y; } can be instantiated as Point.new(x => 1, y => 2)), and also renders that in the .raku dumping method.
Opts the attribute into the default .Capture object, meaning we can use it in destructuring (e.g. sub translated(Point (:$x, :$y)) { ... }).
Which are the things you'd want if you were writing in a more procedural or functional style and using class as a means to define a record type.
The Raku design is not optimized for doing clever things in setters, because that is considered a poor thing to optimize for. It's beyond what's needed for a record type; in some languages we could argue we want to do validation of what's being assigned, but in Raku we can turn to subset types for that. At the same time, if we're really doing an OO design, then we want an API of meaningful behaviors that hides the state model, rather than to be thinking in terms of getters/setters, which tend to lead to a failure to colocate data and behavior, which is much of the point of doing OO anyway.
BUT - I do not understand why I can’t just go $c.v( 43 ) To set a public attribute
Well, that's really up to the architect. But seriously, no, that's simply not the standard way Raku works.
Now, it would be entirely possible to create an Attribute trait in module space, something like is settable, that would create an alternate accessor method that would accept a single value to set the value. The problem with doing this in core is, is that I think there are basically 2 camps in the world about the return value of such a mutator: would it return the new value, or the old value?
Please contact me if you're interested in implementing such a trait in module space.
I currently suspect you just got confused.1 Before I touch on that, let's start over with what you're not confused about:
I like the immediacy of the = assignment, but I need the ease of bunging in side actions that multi methods provide. ... I do not understand why I can’t just go $c.v( 43 ) To set a public attribute
You can do all of these things. That is to say you use = assignment, and multi methods, and "just go $c.v( 43 )", all at the same time if you want to:
class C {
has $!v;
multi method v is rw { $!v }
multi method v ( :$trace! ) is rw { say 'trace'; $!v }
multi method v ( $new-value ) { say 'new-value'; $!v = $new-value }
my $c = C.new;
$c.v = 41;
say $c.v; # 41
$c.v(:trace) = 42; # trace
say $c.v; # 42
$c.v(43); # new-value
say $c.v; # 43
A possible source of confusion1
Behind the scenes, has $.foo is rw generates an attribute and a single method along the lines of:
has $!foo;
method foo () is rw { $!foo }
The above isn't quite right though. Given the behavior we're seeing, the compiler's autogenerated foo method is somehow being declared in such a way that any new method of the same name silently shadows it.2
So if you want one or more custom methods with the same name as an attribute you must manually replicate the automatically generated method if you wish to retain the behavior it would normally be responsible for.
1 See jnthn's answer for a clear, thorough, authoritative accounting of Raku's opinion about private vs public getters/setters and what it does behind the scenes when you declare public getters/setters (i.e. write has $.foo).
2 If an autogenerated accessor method for an attribute was declared only, then Raku would, I presume, throw an exception if a method with the same name was declared. If it were declared multi, then it should not be shadowed if the new method was also declared multi, and should throw an exception if not. So the autogenerated accessor is being declared with neither only nor multi but instead in some way that allows silent shadowing.

Declaring a variable belonging to a user-defined class in Perl 6

When I declare a variable, whose value belongs to a built-in class, I simply write
my Int $a;
But when I want to use a user-defined class, I have to use Classname.new.
my class House {
has $.area is rw;
my $house1 = House.new;
$house1.area = 200;
say $house1.area;
So, my naïve question is, what's the reason of that difference? Why can't we simply write my House $house1?
My ultimate goal is to use an array whose values are instances of a user-defined class. How can I do the following correctly?
my #houses ...;
#houses[10].area = 222;
my House $a does the same as my Int $a. It puts a restriction on the values that you can put in it. If you look at the content of the variable, you will get the type object of that restriction.
There is a trick that you can use though, so you don't have to repeat the House bit: my House $a .= new, which is the equivalent of my House $a = House.new.
To get back to your question: yes, you can do that with some trouble:
class House {
has $.area;
multi method area(House:U \SELF:) is raw {
(SELF = House.new).area
multi method area(House:D:) is raw {
my House #houses;
#houses[2].area = 42;
say #houses # [(House) (House) House.new(area => 42)]
We create two candidates for the accessor method: one taking an undefined type object, and the other an instantiated object. The first one modifies its invocant (assuming it to be a container that can be set), then calls the instantiated version of the method. I'm leaving this as an exercise to the reader to turn this into an Attribute trait.
When you write my Int $a; you will have a variable of type Int, but without value, or even container. The concrete value of $a will be (Int).
The same with my House $house; - you will get (House) value.
In your case you have to initialize array's elements by some House value. For example:
my #houses = House.new() xx 11;
#houses[10].area = 222;
I think you're missing the part that the compiler is doing some of the work for you. When you have a literal number, the parser recognizes it and constructs the right numeric object for it. There's a virtual and unseen Int.new() that has already happened for you in rakudo/src/Perl6/Actions.nqp. It's at the NQP level but it's the same idea.

Smalltalk initialization variables

In languages like Java and C++ we give parameters to constructors.
How do you do this in Pharo Smalltalk?
I want something like
aColor = Color new 'red'.
Or is this bad practice and should I always do
aColor = Color new.
aColor name:= red.d
The short answer is that you can do pretty much the same in Smalltalk. From the calling code it would look like:
aColor := Color named: 'Red'.
The long answer is that in Smalltalk you don't have constructors, at least not in the sense that you have a special message named after the class. What you do in Smalltalk is defining class-side messages (i.e. messages understood by the class, not the instance[*]) where you can instantiate and configure your instances. Assuming that your Color class has a name instance variable and a setter for it, the #named: method would be implemented like:
(class) Color>>named: aName
| color |
color := self new.
color name: aName.
Some things to note:
We are using the #new message sent to the class to create a new instance. You can think of the #new message as the primitive way for creating objects (hint: you can browse the implementors of the #new message to see how it is implemented).
We can define as many class methods as we want to create new 'configured' instances (e.g. Color fromHexa:) or return pre-created ones (e.g. Color blue).
You can still create an uninitialized instance by doing Color new. If you want to forbid that behavior then you must override the #new class message.
There are many good books that you can read about Smalltalk basics at Stef's Free Online Smalltalk Books
[*] This is quite natural due to the orthogonal nature on Smalltalk, since everything (including classes) is an object. If you are interested check Chapter 13 of Pharo by Example or any other reference to classes and metaclasses in Smalltalk.
In Smalltalk all member fields are strictly private and to assign to them you'll have to define assigning methods.
Color >> name: aString
name := aString
Then you could create your object like this:
aColor := (Color new)
name: 'red';
Commonly to reduce verbosity factory method is used:
Color class >> withName: aName
^ (self new)
name: aName;
With this you could create new objects like this:
aColor := Color withName: 'red'.

How to call appropriate subclass constructor inside base class constructor in MATLAB

I'm trying to use single inheritance in Matlab, and to write a base class constructor that allows the creation of arrays of objects, including empty arrays, and which is inherited by subclasses. I can't work out how to do it without using some incredibly clunky code. There must be a better way.
In this toy example, my base class is called MyBaseClass, and my subclass is called MySubClass. Each can be constructed with a single numeric argument, or no arguments (in which case NaN is assumed). In the toy example my SubClass is trivial and doesn't extend the behavior of MyBaseClass in any way, but obviously in practice it would do more stuff.
I want to be able to call the constructor of each as follows:
obj = MyBaseClass; % default constructor of 'NaN-like' object
obj = MyBaseClass([]); % create an empty 0x0 array of type MyBaseClass
obj = MyBaseClass(1); % create a 1x1 array of MyBaseClass with value 1
obj = MyBaseClass([1 2; 3 4]) % create a 2x2 array of MyBaseClass with values 1, 2, 3, 4.
And the same four calls for MySubClass.
The solution I have found needs to call eval(class(obj)) in order to recover the subclass name and construct code in strings to call while in the base class constructor. This seems clunky and bad. (And it's somewhat surprising to me that it's possible, but it is.) I guess I could duplicate more logic between the MyBaseClass and MySubClass constructors, but that also seems clunky and bad, and misses the point of inheritance. Is there a better way?
% MyBaseClass.m
classdef MyBaseClass
data = NaN
% constructor
function obj = MyBaseClass(varargin)
if nargin == 0
% Handle the no-argument case
arg = varargin{1};
% assume arg is a numeric array
if isempty(arg)
% Handle the case ClassName([])
% Can't write this, because of subclasses:
% obj = MyBaseClass.empty(size(arg));
obj = eval([class(obj) '.empty(size(arg))']);
% arg is an array
% Make obj an array of the correct size by allocating the nth
% element. Need to recurse for the no-argument case of the
% relevant class constructor, which might not be this one.
% Can't write this, because of subclasses
% obj(numel(arg)) = MyBaseClass;
obj(numel(arg)) = eval(class(obj));
% Rest of the constructor - obviously in this toy example,
% could be simplified.
wh = ~isnan(arg);
for i = find(wh(:))'
obj(i).data = arg(i);
% And reshape to the size of the original
obj = reshape(obj, size(arg));
% end of MyBaseClass.m
% MySubClass.m
classdef MySubClass < MyBaseClass
function obj = MySubClass(varargin)
obj = obj#MyBaseClass(varargin{:});
% end of MySubClass.m
Your solution is functional and embraces some loose MATLAB typing to achieve what you want. However, getting clean and structured OOP is probably going to require losing some of the functionality you want. At the same time, the best option for avoiding code duplication is templated/generic container classes but these are not supported in MATLAB at this time.
Your code mirrors the MATLAB documentation on Building Arrays in the Constructor and relies on MATLAB being a loosely typed language that enabled you to convert an object into an array of objects without problem. Exploiting this powerful and flexible feature of MATLAB does introduce some organizational issues and may undermine your efforts at clean, object oriented code.
Problems begin because the MyBaseClass constructor is not a true constructor for MyBaseClass.
Wikipedia says:
"In object-oriented programming, a constructor (sometimes shortened to ctor) in a class is a special type of subroutine called at the creation of an object. It prepares the new object for use, often accepting parameters which the constructor uses to set any member variables required when the object is first created. It is called a constructor because it constructs the values of data members of the class."
Notice that the MyBaseClass constructor is not constructing values for the object members. Instead, it is a function that sets the object equal to an array of objects of type MyBaseClass and tries to set their data members to some value. You can see where the obj is destroyed at set to an array here:
obj(numel(arg)) = eval(class(obj));
This behavior is especially unhelpful when you derive MySubClass from MyBaseClass because MyBaseClass isn’t supposed to assign a new object to the variable obj----MySubClass has already created the new object in obj and is simply asking MyBaseClass to construct the portion of the existing object in obj that MyBaseClass knows the details for.
Some clarity might be gained by noting that when you enter the constructor for both MyBaseClass and MySubClass the variable obj is already populated with a perfectly good instance of the class. Good OOP practice would have you keep this original instance, use it in the base class constructor, and only act to populate its members in the constructor----not to overwrite the object entirely with something new.
My conclusion would be to not assign obj to be an array inside of MyBaseClass. Instead, I would recommend creating a class MyBaseClassArray that creates an array of MyBaseClass objects.
Unfortunately, you would also need to create a duplicate class MySubClassArray that creates an array of MySubClass objects. Languages like C++ and Java get around this code duplication issue with templates and generics, respectively but MATLAB does not currently support any form of templates (http://www.mathworks.com/help/techdoc/matlab_oop/brqzfut-1.html). Without templates there is no good way to avoid code duplication.
You could try and avoid some duplication by creating a generic CreateClassArray function that takes the string name of a class to create and the constructor arguments to use for each object---but now we are coming back to code that looks like your original. The only difference is now we have a clear division between the array class and the individual objects. The truth is that although MATLAB does not support templates, its flexible classes and typing system allow you use eval() like you have to change code and overwrite obj at will and create code that acts generically across classes. The cost? Readability, speed, and the uncomfortable feeling you got when you saw your base class constructing the subclass.
In short, you used MATLAB’s flexibility to overwrite the obj in the constructor with an array to avoid creating a separate container class for MyBaseClass. You then used eval to make up for not having a template feature in MATLAB that would allow you to reuse your array creation code all types. In the end, your solution is functional, reduces code duplication, but does require some unnatural behavior from your classes. It’s just a trade you have to make.

Object Slicing, Is it advantage?

Object slicing is some thing that object looses some of its attributes or functions when a child class is assigned to base class.
Some thing like
Class A{
Class B extends A{
Class SomeClass{
A a = new A();
B b = new B();
// Some where if might happen like this */
a = b; (Object slicing happens)
Do we say Object slicing is any beneficial in any ways?
If yes, can any one please tell me how object slicing be a helpful in development and where it might be helpful?
In C++, you should think of an object slice as a conversion from the derived type to the base type[*]. A brand new object is created, which is "inspired by a true story".
Sometimes this is something that you would want to do, but the result is not in any sense the same object as the original. When object slicing goes wrong is when people aren't paying attention, and think it is the same object or a copy of it.
It's normally not beneficial. In fact it's normally done accidentally when someone passes by value when they meant to pass by reference.
It's quite hard to come up with an example of when slicing is definitively the right thing to do, because it's quite hard (especially in C++) to come up with an example where a non-abstract base class is definitively the right thing to do. This is an important design point, and not one to pass over lightly - if you find yourself slicing an object, either deliberately or accidentally, quite likely your object hierarchy is wrong to start with. Either the base class shouldn't be used as a base class, or else it should have at least one pure virtual function and hence not be sliceable or passable by value.
So, any example I gave where an object is converted to an object of its base class, would rightly provoke the objection, "hang on a minute, what are you doing inheriting from a concrete class in the first place?". If slicing is accidental then it's probably a bug, and if it's deliberate then it's probably "code smell".
But the answer might be "yes, OK, this shouldn't really be how things are structured, but given that they are structured that way, I need to convert from the derived class to the base class, and that by definition is a slice". In that spirit, here's an example:
struct Soldier {
string name;
string rank;
string serialNumber;
struct ActiveSoldier : Soldier {
string currentUnit;
ActiveSoldier *commandingOfficer; // the design errors multiply!
int yearsService;
template <typename InputIterator>
void takePrisoners(InputIterator first, InputIterator last) {
while (first != last) {
Soldier s(*first);
// do some stuff with name, rank and serialNumber
Now, the requirement of the takePrisoners function template is that its parameter be an iterator for a type convertible to Soldier. It doesn't have to be a derived class, and we don't directly access the members "name", etc, so takePrisoners has tried to offer the easiest possible interface to implement given the restrictions (a) should work with Soldier, and (b) should be possible to write other types that it also works with.
ActiveSoldier is one such other type. For reasons best known only to the author of that class, it has opted to publicly inherit from Soldier rather than providing an overloaded conversion operator. We can argue whether that's ever a good idea, but let's suppose we're stuck with it. Because it's a derived class, it is convertible to Soldier. That conversion is called a slice. Hence, if we call takePrisoners passing in the begin() and end() iterators for a vector of ActiveSoldiers, then we will slice them.
You could probably come up with similar examples for an OutputIterator, where the recipient only cares about the base class part of the objects being delivered, and so allows them to be sliced as they're written to the iterator.
The reason it's "code smell" is that we should consider (a) rewriting ActiveSoldier, and (b) changing Soldier so that it can be accessed using functions instead of member access, so that we can abstract that set of functions as an interface that other types can implement independently, so that takePrisoners doesn't have to convert to Soldier. Either of those would remove the need for a slice, and would have potential benefits for the ease with which our code can be extended in future.
[*] because it is one. The last two lines below are doing the same thing:
struct A {
int value;
A(int v) : value(v) {}
struct B : A {
int quantity;
B(int v, int q) : A(v), quantity(q) {}
int main() {
int i = 12; // an integer
B b(12, 3); // an instance of B
A a1 = b; // (1) convert B to A, also known as "slicing"
A a2 = i; // (2) convert int to A, not known as "slicing"
The only difference is that (1) calls A's copy constructor (that the compiler provides even though the code doesn't), whereas (2) calls A's int constructor.
As someone else said, Java doesn't do object slicing. If the code you provide were turned into Java, then no kind of object slicing would happen. Java variables are references, not objects, so the postcondition of a = b is just that the variable "a" refers to the same object as the variable "b" - changes via one reference can be seen via the other reference, and so on. They just refer to it by a different type, which is part of polymorphism. A typical analogy for this is that I might think of a person as "my brother"[**], and someone else might think of the same person as "my vicar". Same object, different interface.
You can get the Java-like effect in C++ using pointers or references:
B b(24,7);
A *a3 = &b; // No slicing - a3 is a pointer to the object b
A &a4 = b; // No slicing - a4 is a reference to (pseudonym for) the object b
[**] In point of fact, my brother is not a vicar.