Smalltalk: How primitives are implemented? - oop

I know that everything is an object and you send messages to objects in Smalltalk to do almost everything.
Now how can we implement an object (memory representation and basic operations) to represent a primitive data type? For example how + for integers is implemented?
I looked at the source code for Smalltalk and found this in Smallint.st. Can someone explain this piece of code?
+ arg [
"Sum the receiver and arg and answer another Number"
<category: 'built ins'>
<primitive: VMpr_SmallInteger_plus>
^self generality == arg generality
ifFalse: [self retrySumCoercing: arg]
ifTrue: [(LargeInteger fromInteger: self) + (LargeInteger fromInteger: arg)]
]
Here is the link of above code: https://github.com/gnu-smalltalk/smalltalk/blob/62dab58e5231909c7286f1e61e26c9f503b2b3df/kernel/SmallInt.st

Conceptually speaking primitive methods are pieces of behavior (routines) implemented by the Virtual Machine (VM), not by regular Smalltalk code.
When the Smalltalk compiler finds the statement <primitive: ...> it interprets this as an special type of method whose argument (in your case VMpr_SmallInteger_plus) indicates the integer index of the target routine within the VM.
In this sense a primitive is a global routine not bound to the MethodDictionary of any particular class. The primitive logic is intended for a receiver and arguments of certain classes and that's why it must check that the receiver and the arguments (if any) conform its requirements. If not, the primitive fails and in that case the control flows to the Smalltalk code that follows the <primitive: ...> statement. Otherwise the primitive succeeds and the Smalltalk code below is not executed. Note also that the compiler will not allow for any Smalltalk code other than temporary declaration occurring above the <primitive:...> sentence.
In your example, if the argument arg is not of the expected class (presumably a SmallInteger) the routine gives up trying to sum it to the receiver and delegates the resolution of the operation to the Smalltalk code.
If the argument happens to be a SmallInteger, the primitive will compute the result (using the routine held in the VM) and answer with it.
I haven't seen the code of this primitive but it could also happen that the primitive fails if the result of the sum does not fit in a SmallInteger, in which case both the receiver and the argument would be cast to LargeIntegers and the addition would take place in the #+ method of the appropriate class (LargePositiveInteger or LargeNegativeInteger).
The other branch of the Smalltalk code allows for the implementation of a polymorphic sum between a SmallInteger and any other type of object. For instance this part of the Smalltalk code would take place if you evaluate 3 + 4.0 because in this case the argument is a Float. Something similar happens if you evaluate 3 + (4 / 3), etc.

Related

Fortran Functions with a pointer result in a normal assignment

After some discussion on the question found here Correct execution of Final routine in Fortran
I thought it will be useful to know when a function with a pointer result is appropriate to use with a normal or a pointer assignment. For example, given this simple function
function pointer_result(this)
implicit none
type(test_type),intent(in) pointer :: this
type(test_type), pointer :: pointer_result
allocate(pointer_result)
end function
I would normally do test=>pointer_result(test), where test has been declared with the pointer attribute. While the normal assignment test=pointer_result(test) is legal it means something different.
What does the normal assignment imply compared to the pointer assignment?
When does it make sense to use one or the other assignment?
A normal assignment
test = pointer_result()
means that the value of the current target of test will be overwritten by the value pointed to by the resulting pointer. If test points to some invalid address (is undefined or null) the program will crash or produce undefined results. The anonymous target allocated by the function will have no pointer to it any more and the memory will be leaked.
There is hardly any legitimate use for this, but it is likely to happen when one makes a typo and writes = instead of =>. It is a very easy one to make and several style guides recommend to never use pointer functions.

The use of ">>" in Pharo/Smalltalk

I am implementing futures in Pharo. I came across this website http://onsmalltalk.com/smalltalk-concurrency-playing-with-futures. I am following this example and trying to replicate it on Pharo. However, I get to this point the last step and I have no idea what ">>" means: This symbol is not also included as part of Smalltalk syntax in http://rigaux.org/language-study/syntax-across-languages-per-language/Smalltalk.html.
BlockClosure>>future
^ SFuture new value: self fixTemps
I can see future is not a variable or a method implemented by BlockClosure. What should I do with this part of the code to make the promises/futures work as indicated at http://onsmalltalk.com/smalltalk-concurrency-playing-with-futures? I cannot add it on the Playground or as a method to my Promise class as it is, or am I missing something?
After adding the future method to BlockClosure, this is the code I try on the PlayGround.
value1 := [200 timesRepeat:[Transcript show: '.']. 6] future.
value2 := [200 timesRepeat:[Transcript show: '+']. 6] future.
Transcript show: 'other work'.
Transcript show: (value1 + value2).
Date today
The transcript displays the below error instead of the expected value of 12.
UndefinedObject>>DoIt (value1 is Undeclared)
UndefinedObject>>DoIt (value2 is Undeclared)
For some reason that it would be nice to learn, there is a traditional notation in Smalltalk to refer to the method with selector, say, m in class C which is C>>m. For example, BlockClosure>>future denotes the method of BlockClosure with selector #future. Interestingly enough, the expression is not an evaluable Smalltalk one, meaning, it is not a Smalltalk expression. It is just a succinct way of saying, "what comes below is the source code of method m in class C". Just that.
In Smalltalk, however, methods are objects too. In fact, they are instances of CompiledMethod. This means that they can be retrieved by sending a message. In this case, the message is methodAt:. The receiver of the message is the class which implements the method and the argument is the selector (respectively, C and #m, or BlockClosure and #future in your example).
Most dialects, therefore, implement a synonym of methodAt: named >>. This is easily done in this way:
>> aSymbol
^self methodAt: aSymbol
This puts the Smalltalk syntax much closer to the traditional notation because now BlockClosure>>future looks like the expression that would send the message >> to BlockClosure with argument future. However, future is not a Symbol unless we prepend it with #, namely #future. So, if we prefix the selector with the # sign, we get the literal Symbol #future, which is a valid Smalltalk object. Now the expression
BlockClosure >> #future
becomes a message, and its result after evaluating it, the CompiledMethod with selector #future in the class BlockClosure.
In sum, BlockClosure>>future is a notation, not a valid Smalltalk expression. However, by tweaking it to be BlockClosure >> #future, it becomes an evaluable expression of the language that returns the method the notation referred to.

Does fortran permit inline operations on the return value of a function?

I am trying to design a data structure composed of objects which contain, as instance variables, objects of another type.
I'd like to be able to do something like this:
CALL type1_object%get_nested_type2_object()%some_type2_method()
Notice I am trying to immediately use the getter, get_nested_type2_object() and then act on its return value to call a method in the returned type2 object.
As it stands, gfortran v4.8.2 does not accept this syntax and thinks get_nested_type2_object() is an array reference, not a function call. Is there any syntax that I can use to clarify this or does the standard not allow this?
To give a more concrete example, here is some code illustrating this:
furniture_class.F95:
MODULE furniture_class
IMPLICIT NONE
TYPE furniture_object
INTEGER :: length
INTEGER :: width
INTEGER :: height
CONTAINS
PROCEDURE :: get_length
END TYPE furniture_object
CONTAINS
FUNCTION get_length(self)
IMPLICIT NONE
CLASS(furniture_object) :: self
INTEGER :: get_length
get_length = self%length
END FUNCTION
END MODULE furniture_class
Now a room object may contain one or more furniture objects.
room_class.F95:
MODULE room_class
USE furniture_class
IMPLICIT NONE
TYPE :: room_object
CLASS(furniture_object), POINTER :: furniture
CONTAINS
PROCEDURE :: get_furniture
END TYPE room_object
CONTAINS
FUNCTION get_furniture(self)
USE furniture_class
IMPLICIT NONE
CLASS(room_object) :: self
CLASS(furniture_object), POINTER :: get_furniture
get_furniture => self%furniture
END FUNCTION get_furniture
END MODULE room_class
Finally, here is a program where I attempt to access the furniture object inside the room (but the compiler won't let me):
room_test.F95
PROGRAM room_test
USE room_class
USE furniture_class
IMPLICIT NONE
CLASS(room_object), POINTER :: room_pointer
CLASS(furniture_object), POINTER :: furniture_pointer
ALLOCATE(room_pointer)
ALLOCATE(furniture_pointer)
room_pointer%furniture => furniture_pointer
furniture_pointer%length = 10
! WRITE(*,*) 'The length of furniture in the room is', room_pointer%furniture%get_length() - This works.
WRITE(*,*) 'The length of furniture in the room is', room_pointer%get_furniture()%get_length() ! This line fails to compile
END PROGRAM room_test
I can of course directly access the furniture object if I don't use a getter to return the nested object, but this ruins the encapsulation and can become problematic in production code that is much more complex than what I show here.
Is what I am trying to do not supported by the Fortran standard or do I just need a more compliant compiler?
What you want to do is not supported by the syntax of the standard language.
(Variations on the general syntax (not necessarily this specific case) that might apply for "dereferencing" a function result could be ambiguous - consider things like substrings, whole array references, array sections, etc.)
Typically you [pointer] assign the result of the first function call to a [pointer] variable of the appropriate type, and then apply the binding for the second function to that variable.
Alternatively, if you want to apply an operation to a primary in an expression (such as a function reference) to give another value, then you could use an operator.
Some, perhaps rather subjective, comments:
Your room object doesn't really contain a furniture object - it holds a reference to a furniture object. Perhaps you use that reference in a manner that implies the parent object "containing" it, but that's not what the component definition naturally suggests.
(Use of a pointer component suggests that you want the room to point at (i.e. reference) some furniture. In terms of the language, the object referenced by a pointer component is not usually considered part of the value of the parent object of the component - consider how intrinsic assignment works, restrictions around modifying INTENT(IN) arguments, etc.
A non-pointer component suggests to me that the furniture is part of the room. In a Fortran language sense an object that is a non-pointer component it is always part of the value of the parent object of the component.
To highlight - pointer components in different rooms could potentially point at the same piece of furniture; a non-pointer furniture object is only ever directly part of one room.)
You need to be very careful using functions with pointer results. In the general case, is it:
p = some_ptr_function(args)
(and perhaps I accidentally leak memory) or
p => some_ptr_function(args)
Only one little character difference, both valid syntax, quite different semantics. If the second case is what is intended, then why not just pass the pointer back via a subroutine argument? An inconsequential difference in typing and it is much safer.
A general reminder applicable to some of the above - in the context of an expression, evaluation of a function reference yields a value. Values are not variables and hence you are not permitted to vary [modify] them.

Smalltalk type system

I'm very new to Smalltalk and would like to understand a few things and confirm others (in order to see if I'm getting the idea or not):
1) In Smalltalk variables are untyped?
2) The only "type check" in Smalltalk occurs when a message is sent and the inheritance hierarchy is climbed up in order to bind the message to a method? And in case the class Object is reached it throws a run time error because the method doesn't exist?
3) There are no coercions because there are no types...?
4) Is it possible to overload methods or operators?
5) Is there some kind of Genericity? I mean, parametric polymorphism?
6) Is there some kind of compatibility/equivalence check for arguments when a message is sent? or when a variable is assigned?
Most questions probably have very short answers (If I'm in the right direction).
1) Variables have no declared types. They are all implicitly references to objects. The objects know what kind they are.
2) There is no implicit type check but you can do explicit checks if you like. Check out the methods isMemberOf: and isKindOf:.
3) Correct. There is no concept of coercion.
4) Operators are just messages. Any object can implement any method so, yes it has overloading.
5) Smalltalk is the ultimate in generic. Variables and collections can contain any object. Languages that have "generics" make the variables and collections more specific. Go figure. Polymorphism is based on the class of the receiver. To do multiple polymorphism use double dispatching.
6) There are no implicit checks. You can add your own explicit checks as needed.
Answer 3) you can change the type of an object using messages like #changeClassTo:, #changeClassToThatOf:, and #adoptInstance:. There are, of course caveats on what can be converted to what. See the method comments.
For the sake of completion, an example from the Squeak image:
Integer>>+ aNumber
"Refer to the comment in Number + "
aNumber isInteger ifTrue:
[self negative == aNumber negative
ifTrue: [^ (self digitAdd: aNumber) normalize]
ifFalse: [^ self digitSubtract: aNumber]].
aNumber isFraction ifTrue:
[^Fraction numerator: self * aNumber denominator + aNumber numerator denominator: aNumber denominator].
^ aNumber adaptToInteger: self andSend: #+
This shows:
that classes work as some kind of 'practical typing', effectively differentiating things that can be summed (see below).
a case of explicitly checking for Type/Class. Of course, if the parameter is not an Integer or Fraction, and does_not_understand #adaptToInteger:andSend:, it will raise a DNU (doesNotUnderstand see below).
some kind of 'coercion' going on, but not implicitly. The last line:
^aNumber adaptToInteger: self andSend: #+
asks the argument to the method to do the appropriate thing to add himself to an integer. This can involve asking the original receiver to return, say, a version of himself as a Float.
(doesn't really show, but insinuates) that #+ is defined in more than one class. Operators are defined as regular methods, they're called binary methods. The difference is some Smalltalk dialects limit them up to two chars length, and their precedence.
an example of dispatching on the type of the receiver and the argument. It uses double dispatch (see 3).
an explicit check where it's needed. Object can be seen as having types (classes), but variables are not. They just hold references to any object, as Smalltalk is dynamically typed.
This also shows that much of Smalltalk is implemented in Smalltalk itself, so the image is always a good place to look for this kind of things.
About DNU errors, they are actually a bit more involved:
When the search reaches the top class in the inheritance chain (presumably ProtoObject) and the method is not found, a #doesNotUndertand: message is sent to the object, with the message not understood as parameter) in case it wants to handle the miss. If #doesNotUnderstand: is not implemented, the lookup once again climbs up to Object, where its implementation is to throw an error.
Note: I'm not sure about the equivalence between Classes and Types, so I tried to be careful about that point.

Smalltalk exotic features

Smalltalk syntax (and features) can be found pretty exotic (and even disturbing) when you come from a more C-like syntax world. I found myself losing time with some
I would be interested in learning knowing what you found really exotic compared to more classic/mainstream languages and that you think helps to understand the language.
For example, evaluation with logic operators :
(object1 = object2) & (object3 = object4) : this will evaluate the whole expression, even if the left part is false, the rest will be evaluated.
(object1 = object2) and: [object3 = object4] : this will evaluate the left part, and only will evaluate the right part if the first is true.
Everything is an object, and everything above the VM's available for inspection and modification. (Primitives are part of the VM, conceptually at least.) Even your call stack's available (thisContext) - Seaside implemented continuations back in the day by simply swizzling down the call stack into a stream, and restoring it (returning to the continuation) by simply reading out activation frames from that stream!
You can construct a selector from a string and turn it into a Symbol and send it as a message: self perform: 'this', 'That' will do the same thing as self thisThat. (But don't do this, for the same reasons you should avoid eval in both Lisps and PHP: very hard to debug!)
Message passing: it's not method invocation!
#become: is probably a bit of a shock to anyone who hasn't seen it before. (tl;dr a wholesale swapping of two object pointers - all references to B now point to A, and all references to A now point to B)
Primitves
someMethod
<primitive 14122 wtf>
"fail and execute the following"
[self] inlineCopyInject: [:t1 | self].
My first wrestling session with Smalltalk was the metaclass implementation.
Consider this:
What is the class of 'This is a string'? Well, something like String.
What is the class of String? String class. Note: this is a class, but it has no name, it just prints itself as 'String class'.
What is the class of String class? Metaclass. Note: this is a named class.
What is the class of Metaclass? As you might expect (or not) this is Metaclass class. Of which, again as you might expect, the class is Metaclass again.
This is the first circularity. Another one which I found rather esoteric at first (of course, now I eat metaclasses for breakfast) is the next one:
What is the superclass of String? Object (eventually, different implementations of Smalltalk have different class hierarchies of these basic classes).
What is the superclass of Object? nil. Now this is an interesting answer in Smalltalk, because it actually is an object! nil class answers UndefinedObject. Of which the superclass is ... Object.
Navigating through the superclass and instance of relations was a real rollercoster ride for me in those days...
How about selective breakpointing (which I actually use at times):
foo
thisContext sender selector == #bar ifTrue:[ self halt ].
...
will debug itself, but only if called from bar. Useful, if foo is called for from zillion other places and a regular breakpoint hits too often.
I've always been fond of the Smalltalk quine:
quine
^thisContext method getSource
(Pharo version.)