Understanding GNU Smalltalk Closure - smalltalk

The following piece of code is giving the error error: did not understand '#generality'
pqueue := SortedCollection new.
freqtable keysAndValuesDo: [:key :value |
(value notNil and: [value > 0]) ifTrue: [
|newvalue|
newvalue := Leaf new: key count: value.
pqueue add: newvalue.
]
].
[pqueue size > 1] whileTrue:[
|first second new_internal newcount|
first := pqueue removeFirst.
second := pqueue removeFirst.
first_count := first count.
second_count := second count.
newcount := first_count + second_count.
new_internal := Tree new: nl count: newcount left: first right: second.
pqueue add: new_internal.
].
The inconsistency is in the line pqueue add: new_internal. When I remove this line, the program compiles. I think the problem is related to the iteration block [pqueue size > 1] whileTrue: and pqueue add: new_internal.
Note: This is the algorithm to build the decoding tree based on huffman code.
error-message expanded
Object: $<10> error: did not understand #generality
MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
SmallInteger>><= (SmallInt.st:215)
Leaf>><= (hzip.st:30)
optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
SortedCollection>>insertionIndexFor:upTo: (SortCollect.st:702)
[] in SortedCollection>>merge (SortCollect.st:531)
SortedCollection(SequenceableCollection)>>reverseDo: (SeqCollect.st:958)
SortedCollection>>merge (SortCollect.st:528)
SortedCollection>>beConsistent (SortCollect.st:204)
SortedCollection(OrderedCollection)>>removeFirst (OrderColl.st:295)
optimized [] in UndefinedObject>>executeStatements (hzip.st:156)
BlockClosure>>whileTrue: (BlkClosure.st:328)
UndefinedObject>>executeStatements (hzip.st:154)
Object: $<10> error: did not understand #generality
MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
SmallInteger>><= (SmallInt.st:215)
Leaf>><= (hzip.st:30)
optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
SortedCollection>>insertionIndexFor:upTo: (SortCollect.st:702)
[] in SortedCollection>>merge (SortCollect.st:531)
SortedCollection(SequenceableCollection)>>reverseDo: (SeqCollect.st:958)
SortedCollection>>merge (SortCollect.st:528)
SortedCollection>>beConsistent (SortCollect.st:204)
SortedCollection(OrderedCollection)>>do: (OrderColl.st:64)
UndefinedObject>>executeStatements (hzip.st:164)

One learning we can take from this question is to acquire the habit of reading the stack trace trying to make sense of it. Let's focus in the last few messages:
1. Object: $<10> error: did not understand #generality
2. MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
3. Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
4. SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
5. SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
6. SmallInteger>><= (SmallInt.st:215)
7. Leaf>><= (hzip.st:30)
8. optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
Each of these lines represents the activation of a method. Every line represents a message and the sequence of messages goes upwards (as it happens in any Stack.) The full detail of every activation can be seen in the debugger. Here, however, we only are presented with the class >> #selector pair. There are several interesting facts we can identify from this summarized information:
In line 1 we get the actual error. In this case we got a MessageNotUnderstood exception. The receiver of the message was the Character $<10>, i.e., the linefeed character.
Lines 2 and 3 confirm that the not understood message was #generality.
Lines 4, 5 and 6 show the progression of messages that ended up sending #generality to the wrong object (linefeed). While 4 and 5 might look obscure for the non-experienced Smalltalker, line 6 has the key information: some SmallInteger received the <= message. This message would fail because the argument wasn't the appropriate one. From the information we already got we know that the argument was the linefeed character.
Line 7 shows that SmallInteger >> #<= came from the way the same selector #<= is implemented in Leaf. It tells us that a Leaf delegates #<= to some Integer known to it.
Line 8 says why we are dealing with the comparison selector #<=. The reason is that we are sorting some collection.
So, we are trying to sort a collection of Leaf objects which rely on some integers for their comparison and somehow one of those "integers" wasn't a Number but the Character linefeed.
If we take a look at the Smalltalk code with this information in mind we see:
The SortedCollection is pqueue and the Leaf objects are the items being added to it.
The invariant property of a SortedCollection is that it always has its elements ordered by a given criterion. Consequently, every time we add: an element to it, the element will be inserted in the correct position. Hence the comparison message #<=.
Now let's look for #add: in the code. Besides of the one above, there is another below:
new_internal := Tree new: nl count: newcount left: first right: second.
pqueue add: new_internal.
This one is interesting because is where the error happens. Note however that we are not adding a Leaf here but a Tree. But wait, it might be that a Tree and a Leaf belong to the same hierarchy. In fact, both Tree and Leaf represent nodes in an acyclic graph. Moreover, the code confirms this idea when it reads:
Leaf new: key count: value.
...
Tree new: nl count: newcount left: first right: second.
See? Both Leaf and Tree have some key (the argument of new:) and some count. In addition, Trees have left and right branches, which Leafs not (of course!)
So, in principle, it would be ok to add instances of Tree to our pqueue collection. This cannot be what causes the error.
Now, if we look closer to the way the Tree is created we can see a suspicious argument nl. This is interesting because of two reasons: (i) the variable nl is not defined in the part of the code we were given and (ii) the variable nl is the key that will be used by the Tree to respond to the #<= message. Therefore, nl must be the linefeed character $<10>. Which makes a lot of sense because nl is an abbreviation of newline and in the Linux world newlines are linefeeds.
Conclusion: The problem seems to be caused by the wrong argument nl used for the Tree's key.

Related

CBMC Toy Example

I'm new to CBMC and experimenting with it. In this link here, there is a toy example for checking the function binsearch with CBMC. I decided to run the following command that they provided, just changing up the number of times the loop was unwound:
cbmc binsearch.c --function binsearch --unwind 4 --bounds-check --unwinding-assertions
It returned the following:
** Results:
[binsearch.unwind.0] unwinding assertion loop 0: FAILURE
prog.c function binsearch
[binsearch.array_bounds.1] line 7 array `a' lower bound in a[(signed long int)middle]: SUCCESS
[binsearch.array_bounds.2] line 7 array `a' upper bound in a[(signed long int)middle]: SUCCESS
[binsearch.array_bounds.3] line 9 array `a' lower bound in a[(signed long int)middle]: SUCCESS
[binsearch.array_bounds.4] line 9 array `a' upper bound in a[(signed long int)middle]: SUCCESS
Is the fact that the unwinding assertion failed because there weren't enough iterations a bad thing? From my point-of-view, it seems like the example is bug-free because the code didn't access portions of memory that it's not supposed to, but I'm not sure based on that one unwinding assertions failure. Anyone have any ideas about the safety? Does that failure matter?
Based on the --unwinding-assertion property, which checks the following:
Checks whether --unwind is large enough to cover all program paths. If the argument is too small, CBMC will detect that not enough unwinding is done reports that an unwinding assertion has failed.
I'd say that it is alerts to the possibility that there aren't enough loop iterations to make sure that the function won't access the array outside of the bounds. This means that while the function didn't violate any properties with 4, we need to check all paths before we can say that it is safe for certain.

How to use hyperoperators with Scalars that aren't really scalar?

I want to make a hash of sets. Well, SetHashes, since they need to be mutable.
In fact, I would like to initialize my Hash with multiple identical copies of the same SetHash.
I have an array containing the keys for the new hash: #keys
And I have my SetHash already initialized in a scalar variable: $set
I'm looking for a clean way to initialize the hash.
This works:
my %hash = ({ $_ => $set.clone } for #keys);
(The parens are needed for precedence; without them, the assignment to %hash is part of the body of the for loop. I could change it to a non-postfix for loop or make any of several other minor changes to get the same result in a slightly different way, but that's not what I'm interested in here.)
Instead, I was kind of hoping I could use one of Raku's nifty hyper-operators, maybe like this:
my %hash = #keys »=>» $set;
That expression works a treat when $set is a simple string or number, but a SetHash?
Array >>=>>> SetHash can never work reliably: order of keys in SetHash is indeterminate
Good to know, but I don't want it to hyper over the RHS, in any order. That's why I used the right-pointing version of the hyperop: so it would instead replicate the RHS as needed to match it up to the LHS. In this sort of expression, is there any way to say "Yo, Raku, treat this as a scalar. No, really."?
I tried an explicit Scalar wrapper (which would make the values harder to get at, but it was an experiment):
my %map = #keys »=>» $($set,)
And that got me this message:
Lists on either side of non-dwimmy hyperop of infix:«=>» are not of the same length while recursing
left: 1 elements, right: 4 elements
So it has apparently recursed into the list on the left and found a single key and is trying to map it to a set on the right which has 4 elements. Which is what I want - the key mapped to the set. But instead it's mapping it to the elements of the set, and the hyperoperator is pointing the wrong way for that combination of sizes.
So why is it recursing on the right at all? I thought a Scalar container would prevent that. The documentation says it prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
The error message says the version of the hyperoperator I'm using is "non-dwimmy", which may explain why it's not in fact doing what I mean, but is there maybe an even-less-dwimmy version that lets me be even more explicit? I still haven't gotten my brain aligned well enough with the way Raku works for it to be able to tell WIM reliably.
I'm looking for a clean way to initialize the hash.
One idiomatic option:
my %hash = #keys X=> $set;
See X metaoperator.
The documentation says ... a Scalar container ... prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
A cat is an animal, but an animal is not necessarily a cat. Flattening may act recursively, but some operations that act recursively don't flatten. Recursive flattening stops if it sees a Scalar. But hyperoperation isn't flattening. I get where you're coming from, but this is not the real problem, or at least not a solution.
I had thought that hyperoperation had two tests controlling recursing:
Is it hyperoperating a nodal operation (eg .elems)? If so, just apply it like a parallel shallow map (so don't recurse). (The current doc quite strongly implies that nodal can only be usefully applied to a method, and only a List one (or augmentation thereof) rather than any routine that might get hyperoperated. That is much more restrictive than I was expecting, and I'm sceptical of its truth.)
Otherwise, is a value Iterable? If so, then recurse into that value. In general the value of a Scalar automatically behaves as the value it contains, and that applies here. So Scalars won't help.
A SetHash doesn't do the Iterable role. So I think this refusal to hyperoperate with it is something else.
I just searched the source and that yields two matches in the current Rakudo source, both in the Hyper module, with this one being the specific one we're dealing with:
multi method infix(List:D \left, Associative:D \right) {
die "{left.^name} $.name {right.^name} can never work reliably..."
}
For some reason hyperoperation explicitly rejects use of Associatives on either the right or left when coupled with the other side being a List value.
Having pursued the "blame" (tracking who made what changes) I arrived at the commit "Die on Associative <<op>> Iterable" which says:
This can never work due to the random order of keys in the Associative.
This used to die before, but with a very LTA error about a Pair.new()
not finding a suitable candidate.
Perhaps this behaviour could be refined so that the determining factor is, first, whether an operand does the Iterable role, and then if it does, and is Associative, it dies, but if it isn't, it's accepted as a single item?
A search for "can never work reliably" in GH/rakudo/rakudo issues yields zero matches.
Maybe file an issue? (Update I filed "RFC: Allow use of hyperoperators with an Associative that does not do Iterable role instead of dying with "can never work reliably".)
For now we need to find some other technique to stop a non-Iterable Associative being rejected. Here I use a Capture literal:
my %hash = #keys »=>» \($set);
This yields: {a => \(SetHash.new("b","a","c")), b => \(SetHash.new("b","a","c")), ....
Adding a custom op unwraps en passant:
sub infix:« my=> » ($lhs, $rhs) { $lhs => $rhs[0] }
my %hash = #keys »my=>» \($set);
This yields the desired outcome: {a => SetHash(a b c), b => SetHash(a b c), ....
my %hash = ({ $_ => $set.clone } for #keys);
(The parens seem to be needed so it can tell that the curlies are a block instead of a Hash literal...)
No. That particular code in curlies is a Block regardless of whether it's in parens or not.
More generally, Raku code of the form {...} in term position is almost always a Block.
For an explanation of when a {...} sequence is a Hash, and how to force it to be one, see my answer to the Raku SO Is that a Hash or a Block?.
Without the parens you've written this:
my %hash = { block of code } for #keys
which attempts to iterate #keys, running the code my %hash = { block of code } for each iteration. The code fails because you can't assign a block of code to a hash.
Putting parens around the ({ block of code } for #keys) part completely alters the meaning of the code.
Now it runs the block of code for each iteration. And it concatenates the result of each run into a list of results, each of which is a Pair generated by the code $_ => $set.clone. Then, when the for iteration has completed, that resulting list of pairs is assigned, once, to my %hash.

Can one create a standalone method/function (without any class)

I am trying to understand smalltalk. Is it possible to have a standalone method/function, which is not part of any particular class, and which can be called later:
amethod ['amethod called' printNl].
amethod.
Above code gives following error:
simpleclass.st:1: expected Eval, Namespace or class definition
How can I use Eval or Namespace as being suggested by error message?
I tried following but none work:
Eval amethod [...
amethod Eval [...
Eval amethod Eval[... "!"
Eval [... works but I want to give a name to the block so that I can call it later.
Following also works but gets executed immediately and does not execute when called later.
Namespace current: amethod ['amethod called' printNl].
Thanks for your insight.
In Smalltalk the equivalent to a standalone method is a Block (a.k.a. BlockClosure). You create them by enclosing Smalltalk expressions between square brackets. For example
[3 + 4]
To evaluate a block, you send it the message value:
[3 + 4] value
which will answer with 7.
Blocks may also have arguments:
[:s | 3 + s]
you evaluate them with value:
[:s | 3 + s] value: 4 "answers with 7"
If the block has several sentences, you separate them with a dot, as you would do in the body of a method.
Addendum
Blocks in Smalltalk are first class objects. In particular, one can reference them with variables, the same one does with any other objects:
three := 3.
threePlus := [:s | three + s].
for later use
threePlus value: 4 "7"
Blocks can be nested:
random := Random new.
compare := [:p :u | u <= p]
bernoulli60 := [compare value: 0.6 value: random next].
Then the sequence:
bernoulli60 value. "true"
bernoulli60 value. "false"
...
bernoulli60 value. "true"
will answer with true about 60% of the times.
Leandro's answer, altough being correct and with deep smalltalk understanding, is answering what you asked for, but I think, not 100% sure thou, you are actually asking how to "play" around with a code without the need to create a class.
In my eyes want you want is called a Workspace (Smalltalk/X and Dolphin) (it can have different names like Playground in Pharo Smalltalk).
If you want to play around you need to create a local variable.
| result |
result := 0. "Init otherwise nil"
"Adding results of a simple integer factorial"
1 to: 10 do: [ :integer |
result := result + integer factorial
].
Transcript show: result.
Explanation:
I'm using a do: block for 1-10 iterration. (:integer is a block local variable). Next I'm, showing the result on Transcript.

For loop for array in Pharo Smalltalk

I'm trying to make an array with random numbers (just 0 or 1), but when I run it, it just prints this: End of statement list encountered ->
This is my code:
GenList
| lista |
lista := Array new: 31.
1 to: 30 do: [ :i | lista at: i put: 2 atRandom - 1]
^lista
What can I do?
Some interesting things to consider:
1. The method selector doesn't start with a lowercase letter
It is a tradition for selectors to start with a lowercase letter. In this sense, genLista would be more correct than GenLista.
2. The method selector includes the abbreviated word 'gen'
For instance, genLista could be renamed to genereLista o listaAlAzar (if you decide to use Spanish)
3. The Array named lista has 31 elements, not 30
The result of Array new: 31 is an array of 31 elements. However, the code below it only fills 30 of them, leaving the last one uninitialized (i.e., nil). Possible solution: lista := Array new: 30.
4. A dot is missing causing a compilation error
The code
1 to: 30 do: [ :i | lista at: i put: 2 atRandom - 1]
^lista
does not compile because there is no dot indicating the separation between the two sentences. Note that the error happens at compilation time (i.e., when you save the method) because the return token ^ must start a statement (i.e., it cannot be inlined inside a statement).
There are other cases where a missing dot will not prevent the code from compiling. Instead, an error will happen at runtime. Here is a (typical) example:
1 to: 10 do: [:i | self somethingWith: i] "<- missing dot here"
self somethingElse
the missing dot will generate the runtime error self not understood by block.
5. There is a more expressive way of generating 0s and 1s at random
The calculation 2 atRandom - 1 is ok. However, it forces the reader to mentally do the math. A better way to reveal your intention would have been
#(0 1) atRandom
6. When playing with random numbers don't forget to save the seed
While it is ok to use atRandom, such a practice should only be used with "toy" code. If you are developing a system or a library, the recommended practice is to save the seed somewhere before generating any random data. This will allow you to reproduce the generation of random quantities later on for the sake of debugging or confirmation. (Note however, that this will not suffice for making your program deterministically reproducible because unordered (e.g. hashed) collections could form differently in successive executions.)

Why ifTrue and ifFalse are not separated by ; in Smalltalk?

a > b
ifTrue:[ 'greater' ]
ifFalse:[ 'less or equal' ]
My understanding is that Boolean a > b receives the message ifTrue:[ 'greater' ], and then ifFalse:[ 'less or equal' ] complying to the generalization:
objectInstance selector; selector2
But there a semicolon is needed to specify that the receiver of selector2 is not (objectInstance selector) but objectInstance. Is not the same with the above conditional execution?
The selector of the method is Boolean>>ifTrue:ifFalse:, which means it is one method with two parameters, not two methods with one parameter.
Ergo, to invoke the method, you send it the message ifTrue:ifFalse: with two block arguments.
Note that for convenience reasons, there are also methods Boolean>>ifFalse:ifTrue:, Boolean>>ifTrue: and Boolean>>ifFalse:.
Everything relevant has already been sayd, but just for your amusement:
As already told,
rcvr ifTrue:[...] ifFalse:[...]
is the one and single message #'ifTrue:ifFalse:' with 2 args sent to rcvr. The value of that expression is the one from that message send.
In contrast:
rcvr ifTrue:[...]; ifFalse:[...]
is a cascade of 2 sequential messages (#'ifTrue:' and #'ifFalse:'), each with 1 arg sent to rcvr. The value of the expression is the one returned from the last send.
Now the funny thing is that booleans do understand ifTrue: / ifFalse: (each with 1 arg),
so your code works for the side effect (evaluating those blocks), but not for its value.
This means that:
a > b ifTrue:[Transcript showCR:'gt'] ; ifFalse:[Transcript showCR:'le']
generates the same output as:
a > b ifTrue:[Transcript showCR:'gt'] ifFalse:[Transcript showCR:'le']
but:
msg := a > b ifTrue:['gt'] ; ifFalse:['le']
will generate different values in msg than:
msg := a > b ifTrue:['gt'] ifFalse:['le']
depending on the values of a and b. Try (a b)=(1 2) vs. (a b)=(2 1)...
The problem of many Smalltalk beginners is that they think of ifXXX: as syntax, where it is actually a message send which generates value. Also, the semi is not a statement separator as in many previously learned languages, but a sequencing message send construct.
A bad trap for beginners, because the code seems to work for some particular value combinations, whereas it generates funny results for others.
Let's hope your unit tests cover these ;-)
edit: to see where the bad value comes from, take a look at what is returned by the Boolean >> ifFalse: method for a true receiver...