What's the logic behind some WriteStream messages? - smalltalk

Consider the following sequence of messages:
1. s := '' writeStream.
2. s nextPutAll: '123'.
3. s skip: -3.
4. s position "=> 0".
5. s size "=> 3".
6. s isEmpty "=> false".
7. s contents isEmpty "=> true (!)"
Aren't 6 and 7 contradictory or, at least, confusing? What's the logic behind this behavior? (Dolphin has a similar functioning.)
UPDATE
As #MartinW observed (see the comment below) ReadStream and ReadWriteStream behave differently (we could say, as expected.)
From a practical viewpoint the compatibility that worries me more is with FileStream where the contents message is not limited to the current position. Such a difference invalidates an otherwise nice "law" by which any code that works with a memory stream (string or byte array) also works with a file stream, and conversely. This kind of equivalence is very useful for testing and also for pedagogical reasons. The apparent loss of functionality would be easily recovered by adding the method #truncate to WriteStream which would explicitly shorten the size to the current position [cf., the discussion below with Derek Williams]

In many Smalltalks (like VAST), the method comments explain it well:
WriteStream>>contents
"Answer a Collection which is a copy collection that the receiver is
streaming over, truncated to the current position reference."
PositionableStream>>isEmpty
"Answer a Boolean which is true if the receiver can access any
objects and false otherwise."
Note "truncated to the current position reference."
In your example, contents is an empty string because you set the position back to 0 with skip: -3.
This behavior is not only correct, it's quite useful. For example, consider a method that builds a comma-separated list from a collection:
commaSeparatedListFor: aCollection
ws := '' writeStream.
aCollection do: [ :ea |
ws print: ea; nextPutAll: ', ' ].
ws isEmpty
ifFalse: [ ws position: ws position - 2 ].
^ws contents
In this case, the method may have written the final trailing ", ", but we want contents to exclude it.

Related

How to add same method with 2 different names in GNU Smalltalk?

How can I have a class expose the same method with 2 different names?
E.g. that the asDescripton function does the same thing / re-exports the asString function without simply copy-pasting the code.
Object subclass: Element [
| width height |
Element class >> new [
^super new init.
]
init [
width := 0.
height := 0.
]
asString [
^ 'Element with width ', width, ' and height ', height.
]
asDescription [ "???" ]
]
In Smalltalk you usually implement #printOn: and get #asString from the inherited version of it which goes on the lines of
Object >> asString
| stream |
stream := '' writeStream.
self printOn: stream.
^stream contents
The actual implementation of this method may be slightly different in your environment, the idea remains the same.
As this is given, it is usually a good idea to implement #printOn: rather than #asString. In your case you would have it implemented as
Element >> printOn: aStream
aStream
nextPutAll: 'Element with width ';
nextPutAll: width asString;
nextPutAll: ' and height ';
nextPutAll: height asString
and then, as JayK and luker indicated,
Element >> asDescription
^self asString
In other words, you (usually) don't want to implement #asString but #printOn:. This approach is better because it takes advantage of the inheritance and ensures consistency between #printOn: and #asString, which is usually expected. In addition, it will give you the opportunity to start becoming familiar with Streams, which play a central role in Smalltalk.
Note by the way that in my implementation I've used width asString and heigh asString. Your code attempts to concatenate (twice) a String with a Number:
'Element with width ', width, ' and height ', height.
which won't work as you can only concatenate instances of String with #,.
In most of the dialects, however, you can avoid sending #asString by using #print: instead of #nextPutAll:, something like:
Element >> printOn: aStream
aStream
nextPutAll: 'Element with width ';
print: width;
nextPutAll: ' and height ';
print: height
which is a little bit less verbose and therefore preferred.
One last thing. I would recommend changing the first line above with this one:
nextPutAll: self class name;
nextPutAll: ' with width ';
instead of hardcoding the class name. This would prove to be useful if in the future you subclass Element because you will have no need to tweak #printOn: and any of its derivatives (e.g., #asDescription).
Final thought: I would rename the selector #asDescription to be #description. The preposition as is intended to convert an object to another of a different class (this is why #asString is ok). But this doesn't seem to be the case here.
Addendum: Why?
There is a reason why #asString is implemented in terms of #printOn:, and not the other way around: generality. While the effort (code complexity) is the same, #printOn: is clearly a winner because it will work with any character Stream. In particular, it will work with no modification whatsoever with
Files (instances of FileStream)
Sockets (instances of SocketStream)
The Transcript
In other words, by implementing #printOn: one gets #asString for free (inheritance) and --at the same time-- the ability to dump a representation of the object on files and sockets. The Transcript is particularly interesting because it supports the Stream protocol for writing, and thus can be used for testing purposes before sending any bytes to external devices.
Remember!
In Smalltalk, the goal is to have objects whose behavior is simple and general at once, not just simple!
As lurker wrote in the comments, send the asString message in asDescription.
asDescription
^ self asString
This is usually done to expose additional interfaces/protocols from a class, for compatibility or as a built-in adapter. If you create something new that does not have to fit in anywhere else, consider to stick to just one name for each operation.
Edit: if you really are after the re-export semantics and do not want the additional message sends involved in the delegation above, there might be a way to put the CompiledMethod of asString in the method dictionary of the class a second time under the other name. But neither am I sure that this would work, nor do I know the protocol in GNU Smalltalk how to manipulate the method dictionary. Have a look at the documentation of the Behavior class. Also, I would not consider this as programming Smalltalk, but tinkering with the system.

Understanding GNU Smalltalk Closure

The following piece of code is giving the error error: did not understand '#generality'
pqueue := SortedCollection new.
freqtable keysAndValuesDo: [:key :value |
(value notNil and: [value > 0]) ifTrue: [
|newvalue|
newvalue := Leaf new: key count: value.
pqueue add: newvalue.
]
].
[pqueue size > 1] whileTrue:[
|first second new_internal newcount|
first := pqueue removeFirst.
second := pqueue removeFirst.
first_count := first count.
second_count := second count.
newcount := first_count + second_count.
new_internal := Tree new: nl count: newcount left: first right: second.
pqueue add: new_internal.
].
The inconsistency is in the line pqueue add: new_internal. When I remove this line, the program compiles. I think the problem is related to the iteration block [pqueue size > 1] whileTrue: and pqueue add: new_internal.
Note: This is the algorithm to build the decoding tree based on huffman code.
error-message expanded
Object: $<10> error: did not understand #generality
MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
SmallInteger>><= (SmallInt.st:215)
Leaf>><= (hzip.st:30)
optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
SortedCollection>>insertionIndexFor:upTo: (SortCollect.st:702)
[] in SortedCollection>>merge (SortCollect.st:531)
SortedCollection(SequenceableCollection)>>reverseDo: (SeqCollect.st:958)
SortedCollection>>merge (SortCollect.st:528)
SortedCollection>>beConsistent (SortCollect.st:204)
SortedCollection(OrderedCollection)>>removeFirst (OrderColl.st:295)
optimized [] in UndefinedObject>>executeStatements (hzip.st:156)
BlockClosure>>whileTrue: (BlkClosure.st:328)
UndefinedObject>>executeStatements (hzip.st:154)
Object: $<10> error: did not understand #generality
MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
SmallInteger>><= (SmallInt.st:215)
Leaf>><= (hzip.st:30)
optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
SortedCollection>>insertionIndexFor:upTo: (SortCollect.st:702)
[] in SortedCollection>>merge (SortCollect.st:531)
SortedCollection(SequenceableCollection)>>reverseDo: (SeqCollect.st:958)
SortedCollection>>merge (SortCollect.st:528)
SortedCollection>>beConsistent (SortCollect.st:204)
SortedCollection(OrderedCollection)>>do: (OrderColl.st:64)
UndefinedObject>>executeStatements (hzip.st:164)
One learning we can take from this question is to acquire the habit of reading the stack trace trying to make sense of it. Let's focus in the last few messages:
1. Object: $<10> error: did not understand #generality
2. MessageNotUnderstood(Exception)>>signal (ExcHandling.st:254)
3. Character(Object)>>doesNotUnderstand: #generality (SysExcept.st:1448)
4. SmallInteger(Number)>>retryDifferenceCoercing: (Number.st:357)
5. SmallInteger(Number)>>retryRelationalOp:coercing: (Number.st:295)
6. SmallInteger>><= (SmallInt.st:215)
7. Leaf>><= (hzip.st:30)
8. optimized [] in SortedCollection class>>defaultSortBlock (SortCollect.st:7)
Each of these lines represents the activation of a method. Every line represents a message and the sequence of messages goes upwards (as it happens in any Stack.) The full detail of every activation can be seen in the debugger. Here, however, we only are presented with the class >> #selector pair. There are several interesting facts we can identify from this summarized information:
In line 1 we get the actual error. In this case we got a MessageNotUnderstood exception. The receiver of the message was the Character $<10>, i.e., the linefeed character.
Lines 2 and 3 confirm that the not understood message was #generality.
Lines 4, 5 and 6 show the progression of messages that ended up sending #generality to the wrong object (linefeed). While 4 and 5 might look obscure for the non-experienced Smalltalker, line 6 has the key information: some SmallInteger received the <= message. This message would fail because the argument wasn't the appropriate one. From the information we already got we know that the argument was the linefeed character.
Line 7 shows that SmallInteger >> #<= came from the way the same selector #<= is implemented in Leaf. It tells us that a Leaf delegates #<= to some Integer known to it.
Line 8 says why we are dealing with the comparison selector #<=. The reason is that we are sorting some collection.
So, we are trying to sort a collection of Leaf objects which rely on some integers for their comparison and somehow one of those "integers" wasn't a Number but the Character linefeed.
If we take a look at the Smalltalk code with this information in mind we see:
The SortedCollection is pqueue and the Leaf objects are the items being added to it.
The invariant property of a SortedCollection is that it always has its elements ordered by a given criterion. Consequently, every time we add: an element to it, the element will be inserted in the correct position. Hence the comparison message #<=.
Now let's look for #add: in the code. Besides of the one above, there is another below:
new_internal := Tree new: nl count: newcount left: first right: second.
pqueue add: new_internal.
This one is interesting because is where the error happens. Note however that we are not adding a Leaf here but a Tree. But wait, it might be that a Tree and a Leaf belong to the same hierarchy. In fact, both Tree and Leaf represent nodes in an acyclic graph. Moreover, the code confirms this idea when it reads:
Leaf new: key count: value.
...
Tree new: nl count: newcount left: first right: second.
See? Both Leaf and Tree have some key (the argument of new:) and some count. In addition, Trees have left and right branches, which Leafs not (of course!)
So, in principle, it would be ok to add instances of Tree to our pqueue collection. This cannot be what causes the error.
Now, if we look closer to the way the Tree is created we can see a suspicious argument nl. This is interesting because of two reasons: (i) the variable nl is not defined in the part of the code we were given and (ii) the variable nl is the key that will be used by the Tree to respond to the #<= message. Therefore, nl must be the linefeed character $<10>. Which makes a lot of sense because nl is an abbreviation of newline and in the Linux world newlines are linefeeds.
Conclusion: The problem seems to be caused by the wrong argument nl used for the Tree's key.

Rebol: Dynamic binding of block words

In Rebol, there are words like foreach that allow "block parametrization" over a given word and a series, e.g., foreach w [1 2 3] [print w]. Since I find that syntax very convenient (as opposed to passing func blocks), I'd like to use it for my own words that operate on lazy lists, e.g map/stream x s [... x ... ].
How is that syntax idiom called? How is it properly implemented?
I was searching the docs, but I could not find a straight answer, so I tried to implement foreach on my own. Basically, my implementation comes in two parts. The first part is a function that binds a specific word in a block to a given value and yields a new block with the bound words.
bind-var: funct [block word value] [
qw: load rejoin ["'" word]
do compose [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
]
Using that, I implemented foreach as follows:
my-foreach: func ['word s block] [
if empty? block [return none]
until [
do bind-var block word first s
s: next s
tail? s
]
]
I find that approach quite clumsy (and it probably is), so I was wondering how the problem can be solved more elegantly. Regardless, after coming up with my contraption, I am left with two questions:
In bind-var, I had to do some wrapping in bind [(block)] (:qw) because (block) would "dissolve". Why?
Because (?) of 2, the bind operation is performed on a new block (created by the [(block)] expression), not the original one passed to my-foreach, with seperate bindings, so I have to operate on that. By mistake, I added [(block)] and it still works. But why?
Great question. :-) Writing your own custom loop constructs in Rebol2 and R3-Alpha (and now, history repeating with Red) has many unanswered problems. These kinds of problems were known to the Rebol3 developers and considered blocking bugs.
(The reason that Ren-C was started was to address such concerns. Progress has been made in several areas, though at time of writing many outstanding design problems remain. I'll try to just answer your questions under the historical assumptions, however.)
In bind-var, I had to do some wrapping in bind [(block)] (:qw) because (block) would "dissolve". Why?
That's how COMPOSE works by default...and it's often the preferred behavior. If you don't want that, use COMPOSE/ONLY and blocks will not be spliced, but inserted as-is.
qw: load rejoin ["'" word]
You can convert WORD! to LIT-WORD! via to lit-word! word. You can also shift the quoting responsibility into your boilerplate, e.g. set quote (word) value, and avoid qw altogether.
Avoiding LOAD is also usually preferable, because it always brings things into the user context by default--so it loses the binding of the original word. Doing a TO conversion will preserve the binding of the original WORD! in the generated LIT-WORD!.
do compose [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
Presumably you meant COMPOSE/DEEP here, otherwise this won't work at all... with regular COMPOSE the embedded PAREN!s cough, GROUP!s for [(block)] will not be substituted.
By mistake, I added [(block)] and it still works. But why?
If you do a test like my-foreach x [1] [print x probe bind? 'x] the output of the bind? will show you that it is bound into the "global" user context.
Fundamentally, you don't have any MAKE OBJECT! or USE to create a new context to bind the body into. Hence all you could potentially be doing here would be stripping off any existing bindings in the code for x and making sure they are into the user context.
But originally you did have a USE, that you edited to remove. That was more on the right track:
bind-var: func [block word value /local qw] [
qw: load rejoin ["'" word]
do compose/deep [
use [(qw)] [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
]
]
You're right to suspect something is askew with how you're binding. But the reason this works is because your BIND is only redoing the work that USE itself does. USE already deep walks to make sure any of the word bindings are adjusted. So you could omit the bind entirely:
do compose/deep [
use [(qw)] [
set (:qw) value
[(block)]
]
]
the bind operation is performed on a new block (created by the [(block)] expression), not the original one passed to my-foreach, with separate bindings
Let's adjust your code by taking out the deep-walking USE to demonstrate the problem you thought you had. We'll use a simple MAKE OBJECT! instead:
bind-var: func [block word value /local obj qw] [
do compose/deep [
obj: make object! [(to-set-word word) none]
qw: bind (to-lit-word word) obj
set :qw value
bind [(block)] :qw
[(block)] ; This shouldn't work? see Question 2
]
]
Now if you try my-foreach x [1 2 3] [print x]you'll get what you suspected... "x has no value" (assuming you don't have some latent global definition of x it picks up, which would just print that same latent value 3 times).
But to make you sufficiently sorry you asked :-), I'll mention that my-foreach x [1 2 3] [loop 1 [print x]] actually works. That's because while you were right to say a bind in the past shouldn't affect a new block, this COMPOSE only creates one new BLOCK!. The topmost level is new, any "deeper" embedded blocks referenced in the source material will be aliases of the original material:
>> original: [outer [inner]]
== [outer [inner]]
>> composed: compose [<a> (original) <b>]
== [<a> outer [inner] <b>]
>> append original/2 "mutation"
== [inner "mutation"]
>> composed
== [<a> outer [inner "mutation"] <b>]
Hence if you do a mutating BIND on the composed result, it can deeply affect some of your source.
until [
do bind-var block word first s
s: next s
tail? s
]
On a general note of efficiency, you're running COMPOSE and BIND operations on each iteration of your loop. No matter how creative new solutions to these kinds of problems get (there's a LOT of new tech in Ren-C affecting your kind of problem), you're still probably going to want to do it only once and reuse it on the iterations.

Is it possible to create LOCAL variable dynamically in rebol / red?

It is easy to create GLOBAL variables dynamically in rebol / red with set like
i: 1
myvarname: rejoin ["var" i]
set to-word myvarname 10
var1
but then var1 is global. What if I want to create var1 dynamically inside a function and make it LOCAL so as to avoid collision with some global variables of same name ?
In javascript it is possible:
How to declare a dynamic local variable in Javascript
Not sure it is possible in rebol/red ?
In Red you have function, in Rebol2 you have funct. Both create local variable words automatically. Here an example for Rebol2
>> for num 1 100 1 [
[ set to-word rejoin ["f" num] funct [] compose/deep [
[ print [ "n =" n: (num) ]
[ ]
[ ]
>> f1
n = 1
>> f2
n = 2
>> n
** Script Error: n has no value
** Near: n
How it is done, you can see with source funct
In Rebol, there is USE:
x: 10
word: use [x] [
x: 20
print ["Inside the use, x is" x]
'x ;-- leak the word with binding to the USE as evaluative result
]
print ["Outside the use, plain x is" x]
print ["The leaked x from the use is" get word]
That will give you:
Inside the use, x is 20
Outside the use, x is 10
The leaked x from the use is 20
One should be forewarned that the way this works is it effectively does a creation like make object! [x: none]. Then it does a deep walk of the body of the USE, looking for ANY-WORD! that are named x (or X, case doesn't matter)...and binds them to that OBJECT!.
This has several annoying properties:
The enumeration and update of bindings takes time. If you're in a loop it will take this time each visit through the loop.
The creation of the OBJECT! makes two series nodes, one for tracking keys (x) and one for tracking vars (20). Again if you are in a loop, the two series nodes will be created each time through that loop. As the GET outside the loop shows, these nodes will linger until the garbage collector decides they're not needed anymore.
You might want to say use [x] code and not disrupt the bindings in code, hence the body would need to be deep copied before changing it.
The undesirable properties of deep binding led Red to change the language semantics of constructs like FOR-EACH. It currently has no USE construct either, perhaps considered best to avoid for some of the same reasoning.
(Note: New approaches are being investigated on the Rebol side for making the performance "acceptable cost", which might be good enough to use in the future. It would be an evolution of the technique used for specific binding).

Is it possible to implement DO + function in pure REBOL?

When DO is followed by a function, that function is executed and the remaining values are consumed as arguments according to the arity of the given function, e.g.,
do :multiply 3 4
multiply 3 4
These two statements are identical in their effects. But I think DO + function receives special treatment by the REBOL interpreter, because I don't believe it's possible to implement your own DO (with the exact same syntax) in pure REBOL, e.g.,
perform: func [f [any-function!]] [
; What goes here?
]
Is this correct?
Clarification
I am not asking about the DO dialect. This is not a "beginner" question. I understand REBOL's general syntax very, very well: Bindology (an old blog post I did on it), the implications of its homoiconicity, the various flavors of words, and all the rest. (For example, here is my implementation of Logo's cascade in REBOL. While I'm at it, why not plug my Vim syntax plug-in for REBOL.)
I'm asking something more subtle. I'm not sure how I can phrase it more clearly than I already have, so I'll ask you to read my original question more carefully. I want to achieve a function that, like DO, has the following capability:
do :multiply 3 4
double: func [n] [n * 2]
do :double 5
Notice how the syntax do :double or do :multiply consumes the appropriate number of REBOL values after it. This is the key to understanding what I'm asking. As far as I can tell, it is not possible to write your own REBOL function that can DO this.
You'll have answered this question when you can write your own function in pure REBOL that can be substituted for DO in the examples above—without dialects, blocks, or any other modifications—or explain why it can't be done.
The cause of the behavior you are seeing is specifically this line of code for the Rebol native DO.
/***********************************************************************
**
*/ REBNATIVE(do)
/*
***********************************************************************/
{
REBVAL *value = D_ARG(1);
switch (VAL_TYPE(value)) {
/* ... */
case REB_NATIVE:
case REB_ACTION:
case REB_COMMAND:
case REB_REBCODE:
case REB_OP:
case REB_CLOSURE:
case REB_FUNCTION:
VAL_SET_OPT(value, OPTS_REVAL); /* <-- that */
return R_ARG1;
This OPTS_REVAL can be found in sys-value.h, where you'll find some other special control bits...like the hidden "line break" flag:
// Value option flags:
enum {
OPTS_LINE = 0, // Line break occurs before this value
OPTS_LOCK, // Lock word from modification
OPTS_REVAL, // Reevaluate result value
OPTS_UNWORD, // Not a normal word
OPTS_TEMP, // Temporary flag - variety of uses
OPTS_HIDE, // Hide the word
};
So the way the DO native handles a function is to return a kind of "activated" function value. But you cannot make your own values with this flag set in user code. The only place in the entire codebase that sets the flag is this snippet in the DO native.
It looks like something that could be given the axe, as APPLY does this more cleanly and within the definitions of the system.
Yes, in Rebol 3:
>> perform: func [f [any-function!]] [return/redo :f]
>> perform :multiply 3 4
== 12
>> double: func [n] [n * 2]
>> perform :double 5
== 10
You might find it interesting to read: Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?
This is a good question, and I will try to explain it to the best of my understanding.
The two statements above are identical in effect, but it is worth diving deeper into what is happening.
The :word syntax is known as a get-word! and is equivalent to writing get 'word. So another way of writing this would be
do get 'multiply 3 4
multiply is just another word! to Rebol.
The do dialect is the default dialect used by the Rebol interpreter.
If you want to implement your own version of do you need to be evaluating your code/data yourself, not using do. Here is a trivial example:
perform: func [ code [block!]] [ if equal? code [ 1 ] [ print "Hello" ] ]
This defines perform as a function which takes a block of code. The "language" or dialect it is expecting is trivial in that the syntax is just perform an action (print "hello") if the code passed is the integer 1.
If this was called as
perform [ multiply 3 4 ]
nothing would happen as code is not equal to 1.
The only way it would do something is if it was passed a block! containing 1.
>> perform [ 1 ]
Hello
Expanding on this slightly:
perform: func [ code [block!]] [ if equal? code [ multiply 3 4 ] [ 42 ] ]
would give us a perform which behaves very differently.
>> perform [ multiply 3 4 ]
== 42
You can easily write your own do to evaluate your dialect, but if you run it directly then you are already running within the do dialect so you need to call a function of some kind to bootstrap your own dialect.
This jumping between dialects is a normal way to write Rebol code, a good example of this being the parse dialect
parse [ 1 2.4 3 ] [ some number! ]
which has it's own syntax and even reuses existing do dialect words such as skip but with a different meaning.