Rebol: Dynamic binding of block words - rebol

In Rebol, there are words like foreach that allow "block parametrization" over a given word and a series, e.g., foreach w [1 2 3] [print w]. Since I find that syntax very convenient (as opposed to passing func blocks), I'd like to use it for my own words that operate on lazy lists, e.g map/stream x s [... x ... ].
How is that syntax idiom called? How is it properly implemented?
I was searching the docs, but I could not find a straight answer, so I tried to implement foreach on my own. Basically, my implementation comes in two parts. The first part is a function that binds a specific word in a block to a given value and yields a new block with the bound words.
bind-var: funct [block word value] [
qw: load rejoin ["'" word]
do compose [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
]
Using that, I implemented foreach as follows:
my-foreach: func ['word s block] [
if empty? block [return none]
until [
do bind-var block word first s
s: next s
tail? s
]
]
I find that approach quite clumsy (and it probably is), so I was wondering how the problem can be solved more elegantly. Regardless, after coming up with my contraption, I am left with two questions:
In bind-var, I had to do some wrapping in bind [(block)] (:qw) because (block) would "dissolve". Why?
Because (?) of 2, the bind operation is performed on a new block (created by the [(block)] expression), not the original one passed to my-foreach, with seperate bindings, so I have to operate on that. By mistake, I added [(block)] and it still works. But why?

Great question. :-) Writing your own custom loop constructs in Rebol2 and R3-Alpha (and now, history repeating with Red) has many unanswered problems. These kinds of problems were known to the Rebol3 developers and considered blocking bugs.
(The reason that Ren-C was started was to address such concerns. Progress has been made in several areas, though at time of writing many outstanding design problems remain. I'll try to just answer your questions under the historical assumptions, however.)
In bind-var, I had to do some wrapping in bind [(block)] (:qw) because (block) would "dissolve". Why?
That's how COMPOSE works by default...and it's often the preferred behavior. If you don't want that, use COMPOSE/ONLY and blocks will not be spliced, but inserted as-is.
qw: load rejoin ["'" word]
You can convert WORD! to LIT-WORD! via to lit-word! word. You can also shift the quoting responsibility into your boilerplate, e.g. set quote (word) value, and avoid qw altogether.
Avoiding LOAD is also usually preferable, because it always brings things into the user context by default--so it loses the binding of the original word. Doing a TO conversion will preserve the binding of the original WORD! in the generated LIT-WORD!.
do compose [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
Presumably you meant COMPOSE/DEEP here, otherwise this won't work at all... with regular COMPOSE the embedded PAREN!s cough, GROUP!s for [(block)] will not be substituted.
By mistake, I added [(block)] and it still works. But why?
If you do a test like my-foreach x [1] [print x probe bind? 'x] the output of the bind? will show you that it is bound into the "global" user context.
Fundamentally, you don't have any MAKE OBJECT! or USE to create a new context to bind the body into. Hence all you could potentially be doing here would be stripping off any existing bindings in the code for x and making sure they are into the user context.
But originally you did have a USE, that you edited to remove. That was more on the right track:
bind-var: func [block word value /local qw] [
qw: load rejoin ["'" word]
do compose/deep [
use [(qw)] [
set (:qw) value
bind [(block)] (:qw)
[(block)] ; This shouldn't work? see Question 2
]
]
]
You're right to suspect something is askew with how you're binding. But the reason this works is because your BIND is only redoing the work that USE itself does. USE already deep walks to make sure any of the word bindings are adjusted. So you could omit the bind entirely:
do compose/deep [
use [(qw)] [
set (:qw) value
[(block)]
]
]
the bind operation is performed on a new block (created by the [(block)] expression), not the original one passed to my-foreach, with separate bindings
Let's adjust your code by taking out the deep-walking USE to demonstrate the problem you thought you had. We'll use a simple MAKE OBJECT! instead:
bind-var: func [block word value /local obj qw] [
do compose/deep [
obj: make object! [(to-set-word word) none]
qw: bind (to-lit-word word) obj
set :qw value
bind [(block)] :qw
[(block)] ; This shouldn't work? see Question 2
]
]
Now if you try my-foreach x [1 2 3] [print x]you'll get what you suspected... "x has no value" (assuming you don't have some latent global definition of x it picks up, which would just print that same latent value 3 times).
But to make you sufficiently sorry you asked :-), I'll mention that my-foreach x [1 2 3] [loop 1 [print x]] actually works. That's because while you were right to say a bind in the past shouldn't affect a new block, this COMPOSE only creates one new BLOCK!. The topmost level is new, any "deeper" embedded blocks referenced in the source material will be aliases of the original material:
>> original: [outer [inner]]
== [outer [inner]]
>> composed: compose [<a> (original) <b>]
== [<a> outer [inner] <b>]
>> append original/2 "mutation"
== [inner "mutation"]
>> composed
== [<a> outer [inner "mutation"] <b>]
Hence if you do a mutating BIND on the composed result, it can deeply affect some of your source.
until [
do bind-var block word first s
s: next s
tail? s
]
On a general note of efficiency, you're running COMPOSE and BIND operations on each iteration of your loop. No matter how creative new solutions to these kinds of problems get (there's a LOT of new tech in Ren-C affecting your kind of problem), you're still probably going to want to do it only once and reuse it on the iterations.

Related

How to use hyperoperators with Scalars that aren't really scalar?

I want to make a hash of sets. Well, SetHashes, since they need to be mutable.
In fact, I would like to initialize my Hash with multiple identical copies of the same SetHash.
I have an array containing the keys for the new hash: #keys
And I have my SetHash already initialized in a scalar variable: $set
I'm looking for a clean way to initialize the hash.
This works:
my %hash = ({ $_ => $set.clone } for #keys);
(The parens are needed for precedence; without them, the assignment to %hash is part of the body of the for loop. I could change it to a non-postfix for loop or make any of several other minor changes to get the same result in a slightly different way, but that's not what I'm interested in here.)
Instead, I was kind of hoping I could use one of Raku's nifty hyper-operators, maybe like this:
my %hash = #keys »=>» $set;
That expression works a treat when $set is a simple string or number, but a SetHash?
Array >>=>>> SetHash can never work reliably: order of keys in SetHash is indeterminate
Good to know, but I don't want it to hyper over the RHS, in any order. That's why I used the right-pointing version of the hyperop: so it would instead replicate the RHS as needed to match it up to the LHS. In this sort of expression, is there any way to say "Yo, Raku, treat this as a scalar. No, really."?
I tried an explicit Scalar wrapper (which would make the values harder to get at, but it was an experiment):
my %map = #keys »=>» $($set,)
And that got me this message:
Lists on either side of non-dwimmy hyperop of infix:«=>» are not of the same length while recursing
left: 1 elements, right: 4 elements
So it has apparently recursed into the list on the left and found a single key and is trying to map it to a set on the right which has 4 elements. Which is what I want - the key mapped to the set. But instead it's mapping it to the elements of the set, and the hyperoperator is pointing the wrong way for that combination of sizes.
So why is it recursing on the right at all? I thought a Scalar container would prevent that. The documentation says it prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
The error message says the version of the hyperoperator I'm using is "non-dwimmy", which may explain why it's not in fact doing what I mean, but is there maybe an even-less-dwimmy version that lets me be even more explicit? I still haven't gotten my brain aligned well enough with the way Raku works for it to be able to tell WIM reliably.
I'm looking for a clean way to initialize the hash.
One idiomatic option:
my %hash = #keys X=> $set;
See X metaoperator.
The documentation says ... a Scalar container ... prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
A cat is an animal, but an animal is not necessarily a cat. Flattening may act recursively, but some operations that act recursively don't flatten. Recursive flattening stops if it sees a Scalar. But hyperoperation isn't flattening. I get where you're coming from, but this is not the real problem, or at least not a solution.
I had thought that hyperoperation had two tests controlling recursing:
Is it hyperoperating a nodal operation (eg .elems)? If so, just apply it like a parallel shallow map (so don't recurse). (The current doc quite strongly implies that nodal can only be usefully applied to a method, and only a List one (or augmentation thereof) rather than any routine that might get hyperoperated. That is much more restrictive than I was expecting, and I'm sceptical of its truth.)
Otherwise, is a value Iterable? If so, then recurse into that value. In general the value of a Scalar automatically behaves as the value it contains, and that applies here. So Scalars won't help.
A SetHash doesn't do the Iterable role. So I think this refusal to hyperoperate with it is something else.
I just searched the source and that yields two matches in the current Rakudo source, both in the Hyper module, with this one being the specific one we're dealing with:
multi method infix(List:D \left, Associative:D \right) {
die "{left.^name} $.name {right.^name} can never work reliably..."
}
For some reason hyperoperation explicitly rejects use of Associatives on either the right or left when coupled with the other side being a List value.
Having pursued the "blame" (tracking who made what changes) I arrived at the commit "Die on Associative <<op>> Iterable" which says:
This can never work due to the random order of keys in the Associative.
This used to die before, but with a very LTA error about a Pair.new()
not finding a suitable candidate.
Perhaps this behaviour could be refined so that the determining factor is, first, whether an operand does the Iterable role, and then if it does, and is Associative, it dies, but if it isn't, it's accepted as a single item?
A search for "can never work reliably" in GH/rakudo/rakudo issues yields zero matches.
Maybe file an issue? (Update I filed "RFC: Allow use of hyperoperators with an Associative that does not do Iterable role instead of dying with "can never work reliably".)
For now we need to find some other technique to stop a non-Iterable Associative being rejected. Here I use a Capture literal:
my %hash = #keys »=>» \($set);
This yields: {a => \(SetHash.new("b","a","c")), b => \(SetHash.new("b","a","c")), ....
Adding a custom op unwraps en passant:
sub infix:« my=> » ($lhs, $rhs) { $lhs => $rhs[0] }
my %hash = #keys »my=>» \($set);
This yields the desired outcome: {a => SetHash(a b c), b => SetHash(a b c), ....
my %hash = ({ $_ => $set.clone } for #keys);
(The parens seem to be needed so it can tell that the curlies are a block instead of a Hash literal...)
No. That particular code in curlies is a Block regardless of whether it's in parens or not.
More generally, Raku code of the form {...} in term position is almost always a Block.
For an explanation of when a {...} sequence is a Hash, and how to force it to be one, see my answer to the Raku SO Is that a Hash or a Block?.
Without the parens you've written this:
my %hash = { block of code } for #keys
which attempts to iterate #keys, running the code my %hash = { block of code } for each iteration. The code fails because you can't assign a block of code to a hash.
Putting parens around the ({ block of code } for #keys) part completely alters the meaning of the code.
Now it runs the block of code for each iteration. And it concatenates the result of each run into a list of results, each of which is a Pair generated by the code $_ => $set.clone. Then, when the for iteration has completed, that resulting list of pairs is assigned, once, to my %hash.

How to add same method with 2 different names in GNU Smalltalk?

How can I have a class expose the same method with 2 different names?
E.g. that the asDescripton function does the same thing / re-exports the asString function without simply copy-pasting the code.
Object subclass: Element [
| width height |
Element class >> new [
^super new init.
]
init [
width := 0.
height := 0.
]
asString [
^ 'Element with width ', width, ' and height ', height.
]
asDescription [ "???" ]
]
In Smalltalk you usually implement #printOn: and get #asString from the inherited version of it which goes on the lines of
Object >> asString
| stream |
stream := '' writeStream.
self printOn: stream.
^stream contents
The actual implementation of this method may be slightly different in your environment, the idea remains the same.
As this is given, it is usually a good idea to implement #printOn: rather than #asString. In your case you would have it implemented as
Element >> printOn: aStream
aStream
nextPutAll: 'Element with width ';
nextPutAll: width asString;
nextPutAll: ' and height ';
nextPutAll: height asString
and then, as JayK and luker indicated,
Element >> asDescription
^self asString
In other words, you (usually) don't want to implement #asString but #printOn:. This approach is better because it takes advantage of the inheritance and ensures consistency between #printOn: and #asString, which is usually expected. In addition, it will give you the opportunity to start becoming familiar with Streams, which play a central role in Smalltalk.
Note by the way that in my implementation I've used width asString and heigh asString. Your code attempts to concatenate (twice) a String with a Number:
'Element with width ', width, ' and height ', height.
which won't work as you can only concatenate instances of String with #,.
In most of the dialects, however, you can avoid sending #asString by using #print: instead of #nextPutAll:, something like:
Element >> printOn: aStream
aStream
nextPutAll: 'Element with width ';
print: width;
nextPutAll: ' and height ';
print: height
which is a little bit less verbose and therefore preferred.
One last thing. I would recommend changing the first line above with this one:
nextPutAll: self class name;
nextPutAll: ' with width ';
instead of hardcoding the class name. This would prove to be useful if in the future you subclass Element because you will have no need to tweak #printOn: and any of its derivatives (e.g., #asDescription).
Final thought: I would rename the selector #asDescription to be #description. The preposition as is intended to convert an object to another of a different class (this is why #asString is ok). But this doesn't seem to be the case here.
Addendum: Why?
There is a reason why #asString is implemented in terms of #printOn:, and not the other way around: generality. While the effort (code complexity) is the same, #printOn: is clearly a winner because it will work with any character Stream. In particular, it will work with no modification whatsoever with
Files (instances of FileStream)
Sockets (instances of SocketStream)
The Transcript
In other words, by implementing #printOn: one gets #asString for free (inheritance) and --at the same time-- the ability to dump a representation of the object on files and sockets. The Transcript is particularly interesting because it supports the Stream protocol for writing, and thus can be used for testing purposes before sending any bytes to external devices.
Remember!
In Smalltalk, the goal is to have objects whose behavior is simple and general at once, not just simple!
As lurker wrote in the comments, send the asString message in asDescription.
asDescription
^ self asString
This is usually done to expose additional interfaces/protocols from a class, for compatibility or as a built-in adapter. If you create something new that does not have to fit in anywhere else, consider to stick to just one name for each operation.
Edit: if you really are after the re-export semantics and do not want the additional message sends involved in the delegation above, there might be a way to put the CompiledMethod of asString in the method dictionary of the class a second time under the other name. But neither am I sure that this would work, nor do I know the protocol in GNU Smalltalk how to manipulate the method dictionary. Have a look at the documentation of the Behavior class. Also, I would not consider this as programming Smalltalk, but tinkering with the system.

Hash with Array values in Perl 6

What's going on here?
Why are %a{3} and %a{3}.Array different if %a has Array values and %a{3} is an Array?
> my Array %a
{}
> %a{3}.push("foo")
[foo]
> %a{3}.push("bar")
[foo bar]
> %a{3}.push("baz")
[foo bar baz]
> .say for %a{3}
[foo bar baz]
> %a{3}.WHAT
(Array)
> .say for %a{3}.Array
foo
bar
baz
The difference being observed here is the same as with:
my $a = [1,2,3];
.say for $a; # [1 2 3]
.say for $a.Array; # 1\n2\n3\n
The $ sigil can be thought of as meaning "a single item". Thus, when given to for, it will see that and say "aha, a single item" and run the loop once. This behavior is consistent across for and operators and routines. For example, here's the zip operator given arrays and them itemized arrays:
say [1, 2, 3] Z [4, 5, 6]; # ((1 4) (2 5) (3 6))
say $[1, 2, 3] Z $[4, 5, 6]; # (([1 2 3] [4 5 6]))
By contrast, method calls and indexing operations will always be called on what is inside of the Scalar container. The call to .Array is actually a no-op since it's being called on an Array already, and its interesting work is actually in the act of the method call itself, which is unwrapping the Scalar container. The .WHAT is like a method call, and is telling you about what's inside of any Scalar container.
The values of an array and a hash are - by default - Scalar containers which in turn hold the value. However, the .WHAT used to look at the value was hiding that, since it is about what's inside the Scalar. By contrast, .perl [1] makes it clear that there's a single item:
my Array %a;
%a{3}.push("foo");
%a{3}.push("bar");
say %a{3}.perl; $["foo", "bar"]
There are various ways to remove the itemization:
%a{3}.Array # Identity minus the container
%a{3}.list # Also identity minus the container for Array
#(%a{3}) # Short for %a{3}.cache, which is same as .list for Array
%a{3}<> # The most explicit solution, using the de-itemize op
|%a{3} # Short for `%a{3}.Slip`; actually makes a Slip
I'd probably use for %a{3}<> { } in this case; it's both shorter than the method calls and makes clear that we're doing this purely to remove the itemization rather than a coercion.
While for |%a{3} { } also works fine and is visually nice, it is the only one that doesn't optimize down to simply removing something from its Scalar container, and instead makes an intermediate Slip object, which is liable to slow the iteration down a bit (though depending on how much work is being done by the loop, that could well be noise).
[1] Based on what I wrote, one may wonder why .perl can recover the fact that something was itemized. A method call $foo.bar is really doing something like $foo<>.^find_method('bar')($foo). Then, in a method bar() { self }, the self is bound to the thing the method was invoked on, removed from its container. However, it's possible to write method bar(\raw-self:) { } to recover it exactly as it was provided.
The issue is Scalar containers do DWIM indirection.
%a{3} is bound to a Scalar container.
By default, if you refer to the value or type of a Scalar container, you actually access the value, or type of the value, contained in the container.
In contrast, when you refer to an Array container as a single entity, you do indeed access that Array container, no sleight of hand.
To see what you're really dealing with, use .VAR which shows what a variable (or element of a composite variable) is bound to rather than allowing any container it's bound to to pretend it's not there.
say %a{3}.VAR ; # $["foo", "bar", "baz"]
say %a{3}.Array.VAR ; # [foo bar baz]
This is a hurried explanation. I'm actually working on a post specifically focusing on containers.

Is it possible to create LOCAL variable dynamically in rebol / red?

It is easy to create GLOBAL variables dynamically in rebol / red with set like
i: 1
myvarname: rejoin ["var" i]
set to-word myvarname 10
var1
but then var1 is global. What if I want to create var1 dynamically inside a function and make it LOCAL so as to avoid collision with some global variables of same name ?
In javascript it is possible:
How to declare a dynamic local variable in Javascript
Not sure it is possible in rebol/red ?
In Red you have function, in Rebol2 you have funct. Both create local variable words automatically. Here an example for Rebol2
>> for num 1 100 1 [
[ set to-word rejoin ["f" num] funct [] compose/deep [
[ print [ "n =" n: (num) ]
[ ]
[ ]
>> f1
n = 1
>> f2
n = 2
>> n
** Script Error: n has no value
** Near: n
How it is done, you can see with source funct
In Rebol, there is USE:
x: 10
word: use [x] [
x: 20
print ["Inside the use, x is" x]
'x ;-- leak the word with binding to the USE as evaluative result
]
print ["Outside the use, plain x is" x]
print ["The leaked x from the use is" get word]
That will give you:
Inside the use, x is 20
Outside the use, x is 10
The leaked x from the use is 20
One should be forewarned that the way this works is it effectively does a creation like make object! [x: none]. Then it does a deep walk of the body of the USE, looking for ANY-WORD! that are named x (or X, case doesn't matter)...and binds them to that OBJECT!.
This has several annoying properties:
The enumeration and update of bindings takes time. If you're in a loop it will take this time each visit through the loop.
The creation of the OBJECT! makes two series nodes, one for tracking keys (x) and one for tracking vars (20). Again if you are in a loop, the two series nodes will be created each time through that loop. As the GET outside the loop shows, these nodes will linger until the garbage collector decides they're not needed anymore.
You might want to say use [x] code and not disrupt the bindings in code, hence the body would need to be deep copied before changing it.
The undesirable properties of deep binding led Red to change the language semantics of constructs like FOR-EACH. It currently has no USE construct either, perhaps considered best to avoid for some of the same reasoning.
(Note: New approaches are being investigated on the Rebol side for making the performance "acceptable cost", which might be good enough to use in the future. It would be an evolution of the technique used for specific binding).

Is it possible to implement DO + function in pure REBOL?

When DO is followed by a function, that function is executed and the remaining values are consumed as arguments according to the arity of the given function, e.g.,
do :multiply 3 4
multiply 3 4
These two statements are identical in their effects. But I think DO + function receives special treatment by the REBOL interpreter, because I don't believe it's possible to implement your own DO (with the exact same syntax) in pure REBOL, e.g.,
perform: func [f [any-function!]] [
; What goes here?
]
Is this correct?
Clarification
I am not asking about the DO dialect. This is not a "beginner" question. I understand REBOL's general syntax very, very well: Bindology (an old blog post I did on it), the implications of its homoiconicity, the various flavors of words, and all the rest. (For example, here is my implementation of Logo's cascade in REBOL. While I'm at it, why not plug my Vim syntax plug-in for REBOL.)
I'm asking something more subtle. I'm not sure how I can phrase it more clearly than I already have, so I'll ask you to read my original question more carefully. I want to achieve a function that, like DO, has the following capability:
do :multiply 3 4
double: func [n] [n * 2]
do :double 5
Notice how the syntax do :double or do :multiply consumes the appropriate number of REBOL values after it. This is the key to understanding what I'm asking. As far as I can tell, it is not possible to write your own REBOL function that can DO this.
You'll have answered this question when you can write your own function in pure REBOL that can be substituted for DO in the examples above—without dialects, blocks, or any other modifications—or explain why it can't be done.
The cause of the behavior you are seeing is specifically this line of code for the Rebol native DO.
/***********************************************************************
**
*/ REBNATIVE(do)
/*
***********************************************************************/
{
REBVAL *value = D_ARG(1);
switch (VAL_TYPE(value)) {
/* ... */
case REB_NATIVE:
case REB_ACTION:
case REB_COMMAND:
case REB_REBCODE:
case REB_OP:
case REB_CLOSURE:
case REB_FUNCTION:
VAL_SET_OPT(value, OPTS_REVAL); /* <-- that */
return R_ARG1;
This OPTS_REVAL can be found in sys-value.h, where you'll find some other special control bits...like the hidden "line break" flag:
// Value option flags:
enum {
OPTS_LINE = 0, // Line break occurs before this value
OPTS_LOCK, // Lock word from modification
OPTS_REVAL, // Reevaluate result value
OPTS_UNWORD, // Not a normal word
OPTS_TEMP, // Temporary flag - variety of uses
OPTS_HIDE, // Hide the word
};
So the way the DO native handles a function is to return a kind of "activated" function value. But you cannot make your own values with this flag set in user code. The only place in the entire codebase that sets the flag is this snippet in the DO native.
It looks like something that could be given the axe, as APPLY does this more cleanly and within the definitions of the system.
Yes, in Rebol 3:
>> perform: func [f [any-function!]] [return/redo :f]
>> perform :multiply 3 4
== 12
>> double: func [n] [n * 2]
>> perform :double 5
== 10
You might find it interesting to read: Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?
This is a good question, and I will try to explain it to the best of my understanding.
The two statements above are identical in effect, but it is worth diving deeper into what is happening.
The :word syntax is known as a get-word! and is equivalent to writing get 'word. So another way of writing this would be
do get 'multiply 3 4
multiply is just another word! to Rebol.
The do dialect is the default dialect used by the Rebol interpreter.
If you want to implement your own version of do you need to be evaluating your code/data yourself, not using do. Here is a trivial example:
perform: func [ code [block!]] [ if equal? code [ 1 ] [ print "Hello" ] ]
This defines perform as a function which takes a block of code. The "language" or dialect it is expecting is trivial in that the syntax is just perform an action (print "hello") if the code passed is the integer 1.
If this was called as
perform [ multiply 3 4 ]
nothing would happen as code is not equal to 1.
The only way it would do something is if it was passed a block! containing 1.
>> perform [ 1 ]
Hello
Expanding on this slightly:
perform: func [ code [block!]] [ if equal? code [ multiply 3 4 ] [ 42 ] ]
would give us a perform which behaves very differently.
>> perform [ multiply 3 4 ]
== 42
You can easily write your own do to evaluate your dialect, but if you run it directly then you are already running within the do dialect so you need to call a function of some kind to bootstrap your own dialect.
This jumping between dialects is a normal way to write Rebol code, a good example of this being the parse dialect
parse [ 1 2.4 3 ] [ some number! ]
which has it's own syntax and even reuses existing do dialect words such as skip but with a different meaning.