Variable substitution within braces in Tcl - variables

Correct me wherever I am wrong.
When we use the variables inside braces, the value won't be replaced during evaluation and simply passed on as an argument to the procedure/command. (Yes, some exception are there like expr {$x+$y}).
Consider the following scenarios,
Scenario 1
% set a 10
10
% if {$a==10} {puts "value is $a"}
value is 10
% if "$a==10" "puts \"value is $a\""
value is 10
Scenario 2
% proc x {} {
set c 10
uplevel {set val $c}
}
%
% proc y {} {
set c 10
uplevel "set val $c"
}
% x
can't read "c": no such variable
% y
10
% set val
10
%
In both of the scenarios, we can see that the variable substitution is performed on the body of the if loop (i.e. {puts "value is $a"}), whereas in the uplevel, it is not (i.e. {set val $c}), based on the current context.
I can see it as if like they might have access it via upvar kind of stuffs may be. But, why it has to be different among places ? Behind the scene, why it has to be designed in such this way ? Or is it just a conventional way how Tcl works?

Tcl always works exactly the same way with exactly one level of interpretation, though there are some cases where there is a second level because a command specifically requests it. The way it works is that stuff inside braces is never interpolated or checked for word boundaries (provided those braces start at the start of a “word”), stuff in double quotes is interpolated but not parsed for word boundaries (provided they start a word), and otherwise both interpolation and word boundary scanning are done (with the results of interpolation not scanned).
But some commands send the resulting word through again. For example:
eval {
puts "this is an example with your path: $env(PATH)"
}
The rule applies to the outer eval, but that concatenates its arguments and then sends the results into Tcl again. if does something similar with its body script except there's no concatenation, and instead there's conditional execution. proc also does the same, except it delays running the code until you call the procedure. The expr command is like eval, except that sends the script into the expression evaluation engine, which is really a separate little language. The if command also uses the expression engine (as do while and for). The expression language understands $var (and […]) as well.
So what happens if you do this?
set x [expr $x + $y]
Well, first we parse the first word out, set, then x, then with the third word we start a command substitution, which recursively enters the parser until the matching ] is found. With the inner expr, we first parse expr, then $x (reading the x variable), then +, then $y. Now the expr command is invoked with three arguments; it concatenates the values with spaces between them and sends the result of the concatenation into the expression engine. If you had x previously containing $ab and y containing [kaboom], the expression to evaluate will be actually:
$ab + [kaboom]
which will probably give you an error about a non-existing variable or command. On the other hand, if you did expr {$x + $y} with the braces, you'll get an addition applied to the contents of the two variables (still an error in this case, because neither looks like a number).
You're recommended to brace your expressions because then the expression that you write is the expression that will be evaluated. Otherwise, you can get all sorts of “unexpected” behaviours. Here's a mild example:
set x {12 + 34}
puts [expr $x]
set y {56 + 78}
puts [expr $y]
puts [expr $x * $y]
Remember, Tcl always works the same way. No special cases. Anything that looks like a special cases is just a command that implements a little language (often by calling recursively back into Tcl or the expression engine).

In addition to Donal Fellows's answer:
In scenario 2, in x the command uplevel {set val $c} is invoked, and fails because there is no such variable at the caller's level.
In y, the equivalent of uplevel {set val 10} is invoked (because the value of c is substituted when the command is interpreted). This script can be evaluated at the caller's level since it doesn't depend on any variables there. Instead, it creates the variable val at that level.
It has been designed this way because it gives the programmer more choices. If we want to avoid evaluation when a command is prepared for execution (knowing that the command we invoke may still evaluate our variables as it executes), we brace our arguments. If we want evaluation to happend during command preparation, we use double quotes (or no form of quoting).
Now try this:
% set c 30
30
% x
30
% y
10
If there is such a variable at the caller's level, x is a useful command for setting the variable val to the value of c, while y is a useful command for setting the variable val to the value encapsulated inside y.

Related

Raku zip operator & space

I found this one liner which joins same lines from multiple files.
How to add a space between two lines?
If line 1 from file A is blue and line 1 from file B is sky, a get bluesky,
but need blue sky.
say $_ for [Z~] #*ARGS.map: *.IO.lines;
This is using the side-effect of .Str on a List to add spaces between the elements:
say .Str for [Z] #*ARGS.map: *.IO.lines
The Z will create 2 element List objects, which the .Str will then stringify.
Or even shorter:
.put for [Z] #*ARGS.map: *.IO.lines
where the .put will call the .Str for you and output that.
If you want anything else inbetween, then you could probably use .join:
say .join(",") for [Z] #*ARGS.map: *.IO.lines
would put comma's between the words.
Note: definitely don't do this in anything approaching real code. Use (one of) the readable ways in Liz's answer.
If you really want to use the same structure as [Z~] – that is, an operator modified by the Zip meta-operator, all inside the Reduce meta-operator – you can. But it's not pretty:
say $_ for [Z[&(*~"\x20"~*)]] #*ARGS.map: *.IO.lines
Here's how that works: Z can take an operator, so we need to give it an operator that concatenates two strings with a space in between. But there's no operator like that built in. No problem – we can turn any function into an infix operator by surrounding it with [ ] (the infix form).
So all we need is a function that joins two strings with a space between them. That also doesn't exist, but we can create one: * ~ ' ' ~ *. So, we should be able to shove that into our infix form and pass the whole thing to the Zip operator Z[* ~ ' ' ~ *].
Except that doesn't work. Because Zip isn't really expecting an infix form, we need to give it a hint that we're passing in a function … that is, we need to put our function into a callable context with &( ), which gets us to Z[&(* ~ ' ' ~ *)].
That Zip expression does what we want when used in infix position – but it still doesn't work once we put it back into the Reduce/[ ] operator that we want to use. This time, the problem is due to something that may or may not be a bug – even after discussing it with jnthn on github, I'm still not sure whether this behavior is intended/correct.
Specifically, the issue is that the Reduction meta-operator doesn't allow whitespace – even in strings. Thus, we need to replace * ~ ' ' ~ * with *~"\c[space]"~* or *~"\x20"~* (where \x20 is the hex value of in Unicode/ASCII). Since we've come this far into obfuscated code, I figure we might as well go all the way. And that gets us back to
say $_ for [Z[&(*~"\x20"~*)]] #*ARGS.map: *.IO.lines
Again, I'm not recommending that you do this. (And, if you do, you could at least make it slightly more readable by saving the * ~ ' ' ~ * function as a named variable in the previous line, which at least gets you whitespace. But, really, just use one of Liz's suggestions).
I just thought this gives a useful window into some of the darker and more interesting corners of Raku's strangely consistent behavior.

String interpolation in Perl6

I have difficulty figuring out why the statement
say "\c500";
produces the character 'Ǵ' on my screen as expected, while the following statements give me an error message at compile time ("Unrecognized \c character"):
my $i = 500;
say "\c$i";
even though
say "$i"; # or 'say $i.Str;' for that matter
produces "500" (with "$i".WHAT indicating type Str).
You'll have to use $i.chr, which is documented here. \c is handled specially within strings, and does not seem to admit anything that is not a literal.
The string literal parser in Perl 6 is a type of domain specific language.
Basically what you write gets compiled similarly to the rest of the language.
"abc$_"
&infix:«~»('abc',$_.Str)
In the case of \c500, you could view it as a compile-time constant.
"\c500"
(BEGIN 500.chr)
Actually it is more like:
(BEGIN 500.HOW.find_method_qualified(Int,500,'chr').(500))
Except that the compiler for string literals actually tries to compile it to an abstract syntax tree, but is unable to because there hasn't been code added to handle this case of \c.
Even if there was, \c is effectively compiled to run at BEGIN time, which is before $_ has a value.
Also \c is used for more than .chr
"\c9" eq "\c[TAB]" eq "\cI" eq "\t"
(Note that \cI represents the character you would get by typing Cntrl+Alt+i on a posix platform)
So which of these should \c$_ compile to?
$_.chr
$_.parse-names
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.index($_).succ.chr
If you want .chr you can write it as one of the following. (spaces added where they are allowed)
"abc$_.chr( )def"
"abc{ $_.chr }def"
"abc{ .chr }def"
'abc' ~ $_.chr ~ 'def'

How to pass functions as parameters in REBOL

I have attempted to pass a function as a parameter in the REBOL programming language, but I haven't figured out the correct syntax yet:
doSomething: func [a b] [
a b
a b
]
doSomething print "hello" {This should pass print as the first argument and "hello" as the second argument.}
This produces an error, since the print function is being called instead of being passed:
hello
*** ERROR
** Script error: doSomething does not allow unset! for its a argument
** Where: try do either either either -apply-
** Near: try load/all join %/users/try-REBOL/data/ system/script/args...
Is it possible to pass the print function as a parameter instead of calling the print function?
I've found the solution: I only need to add : before the name of the function that is being passed as a parameter.
Here, the :print function is being passed as a parameter instead of being invoked with "hello" as its argument:
doSomething: func [a b] [
a b
a b
]
doSomething :print "hello" {This should pass print as the first argument and "hello" as the second argument.}
You have discovered that by the nature of the system, when the interpreter comes across a WORD! symbol type which has been bound to a function, it will invoke the function by default. The default interpreter seeing a GET-WORD! symbol type, on the other hand, suppresses invocation and just returns the value the word is bound to.
The evaluator logic is actually rather straightforward for how it reacts when it sees a certain symbol type. Another way of suppressing invocation is the single quote, which will give you a LIT-WORD! symbol... but these become evaluated as the corresponding WORD! when it sees them:
>> some-word: 'print
>> type? some-word
== word!
In fact, the behavior of a GET-WORD! when the evaluator sees it is equivalent to using the GET function with a WORD!
doSomething: func [a b] [
a b
a b
]
doSomething get 'print "hello" {Message}
The interpreter sees the LIT-WORD! 'print and evaluates that into the WORD! for print, which is then passed to GET, which gives you a FUNCTION! back.
Simplicity of the interpreter logic is why you get things like:
>> a: b: c: 10 print [a b c]
10 10 10
Due to the nature of how it handles a SET-WORD! symbol followed by complete expressions. That yields also the following code printing out 20:
if 10 < a: 20 [
print a
]
Other languages achieve such features with specialized constructs (like multiple initialization, etc.) But Rebol's logic is simpler.
Just wanted to elaborate a bit to help explain what you were looking at. My answer to this other question might provide some more insight into the edge cases, historical and future: "When I use error? and try, err need a value"

PostScript mark token

In PostScript if you have
[4 5 6]
you have the following tokens:
mark integer integer integer mark
The stack goes like this:
| mark |
| mark | integer |
| mark | integer | integer |
| mark | integer | integer | integer |
| array |
Now my question:
Is the ]-mark operator a literal object or an executable object?
Am I correct that the [-mark is a literal object (just data) and that the ]-mark is an executable object (because you always need to create an array when you see this ]-mark operator) ?
PostScript Language Reference Manual section 3.3.2 gives me:
The [ and ] operators, when executed, produce a literal array object with the en-closed objects as elements. Likewise, << and >> (LanguageLevel 2) produce a
literal dictionary object.
That is not clear for me if both [ ] operators are executable or only the ] operator.
Summary.
All of these special tokens, [, ], <<, >>, come out of the scanner as executable names. [ and << are defined to yield a marktype object (so they are not operators per se, but they are executable names defined in systemdict where all the operators live). ] and >> are defined as procedures or operators which are executed just like any other procedure or operator. These use the counttomark operator to find the opening bracket. But all of these tokens are treated specially by the scanner, which recognizes them without surrounding whitespace since they are part of its delimiter set.
Details.
It all depends on when you look at it. Let's trace through what the interpreter does with these tokens. I'm going to illustrate this with a string, but it works just the same with a file.
So if you have an input string
([4 5 6]) cvx exec
cvx makes a literal object executable. The program stream is a file object also labeled executable. exec pushes an object on the Execution Stack, where it is encountered by the interpreter on the next iteration of the inner interpreter processing loop. When executing the program stream, the executable file object is topmost on the Execution Stack.
The interpreter uses token to call the scanner. The scanner skips initial whitespace, then reads all non-whitespace characters up to the next delimiter, then attempts to interpret the string as a number, and failing that it becomes an executable name. The brackets are part of the set of delimiters, and so are termed 'self-delimiting'. So the scanner reads the one bracket character, stops reading because it's a delimiter, discovers it cannot be a number, so it yields an executable name.
Top of Exec Stack | Operand Stack
(4 5 6]) [ |
Next, the interpreter loop executes anything executable (unless it's an array). Executing a token means loading it from the dictionary, and then executing the definition if it's executable. [ is defined as a -mark- object, same as the name mark is defined. It's not technically an operator or a procedure, it's just a definition. Automatic loading happens because the name comes out of the scanner with the executable flag set.
(4 5 6]) | -mark-
The scanner then yields 4, 5, and 6 which are numbers and get pushed straight to the operand stack. 6 is delimited by the ] which is pushed back on the stream.
(]) | -mark- 4 5 6
The interpreter doesn't execute the numbers since they are not executable, but it would be just the same if it did. The action for executing a number is simply to push it on the stack.
Then, finally the scanner encounters the right bracket ]. And that's where the magic happens. Self-delimited, it doesn't need to be followed by any whitespace. The scanner yields the executable name ] and the interpreter executes it by loading and it finds ...
{ counttomark array astore exch pop }
Or maybe an actual operator that does this. But, yeah. counttomark yields the number of elements. array creates an array of that size. astore fills an array with elements from the stack. And exch pop to discard that pesky mark once and for all.
For dictionaries, << is exactly the same as [. It drops a mark. Then you line up some key-value pairs, and >> is procedure that does something to effect of ...
{ counttomark dup dict begin 2 idiv { def } repeat pop currentdict end }
Make a dictionary. Define all the pairs. Pop the mark. Yield the dictionary. This version of the procedure tries to create a fast dictionary by making it double-sized. Move the 2 idiv to before dup to make a small dictionary.
So, to get philosophical, counttomark is the operator you're using. And it requires a special object-type that isn't used for anything else, the marktype object, -mark-. The rest is just syntactical sugar to let you access this stack-counting ability to create linear data-structures.
Appendix
Here's a procedure that models the interpreter loop reading from currentfile.
{currentfile token not {exit} if dup type /arraytype ne {exec} if }loop
exec is responsible for loading (and further executing) any executable names. You can see from this that token really is the name of the scanner; and that procedures (arrays) directly encountered by the interpreter loop are not executed (type /arraytype ne {exec} if).
Using token on strings lets you do really cool stuff, however. For example, you can dynamically construct procedure bodies with substituted names. This is very much like a lisp macro.
/makeadder { % n . { n add }
1 dict begin
/n exch def
({//n add}) token % () {n add} true
pop exch pop % {n add}
end
} def
token reads the entire procedure from the string, substituting the immediately-evaluated name //n with its currently defined value. Notice here that the scanner reads an executable array all at once, effectively executing [ ... ] cvx internally before returning (In certain interpreters, like my own xpost, this allows you to bypass the stack-size limits to build an array, because the array is built in separate memory. But Level 2 garbage collection makes this largely irrelevant).
There is also the bind operator which modifies a procedure by replacing operator names with the operator objects themselves. These tricks help you to factor-out name lookups in speed-critical procedures (like inner loops).
Both [ and ] are executable tokens. [ produces a mark object, ] creates an array of objects to the last mark

get in Object 'Func with Refinement in Rebol

Let's say I have
o: context [
f: func[message /refine message2][
print [message]
if refine [print message 2]
]
]
I can call it like this
do get in o 'f "hello"
But how can I do for the refinement ? something like this that would work
>> do get in o 'f/refine "hello" "world"
** Script Error: in expected word argument of type: any-word
** Near: do get in o 'f/refine
>>
I don't know if there's a way to directly tell the interpreter to use a refinement in invoking a function value. That would require some parameterization of do when its argument is a function! Nothing like that seems to exist...but maybe it's hidden somewhere else.
The only way I know to use a refinement is with a path. To make it clear, I'll first use a temporary word:
>> fword: get in o 'f
>> do compose [(to-path [fword refine]) "hello" "world"]
hello
world
What that second statement evaluates to after the compose is:
do [fword/refine "hello" "world"]
You can actually put function values into paths too. It gets rid of the need for the intermediary:
>> do compose [(to-path compose [(get in o 'f) refine]) "hello" "world"]
hello
world
P.S. you have an extra space between message and 2 above, where it should just be message2
Do this:
o/('f)/refine "hello" "world"
Parens in a path expression are evaluated if they correspond to object field or series pick/poke index references. That makes the above code equivalent to this:
apply get in o 'f ["hello" true "world"]
Note that apply arguments are positional, so you need to know the order the arguments were declared in. You can't do that trick with the function refinements themselves, so you have to use apply or create path expressions to evaluate if you want to parameterize the refinements of the function call.
Use the simple path o/f/refine