How on Raku do we fill out multi variables individually from corresponding regex captures - raku

How is the simplest Raku to fill out so many individual variables from a result regex captures in corresponding order
( just like Perl my ($a, $b $c, $d, $e)= 'hello' =~ m{ ^(h) (e) (l) (l) (o) }x
) ?
try such:
> my ($a, $b, $c, $arr);
[<$a $b $c>]= 'hello world' ~~ m{ (h)(e)(l)lo };
say "\n=$a\n=$b\n=$c"
Use of uninitialized value element of type Any in string context.
Methods .^name, .raku, .gist, or .say can be used to stringify it to something meaningful.
in block <unit> at <unknown file> line 1
...
fails.
Please show the correct and nearest similar way to Perl.

#my ($a, $b, $c, $d, $e) = 'hello' =~ m{ ^(h) (e) (l) (l) (o) }x; # Perl
my ($a, $b, $c, $d, $e) = 'hello' ~~ m{ ^(h) (e) (l) (l) (o) } .list; # Raku
Explaining the differences:
{ ... }x In Perl, white space in a regex is part of the pattern unless you append an x adverb/option. Not so in Raku.
.list. In Perl, matching operations generally return either one string, or a list of strings. It's the same in Raku, except:
The values returned are Match objects, not strings. (They conveniently automatically coerce to strings if treated as strings, but the difference sometimes matters, and greatly expands what Raku can make possible/easy with the returned matches.)
Many operations return a single Match object when Perl folk will expect a list. Your case is an example. But what exactly is a single object? You are a single person. But you're also more than that.
Some objects in Raku are a "parent" of "child" objects. (Match objects are a case in point.) Children are typically accessible by applying appropriate operations to a parent. Your case is an example. .list will often work as an idiomatic way to get at numbered children, .hash for named children. Another approach among many that will work if a given class/object supports it is subscripting:
say ('foo' ~~ m{ (.) (.) })[*]; # (「f」 「o」)
say ('foo' ~~ m{ (.) (.) $<I'm-named>=(.) })<I'm-named>; # 「o」
Why the additional complexity? For Match objects it's a trade off to make the tremendous additional power and utility of general parsing automatically and easily available. See Extracting from .bib file with Raku for more discussion of this.

You'll never do wrong following Raiph's answers. But I'm going to try and explain a bit why that works, and propose alternatives. A Match is a Capture, a beast that's essentially a list and an associative array at the same time. What you want is to have the result as a list, same as it happens in Perl. How do you extract the "list" part (or coerce it to a list)? Easy: use explicit coercion (as Raiph suggests) or try and put it in a list context. What creates a list context? Flattening does:
my ($a, $b, $c, $d, $e) = |('hello' ~~ m{ ^(h) (e) (l) (l) (o) });
List binding does too:
my ($a, $b, $c, $d, $e) := 'hello' ~~ m{ ^(h) (e) (l) (l) (o) };
The thing is that every one of the variables will still contain a Match; what you probably want is the self same thing as in Perl, a list of strings. We can get that with this combo:
my #array is Array[Str(Match)] = | ('hello' ~~ m{ ^(h) (e) (l) (l) (o) });
Where we get the list context in the right hand side, and by explicitly declaring an array with coercing elements, we get [h e l l o] as a result.

Related

Assignment destructuring and operator precedence

The documentation says that the comma operator has higher precedence than the assignment = operator, and this is specifically different than in Perl, so that we are allowed to remove parentheses in some contexts.
This allows us to do things like this:
my #array = 1, 2, 3;
What I don't understand is why when do something like this:
sub test() { return 1, 2 }
my ($a, $b);
$a, $b = test();
$b get assigned [1 2] while $a gets no value.
While I would assume that the following would be equivalent, because the comma operator is tighter than the assignment.
($a, $b) = test();
The semantics of Raku have a lot of subtlety and I guess I am thinking too much in terms of Perl.
Like raiph said in the comments, my original assumption that the comma operator has higher precedence than the assignment operator was false. And it was due to a problem in the rendering of the operator precedence table, which didn't presented operators in their precedence order.
This explains the actual behavior of Raku for my examples.
The = operator itself is always item assignment level, which is tighter than comma. However, it may apply a "sub-precedence", which is how it is compared to any further infixes in the expression that follow it. The term that was parsed prior to the = is considered, and:
In the case that the assignment is to a Scalar variable, then the = operator works just like any other item assignment precedence operator, which is tighter than the precedence of ,
In any other case, its precedence relative to following infixes is list prefix, which is looser than the precedence of ,
To consider some cases (first, where it doesn't impact anything):
$foo = 42; # $ sigil, so item assignment precedence after
#foo = 1; # # sigil, so list assignment precedence after
#foo[0] = 1; # postcircumfix (indexing operator) means list assignment after...
$foo[0] = 1; # ...always, even in this case
If we have a single variable on the left and a list on the right, then:
#foo = 1, 2; # List assignment into #foo two values
$foo = 1, 2; # Assignment of 1 into $foo, 2 is dead code
These apply with the = initializer (following a my $var declaration). This means that:
loop (my $i = 0, my $j = $end; $i < $end; $i++, $j--) {
...
}
Will result in $i being assigned 0 and $j being assigned $end.
Effectively, the rule means we get to have parentheses-free initialization of array and hash variables, while still having lists of scalar initializations work out as in the loop case.
Turning to the examples in the question. First, this:
($a, $b) = test();
Parses a single term, then encounters the =. The precedence when comparing any following infixes would be list prefix (looser than ,). However, there are no more infixes here, so it doesn't really matter.
In this case:
sub test() { return 1, 2 }
my ($a, $b);
$a, $b = test();
The precedence parser sees the , infix, followed by the = infix. The = infix in itself is tighter than comma (item assignment precedence); the sub-precedence is only visible to infixes parsed after the = (and there are none here).
Note that were it not this way, and the precedence shift applied to the expression as a whole, then:
loop (my $i = 0, my #lagged = Nil, |#values; $i < #values; $i++) {
...
}
Would end up grouped not as (my $i = 0), (my #lagged = Nil, |#values), but rather (my $i = 0, my #lagged) = Nil, |#values, which is rather less useful.

IIFE alternatives in Raku

In Ruby I can group together some lines of code like so with a begin block:
x = begin
puts "Hi!"
a = 2
b = 3
a + b
end
puts x # 5
it's immediately evaluated and its value is the last value of the block (a + b here) (Javascripters do a similar thing with IIFEs)
What are the ways to do this in Raku? Is there anything smoother than:
my $x = ({
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
})();
say $x; # 5
Insert a do in front of the block. This tells Raku to:
Immediately do whatever follows the do on its right hand side;
Return the value to the do's left hand side:
my $x = do {
put "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
That said, one rarely needs to use do.
Instead, there are many other IIFE forms in Raku that just work naturally without fuss. I'll mention just two because they're used extensively in Raku code:
with whatever { .foo } else { .bar }
You might think I'm being silly, but those are two IIFEs. They form lexical scopes, have parameter lists, bind from arguments, the works. Loads of Raku constructs work like that.
In the above case where I haven't written an explicit parameter list, this isn't obvious. The fact that .foo is called on whatever if whatever is defined, and .bar is called on it if it isn't, is both implicit and due to the particular IIFE calling behavior of with.
See also if, while, given, and many, many more.
What's going on becomes more obvious if you introduce an explicit parameter list with ->:
for whatever -> $a, $b { say $a + $b }
That iterates whatever, binding two consecutive elements from it to $a and $b, until whatever is empty. If it has an odd number of elements, one might write:
for whatever -> $a, $b? { say $a + $b }
And so on.
Bottom line: a huge number of occurrences of {...} in Raku are IIFEs, even if they don't look like it. But if they're immediately after an =, Raku defaults to assuming you want to assign the lambda rather than immediately executing it, so you need to insert a do in that particular case.
Welcome to Raku!
my $x = BEGIN {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
I guess the common ancestry of Raku and Ruby shows :-)
Also note that to create a constant, you can also use constant:
my constant $x = do {
say "Hi!";
my $a = 2;
my $b = 3;
$a + $b;
}
If you can have a single statement, you can leave off the braces:
my $x = BEGIN 2 + 3;
or:
my constant $x = 2 + 3;
Regarding blocks: if they are in sink context (similar to "void" context in some languages), then they will execute just like that:
{
say "Executing block";
}
No need to explicitely call it: it will be called for you :-)

Raku Ambiguous call to infix(Hyper: Dan::Series, Int)

I am writing a model Series class (kinda like the one in pandas) - and it should be both Positional and Associative.
class Series does Positional does Iterable does Associative {
has Array $.data is required;
has Array $.index;
### Construction ###
method TWEAK {
# sort out data-index dependencies
$!index = gather {
my $i = 0;
for |$!data -> $d {
take ( $!index[$i++] => $d )
}
}.Array
}
### Output ###
method Str {
$!index
}
### Role Support ###
# Positional role support
# viz. https://docs.raku.org/type/Positional
method of {
Mu
}
method elems {
$!data.elems
}
method AT-POS( $p ) {
$!data[$p]
}
method EXISTS-POS( $p ) {
0 <= $p < $!data.elems ?? True !! False
}
# Iterable role support
# viz. https://docs.raku.org/type/Iterable
method iterator {
$!data.iterator
}
method flat {
$!data.flat
}
method lazy {
$!data.lazy
}
method hyper {
$!data.hyper
}
# Associative role support
# viz. https://docs.raku.org/type/Associative
method keyof {
Str(Any)
}
method AT-KEY( $k ) {
for |$!index -> $p {
return $p.value if $p.key ~~ $k
}
}
method EXISTS-KEY( $k ) {
for |$!index -> $p {
return True if $p.key ~~ $k
}
}
#`[ solution attempt #1 does NOT get called
multi method infix(Hyper: Series, Int) is default {
die "I was called"
}
#]
}
my $s = Series.new(data => [rand xx 5], index => [<a b c d e>]);
say ~$s;
say $s[2];
say $s<b>;
So far pretty darn cool.
I can go dd $s.hyper and get this
HyperSeq.new(configuration => HyperConfiguration.new(batch => 64, degree => 1))
BUT (there had to be a but coming), I want to be able to do hyper math on my Series' elements, something like:
say $s >>+>> 2;
But that yields:
Ambiguous call to 'infix(Hyper: Dan::Series, Int)'; these signatures all match:
(Hyper: Associative:D \left, \right, *%_)
(Hyper: Positional:D \left, \right, *%_)
in block <unit> at ./synopsis-dan.raku line 63
How can I tell my class Series not to offer the Associative hyper candidate...?
Note: edited example to be a runnable MRE per #raiph's comment ... I have thus left in the minimum requirements for the 3 roles in play per docs.raku.org
After some experimentation (and new directions to consider from the very helpful comments to this SO along the way), I think I have found a solution:
drop the does Associative role from the class declaration like this:
class Series does Positional does Iterable {...}
BUT
leave the Associative role support methods in the body of the class:
# Associative role support
# viz. https://docs.raku.org/type/Associative
method keyof {
Str(Any)
}
method AT-KEY( $k ) {
for |$!index -> $p {
return $p.value if $p.key ~~ $k
}
}
method EXISTS-KEY( $k ) {
for |$!index -> $p {
return True if $p.key ~~ $k
}
}
This gives me the Positional and Associative accessors, and functional hyper math operators:
my $s = Series.new(data => [rand xx 5], index => [<a b c d e>]);
say ~$s; #([a => 0.6137271559776396 b => 0.7942959887386045 c => 0.5768018697817604 d => 0.8964323560788711 e => 0.025740150933493577] , dtype: Num)
say $s[2]; #0.7942959887386045
say $s<b>; #0.5768018697817604
say $s >>+>> 2; #(2.6137271559776396 2.7942959887386047 2.5768018697817605 2.896432356078871 2.0257401509334936)
While this feels a bit thin (and probably lacks the full set of Associative functions) I am fairly confident that the basic methods will give me slimmed down access like a hash from a key capability that I seek. And it no longer creates the ambiguous call.
This solution may be cheating a bit in that I know the level of compromise that I will accept ;-).
Take #1
First, an MRE with an emphasis on the M1:
class foo does Positional does Associative { method of {} }
sub infix:<baz> (\l,\r) { say 'baz' }
foo.new >>baz>> 42;
yields:
Ambiguous call to 'infix(Hyper: foo, Int)'; these signatures all match:
(Hyper: Associative:D \left, \right, *%_)
(Hyper: Positional:D \left, \right, *%_)
in block <unit> at ./synopsis-dan.raku line 63
The error message shows it's A) a call to a method named infix with an invocant matching Hyper, and B) there are two methods that potentially match that call.
Given that there's no class Hyper in your MRE, these methods and the Hyper class must be either built-ins or internal details that are leaking out.
A search of the doc finds no such class. So Hyper is undocumented Given that the doc has fairly broad coverage these days, this suggests Hyper is an internal detail. But regardless, it looks like you can't solve your problem using official/documented features.
Hopefully this bad news is still better than none.2
Take #2
Where's the fun in letting little details like "not an official feature" stop us doing what we want to do?
There's a core.c module named Hyper.pm6 in the Rakudo source repo.
A few seconds browsing that, and clicks on its History and Blame, and I can instantly see it really is time for me to conclude this SO answer, with a recommendation for your next move.
To wit, I suggest you start another SO, using this answer as its heart (but reversing my presentation order, ie starting by mentioning Hyper, and that it's not doc'd), and namechecking Liz (per Hyper's History/Blame), with a link back to your Q here as its background. I'm pretty sure that will get you a good answer, or at least an authoritative one.
Footnotes
1 I also tried this:
class foo does Positional does Associative { method of {} }
sub postfix:<bar>(\arg) { say 'bar' }
foo.new>>bar;
but that worked (displayed bar).
2 If you didn't get to my Take #1 conclusion yourself, perhaps that was was because your MRE wasn't very M? If you did arrive at the same point (cf "solution attempt #1 does NOT get called" in your MRE) then please read and, for future SOs, take to heart, the wisdom of "Explain ... any difficulties that have prevented you from solving it yourself".

Object Hash Lookup with `eqv`

Is there a way to use eqv to lookup a hash value without looping over the key-value pairs when using object keys?
It is possible to use object keys in a hash by specifying the key's type at declaration:
class Foo { has $.bar };
my Foo $a .= new(:bar(1));
my %h{Foo} = $a => 'A', Foo.new(:bar(2)) => 'B';
However, key lookup uses the identity operator === which will return the value only if it is the same object and not an equivalent one:
my Foo $a-prime .= new(:bar(1));
say $a eqv $a-prime; # True
say $a === $a-prime; # False
say %h{$a}; # A
say %h{$a-prime}; # (Any)
Looking at the documentation for "===", the last line reveals that the operator is based on .WHICH and that "... all value types must override method WHICH." This is why if you create two separate items with the same string value, "===" returns True.
my $a = "Hello World";
my $b = join " ", "Hello", "World";
say $a === $b; # True even though different items - because same value
say $a.WHICH ; # "Str|Hello World"
say $b.WHICH ; # (same as above) which is why "===" returns True
So, instead of creating your own container type or using some of the hooks for subscripts, you could instead copy the way "value types" do it - i.e. butcher (some what) the idea of identity. The .WHICH method for strings shown above simply returns the type name and contents joined with a '|'. Why not do the same thing;
class Foo {
has $.bar;
multi method WHICH(Foo:D:) { "Foo|" ~ $!bar.Str }
}
my Foo $a .= new(:bar(1));
my %h{Foo} = $a => 'A', Foo.new(:bar(2)) => 'B';
my Foo $a-prime .= new(:bar(1));
say $a eqv $a-prime; # True
say $a === $a-prime; # True
say %h{$a}; # A
say %h{$a-prime}; # A
There is a small cost, of course - the concept of identity for the instances of this class is, well - let's say interesting. What are the repercussions? The only thing that comes immediately to mind, is if you plan to use some sort of object persistence framework, its going to squish different instances that now look the same into one (perhaps).
Different objects with the same attribute data are going to be indistinguishable - which is why I hesitated before posting this answer. OTOH, its your class, its your app so, depending on the size/importance etc, it may be a fine way to do it.
If you don't like the buildin postcircumfix, provide your own.
class Foo { has $.bar };
my Foo $a .= new(:bar(1));
my %h{Foo} = $a => 'A', Foo.new(:bar(2)) => 'B';
multi sub postcircumfix:<{ }>(\SELF, WhateverCode $c) is raw {
gather for SELF.keys -> $k {
take $k if $c($k)
}
}
dd %h;
dd %h{* eqv $a};
OUTPUT
Hash[Any,Foo] %h = (my Any %{Foo} = (Foo.new(bar => 1)) => "A", (Foo.new(bar => 2)) => "B")
(Foo.new(bar => 1),).Seq

Determine type of a variable in Tcl

I'm looking for a way to find the type of a variable in Tcl. For example if I have the variable $a and I want to know whether it is an integer.
I have been using the following so far:
if {[string is boolean $a]} {
#do something
}
and this seems to work great for the following types:
alnum, alpha, ascii, boolean, control, digit, double, false, graph, integer, lower, print, punct, space, true, upper, wordchar, xdigit
However it is not capable to tell me if my variable might be an array, a list or a dictionary. Does anyone know of a way to tell if a variable is either of those three?
Tcl's variables don't have types (except for whether or not they're really an associative array of variables — i.e., using the $foo(bar) syntax — for which you use array exists) but Tcl's values do. Well, somewhat. Tcl can mutate values between different types as it sees fit and does not expose this information[*]; all you can really do is check whether a value conforms to a particular type.
Such conformance checks are done with string is (where you need the -strict option, for ugly historical reasons):
if {[string is integer -strict $foo]} {
puts "$foo is an integer!"
}
if {[string is list $foo]} { # Only [string is] where -strict has no effect
puts "$foo is a list! (length: [llength $foo])"
if {[llength $foo]&1 == 0} {
# All dictionaries conform to lists with even length
puts "$foo is a dictionary! (entries: [dict size $foo])"
}
}
Note that all values conform to the type of strings; Tcl's values are always serializable.
[EDIT from comments]: For JSON serialization, it's possible to use dirty hacks to produce a “correct” serialization (strictly, putting everything in a string would be correct from Tcl's perspective but that's not precisely helpful to other languages) with Tcl 8.6. The code to do this, originally posted on Rosetta Code is:
package require Tcl 8.6
proc tcl2json value {
# Guess the type of the value; deep *UNSUPPORTED* magic!
regexp {^value is a (.*?) with a refcount} \
[::tcl::unsupported::representation $value] -> type
switch $type {
string {
# Skip to the mapping code at the bottom
}
dict {
set result "{"
set pfx ""
dict for {k v} $value {
append result $pfx [tcl2json $k] ": " [tcl2json $v]
set pfx ", "
}
return [append result "}"]
}
list {
set result "\["
set pfx ""
foreach v $value {
append result $pfx [tcl2json $v]
set pfx ", "
}
return [append result "\]"]
}
int - double {
return [expr {$value}]
}
booleanString {
return [expr {$value ? "true" : "false"}]
}
default {
# Some other type; do some guessing...
if {$value eq "null"} {
# Tcl has *no* null value at all; empty strings are semantically
# different and absent variables aren't values. So cheat!
return $value
} elseif {[string is integer -strict $value]} {
return [expr {$value}]
} elseif {[string is double -strict $value]} {
return [expr {$value}]
} elseif {[string is boolean -strict $value]} {
return [expr {$value ? "true" : "false"}]
}
}
}
# For simplicity, all "bad" characters are mapped to \u... substitutions
set mapped [subst -novariables [regsub -all {[][\u0000-\u001f\\""]} \
$value {[format "\\\\u%04x" [scan {& } %c]]}]]
return "\"$mapped\""
}
Warning: The above code is not supported. It depends on dirty hacks. It's liable to break without warning. (But it does work. Porting to Tcl 8.5 would require a tiny C extension to read out the type annotations.)
[*] Strictly, it does provide an unsupported interface for discovering the current type annotation of a value in 8.6 — as part of ::tcl::unsupported::representation — but that information is in a deliberately human-readable form and subject to change without announcement. It's for debugging, not code. Also, Tcl uses rather a lot of different types internally (e.g., cached command and variable names) that you won't want to probe for under normal circumstances; things are rather complex under the hood…
The other answers all provide very useful information, but it's worth noting something that a lot of people don't seem to grok at first.
In Tcl, values don't have a type... they question is whether they can be used as a given type. You can think about it this way
string is integer $a
You're not asking
Is the value in $a an integer
What you are asking is
Can I use the value in $a as an integer
Its useful to consider the difference between the two questions when you're thinking along the lines of "is this an integer". Every integer is also a valid list (of one element)... so it can be used as either and both string is commands will return true (as will several others for an integer).
If you want to deal with JSON then I highly suggest you read the JSON page on the Tcl wiki: http://wiki.tcl.tk/json.
On that page I posted a simple function that compiles Tcl values to JSON string given a formatting descriptor. I also find the discussion on that page very informative.
For arrays you want array exists
for dicts you want dict exists
for a list I don't think there is a built in way prior to 8.5?, there is this from http://wiki.tcl.tk/440
proc isalist {string} {
return [expr {0 == [catch {llength $string}]}]
}
To determine if a variable is an array:
proc is_array {var} {
upvar 1 $var value
if {[catch {array names $value} errmsg]} { return 1 }
return 0
}
# How to use it
array set ar {}
set x {1 2 3}
puts "ar is array? [is_array ar]"; # ar is array? 1
puts "x is array? [is_array x]"; # x is array? 0
For the specific case of telling if a value is usable as a dictionary, tcllib's dicttool package has a dict is_dict <value> command that returns a true value if <value> can act as one.