Why does a single number fail to match a Range object in an array? - raku

> my #numbers = 1, 3, 5;
> 1 ~~ /#numbers/; #
「1」
is the same as:
> 1 ~~ /1 | 3 | 5/
「1」
but when the element is a Range object, it fails to match:
> my #ranges = 1..3.item, 4..6.item;
[1..3 4..6]
> 1 ~~ /#ranges/
Nil
> 1 ~~ /|#ranges/
Nil
> 1 ~~ /||#ranges/

When the regex engine sees /#numbers/ it treats that like an alternation of the array elements, so your first two examples are equivalent.
There just is no such automatism for Ranges I believe.
Edit: Never mind below, I totally misread the question at first.
> my #ranges = 1..3, 4..6;
[1..3 4..6]
> 1 ~~ #ranges[0];
True
> 2 ~~ #ranges[1];
False
> 4 ~~ #ranges[1];
True
> #ranges.first( 5 ~~ * )
4..6
See? #ranges is a array of, well, ranges (your call to item does nothing here). Theoretically this would hold true if the smartmatch operator were smarter.
> 1..3 ~~ #ranges;
False
Flattening also doesn't help, because a flat list of ranges is still a list of ranges.
Flattening the ranges themselves is possible, but that simply turns them into Arrays
> my #ranges2 = |(1..3), |(4..6)
[1 2 3 4 5 6]

Why does single number fails to match Range object in array?
Per the doc:
The interpolation rules for individual elements [of an array] are the same as for scalars
And per the same doc section the rule for a scalar (that is not a regex) is:
interpolate the stringified value
A range object such as 1..3 stringifies to 1 2 3:
my $range = 1..3;
put $range; # 1 2 3
put so '1' ~~ / $range /; # False
put so '1 2 3' ~~ / $range /; # True
So, as Holli suggests, perhaps instead:
my #ranges = flat 1..3, 4..6;
say #ranges; # [1 2 3 4 5 6]
say 1 ~~ /#ranges/; # 「1」
Or is there some reason you don't want that? (See also Scimon's comment on Holli's answer.)

Related

Finding the contiguous sequences of equal elements in a list Raku

I'd like to find the contiguous sequences of equal elements (e.g. of length 2) in a list
my #s = <1 1 0 2 0 2 1 2 2 2 4 4 3 3>;
say grep {$^a eq $^b}, #s;
# ==> ((1 1) (2 2) (4 4) (3 3))
This code looks ok but when one more 2 is added after the sequence of 2 2 2 or when one 2 is removed from it, it says Too few positionals passed; expected 2 arguments but got 1 How to fix it? Please note that I'm trying to find them without using for loop, i.e. I'm trying to find them using a functional code as much as possible.
Optional: In the bold printed section:
<1 1 0 2 0 2 1 2 2 2 4 4 3 3>
multiple sequences of 2 2 are seen. How to print them the number of times they are seen? Like:
((1 1) (2 2) (2 2) (4 4) (3 3))
There are an even number of elements in your input:
say elems <1 1 0 2 0 2 1 2 2 2 4 4 3 3>; # 14
Your grep block consumes two elements each time:
{$^a eq $^b}
So if you add or remove an element you'll get the error you're getting when the block is run on the single element left over at the end.
There are many ways to solve your problem.
But you also asked about the option of allowing for overlapping so, for example, you get two (2 2) sub-lists when the sequence 2 2 2 is encountered. And, in a similar vein, you presumably want to see two matches, not zero, with input like:
<1 2 2 3 3 4>
So I'll focus on solutions that deal with those issues too.
Despite the narrowing of solution space to deal with the extra issues, there are still many ways to express solutions functionally.
One way that just appends a bit more code to the end of yours:
my #s = <1 1 0 2 0 2 1 2 2 2 4 4 3 3>;
say grep {$^a eq $^b}, #s .rotor( 2 => -1 ) .flat
The .rotor method converts a list into a list of sub-lists, each of the same length. For example, say <1 2 3 4> .rotor: 2 displays ((1 2) (3 4)). If the length argument is a pair, then the key is the length and the value is an offset for starting the next pair. If the offset is negative you get sub-list overlap. Thus say <1 2 3 4> .rotor: 2 => -1 displays ((1 2) (2 3) (3 4)).
The .flat method "flattens" its invocant. For example, say ((1,2),(2,3),(3,4)) .flat displays (1 2 2 3 3 4).
A perhaps more readable way to write the above solution would be to omit the flat and use .[0] and .[1] to index into the sub-lists returned by rotor:
say #s .rotor( 2 => -1 ) .grep: { .[0] eq .[1] }
See also Elizabeth Mattijsen's comment for another variation that generalizes for any sub-list size.
If you needed a more general coding pattern you might write something like:
say #s .pairs .map: { .value xx 2 if .key < #s - 1 and [eq] #s[.key,.key+1] }
The .pairs method on a list returns a list of pairs, each pair corresponding to each of the elements in its invocant list. The .key of each pair is the index of the element in the invocant list; the .value is the value of the element.
.value xx 2 could have been written .value, .value. (See xx.)
#s - 1 is the number of elements in #s minus 1.
The [eq] in [eq] list is a reduction.
If you need text pattern matching to decide what constitutes contiguous equal elements you might convert the input list into a string, match against that using one of the match adverbs that generate a list of matches, then map from the resulting list of matches to your desired result. To match with overlaps (eg 2 2 2 results in ((2 2) (2 2)) use :ov:
say #s .Str .match( / (.) ' ' $0 /, :ov ) .map: { .[0].Str xx 2 }
TIMTOWDI!
Here's an iterative approach using gather/take.
say gather for <1 1 0 2 0 2 1 2 2 2 4 4 3 3> {
state $last = '';
take ($last, $_) if $last == $_;
$last = $_;
};
# ((1 1) (2 2) (2 2) (4 4) (3 3))

Parallel loop with one list literal

I'm trying to loop over two lists, one being a premade list and one being a list literal. Is something like this possible?
Pseudocode example:
list(list1 APPEND 0 1 2 3 4)
foreach(item IN LISTS ${list1} 5 6 7 8 9)
message(${item} ${#other variable})
endforeach(item)
# prints out
0 5
1 6
... etc
For iterate over several lists at the same time, you may use foreach loop over their indicies. Then, in the loop body access the lists's elements by that index:
# Setup content of the lists somehow
set(list1 0 1 2 3 4)
set(list2 5 6 7 8 9)
list(LENGTH list1 n_elems) # Total number of the elements in the every list
math(EXPR last_index "${n_elems}-1") # The last index in the every list
# Now iterate over indicies
foreach(i RANGE ${last_index})
list(GET list1 ${i} elem1) # Element in the first list
list(GET list2 ${i} elem2) # Corresponded element in the second list
# Do something with elements
message("${elem1} ${elem2}")
endforeach()

Binding a scalar to a sigilless variable (Perl 6)

Let me start by saying that I understand that what I'm asking about in the title is dubious practice (as explained here), but my lack of understanding concerns the syntax involved.
When I first tried to bind a scalar to a sigilless symbol, I did this:
my \a = $(3);
thinking that $(...) would package the Int 3 in a Scalar (as seemingly suggested in the documentation), which would then be bound to symbol a. This doesn't seem to work though: the Scalar is nowhere to be found (a.VAR.WHAT returns (Int), not (Scalar)).
In the above-referenced post, raiph mentions that the desired binding can be performed using a different syntax:
my \a = $ = 3;
which works. Given the result, I suspect that the statement can be phrased equivalently, though less concisely, as: my \a = (my $ = 3), which I could then understand.
That leaves the question: why does the attempt with $(...) not work, and what does it do instead?
What $(…) does is turn a value into an item.
(A value in a scalar variable ($a) also gets marked as being an item)
say flat (1,2, (3,4) );
# (1 2 3 4)
say flat (1,2, $((3,4)) );
# (1 2 (3 4))
say flat (1,2, item((3,4)) );
# (1 2 (3 4))
Basically it is there to prevent a value from flattening. The reason for its existence is that Perl 6 does not flatten lists as much as most other languages, and sometimes you need a little more control over flattening.
The following only sort-of does what you want it to do
my \a = $ = 3;
A bare $ is an anonymous state variable.
my \a = (state $) = 3;
The problem shows up when you run that same bit of code more than once.
sub foo ( $init ) {
my \a = $ = $init; # my \a = (state $) = $init;
(^10).map: {
sleep 0.1;
++a
}
}
.say for await (start foo(0)), (start foo(42));
# (43 44 45 46 47 48 49 50 51 52)
# (53 54 55 56 57 58 59 60 61 62)
# If foo(42) beat out foo(0) instead it would result in:
# (1 2 3 4 5 6 7 8 9 10)
# (11 12 13 14 15 16 17 18 19 20)
Note that variable is shared between calls.
The first Promise halts at the sleep call, and then the second sets the state variable before the first runs ++a.
If you use my $ instead, it now works properly.
sub foo ( $init ) {
my \a = my $ = $init;
(^10).map: {
sleep 0.1;
++a
}
}
.say for await (start foo(0)), (start foo(42));
# (1 2 3 4 5 6 7 8 9 10)
# (43 44 45 46 47 48 49 50 51 52)
The thing is that sigiless “variables” aren't really variables (they don't vary), they are more akin to lexically scoped (non)constants.
constant \foo = (1..10).pick; # only pick one value and never change it
say foo;
for ^5 {
my \foo = (1..10).pick; # pick a new one each time through
say foo;
}
Basically the whole point of them is to be as close as possible to referring to the value you assign to it. (Static Single Assignment)
# these work basically the same
-> \a {…}
-> \a is raw {…}
-> $a is raw {…}
# as do these
my \a = $i;
my \a := $i;
my $a := $i;
Note that above I wrote the following:
my \a = (state $) = 3;
Normally in the declaration of a state var, the assignment only happens the first time the code gets run. Bare $ doesn't have a declaration as such, so I had to prevent that behaviour by putting the declaration in parens.
# bare $
for (5 ... 1) {
my \a = $ = $_; # set each time through the loop
say a *= 2; # 15 12 9 6 3
}
# state in parens
for (5 ... 1) {
my \a = (state $) = $_; # set each time through the loop
say a *= 2; # 15 12 9 6 3
}
# normal state declaration
for (5 ... 1) {
my \a = state $ = $_; # set it only on the first time through the loop
say a *= 2; # 15 45 135 405 1215
}
Sigilless variables are not actually variables, they are more of an alias, that is, they are not containers but bind to the values they get on the right hand side.
my \a = $(3);
say a.WHAT; # OUTPUT: «(Int)␤»
say a.VAR.WHAT; # OUTPUT: «(Int)␤»
Here, by doing $(3) you are actually putting in scalar context what is already in scalar context:
my \a = 3; say a.WHAT; say a.VAR.WHAT; # OUTPUT: «(Int)␤(Int)␤»
However, the second form in your question does something different. You're binding to an anonymous variable, which is a container:
my \a = $ = 3;
say a.WHAT; # OUTPUT: «(Int)␤»
say a.VAR.WHAT;# OUTPUT: «(Scalar)␤»
In the first case, a was an alias for 3 (or $(3), which is the same); in the second, a is an alias for $, which is a container, whose value is 3. This last case is equivalent to:
my $anon = 3; say $anon.WHAT; say $anon.VAR.WHAT; # OUTPUT: «(Int)␤(Scalar)␤»
(If you have some suggestion on how to improve the documentation, I'd be happy to follow up on it)

Getting a positional slice using a Range variable as a subscript

my #numbers = <4 8 15 16 23 42>;
this works:
.say for #numbers[0..2]
# 4
# 8
# 15
but this doesn't:
my $range = 0..2;
.say for #numbers[$range];
# 16
the subscript seems to be interpreting $range as the number of elements in the range (3). what gives?
Working as intended. Flatten the range object into a list with #numbers[|$range] or use binding on Range objects to hand them around. https://docs.perl6.org will be updated shortly.
On Fri Jul 22 15:34:02 2016, gfldex wrote:
> my #numbers = <4 8 15 16 23 42>; my $range = 0..2; .say for
> #numbers[$range];
> # OUTPUT«16␤»
> # expected:
> # OUTPUT«4␤8␤15␤»
>
This is correct, and part of the "Scalar container implies item" rule.
Changing it would break things like the second evaluation here:
> my #x = 1..10; my #y := 1..3; #x[#y]
(2 3 4)
> #x[item #y]
4
Noting that since a range can bind to #y in a signature, then Range being a
special case would make an expression like #x[$(#arr-param)]
unpredictable in its semantics.
> # also binding to $range provides the expected result
> my #numbers = <4 8 15 16 23 42>; my $range := 0..2; .say for
> #numbers[$range];
> # OUTPUT«4␤8␤15␤»
> y
This is also expected, since with binding there is no Scalar container to
enforce treatment as an item.
So, all here is working as designed.
A symbol bound to a Scalar container yields one thing
Options for getting what you want include:
Prefix with # to get a plural view of the single thing: numbers[#$range]; OR
declare the range variable differently so it works directly
For the latter option, consider the following:
# Bind the symbol `numbers` to the value 1..10:
my \numbers = [0,1,2,3,4,5,6,7,8,9,10];
# Bind the symbol `rangeA` to the value 1..10:
my \rangeA := 1..10;
# Bind the symbol `rangeB` to the value 1..10:
my \rangeB = 1..10;
# Bind the symbol `$rangeC` to the value 1..10:
my $rangeC := 1..10;
# Bind the symbol `$rangeD` to a Scalar container
# and then store the value 1..10 in it:`
my $rangeD = 1..10;
# Bind the symbol `#rangeE` to the value 1..10:
my #rangeE := 1..10;
# Bind the symbol `#rangeF` to an Array container and then
# store 1 thru 10 in the Scalar containers 1 thru 10 inside the Array
my #rangeF = 1..10;
say numbers[rangeA]; # (1 2 3 4 5 6 7 8 9 10)
say numbers[rangeB]; # (1 2 3 4 5 6 7 8 9 10)
say numbers[$rangeC]; # (1 2 3 4 5 6 7 8 9 10)
say numbers[$rangeD]; # 10
say numbers[#rangeE]; # (1 2 3 4 5 6 7 8 9 10)
say numbers[#rangeF]; # (1 2 3 4 5 6 7 8 9 10)
A symbol that's bound to a Scalar container ($rangeD) always yields a single value. In a [...] subscript that single value must be a number. And a range, treated as a single number, yields the length of that range.

What's the R equivalent of SQL's LIKE 'description%' statement?

Not sure how else to ask this but, I want to search for a term within several string elements. Here's what my code looks like (but wrong):
inplay = vector(length=nrow(des))
for (ii in 1:nrow(des)) {
if (des[ii] = 'In play%')
inplay[ii] = 1
else inplay[ii] = 0
}
des is a vector that stores strings such as "Swinging Strike", "In play (run(s))", "In play (out(s) recorded)" and etc. What I want inplay to store is a 1s and 0s vector corresponding with the des vector, with the 1s in inplay indicating that the des value had "In play%" in it and 0s otherwise.
I believe the 3rd line is incorrect, because all this does is return a vector of 0s with a 1 in the last element.
Thanks in advance!
The data.table package has syntax that is often similar to SQL. The package includes %like%, which is a "convenience function for calling regexpr". Here is an example taken from its help file:
## Create the data.table:
DT = data.table(Name=c("Mary","George","Martha"), Salary=c(2,3,4))
## Subset the DT table where the Name column is like "Mar%":
DT[Name %like% "^Mar"]
## Name Salary
## 1: Mary 2
## 2: Martha 4
The R analog to SQL's LIKE is just R's ordinary indexing syntax.
The 'LIKE' operator selects data rows from a table by matching string values in a specified column against a user-supplied pattern
> # create a data frame having a character column
> clrs = c("blue", "black", "brown", "beige", "berry", "bronze", "blue-green", "blueberry")
> dfx = data.frame(Velocity=sample(100, 8), Colors=clrs)
> dfx
Velocity Colors
1 90 blue
2 94 black
3 71 brown
4 36 beige
5 75 berry
6 2 bronze
7 89 blue-green
8 93 blueberry
> # create a pattern to use (the same as you would do when using the LIKE operator)
> ptn = '^be.*?' # gets beige and berry but not blueberry
> # execute a pattern-matching function on your data to create an index vector
> ndx = grep(ptn, dfx$Colors, perl=T)
> # use this index vector to extract the rows you want from the data frome:
> selected_rows = dfx[ndx,]
> selected_rows
Velocity Colors
4 36 beige
5 75 berry
In SQL, that would be:
SELECT * FROM dfx WHERE Colors LIKE ptn3
Something like regexpr?
> d <- c("Swinging Strike", "In play (run(s))", "In play (out(s) recorded)")
> regexpr('In play', d)
[1] -1 1 1
attr(,"match.length")
[1] -1 7 7
>
or grep
> grep('In play', d)
[1] 2 3
>
Since stringr 1.5.0, you can use str_like, which follows the structure of SQL's LIKE:
library(stringr)
fruit <- c("apple", "banana", "pear", "pineapple")
str_like(fruit, "app%")
#[1] TRUE FALSE FALSE FALSE
Not only does it include %, but also several other operators (see ?str_like).
Must match the entire string
_⁠ matches a single character (like .)
⁠%⁠ matches any number of characters (like ⁠.*⁠)
⁠%⁠ and ⁠_⁠ match literal ⁠%⁠ and ⁠_⁠
The match is case insensitive by default