Is there a way to safely redeclare a symbol? - raku

I often find myself experimenting in the REPL and I will say something like:
subset Bar of Int where * %% 57;
Then I play around with checks on the Bar-ness for things for a bit.
Everything is happy, until I realize that I want to change the definition of Bar.
If I just redefine Bar, I get a Redeclaration of symbol exception.
I tried using MONKEY-TYPING and augment like this:
use MONKEY-TYPING;
augment subset Bar of Int where * %% 37;
But that netted me the same error.
Why do I want this? So I can iterate on my subset (or class, or other symbol) definitions, while reusing the tests I've already typed that are in my history.

The REPL has its shortcomings. It is an elaborate construction of EVAL statements that try to work together. Sometimes that doesn't work out.
I guess the best we could do, is to introduce a REPL command that would make it forget everything it has done before. Patches welcome! :-)

I think the REPL does part of its magic by EVAL-ing each new input in a new nested lexical scope. So, if you declare things with my then you can shadow them with declarations entered later:
my subset Bar of Int where * %% 57;
sub take-Bar(Bar $n) { say "$n is Bar" }
take-Bar 57;
my subset Bar of Int where * %% 42;
sub take-Bar(Bar $n) { say "$n is Bar" }
take-Bar 42;
If you omit my, then for subset and class declarations, our will be used, and since our is actually my + adding the symbol to the enclosing package...; turns out if you delete the symbol from the package, you can then shadow it again later:
subset Bar of Int where * %% 57;
GLOBAL::<Bar>:delete;
subset Bar of Int where * %% 42;
42 ~~ Bar;
NOTE: These results are just from my experiments in the REPL. I'm not sure if there are other unknown side effects.

Related

How to assign the .lines Seq to a variable and iterate over it?

Assigning an iterator to variable changes apparently how the Seq behaves. E.g.
use v6;
my $i = '/etc/lsb-release'.IO.lines;
say $i.WHAT;
say '/etc/lsb-release'.IO.lines.WHAT;
.say for $i;
.say for '/etc/lsb-release'.IO.lines;
results in:
(Seq)
(Seq)
(DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS")
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"
So once assigned I get only the string representation of the sequence. I know I can use .say for $i.lines to get the same output but I do not understand the difference between the assigned and unassigned iterator/Seq.
Assignment in Perl 6 is always about putting something into something else.
Assignment into a Scalar ($ sigil) stores the thing being assigned into a Scalar container object, meaning it will be treated as a single item; this is why for $item { } will not do an iteration. There are various ways to overcome this; the most conceptually simple way is to use the <> postfix operator, which strips away any Scalar container:
my $i = '/etc/lsb-release'.IO.lines;
.say for $i<>;
There's also the slip operator ("flatten into"), which will achieve the same:
my $i = '/etc/lsb-release'.IO.lines;
.say for |$i;
Assignment into an Array will - unless the right-hand side is marked lazy - iterate it and store each element into the Array. Thus:
my #i = '/etc/lsb-release'.IO.lines;
.say for #i;
Will work, but it will eagerly read all the lines into #i before the loop starts. This is OK for a small file, but less ideal for a large file, where we might prefer to work lazily (that is, only pulling a bit of the file into memory at a time). One might try:
my #i = lazy '/etc/lsb-release'.IO.lines;
.say for #i;
But that won't help with the retention problem; it just means the array will be populated lazily from the file as the iteration takes place. Of course, sometimes we might want to go through the lines multiple times, in which case assignment into an Array would be the best choice.
By contrast, declaring a symbol and binding it to that:
my \i = '/etc/lsb-release'.IO.lines;
.say for i;
Is not a "put into" operation at all; it just makes the symbol i refer to exactly what lines returns. This is rather clearer than putting it into a Scalar container only to take it out again. It's also a little easier on the reader, since a my \foo = ... can never be rebound, and so the reader doesn't need to be on the lookup for any potential changes later on in the code.
As a final note, it's worth knowing that the my \foo = ... form is actually a binding, rather than an assignment. Perl 6 allows us to write it with the = operator rather than forcing :=, even if in this case the semantics are := semantics. This is just one of a number of cases where a declaration with an initializer differs a bit from a normal assignment, e.g. has $!foo = rand actually runs the assignment on every object instantiation, while state $foo = rand only runs it only if we're on the first entry to the current closure clone.
If you want to be able to iterate over the sequence you need to either assign it to a positional :
my #i = '/etc/lsb-release'.IO.lines;
.say for #i;
Or you can tell the iterator that you want to treat the given thing as iterable :
.say for #$i
Or you can Slip it into a list for the iterator :
.say for |$i

Does Perl 6 have an infinite Int?

I had a task where I wanted to find the closest string to a target (so, edit distance) without generating them all at the same time. I figured I'd use the high water mark technique (low, I guess) while initializing the closest edit distance to Inf so that any edit distance is closer:
use Text::Levenshtein;
my #strings = < Amelia Fred Barney Gilligan >;
for #strings {
put "$_ is closest so far: { longest( 'Camelia', $_ ) }";
}
sub longest ( Str:D $target, Str:D $string ) {
state Int $closest-so-far = Inf;
state Str:D $closest-string = '';
if distance( $target, $string ) < $closest-so-far {
$closest-so-far = $string.chars;
$closest-string = $string;
return True;
}
return False;
}
However, Inf is a Num so I can't do that:
Type check failed in assignment to $closest-so-far; expected Int but got Num (Inf)
I could make the constraint a Num and coerce to that:
state Num $closest-so-far = Inf;
...
$closest-so-far = $string.chars.Num;
However, this seems quite unnatural. And, since Num and Int aren't related, I can't have a constraint like Int(Num). I only really care about this for the first value. It's easy to set that to something sufficiently high (such as the length of the longest string), but I wanted something more pure.
Is there something I'm missing? I would have thought that any numbery thing could have a special value that was greater (or less than) all the other values. Polymorphism and all that.
{new intro that's hopefully better than the unhelpful/misleading original one}
#CarlMäsak, in a comment he wrote below this answer after my first version of it:
Last time I talked to Larry about this {in 2014}, his rationale seemed to be that ... Inf should work for all of Int, Num and Str
(The first version of my answer began with a "recollection" that I've concluded was at least unhelpful and plausibly an entirely false memory.)
In my research in response to Carl's comment, I did find one related gem in #perl6-dev in 2016 when Larry wrote:
then our policy could be, if you want an Int that supports ±Inf and NaN, use Rat instead
in other words, don't make Rat consistent with Int, make it consistent with Num
Larry wrote this post 6.c. I don't recall seeing anything like it discussed for 6.d.
{and now back to the rest of my first answer}
Num in P6 implements the IEEE 754 floating point number type. Per the IEEE spec this type must support several concrete values that are reserved to stand in for abstract concepts, including the concept of positive infinity. P6 binds the corresponding concrete value to the term Inf.
Given that this concrete value denoting infinity already existed, it became a language wide general purpose concrete value denoting infinity for cases that don't involve floating point numbers such as conveying infinity in string and list functions.
The solution to your problem that I propose below is to use a where clause via a subset.
A where clause allows one to specify run-time assignment/binding "typechecks". I quote "typecheck" because it's the most powerful form of check possible -- it's computationally universal and literally checks the actual run-time value (rather than a statically typed view of what that value can be). This means they're slower and run-time, not compile-time, but it also makes them way more powerful (not to mention way easier to express) than even dependent types which are a relatively cutting edge feature that those who are into advanced statically type-checked languages tend to claim as only available in their own world1 and which are intended to "prevent bugs by allowing extremely expressive types" (but good luck with figuring out how to express them... ;)).
A subset declaration can include a where clause. This allows you to name the check and use it as a named type constraint.
So, you can use these two features to get what you want:
subset Int-or-Inf where Int:D | Inf;
Now just use that subset as a type:
my Int-or-Inf $foo; # ($foo contains `Int-or-Inf` type object)
$foo = 99999999999; # works
$foo = Inf; # works
$foo = Int-or-Inf; # works
$foo = Int; # typecheck failure
$foo = 'a'; # typecheck failure
1. See Does Perl 6 support dependent types? and it seems the rough consensus is no.

How do I access an integer array within a struct/class from in-line assembly (blackfin dialect) using gcc?

Not very familiar with in-line assembly to begin with, and much less with that of the blackfin processor. I am in the process of migrating a legacy C application over to C++, and ran into a problem this morning regarding the following routine:
//
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
:: "a" ( buffer ), "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
I have a class that contains an array of shorts that is used for audio processing:
class AudProc
{
enum { buffer_size = 512 };
short M_samples[ buffer_size * 2 ];
// remaining part of class omitted for brevity
};
Within the AudProc class I have a method that calls clear_buffer, passing it the samples array:
clear_buffer ( M_samples, sizeof ( M_samples ) / 2 );
This generates a "Bus Error" and aborts the application.
I have tried making the array public, and that produces the same result. I have also tried making it static; that allows the call to go through without error, but no longer allows for multiple instances of my class as each needs its own buffer to work with. Now, my first thought is, it has something to do with where the buffer is in memory, or from where it is being accessed. Does something need to be changed in the in-line assembly to make this work, or in the way it is being called?
Thought that this was similar to what I was trying to accomplish, but it is using a different dialect of asm, and I can't figure out if it is the same problem I am experiencing or not:
GCC extended asm, struct element offset encoding
Anyone know why this is occurring and how to correct it?
Does anyone know where there is helpful documentation regarding the blackfin asm instruction set? I've tried looking on the ADSP site, but to no avail.
I would suspect that you could define your clear_buffer as
inline void clear_buffer (short * buffer, int len) {
memset (buffer, 0, sizeof(short)*len);
}
and probably GCC is able to optimize (when invoked with -O2 or -O3) that cleverly (because GCC knows about memset).
To understand assembly code, I suggest running gcc -S -O -fverbose-asm on some small C file, then to look inside the produced .s file.
I would have take a guess, because I don't know Blackfin assembler:
That LC0 sounds like "loop counter", LSETUP looks like a macro/insn, which, well, setups a loop between two labels and with a certain loop counter.
The "%0" operands is apparently the address to write to and we can safely guess it's incremented in the loop, in other words it's both an input and output operand and should be described as such.
Thus, I suggest describing it as in input-output operand, using "+" constraint modifier, as follows:
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
: "+a" ( buffer )
: "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
This is, of course, just a hypothesis, but you could disassemble the code and check if by any chance GCC allocated the same register for "%0" and "%2".
PS. Actually, only "+a" should be enough, early-clobber is irrelevant.
For anyone else who runs into a similar circumstance, the problem here was not with the in-line assembly, nor with the way it was being called: it was with the classes / structs in the program. The class that I believed to be the offender was not the problem - there was another class that held an instance of it, and due to other members of that outer class, the inner one was not aligned on a word boundary. This was causing the "Bus Error" that I was experiencing. I had not come across this before because the classes were not declared with __attribute__((packed)) in other code, but they are in my implementation.
Giving Type Attributes - Using the GNU Compiler Collection (GCC) a read was what actually sparked the answer for me. Two particular attributes that affect memory alignment (and, thus, in-line assembly such as I am using) are packed and aligned.
As taken from the aforementioned link:
aligned (alignment)
This attribute specifies a minimum alignment (in bytes) for variables of the specified type. For example, the declarations:
struct S { short f[3]; } __attribute__ ((aligned (8)));
typedef int more_aligned_int __attribute__ ((aligned (8)));
force the compiler to ensure (as far as it can) that each variable whose type is struct S or more_aligned_int is allocated and aligned at least on a 8-byte boundary. On a SPARC, having all variables of type struct S aligned to 8-byte boundaries allows the compiler to use the ldd and std (doubleword load and store) instructions when copying one variable of type struct S to another, thus improving run-time efficiency.
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question. This means that you can effectively adjust the alignment of a struct or union type by attaching an aligned attribute to any one of the members of such a type, but the notation illustrated in the example above is a more obvious, intuitive, and readable way to request the compiler to adjust the alignment of an entire struct or union type.
As in the preceding example, you can explicitly specify the alignment (in bytes) that you wish the compiler to use for a given struct or union type. Alternatively, you can leave out the alignment factor and just ask the compiler to align a type to the maximum useful alignment for the target machine you are compiling for. For example, you could write:
struct S { short f[3]; } __attribute__ ((aligned));
Whenever you leave out the alignment factor in an aligned attribute specification, the compiler automatically sets the alignment for the type to the largest alignment that is ever used for any data type on the target machine you are compiling for. Doing this can often make copy operations more efficient, because the compiler can use whatever instructions copy the biggest chunks of memory when performing copies to or from the variables that have types that you have aligned this way.
In the example above, if the size of each short is 2 bytes, then the size of the entire struct S type is 6 bytes. The smallest power of two that is greater than or equal to that is 8, so the compiler sets the alignment for the entire struct S type to 8 bytes.
Note that although you can ask the compiler to select a time-efficient alignment for a given type and then declare only individual stand-alone objects of that type, the compiler's ability to select a time-efficient alignment is primarily useful only when you plan to create arrays of variables having the relevant (efficiently aligned) type. If you declare or use arrays of variables of an efficiently-aligned type, then it is likely that your program also does pointer arithmetic (or subscripting, which amounts to the same thing) on pointers to the relevant type, and the code that the compiler generates for these pointer arithmetic operations is often more efficient for efficiently-aligned types than for other types.
The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well. See below.
Note that the effectiveness of aligned attributes may be limited by inherent limitations in your linker. On many systems, the linker is only able to arrange for variables to be aligned up to a certain maximum alignment. (For some linkers, the maximum supported alignment may be very very small.) If your linker is only able to align variables up to a maximum of 8-byte alignment, then specifying aligned(16) in an __attribute__ still only provides you with 8-byte alignment. See your linker documentation for further information.
.
packed
This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bit-fields) of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.
Specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members. Specifying the -fshort-enums flag on the line is equivalent to specifying the packed attribute on all enum definitions.
In the following example struct my_packed_struct's members are packed closely together, but the internal layout of its s member is not packed—to do that, struct my_unpacked_struct needs to be packed too.
struct my_unpacked_struct
{
char c;
int i;
};
struct __attribute__ ((__packed__)) my_packed_struct
{
char c;
int i;
struct my_unpacked_struct s;
};
You may only specify this attribute on the definition of an enum, struct or union, not on a typedef that does not also define the enumerated type, structure or union.
The problem which I was experiencing was specifically due to the use of packed. I attempted to simply add the aligned attribute to the structs and classes, but the error persisted. Only removing the packed attribute resolved the problem. For now, I am leaving the aligned attribute on them and testing to see if I find any improvements in the efficiency of the code as mentioned above, simply due to their being aligned on word boundaries. The application makes use of arrays of these structures, so perhaps there will be better performance, but only profiling the code will say for certain.

Optional named arguments without wrapping them all in "OptionValue"

Suppose I have a function with optional named arguments but I insist on referring to the arguments by their unadorned names.
Consider this function that adds its two named arguments, a and b:
Options[f] = {a->0, b->0}; (* The default values. *)
f[OptionsPattern[]] :=
OptionValue[a] + OptionValue[b]
How can I write a version of that function where that last line is replaced with simply a+b?
(Imagine that that a+b is a whole slew of code.)
The answers to the following question show how to abbreviate OptionValue (easier said than done) but not how to get rid of it altogether: Optional named arguments in Mathematica
Philosophical Addendum: It seems like if Mathematica is going to have this magic with OptionsPattern and OptionValue it might as well go all the way and have a language construct for doing named arguments properly where you can just refer to them by, you know, their names. Like every other language with named arguments does. (And in the meantime, I'm curious what workarounds are possible...)
Why not just use something like:
Options[f] = {a->0, b->0};
f[args___] := (a+b) /. Flatten[{args, Options[f]}]
For more complicated code I'd probably use something like:
Options[f] = {a->0, b->0};
f[OptionsPattern[]] := Block[{a,b}, {a,b} = OptionValue[{a,b}]; a+b]
and use a single call to OptionValue to get all the values at once. (Main reason is that this cuts down on messages if there are unknown options present.)
Update, to programmatically generate the variables from the options list:
Options[f] = {a -> 0, b -> 0};
f[OptionsPattern[]] :=
With[{names = Options[f][[All, 1]]},
Block[names, names = OptionValue[names]; a + b]]
Here is the final version of my answer, containing the contributions from the answer by Brett Champion.
ClearAll[def];
SetAttributes[def, HoldAll];
def[lhs : f_[args___] :> rhs_] /; !FreeQ[Unevaluated[lhs], OptionsPattern] :=
With[{optionNames = Options[f][[All, 1]]},
lhs := Block[optionNames, optionNames = OptionValue[optionNames]; rhs]];
def[lhs : f_[args___] :> rhs_] := lhs := rhs;
The reason why the definition is given as a delayed rule in the argument is that this way we can
benefit from the syntax highlighting. Block trick is used because it fits the problem: it does not interfere with possible nested lexical scoping constructs inside your function, and therefore there is no danger of inadvertent variable capture. We check for presence of OptionsPattern since this code wil not be correct for definitions without it, and we want def to also work in that case.
Example of use:
Clear[f, a, b, c, d];
Options[f] = {a -> c, b -> d};
(*The default values.*)
def[f[n_, OptionsPattern[]] :> (a + b)^n]
You can look now at the definition:
Global`f
f[n$_,OptionsPattern[]]:=Block[{a,b},{a,b}=OptionValue[{a,b}];(a+b)^n$]
f[n_,m_]:=m+n
Options[f]={a->c,b->d}
We can test it now:
In[10]:= f[2]
Out[10]= (c+d)^2
In[11]:= f[2,a->e,b->q]
Out[11]= (e+q)^2
The modifications are done at "compile - time" and are pretty transparent. While this solution saves
some typing w.r.t. Brett's, it determines the set of option names at "compile-time", while Brett's - at "run-time". Therefore, it is a bit more fragile than Brett's: if you add some new option to the function after it has been defined with def, you must Clear it and rerun def. In practice, however, it is customary to start with ClearAll and put all definitions in one piece (cell), so this does not seem to be a real problem. Also, it can not work with string option names, but your original concept also assumes they are Symbols. Also, they should not have global values, at least not at the time when def executes.
Here's a kind of horrific solution:
Options[f] = {a->0, b->0};
f[OptionsPattern[]] := Module[{vars, tmp, ret},
vars = Options[f][[All,1]];
tmp = cat[vars];
each[{var_, val_}, Transpose[{vars, OptionValue[Automatic,#]& /# vars}],
var = val];
ret =
a + b; (* finally! *)
eval["ClearAll[", StringTake[tmp, {2,-2}], "]"];
ret]
It uses the following convenience functions:
cat = StringJoin##(ToString/#{##})&; (* Like sprintf/strout in C/C++. *)
eval = ToExpression[cat[##]]&; (* Like eval in every other lang. *)
SetAttributes[each, HoldAll]; (* each[pattern, list, body] *)
each[pat_, lst_, bod_] := ReleaseHold[ (* converts pattern to body for *)
Hold[Cases[Evaluate#lst, pat:>bod];]]; (* each element of list. *)
Note that this doesn't work if a or b has a global value when the function is called. But that was always the case for named arguments in Mathematica anyway.

Recycling variable name within single function

I have a function that contains two for loops, and I'm using a variable called count as the counter. I've chosen to recycle the name as the the first loop will finish it's execution completely before the second one begins, so there is no chance of the counters interfering with each other. The G++ compiler has taken exception to this via the following warning:
error: name lookup of ‘count’ changed for ISO ‘for’ scoping
note: (if you use ‘-fpermissive’ G++ will accept your code)
Is variable recycling considered bad practice in professional software development, or is it a situational concern, and what other implications have I missed here?
Are you doing this?
for(int count = 0; ...)
{
...
}
for(count = 0; ...)
{
...
}
I doubt gcc would like that, as the second count isn't in scope. I think it only applies to the first for loop, but gcc has options to accept poor code. If you either make the second int count or move the first to the outer scope, gcc should be happy.
This depends on the circumstances, but I generally don't reuse variables. The name of the variable should reflect its purpose, and switching part way through a function can be confusing. Declare what you need, let the compiler take care of the optimizations.
Steve McConnell recommends not reusing local variables in functions in Code Complete.
He's not the definitive voice of practice in professional software development, but he's about as close as you're going to get to a definitive voice.
The argument is that it makes it harder to read the code.
What are you counting? Name the variables after that.
It sounds like you're defining the variable in the for? i.e. "for (int count=0; count++; count < x)"? If so, that could be problematic, as well as unclear. If you're going to use it in a second for loop define it outside both loops.
If you're using a loop counter variables like this, then it usually doesn't matter.
for (int i ...; ... ; ...) {
...
}
for (int i ...; ... ; ...) {
...
}
however, if you're intending to shadow another variable:
int i ...;
for (int i ...; ... ; ...) {
...
}
that's a red flag.