Perl6: Sub: restrict to static hash-return type - raku

I want to restrict the return type of some of my functions in Perl6.
I know, how to deduce the right return type of a function, returning a scalar or an array in Perl6, but I don't know, how I can do this, if I use a hash of a particular type as return value?
Example: The Array approach can be seen in test_arr() and the Hash approach can be seen in test_hash(). So I want to specify the return value of test_hash(), to return a hash of class A.
#!/usr/bin/env perl6
use v6.c;
use fatal;
class A {
has Str $.member;
}
sub test_arr() returns Array[A] {
my A #ret;
#ret.push(A.new(member=>'aaa'));
return #ret;
}
sub test_hash() { #TODO: add `returns FANCY_TYPE`
my A %ret;
%ret.append("elem",A.new(member=>'aaa'));
%ret.append("b",A.new(member=>'d'));
return %ret;
}
sub MAIN() returns UInt:D {
say test_arr().perl;
say test_hash().perl;
return 0;
}

It's really the same as with arrays:
sub test_hash() returns Hash[A] {
my A %ret;
%ret.append("elem",A.new(member=>'aaa'));
%ret.append("b",A.new(member=>'d'));
return %ret;
}
Note that you can also write %ret<elem> = A.new(...).
Update: For an hash of array of A, you need to do basically the same thing, you just need to be explicit about the types at every step:
sub test_hash() returns Hash[Array[A]] {
my Array[A] %ret;
%ret<elem> = Array[A].new(A.new(member => 'aaa'));
return %ret;
}
But don't exaggerate it; Perl 6 isn't as strongly typed as Haskell, and trying to act as if it were won't make for a good coding experience.

Related

how to create and export dynamic operators

I have some classes (and will need quite a few more) that look like this:
use Unit;
class Unit::Units::Ampere is Unit
{
method TWEAK { with self {
.si = True;
# m· kg· s· A ·K· mol· cd
.si-signature = [ 0, 0, 0, 1, 0, 0, 0 ];
.singular-name = "ampere";
.plural-name = "ampere";
.symbol = "A";
}}
sub postfix:<A> ($value) returns Unit::Units::Ampere is looser(&prefix:<->) is export(:short) {
return Unit::Units::Ampere.new( :$value );
};
sub postfix:<ampere> ($value) returns Unit::Units::Ampere is looser(&prefix:<->) is export(:long) {
$value\A;
};
}
I would like to be able to construct and export the custom operators dynamically at runtime. I know how to work with EXPORT, but how do I create a postfix operator on the fly?
I ended up basically doing this:
sub EXPORT
{
return %(
"postfix:<A>" => sub is looser(&prefix:<->) {
#do something
}
);
}
which is disturbingly simple.
For the first question, you can create dynamic subs by returning a sub from another. To accept only an Ampere parameter (where "Ampere" is chosen programmatically), use a type capture in the function signature:
sub make-combiner(Any:U ::Type $, &combine-logic) {
return sub (Type $a, Type $b) {
return combine-logic($a, $b);
}
}
my &int-adder = make-combiner Int, {$^a + $^b};
say int-adder(1, 2);
my &list-adder = make-combiner List, {(|$^a, |$^b)};
say list-adder(<a b>, <c d>);
say list-adder(1, <c d>); # Constraint type check fails
Note that when I defined the inner sub, I had to put a space after the sub keyword, lest the compiler think I'm calling a function named "sub". (See the end of my answer for another way to do this.)
Now, on to the hard part: how to export one of these generated functions? The documentation for what is export really does is here: https://docs.perl6.org/language/modules.html#is_export
Half way down the page, they have an example of adding a function to the symbol table without being able to write is export at compile time. To get the above working, it needs to be in a separate file. To see an example of a programmatically determined name and programmatically determined logic, create the following MyModule.pm6:
unit module MyModule;
sub make-combiner(Any:U ::Type $, &combine-logic) {
anon sub combiner(Type $a, Type $b) {
return combine-logic($a, $b);
}
}
my Str $name = 'int';
my $type = Int;
my package EXPORT::DEFAULT {
OUR::{"&{$name}-eater"} := make-combiner $type, {$^a + $^b};
}
Invoke Perl 6:
perl6 -I. -MMyModule -e "say int-eater(4, 3);"
As hoped, the output is 7. Note that in this version, I used anon sub, which lets you name the "anonymous" generated function. I understand this is mainly useful for generating better stack traces.
All that said, I'm having trouble dynamically setting a postfix operator's precedence. I think you need to modify the Precedence role of the operator, or create it yourself instead of letting the compiler create it for you. This isn't documented.

Adding user mode types for Perl 6 NativeCall structs

The Perl 6 docs list a bunch of types. Some of them, such as Str, have more complicated box/unbox behaviors.
Is it possible to define my own type, specifying my own routines for the box/unboxing? For a particular project, I have a bunch of types I'm reusing, and basically cut/pasting my accessor functions over and over.
For example, the C Struct uses a time_t, and I plug in accessor methods to go to/from a DateTime. Another example is a comma-separated list, I'd like to go to/from an Array and take care of the split/join automagically.
Is there a better way to do this?
Edit: Add Example:
constant time_t = uint64;
constant FooType_t = uint16;
enum FooType <A B C>;
class Foo is repr('CStruct') is rw
{
has uint32 $.id;
has Str $.name;
has FooType_t $.type;
has time_t $.time;
method name(Str $n?) {
$!name := $n with $n;
$!name;
}
method type(FooType $t?) {
$!type = $t with $t;
FooType($!type);
}
method time(DateTime $d?) {
$!time = .Instant.to-posix[0].Int with $d;
DateTime.new($!time)
}
}
my $f = Foo.new;
$f.id = 12;
$f.name('myname');
$f.type(B);
$f.time(DateTime.new('2000-01-01T12:34:56Z'));
say "$f.id() $f.name() $f.type() $f.time()";
# 12 myname B 2000-01-01T12:34:56Z
This works, I can set the various fields of the CStruct in Perl-ish ways (no lvalue, but I can pass them in as parameters).
Now I want to use time_t, FooType_t, etc. for many fields in a lot of structs and have them act the same way. Is there a better way other than to just copy those methods over and over?
Maybe macros could help here? I haven't mastered them yet.
You could write a trait that handles automatic attribute conversion on fetching or storing the attribute. The following should get you started:
multi sub trait_mod:<is>(Attribute:D $attr, :$autoconv!) {
use nqp;
my $name := $attr.name;
$attr.package.^add_method: $name.substr(2), do given $attr.type {
when .REPR eq 'P6int' {
method () is rw {
my $self := self;
Proxy.new:
FETCH => method () {
$autoconv.out(nqp::getattr_i($self, $self.WHAT, $name));
},
STORE => method ($_) {
nqp::bindattr_i($self, $self.WHAT, $name,
nqp::decont($autoconv.in($_)));
}
}
}
default {
die "FIXME: no idea how to handle {.^name}";
}
}
}
For example, take your use case of time_t:
constant time_t = uint64;
class CTimeConversion {
multi method in(Int $_ --> time_t) { $_ }
multi method in(DateTime $_ --> time_t) { .posix }
method out(time_t $_ --> DateTime) { DateTime.new($_) }
}
class CTimeSpan is repr<CStruct> {
has time_t $.start is autoconv(CTimeConversion);
has time_t $.end is autoconv(CTimeConversion);
}
Finally, some example code to show it works:
my $span = CTimeSpan.new;
say $span;
say $span.end;
$span.end = DateTime.now;
say $span;
say $span.end;

How does one write custom accessor methods in Perl6?

How does one write custom accessor methods in Perl6?
If I have this class:
class Wizard {
has Int $.mana is rw;
}
I can do this:
my Wizard $gandalf .= new;
$gandalf.mana = 150;
Let's say I want to add a little check to a setter in my Perl6 class without giving up the $gandalf.mana = 150; notation (in other words, I don't want to write this: $gandalf.setMana(150);). The program should die, if it tries to set a negative mana. How do I do this? The Perl6 documentation just mentions it is possible to write custom accessors, but does not say how.
With more recent versions of Rakudo there is a subset named UInt that restricts it to positive values.
class Wizard {
has UInt $.mana is rw;
}
So that you're not stuck in a lurch if you need to something like this; here is how that is defined:
( you can leave off the my, but I wanted to show you the actual line from the Rakudo source )
my subset UInt of Int where * >= 0;
You could also do this:
class Wizard {
has Int $.mana is rw where * >= 0;
}
I would like to point out that the * >= 0 in the where constraint is just a short way to create a Callable.
You could have any of the following as a where constraint:
... where &subroutine # a subroutine that returns a true value for positive values
... where { $_ >= 0 }
... where -> $a { $a >= 0 }
... where { $^a >= 0 }
... where $_ >= 0 # statements also work ( 「$_」 is set to the value it's testing )
( If you wanted it to just not be zero you could also use ... where &prefix:<?> which is probably better spelled as ... where ?* or ... where * !== 0 )
If you feel like being annoying to people using your code you could also do this.
class Wizard {
has UInt $.mana is rw where Bool.pick; # accepts changes randomly
}
If you want to make sure the value "makes sense" when looking at all of the values in the class in aggregate, you will have to go to a lot more work.
( It may require a lot more knowledge of the implementation as well )
class Wizard {
has Int $.mana; # use . instead of ! for better `.perl` representation
# overwrite the method the attribute declaration added
method mana () is rw {
Proxy.new(
FETCH => -> $ { $!mana },
STORE => -> $, Int $new {
die 'invalid mana' unless $new >= 0; # placeholder for a better error
$!mana = $new
}
)
}
}
You can get the same accessor interface that saying $.mana provides by declaring a method is rw. Then you can wrap a proxy around the underlying attribute like so:
#!/usr/bin/env perl6
use v6;
use Test;
plan 2;
class Wizard {
has Int $!mana;
method mana() is rw {
return Proxy.new:
FETCH => sub ($) { return $!mana },
STORE => sub ($, $mana) {
die "It's over 9000!" if ($mana // 0) > 9000;
$!mana = $mana;
}
}
}
my Wizard $gandalf .= new;
$gandalf.mana = 150;
ok $gandalf.mana == 150, 'Updating mana works';
throws_like sub {
$gandalf.mana = 9001;
}, X::AdHoc, 'Too much mana is too much';
Proxy is basically a way to intercept read and write calls to storage and do something other than the default behavior. As their capitalization suggests, FETCH and STORE are called automatically by Perl to resolve expressions like $gandalf.mana = $gandalf.mana + 5.
There's a fuller discussion, including whether you should even attempt this, at PerlMonks. I would recommend against the above -- and public rw attributes in general. It's more a display of what it is possible to express in the language than a useful tool.

How to test a function's output (stdout/stderr) in unit tests

I have a simple function I want to test:
func (t *Thing) print(min_verbosity int, message string) {
if t.verbosity >= minv {
fmt.Print(message)
}
}
But how can I test what the function actually sends to standard output? Test::Output does what I want in Perl. I know I could write all my own boilerplate to do the same in Go (as described here):
orig = os.Stdout
r,w,_ = os.Pipe()
thing.print("Some message")
var buf bytes.Buffer
io.Copy(&buf, r)
w.Close()
os.Stdout = orig
if(buf.String() != "Some message") {
t.Error("Failure!")
}
But that's a lot of extra work for every single test. I'm hoping there's a more standard way, or perhaps an abstraction library to handle this.
One thing to also remember, there's nothing stopping you from writing functions to avoid the boilerplate.
For example I have a command line app that uses log and I wrote this function:
func captureOutput(f func()) string {
var buf bytes.Buffer
log.SetOutput(&buf)
f()
log.SetOutput(os.Stderr)
return buf.String()
}
Then used it like this:
output := captureOutput(func() {
client.RemoveCertificate("www.example.com")
})
assert.Equal(t, "removed certificate www.example.com\n", output)
Using this assert library: http://godoc.org/github.com/stretchr/testify/assert.
You can do one of three things. The first is to use Examples.
The package also runs and verifies example code. Example functions may include a concluding line comment that begins with "Output:" and is compared with the standard output of the function when the tests are run. (The comparison ignores leading and trailing space.) These are examples of an example:
func ExampleHello() {
fmt.Println("hello")
// Output: hello
}
The second (and more appropriate, IMO) is to use fake functions for your IO. In your code you do:
var myPrint = fmt.Print
func (t *Thing) print(min_verbosity int, message string) {
if t.verbosity >= minv {
myPrint(message) // N.B.
}
}
And in your tests:
func init() {
myPrint = fakePrint // fakePrint records everything it's supposed to print.
}
func Test...
The third is to use fmt.Fprintf with an io.Writer that is os.Stdout in production code, but bytes.Buffer in tests.
You could consider adding a return statement to your function to return the string that is actually printed out.
func (t *Thing) print(min_verbosity int, message string) string {
if t.verbosity >= minv {
fmt.Print(message)
return message
}
return ""
}
Now, your test could just check the returned string against an expected string (rather than the print out). Maybe a bit more in-line with Test Driven Development (TDD).
And, in your production code, nothing would need to change, since you don't have to assign the return value of a function if you don't need it.

How do I write a generic memoize function?

I'm writing a function to find triangle numbers and the natural way to write it is recursively:
function triangle (x)
if x == 0 then return 0 end
return x+triangle(x-1)
end
But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]
That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html
I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!
You're also asking the wrong question for your original problem ;)
This is a better way for that case:
triangle(n) = n * (n - 1) / 2
Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.
I bet something like this should work with variable argument lists in Lua:
local function varg_tostring(...)
local s = select(1, ...)
for n = 2, select('#', ...) do
s = s..","..select(n,...)
end
return s
end
local function memoize(f)
local cache = {}
return function (...)
local al = varg_tostring(...)
if cache[al] then
return cache[al]
else
local y = f(...)
cache[al] = y
return y
end
end
end
You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.
In C# 3.0 - for recursive functions, you can do something like:
public static class Helpers
{
public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>, R> f)
{
var map = new Dictionary<A, R>();
Func<A, R> self = null;
self = (a) =>
{
R value;
if (map.TryGetValue(a, out value))
return value;
value = f(a, self);
map.Add(a, value);
return value;
};
return self;
}
}
Then you can create a memoized Fibonacci function like this:
var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
In Scala (untested):
def memoize[A, B](f: (A)=>B) = {
var cache = Map[A, B]()
{ x: A =>
if (cache contains x) cache(x) else {
val back = f(x)
cache += (x -> back)
back
}
}
}
Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:
val correctMem = memoize(memoize _)
I don't think that this will compile, but it does illustrate the idea.
Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.
I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.
Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.
If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.
function memoize (f)
local cache = {}
return function (x)
if cache[x] then
return cache[x]
else
local y = f(x)
cache[x] = y
return y
end
end
end
triangle = memoize(triangle);
Note that to avoid a stack overflow, triangle would still need to be seeded.
Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.
local function m(f)
local t = { }
local function mf(x, ...) -- memoized f
assert(x ~= nil, 'nil passed to memoized function')
if select('#', ...) > 0 then
t[x] = t[x] or m(function(...) return f(x, ...) end)
return t[x](...)
else
t[x] = t[x] or f(x)
assert(t[x] ~= nil, 'memoized function returns nil')
return t[x]
end
end
return mf
end
I've been inspired by this question to implement (yet another) flexible memoize function in Lua.
https://github.com/kikito/memoize.lua
Main advantages:
Accepts a variable number of arguments
Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
Works just fine with functions that return multiple values.
Pasting the code here as reference:
local globalCache = {}
local function getFromCache(cache, args)
local node = cache
for i=1, #args do
if not node.children then return {} end
node = node.children[args[i]]
if not node then return {} end
end
return node.results
end
local function insertInCache(cache, args, results)
local arg
local node = cache
for i=1, #args do
arg = args[i]
node.children = node.children or {}
node.children[arg] = node.children[arg] or {}
node = node.children[arg]
end
node.results = results
end
-- public function
local function memoize(f)
globalCache[f] = { results = {} }
return function (...)
local results = getFromCache( globalCache[f], {...} )
if #results == 0 then
results = { f(...) }
insertInCache(globalCache[f], {...}, results)
end
return unpack(results)
end
end
return memoize
Here is a generic C# 3.0 implementation, if it could help :
public static class Memoization
{
public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
{
var cache = new Dictionary<T, TResult>();
var nullCache = default(TResult);
var isNullCacheSet = false;
return parameter =>
{
TResult value;
if (parameter == null && isNullCacheSet)
{
return nullCache;
}
if (parameter == null)
{
nullCache = function(parameter);
isNullCacheSet = true;
return nullCache;
}
if (cache.TryGetValue(parameter, out value))
{
return value;
}
value = function(parameter);
cache.Add(parameter, value);
return value;
};
}
}
(Quoted from a french blog article)
In the vein of posting memoization in different languages, i'd like to respond to #onebyone.livejournal.com with a non-language-changing C++ example.
First, a memoizer for single arg functions:
template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
template <class F>
const Result& operator()(F f, const Arg& a){
typename ResultStore::const_iterator it = memo_.find(a);
if(it == memo_.end()) {
it = memo_.insert(make_pair(a, f(a))).first;
}
return it->second;
}
private:
ResultStore memo_;
};
Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).
Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation
Implemented:
int fib_(int n){
++total_ops;
if(n == 0 || n == 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
And the driver, to memoize
int fib(int n) {
static memoizer1<int,int> memo;
return memo(fib_, n);
}
Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)
This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.
In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.
The example from it's manpage:
# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments); # Is faster than it was before
You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.
Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!
Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html
Extending the idea, it's also possible to memoize functions with two input parameters:
function memoize2 (f)
local cache = {}
return function (x, y)
if cache[x..','..y] then
return cache[x..','..y]
else
local z = f(x,y)
cache[x..','..y] = z
return z
end
end
end
Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.
But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.
function gcd (a, b)
if b == 0 then return a end
return gcd(b, a%b)
end
As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.
Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...
public int triangle(final int n){
return n * (n - 1) / 2;
}
Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.
int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
sum+=i;
memo[i] = sum;
}
return memo[n];