Using a hash with object keys in Perl 6

Using a hash with object keys in Perl 6 - raku

I'm trying to make a Hash with non-string keys, in my case arrays or lists.
> my %sum := :{(1, 3, 5) => 9, (2, 4, 6) => 12}
{(1 3 5) => 9, (2 4 6) => 12}
Now, I don't understand the following.
How to retrieve an existing element?
> %sum{(1, 3, 5)}
((Any) (Any) (Any))
> %sum{1, 3, 5}
((Any) (Any) (Any))
How to add a new element?
> %sum{2, 4} = 6
(6 (Any))

Several things are going on here: first of all, if you use (1,2,3) as a key, Rakudo Perl 6 will consider this to be a slice of 3 keys: 1, 2 and 3. Since neither of these exist in the object hash, you get ((Any) (Any) (Any)).
So you need to indicate that you want the list to be seen as single key of which you want the value. You can do this with $(), so %sum{$(1,3,5)}. This however does not give you the intended result. The reason behind that is the following:
> say (1,2,3).WHICH eq (1,2,3).WHICH
False
Object hashes internally key the object to its .WHICH value. At the moment, Lists are not considered value types, so each List has a different .WHICH. Which makes them unfit to be used as keys in object hashes, or in other cases where they are used by default (e.g. .unique and Sets, Bags and Mixes).
I'm actually working on making this the above eq return True before long: this should make it to the 2018.01 compiler release, on which also a Rakudo Star release will be based.
BTW, any time you're using object hashes and integer values, you will probably be better of using Bags. Alas not yet in this case either for the above reason.
You could actually make this work by using augment class List and adding a .WHICH method on that, but I would recommend against that as it will interfere with any future fixes.

Elizabeth's answer is solid, but until that feature is created, I don't see why you can't create a Key class to use as the hash key, which will have an explicit hash function which is based on its values rather than its location in memory. This hash function, used for both placement in the list and equality testing, is .WHICH. This function must return an ObjAt object, which is basically just a string.
class Key does Positional {
has Int #.list handles <elems AT-POS EXISTS-POS ASSIGN-POS BIND-POS push>;
method new(*#list) { self.bless(:#list); }
method WHICH() { ObjAt.new(#!list.join('|')); }
}
my %hsh{Key};
%hsh{Key.new(1, 3)} = 'result';
say %hsh{Key.new(1, 3)}; # output: result
Note that I only allowed the key to contain Int. This is an easy way of being fairly confident no element's string value contains the '|' character, which could make two keys look the same despite having different elements. However, this is not hardened against naughty users--4 but role :: { method Str() { '|' } } is an Int that stringifies to the illegal value. You can make the code stronger if you use .WHICH recursively, but I'll leave that as an exercise.
This Key class is also a little fancier than you strictly need. It would be enough to have a #.list member and define .WHICH. I defined AT-POS and friends so the Key can be indexed, pushed to, and otherwise treated as an Array.

Related

How to chain filter expressions together

I have data in the following format
ArrayList<Map.Entry<String,ByteString>>
[
{"a":[a-bytestring]},
{"b":[b-bytestring]},
{"a:model":[amodel-bytestring]},
{"b:model":[bmodel-bytestring]},
]
I am looking for a clean way to transform this data into the format (List<Map.Entry<ByteString,ByteString>>) where the key is the value of a and value is the value of a:model.
Desired output
List<Map.Entry<ByteString,ByteString>>
[
{[a-bytestring]:[amodel-bytestring]},
{[b-bytestring]:[bmodel-bytestring]}
]
I assume this will involve the use of filters or other map operations but am not familiar enough with Kotlin yet to know this

It's not possible to give an exact, tested answer without access to the ByteString class — but I don't think that's needed for an outline, as we don't need to manipulate byte strings, just pass them around. So here I'm going to substitute Int; it should be clear and avoid any dependencies, but still work in the same way.
I'm also going to use a more obvious input structure, which is simply a map:
val input = mapOf("a" to 1,
"b" to 2,
"a:model" to 11,
"b:model" to 12)
As I understand it, what we want is to link each key without :model with the corresponding one with :model, and return a map of their corresponding values.
That can be done like this:
val output = input.filterKeys{ !it.endsWith(":model") }
.map{ it.value to input["${it.key}:model"] }.toMap()
println(output) // Prints {1=11, 2=12}
The first line filters out all the entries whose keys end with :model, leaving only those without. Then the second creates a map from their values to the input values for the corresponding :model keys. (Unfortunately, there's no good general way to create one map directly from another; here map() creates a list of pairs, and then toMap() creates a map from that.)
I think if you replace Int with ByteString (or indeed any other type!), it should do what you ask.
The only thing to be aware of is that the output is a Map<Int, Int?> — i.e. the values are nullable. That's because there's no guarantee that each input key has a corresponding :model key; if it doesn't, the result will have a null value. If you want to omit those, you could call filterValues{ it != null } on the result.
However, if there's an ‘orphan’ :model key in the input, it will be ignored.

Raku: return Type

I want to write a function returning an array whose all subarrays must have a length of two.
For example return will be [[1, 2], [3, 4]].
I define:
(1) subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
and
sub fcn(Int $n) of TArray is export(:fcn) {
[[1, 2], [3, 4]];
}
I find (1) over-complicated. Is there something simpler?

Stepping back first
subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
Is there something simpler?
Before we go there, let's step back. Even ignoring your code's "overly-complicated" nature based on just looking at it, it's also potentially problematic and complicated for various reasons that may not be so obvious. I'll highlight three:
This subset will accept an Array containing Arrays, with each of those arrays containing two Ints. But it doesn't mandate an Array[Array[Int]]. The outer Array's type may be just a generic Array, rather than being an Array[Array] let lone an Array[Array[Int]]. Indeed it will be unless you deliberately introduce strongly typed values. I will cover strong typing in the last section of this answer.
What about an empty Array? Your subset will accept that. Is that your intent? If not, what about requiring at least one pair of Ints?
The outer where clause uses a common Raku idiom of the form .all ~~ ..., with a junction on the left hand side of the ~~ smart match operator. Astonishingly, per an issue I just filed, this may be a problem. What alternatives are there?
Starting simple
Raku does a decent job of keeping simple things simple. If we put aside any artificial desire for strong typing, and focus on simple tools for tightening code up, a simple subset I would have suggested in the past would be:
subset TArray where .all == 2; # BAD despite being idiomatic???
This has all of the problems your original code has, plus in addition it accepts data that has non-integers where integers belong.
But it does have the redeeming qualities that it does a useful check (that the inner arrays each have two elements) and it's significantly simpler than your code.
Now I've reminded myself that I need to view .all on the left hand side of ~~ as possibly a problem, I'll instead write it as:
subset TArray where 2 == .all; # Potentially the new idiomatic.
This version reads more poorly, but, while readability is important, basic correctness is more important.
Still fairly simple, and less problems
Here are two variants I came up with:
subset TArray where all .map: * ~~ (Int,Int);
subset TArray where .elems == .grep: (Int,Int);
These both avoid the junction/smartmatch problem. (The first where expression does have a junction to the left of a smart match, but it's not an example of the problem.)
The second version isn't so obviously correct (think of it as checking that the count of subarrays is the same as the count of subarrays that match (Int,Int)) but it nicely lends itself to fixing the problem of matching if there are zero subarrays, if that were to need fixing:
subset TArray where 0 < .elems == .grep: (Int,Int);
Strong typing solutions
The solutions thus far don't deal with strong typing. Perhaps that's desirable. Perhaps not.
To understand what I mean by this, let's first look at literals:
say WHAT 1; # (Int)
say WHAT [1,2]; # (Array)
say WHAT [[1,2],[3,4]]; # (Array)
These values have types determined by their literal constructors.
The last two are just Arrays, generic over their elements.
(The second is not an Array[Int], which might be expected. Similarly the last one is not an Array[Array[Int]].)
Current built in Raku literal forms for composite types (arrays and hashes) all construct generic Arrays which do not constrain the types of their elements.
See the PR Introduce [1,2,3]:Int syntax #4406 for a proposal/PR regarding element typed composite literals and a related issue I just posted in response to your Q here about an alternative and/or complementary approach to that PR. (There have been discussions over the years about this aspect of the type system but it seems like it's time for Rakoons to look at addressing it.)
What if you wanted to build a strongly typed data structure as the value to return from your routine, and to have the return type check that?
Here's one way one might build such a strongly typed value:
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
Super verbose! But now you could write the following for your sub's return type check and it'll work:
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
sub fcn(Int $n) of TArray is export(:fcn) {
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
}
Another way to build a strongly typed value is to specify not only the strong typing in a variable's type constraint, but also coercion typing to bridge from a loosely typed value to a strongly typed target.
We keep the exact same subset (that establishes the strongly typed target data structure and adds "refinement typing" checks):
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
But instead of using a verbose correct-by-construction initialization value, using full type names and news, we introduce additional coercion typing and then just use ordinary literal syntax:
constant TArrayInitialization = TArray(Array[Array[Int]()]());
sub fcn(Int $n) of TArray is export(:fcn) {
my TArrayInitialization $result = [[1,2],[3,4]];
}
(I could have written the TArrayInitialization declaration as another subset, but it would be a slight overkill to have done so. A constant does the job with less fuss.)

I gather that the aim is to restrict the type of the inner Array to [Int,Int] ... the closest I can get to this is to declare two subsets, one based on the other...
subset IArray where * ~~ [Int, Int];
subset TArray where .all ~~ IArray;
Otherwise, the anonymous subset form you use seems to be the briefest, although as #raiph points out you can drop the 'of Array' piece.

If you wanted to impose this sort of constraint on a function's parameter (rather than its return type) you could do so with something like:
sub fcn(#a where {all .map: * ~~ [Int, Int]}) {...}
As the other answers have mentioned, there currently isn't great syntax for similarly constraining the return type, but there's a proposal to add support for similar syntax for return types. In fact, as mentioned in that issue, someone has volunteered to work on an implementation but hasn't yet made any progress as far as I know. (And I guess I should know, since I was that volunteer… oops)
So, for now, a subset is the best option – but hopefully the future will have even better ways to write that.

how would you write R.compose using R.o?

Seems like some use to knowing a good pattern to make an n-step composition or pipeline from a binary function. Maybe it's obvious or common knowledge.
What I was trying to do was R.either(predicate1, predicate2, predicate3, ...) but R.either is one of these binary functions. I thought R.composeWith might be part of a good solution but didn't get it to work right. Then I think R.o is at the heart of it, or perhaps R.chain somehow.
Maybe there's a totally different way to make an n-ary either that could be better than a "compose-with"(R.either)... interested if so but trying to ask a more general question than that.

One common way for converting a binary function into one that takes many arguments is by using R.reduce. This requires at least the arguments of the binary function and its return type to be the same type.
For your example with R.either, it would look like:
const eithers = R.reduce(R.either, R.F)
const fooOr42 = eithers([ R.equals("foo"), R.equals(42) ])
This accepts a list of predicate functions that will each be given as arguments to R.either.
The fooOr42 example above is equivalent to:
const fooOr42 = R.either(R.either(R.F, R.equals("foo")), R.equals(42))
You can also make use of R.unapply if you want to convert the function from accepting a list of arguments, to a variable number of arguments.
const eithers = R.unapply(R.reduce(R.either, R.F))
const fooOr42 = eithers(R.equals("foo"), R.equals(42))
The approach above can be used for any type that can be combined to produce a value of the same type, where the type has some "monoid" instance. This just means that we have a binary function that combines the two types together and some "empty" value, which satisfy some simple laws:
Associativity: combine(a, combine(b, c)) == combine(combine(a, b), c)
Left identity: combine(empty, a) == a
Right identity: combine(a, empty) == a
Some examples of common types with a monoid instance include:
arrays, where the empty list is the empty value and concat is the binary function.
numbers, where 1 is the empty value and multiply is the binary function
numbers, where 0 is the empty value and add is the binary function
In the case of your example, we have predicates (a function returning a boolean value), where the empty value is R.F (a.k.a (_) => false) and the binary function is R.either. You can also combine predicates using R.both with an empty value of R.T (a.k.a (_) => true), which will ensure the resulting predicate satisfies all of the combined predicates.
It is probably also worth mentioning that you could alternatively just use R.anyPass :)

What are the advantages of returning -1 instead of null in indexOf(...)?

When calling List.indexOf(...), what are the advantages of returning -1 rather than null if the value isn't present?
For example:
val list = listOf("a", "b", "c")
val index = list.indexOf("d")
print(index) // Prints -1
Wouldn't it be a cleaner result if index was null instead? If it had an optional return type, then it would be compatible with the elvis operator :? as well as doing things such as index?.let { ... }.
What are the advantages of returning -1 instead of null when there are no matches?

Just speculations but i could think of two reasons:
The first reason is to be compatible with Java and its List.indexOf
As the documentation states:
Returns:
the index of the first occurrence of the specified element in this list, or -1 if this list does not contain the element
The second reason is to have the same datatype as kotlins binarySearch.
Return the index of the element, if it is contained in the list within the specified range; otherwise, the inverted insertion point (-insertion point - 1). The insertion point is defined as the index at which the element should be inserted, so that the list (or the specified subrange of list) still remains sorted.
Where the negative values actually hold additional information where to insert the element if absent. But since the normal indexOf method works on unsorted collections you can not infer the insertion position.

To add to the definitive answer of #Burdui, another reason of such behavior is that -1 return value can be expressed with the same primitive Int type as the other possible results of indexOf function.
If indexOf returned null, it would require making its return type nullable, Int?, and that would cause a primitive return value being boxed into an object. indexOf is often used in a tight loop, for example, when searching for all occurrences of a substring in a string, and having boxing on that hot path could make the cost of using indexOf prohibitive.
On the other hand, there definitely can be situations where performance does not so matter, and returning null from indexOf would make code more expressive. There's a request KT-8133 to introduce indexOfOrNull extension for such situations.
Meanwhile a workaround with calling .takeIf { it >= 0 } on the result of indexOf allows to achieve the same.

Accessing the last element in Perl6

Could someone explain why this accesses the last element in Perl 6
#array[*-1]
and why we need the asterisk *?
Isn't it more logical to do something like this:
#array[-1]

The user documentation explains that *-1 is just a code object, which could also be written as
-> $n { $n - 1 }
When passed to [ ], it will be invoked with the array size as argument to compute the index.
So instead of just being able to start counting backwards from the end of the array, you could use it to eg count forwards from its center via
#array[* div 2] #=> middlemost element
#array[* div 2 + 1] #=> next element after the middlemost one
According to the design documents, the reason for outlawing negative indices (which could have been accepted even with the above generalization in place) is this:
The Perl 6 semantics avoids indexing discontinuities (a source of subtle runtime errors), and provides ordinal access in both directions at both ends of the array.

If you don't like the whatever-star, you can also do:
my $last-elem = #array.tail;
or even
my ($second-last, $last) = #array.tail(2);
Edit: Of course, there's also a head method:
my ($first, $second) = #array.head(2);

The other two answers are excellent. My only reason for answering was to add a little more explanation about the Whatever Star * array indexing syntax.
The equivalent of Perl 6's #array[*-1] syntax in Perl 5 would be $array[ scalar #array - 1]. In Perl 5, in scalar context an array returns the number of items it contains, so scalar #array gives you the length of the array. Subtracting one from this gives you the last index of the array.
Since in Perl 6 indices can be restricted to never be negative, if they are negative then they are definitely out of range. But in Perl 5, a negative index may or may not be "out of range". If it is out of range, then it only gives you an undefined value which isn't easy to distinguish from simply having an undefined value in an element.
For example, the Perl 5 code:
use v5.10;
use strict;
use warnings;
my #array = ('a', undef, 'c');
say $array[-1]; # 'c'
say $array[-2]; # undefined
say $array[-3]; # 'a'
say $array[-4]; # out of range
say "======= FINISHED =======";
results in two nearly identical warnings, but still finishes running:
c
Use of uninitialized value $array[-2] in say at array.pl line 7.
a
Use of uninitialized value in say at array.pl line 9.
======= FINISHED =======
But the Perl 6 code
use v6;
my #array = 'a', Any, 'c';
put #array[*-1]; # evaluated as #array[2] or 'c'
put #array[*-2]; # evaluated as #array[1] or Any (i.e. undefined)
put #array[*-3]; # evaluated as #array[0] or 'a'
put #array[*-4]; # evaluated as #array[-1], which is a syntax error
put "======= FINISHED =======";
will likewise warn about the undefined value being used, but it fails upon the use of an index that comes out less than 0:
c
Use of uninitialized value #array of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
in block <unit> at array.p6 line 5
a
Effective index out of range. Is: -1, should be in 0..Inf
in block <unit> at array.p6 line 7
Actually thrown at:
in block <unit> at array.p6 line 7
Thus your Perl 6 code can more robust by not allowing negative indices, but you can still index from the end using the Whatever Star syntax.
last word of advice
If you just need the last few elements of an array, I'd recommend using the tail method mentioned in mscha's answer. #array.tail(3) is much more self-explanatory than #array[*-3 .. *-1].

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Using a hash with object keys in Perl 6 - raku

Related

How to chain filter expressions together

Raku: return Type

how would you write R.compose using R.o?

What are the advantages of returning -1 instead of null in indexOf(...)?

Accessing the last element in Perl6

Categories

Resources