Performant math operations on large Perl6 CArrays? - raku

I have some large CArrays returned by a native sub that I need to perform basic element-wise math operations on. The CArrays are usually on the order of 10^6 elements. I have found that calling .list on them to them treat them as normal Perl6 types is very expensive. Is there a way to do performant element-wise operations on them while keeping them CArrays?
Short test script to time some methods I've tried:
#!/usr/bin/env perl6
use v6.c;
use NativeCall;
use Terminal::Spinners;
my $list;
my $carray;
my $spinner = Spinner.new;
########## create data stuctures ##########
print "Creating 10e6 element List and CArray ";
my $create = Promise.start: {
$list = 42e0 xx 10e6;
$carray = CArray[num32].new($list);
}
$spinner.await: $create;
########## time List subtractions ##########
my $time = now;
print "Substracting two 10e6 element Lists w/ hyper ";
$spinner.await( Promise.start: {$list >>-<< $list} );
say "List hyper subtraction took: {now - $time} seconds";
$time = now;
print "Substracting two 10e6 element Lists w/ for loop ";
my $diff = Promise.start: {
for ^$list.elems {
$list[$_] - $list[$_];
}
}
$spinner.await: $diff;
say "List for loop subtraction took: {now - $time} seconds";
########## time CArray subtractions ##########
$time = now;
print "Substracting two 10e6 element CArrays w/ .list and hyper ";
$spinner.await( Promise.start: {$carray.list >>-<< $carray.list} );
say "CArray .list and hyper subtraction took: {now - $time} seconds";
$time = now;
print "Substracting two 10e6 element CArrays w/ for loop ";
$diff = Promise.start: {
for ^$carray.elems {
$carray[$_] - $carray[$_];
}
}
$spinner.await: $diff;
say "CArray for loop subtraction took: {now - $time} seconds";
Output:
Creating 10e6 element List and CArray |
Substracting two 10e6 element Lists w/ hyper -
List hyper subtraction took: 26.1877042 seconds
Substracting two 10e6 element Lists w/ for loop -
List for loop subtraction took: 20.6394032 seconds
Substracting two 10e6 element CArrays w/ .list and hyper /
CArray .list and hyper subtraction took: 164.4888844 seconds
Substracting two 10e6 element CArrays w/ for loop |
CArray for loop subtraction took: 133.00560218 seconds
The for loop method seems fastest, but a CArray still took 6x longer to process than a List.
Any ideas?

As long as you can work with a different native data type, Matrix and Vector, you can use the (also native) port of the Gnu Scientific Library by Fernando Santagata. It's got a Vector.sub function you can use.
#!/usr/bin/env perl6
use v6.c;
use NativeCall;
use Terminal::Spinners;
use Math::Libgsl::Vector;
my $list;
my $carray;
my $spinner = Spinner.new;
########## create data stuctures ##########
print "Creating 10e6 element List and CArray ";
my $list1 = Math::Libgsl::Vector.new(size => 10e6.Int);
$list1.setall(42);
my $list2 = Math::Libgsl::Vector.new(size => 10e6.Int);
$list2.setall(33);
########## time List subtractions ##########
my $time = now;
print "Substracting two 10e6 element Lists w/ hyper ";
$spinner.await( Promise.start: { $list1.sub( $list2)} );
say "GSL Vector subtraction took: {now - $time} seconds";
This takes:
GSL Vector subtraction took: 0.08243 seconds
Is that fast enough for ya? :-)

Related

Why does itertools.product run through all elements at initialization?

I assumed that itertools.product generates elements one at the time. I am now noticing that it is not true.
Simple proof of concept:
Class A:
def __init__(self, n):
self.source = iter(range(n))
def __iter__(self):
return self
def __next__(self):
val = next(self.source)
print("I am at:", val)
return val
Now If I do:
from itertools import product
l = product(A(3), A(3))
print("Here")
next(l)
I expect to have as output:
>'Here'
>'I am at 0'
>'I am at 0'
But I have
>'I am at 0'
>'I am at 1'
>'I am at 2'
>'I am at 0'
>'I am at 1'
>'I am at 2'
>'Here'
Am I missing something?
To answer your question we need to look at the implementation of itertools.product:
def product(*args, repeat=1):
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
here you find the real C implementation, but to answer this question, it is enough to refer to python (see the EXTRA paragraph at the bottom).
focus on this line of code:
pools = [tuple(pool) for pool in args] * repeat
in this way all the elements of the two iterators (taken in input) are transformed into a list of tuples (only the first time you call next()), and at this time they are actually created.
Returning to your code, when you call next(l) for the first time, all elements of the iterators are created. In your example the list will be created the polls list with the following elements:
# pools: [(0, 1, 2), (0, 1, 2)]
which is why you got those outputs.
As for the print("Here"), to understand why it is printed first you need to understand how the generators work:
itertool.product() returns a generator object. The generator does not execute the function code until it is stimulated by the first next(). Subsequently, each call next() allows you to calculate the next element, executing only once the loop containing the keyword yield.
Here you will find excellent resources to better understand how python generators work.
Why did 'itertools' choose to keep the list of tuples in memory?
Because the Cartesian product must evaluate the same element several times, and iterators cannot instead be consumed only once.
EXTRA
in C the list of tuple pools it is created equivalent to python, as you can see from this code, are evaluated eagerly. Each iterable argument is first converted to a tuple:
pools = PyTuple_New(npools);
if (pools == NULL)
goto error;
for (i=0; i < nargs ; ++i) {
PyObject *item = PyTuple_GET_ITEM(args, i);
PyObject *pool = PySequence_Tuple(item);
if (pool == NULL)
goto error;
PyTuple_SET_ITEM(pools, i, pool);
indices[i] = 0;
}
for ( ; i < npools; ++i) {
PyObject *pool = PyTuple_GET_ITEM(pools, i - nargs);
Py_INCREF(pool);
PyTuple_SET_ITEM(pools, i, pool);
indices[i] = 0;
}
I'd like to point out that while for both instances of class A the __next__ method gets called exhaustively (until StopIteration is encountered), the itertools.product iterator is still lazy evaluated with the subsequent calls to next. Notice that:
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
> 'Here'
is just a result of calling exhaustively next first for the first passed instance, and then for the second. This is more readily seen when calling product(A(2), A(3)), which results in:
> 'I am at 0'
> 'I am at 1'
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
The same behavior is observed for combinations and permutations. In fact searching for so informed question with "Does itertools.product evaluate its arguments lazily?" brought me to this SO question which also answers your question. The arguments are not evaluated lazily:
since product sometimes needs to go over an iterable more than once, which is not possible if the arguments were left as iterators that can only be consumed once.

Kotlin - How do I concatenate a String to an Int value?

A very basic question, What is the right approach to concatenate String to an Int?
I'm new in Kotlin and want to print an Integer value preceding with String and getting the following error message.
for (i in 15 downTo 10){
print(i + " "); //error: None of the following function can be called with the argument supplied:
print(i); //It's Working but I need some space after the integer value.
}
Expected Outcome
15 14 13 12 11 10
You've got several options:
1. String templates. I think it is the best one. It works absolutely like 2-st solution, but looks better and allow to add some needed characters.
print("$i")
and if you want to add something
print("$i ")
print("$i - is good")
to add some expression place it in brackets
print("${i + 1} - is better")
2. toString method which can be used for any object in kotlin.
print(i.toString())
3. Java-like solution with concatenating
print("" + i)
$ dollar – Dollar symbol is used in String templates that we’ll be seeing next
for (i in 15 downTo 10){
print("$i ")
}
Output : 15 14 13 12 11 10
You can use kotlin string template for that:
for (i in 15 downTo 10){
print("$i ");
}
https://kotlinlang.org/docs/reference/basic-types.html#string-templates
The Int::toString method does what you're looking for. Instead of explicit loops, consider functional approaches like map:
(15 downTo 10).map(Int::toString).joinToString { " " }
Note that the map part is even redundant since joinToString can handle the conversion internally.
The error you get is because the + you're using is the integer one (it is decided by the left operand). The integer + expects 2 integers. In order to actually use the + of String for concatenation, you would need the string on the left, like "" + i + " ".
That being said, it is more idiomatic in Kotlin to print formatted strings using string templates: "$i "
However, if all you want is to print integers with spaces in between, you can use the stdlib function joinToString():
val output = (15 downTo 10).joinToString(" ")
print(output) // or println() if you want to start a new line after your integers
Just cast to String:
for (i in 15 downTo 10){
print(i.toString() + " ");
}
You should use the $ . You can also use the + but it could get confusing in your case because the + has is also an operator which invokes the plus() method which is used to sum Integers.
for (i in 15 downTo 10){
print("$i ");
}

Is there a 'clamp' method/sub for ranges/Num etc in Raku (i.e. Perl6)?

Is there a 'clamp' or equivalent method or sub in Perl6?
eg
my $range= (1.0 .. 9.9)
my $val=15.3;
my $clamped=$range.clamp($val);
# $clamped would be 9.9
$val= -1.3;
$clamped=$range.clamp($val);
# $clamped would be 1.0
Another tact you might like to explore is using a Proxy, which allows you to define "hooks" when fetching or storing a value from a container
sub limited-num(Range $range) is rw {
my ($min, $max) = $range.minmax;
my Numeric $store = $min;
Proxy.new(
FETCH => method () { $store },
STORE => method ($new) {
$store = max($min, min($max, $new));
}
)
}
# Note the use of binding operator `:=`
my $ln := limited-num(1.0 .. 9.9);
say $ln; # OUTPUT: 1
$ln += 4.2;
say $ln; # OUTPUT: 5.2
$ln += 100;
say $ln; # OUTPUT: 9.9
$ln -= 50;
say $ln; # OUTPUT: 1
$ln = 0;
say $ln; # OUTPUT: 1
This particular limited-num will initialise with it's min value, but you can also set it at declaration
my $ln1 := limited-num(1.0 .. 9.9) = 5.5;
say $ln1; # OUTPUT 5.5;
my $ln2 := limited-num(1.0 .. 9.9) = 1000;
say $ln2; # OUTPUT 9.9
I don't think so. So, perhaps:
multi clamp ($range, $value) {
given $range {
return .max when (($value cmp .max) === More);
return .min when (($value cmp .min) === Less);
}
return $value
}
my $range = (1.0 .. 9.9);
say $range.&clamp: 15.3; # 9.9
say $range.&clamp: -1.3; # 1
my $range = 'b'..'y';
say $range.&clamp: 'a'; # b
say $range.&clamp: 'z'; # y
The MOP allows direct exploration of the objects available in your P6 system. A particularly handy metamethod is .^methods which works on most built in objects:
say Range.^methods; # (new excludes-min excludes-max infinite is-int ...
By default this includes just the methods defined in the Range class, not the methods it inherits. (To get them all you could use say Range.^methods: :all. That'll net you a much bigger list.)
When I just tried it I found it also included a lot of methods unhelpfully named Method+{is-nodal}.new. So maybe use this instead:
say Range.^methods.grep: * !~~ / 'is-nodal' /;
This netted:
(new excludes-min excludes-max infinite is-int elems iterator
flat reverse first bounds int-bounds fmt ASSIGN-POS roll pick
Capture push append unshift prepend shift pop sum rand in-range
hyper lazy-if lazy item race of is-lazy WHICH Str ACCEPTS perl
Numeric min max BUILDALL)
That's what I used to lead me to my solution above; I sort of know the methods but use .^methods to remind me.
Another way to explore what's available is doc, eg the official doc's Range page. That netted me:
ACCEPTS min excludes-min max excludes-max bounds
infinite is-int int-bounds minmax elems list flat
pick roll sum reverse Capture rand
Comparing these two lists, sorted and bagged, out of curiosity:
say
<ACCEPTS ASSIGN-POS BUILDALL Capture Numeric Str WHICH append
bounds elems excludes-max excludes-min first flat fmt hyper
in-range infinite int-bounds is-int is-lazy item iterator
lazy lazy-if max min new of perl pick pop prepend push
race rand reverse roll shift sum unshift>.Bag
∩
<ACCEPTS Capture bounds elems excludes-max excludes-min flat
infinite int-bounds is-int list max min minmax pick
rand reverse roll sum>.Bag
displays:
Bag(ACCEPTS, Capture, bounds, elems, excludes-max, excludes-min,
flat, infinite, int-bounds, is-int, max, min, pick,
rand, reverse, roll, sum)
So for some reason, list, minmax, and sum are documented as Range methods but are not listed by my .^methods call. Presumably they're called Method+{is-nodal}.new. Hmm.
say Range.^lookup('minmax'); # Method+{is-nodal}.new
say Range.^lookup('minmax').name; # minmax
Yep. Hmm. So I could have written:
say Range.^methods>>.name.sort;
(ACCEPTS ASSIGN-POS AT-POS BUILDALL Bag BagHash Capture EXISTS-POS
Mix MixHash Numeric Set SetHash Str WHICH append bounds elems
excludes-max excludes-min first flat fmt hyper in-range infinite
int-bounds is-int is-lazy item iterator lazy lazy-if list max min
minmax new of perl pick pop prepend push race rand reverse roll
shift sum unshift)
Anyhow, hope that's helpful.
Strange that no one has suggested using augment. Admittedly, it creates global changes, but that might not be an issue.
augment class Range {
method clamp ($value) { ... }
}
You will need to use the pragmause MONKEY-TYPING in the same scope before the augment in order to use it though. But this way, you can simply say $range.clamp(5), for instance. It saves you one character over raiph's answer, but at the (not insignificant) cost of breaking precompilation.

How is it possible that O(1) constant time code is slower than O(n) linear time code?

"...It is very possible for O(N) code to run faster than O(1) code for specific inputs. Big O just describes the rate of increase."
According to my understanding:
O(N) - Time taken for an algorithm to run based on the varying values of input N.
O(1) - Constant time taken for the algorithm to execute irrespective of the size of the input e.g. int val = arr[10000];
Can someone help me understand based on the author's statement?
O(N) code run faster than O(1)?
What are the specific inputs the author is alluding to?
Rate of increase of what?
O(n) constant time can absolutely be faster than O(1) linear time. The reason is that constant-time operations are totally ignored in Big O, which is a measure of how fast an algorithm's complexity increases as input size n increases, and nothing else. It's a measure of growth rate, not running time.
Here's an example:
int constant(int[] arr) {
int total = 0;
for (int i = 0; i < 10000; i++) {
total += arr[0];
}
return total;
}
int linear(int[] arr) {
int total = 0;
for (int i = 0; i < arr.length; i++) {
total += arr[i];
}
return total;
}
In this case, constant does a lot of work, but it's fixed work that will always be the same regardless of how large arr is. linear, on the other hand, appears to have few operations, but those operations are dependent on the size of arr.
In other words, as arr increases in length, constant's performance stays the same, but linear's running time increases linearly in proportion to its argument array's size.
Call the two functions with a single-item array like
constant(new int[] {1});
linear(new int[] {1});
and it's clear that constant runs slower than linear.
But call them like:
int[] arr = new int[10000000];
constant(arr);
linear(arr);
Which runs slower?
After you've thought about it, run the code given various inputs of n and compare the results.
Just to show that this phenomenon of run time != Big O isn't just for constant-time functions, consider:
void exponential(int n) throws InterruptedException {
for (int i = 0; i < Math.pow(2, n); i++) {
Thread.sleep(1);
}
}
void linear(int n) throws InterruptedException {
for (int i = 0; i < n; i++) {
Thread.sleep(10);
}
}
Exercise (using pen and paper): up to which n does exponential run faster than linear?
Consider the following scenario:
Op1) Given an array of length n where n>=10, print the first ten elements twice on the console. --> This is a constant time (O(1)) operation, because for any array of size>=10, it will execute 20 steps.
Op2) Given an array of length n where n>=10, find the largest element in the array. This is a constant time (O(N)) operation, because for any array, it will execute N steps.
Now if the array size is between 10 and 20 (exclusive), Op1 will be slower than Op2. But let's say, we take an array of size>20 (for eg, size =1000), Op1 will still take 20 steps to complete, but Op2 will take 1000 steps to complete.
That's why the big-o notation is about growth(rate of increase) of an algorithm's complexity

inserting elements into List during Iteration in TCL/TK Scripting

I'm trying to add each input integer into a list and later sort it however I am having trouble adding each integer into the list during iteration.
code:
set l1 {1 2 3 4 5}
for {set i 0} {$i<[llength $l1]} {incr i} {
set data [gets stdin]
scan $data "%d" myint
if $myint<=0 {break} # stop if non positive number is found
set l1 {$myint} # supposed to add an input element into the list during iteration
}
puts $l1
Adding an element to the end of a list is easy; just use lappend instead of set:
lappend l1 $myint
When you come to sorting the list later, use lsort -integer, for example here with the puts:
puts [lsort -integer $l1]
(The lsort command works on values, not variables like lappend.)
However, it appears you're trying to actually input up to five values and sort those. If that's so, you'd be better off writing your code like this:
set l1 {}
for {set i 0} {$i < 5} {incr i} {
set data [gets stdin]
if {[eof stdin] || [scan $data "%d" myint] != 1 || $myint <= 0} {
break
}
lappend l1 $myint
}
puts [lsort -integer $l1]
The differences here? I'm using an empty initial list. I'm testing for End-Of-File. I'm checking the result of scan (in case someone supplies a non-integer). I'm using a compound expression. It's all little things, but they help the code be more robust.