Why does itertools.product run through all elements at initialization? - iteration

I assumed that itertools.product generates elements one at the time. I am now noticing that it is not true.
Simple proof of concept:
Class A:
def __init__(self, n):
self.source = iter(range(n))
def __iter__(self):
return self
def __next__(self):
val = next(self.source)
print("I am at:", val)
return val
Now If I do:
from itertools import product
l = product(A(3), A(3))
print("Here")
next(l)
I expect to have as output:
>'Here'
>'I am at 0'
>'I am at 0'
But I have
>'I am at 0'
>'I am at 1'
>'I am at 2'
>'I am at 0'
>'I am at 1'
>'I am at 2'
>'Here'
Am I missing something?

To answer your question we need to look at the implementation of itertools.product:
def product(*args, repeat=1):
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
here you find the real C implementation, but to answer this question, it is enough to refer to python (see the EXTRA paragraph at the bottom).
focus on this line of code:
pools = [tuple(pool) for pool in args] * repeat
in this way all the elements of the two iterators (taken in input) are transformed into a list of tuples (only the first time you call next()), and at this time they are actually created.
Returning to your code, when you call next(l) for the first time, all elements of the iterators are created. In your example the list will be created the polls list with the following elements:
# pools: [(0, 1, 2), (0, 1, 2)]
which is why you got those outputs.
As for the print("Here"), to understand why it is printed first you need to understand how the generators work:
itertool.product() returns a generator object. The generator does not execute the function code until it is stimulated by the first next(). Subsequently, each call next() allows you to calculate the next element, executing only once the loop containing the keyword yield.
Here you will find excellent resources to better understand how python generators work.
Why did 'itertools' choose to keep the list of tuples in memory?
Because the Cartesian product must evaluate the same element several times, and iterators cannot instead be consumed only once.
EXTRA
in C the list of tuple pools it is created equivalent to python, as you can see from this code, are evaluated eagerly. Each iterable argument is first converted to a tuple:
pools = PyTuple_New(npools);
if (pools == NULL)
goto error;
for (i=0; i < nargs ; ++i) {
PyObject *item = PyTuple_GET_ITEM(args, i);
PyObject *pool = PySequence_Tuple(item);
if (pool == NULL)
goto error;
PyTuple_SET_ITEM(pools, i, pool);
indices[i] = 0;
}
for ( ; i < npools; ++i) {
PyObject *pool = PyTuple_GET_ITEM(pools, i - nargs);
Py_INCREF(pool);
PyTuple_SET_ITEM(pools, i, pool);
indices[i] = 0;
}

I'd like to point out that while for both instances of class A the __next__ method gets called exhaustively (until StopIteration is encountered), the itertools.product iterator is still lazy evaluated with the subsequent calls to next. Notice that:
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
> 'Here'
is just a result of calling exhaustively next first for the first passed instance, and then for the second. This is more readily seen when calling product(A(2), A(3)), which results in:
> 'I am at 0'
> 'I am at 1'
> 'I am at 0'
> 'I am at 1'
> 'I am at 2'
The same behavior is observed for combinations and permutations. In fact searching for so informed question with "Does itertools.product evaluate its arguments lazily?" brought me to this SO question which also answers your question. The arguments are not evaluated lazily:
since product sometimes needs to go over an iterable more than once, which is not possible if the arguments were left as iterators that can only be consumed once.

Related

How to setup for each loop in Kotlin to avoid out of bounds exception

In java I got this construction
for (let i = 0; i < x.length-1; I++
And here to avoid outOfBoundsException we are using x.length-1 but how to do the same thing in Kotlin? I got this code so far
x.forEachIndexed { index, _ ->
output.add((x[index+1]-x[index])*10)
}
And it crashes on the last element when we call x[index+1] so I need to handle the last element somehow
Input list
var x = doubleArrayOf(0.0, 0.23, 0.46, 0.69, 0.92, 1.15, 1.38, 1.61)
For a classic Java for loop you got two options in Kotlin.
One would be something like this.
val x = listOf(1,2,3,4)
for (i in 0 .. x.lastIndex){
// ...
}
Using .. you basically go from 0 up to ( and including) the number coresponding to the second item, in this case the last index of the list.( so from 0 <= i <= x.lastIndex)
The second option is using until
val x = listOf(1,2,3,4)
for (i in 0 until x.size){
// ...
}
This is similar to the previous approach, except the fact that until is not inclusive with the last element.(so from 0 <= i < x.size ).
What you probably need is something like this
val x = listOf(1,2,3,4)
for (i in 0 .. x.lastIndex -1){
// ...
}
or alternative, using until, like this
val x = listOf(1,2,3,4)
for (i in 0 until x.size-1){
// ...
}
This should probably avoid the IndexOut of bounds error, since you go just until the second to last item index.
Feel free to ask more if something is not clear.
This is also a great read if you want to learn more about ranges. https://kotlinlang.org/docs/ranges.html#progression
You already have an answer, but this is another option. If you would use a normal list, you would have access to zipWithNext(), and then you don't need to worry about any index, and you can just do:
list.zipWithNext { current, next ->
output.add((next - current)*10)
}
As mentioned by k314159, we can also do asList() to have direct access to zipWithNext and other list methods, without many drawbacks.
array.asList().zipWithNext { current, next ->
output.add(next - current)
}

My take on Migratory Bird is failing one case

Update: I completely overlooked the complexity added by arr.sort() method. So in Kotlin for array of Int, It compiles to use java.util.DualPivotQuicksort see this which in turn has complexity of O(n^2). see this. Other than that, this is also a valid approach.
I know It can be solved by keeping multiple arrays or using collections (which is what I ended up submitting), I want to know what I missed in the following approach
fun migratoryBirds(arr: Array<Int>): Int {
var maxCount = 0
var maxType = 0
var count = 0
var type = 0
arr.sort()
println(arr.joinToString(" "))
for (value in arr){
if (type != value){
if (count > maxCount){
maxCount = count
maxType = type
}
// new count values
type = value
count = 1
} else {
count++
}
}
return maxType
}
This code passes every scenario except for Test case 2 which has 73966 items for array. On my local machine, that array of 73k+ elements was causing timeout but I did test for array up-to 20k+ randomly generated value 1..5 and every time it succeeded. But I couldn't manage to pass Test case 2 with this approach. So even though I ended up submitting an answer with collection stream approach, I would really like to know what could I be missing in above logic.
I am running array loop only once Complexity should be O(n), So that could not be reason for failing. I am pre-sorting array in ascending order, and I am checking for > not >=, therefore, If two types end up having same count, It will still return the lower of the two types. And this approach is working correctly even for array of 20k+ elements ( I am getting timeout for anything above 25k elements).
The reason it is failing is this line
arr.sort()
Sorting an array takes O(n logn) time. However using something like a hash map this can be solved in O(n) time.
Here is a quick python solution I made to give you the general idea
# Complete the migratoryBirds function below.
def migratoryBirds(arr):
ans = -1
count = -1
dic = {}
for x in arr:
if x in dic:
dic[x] += 1
else:
dic[x] = 1
if dic[x] > count or dic[x] == count and x < ans:
ans = x
count = dic[x]
return ans

Variable getting overwritten in for loop

In a for loop, a different variable is assigned a value. The variable which has already been assigned a value is getting assigned the value from next iteration. At the end, both variable have the same value.
The code is for validating data in a file. When I print the values, it prints correct value for first iteration but in the next iteration, the value assigned in first iteration is changed.
When I print the value of $value3 and $value4 in the for loop, it shows null for $value4 and some value for $value3 but in the next iteration, the value of $value3 is overwritten by the value of $value4
I have tried on rakudo perl 6.c
my $fh= $!FileName.IO.open;
my $fileObject = FileValidation.new( file => $fh );
for (3,4).list {
put "Iteration: ", $_;
if ($_ == 4) {
$value4 := $fileObject.FileValidationFunction(%.ValidationRules{4}<ValidationFunction>, %.ValidationRules{4}<Arguments>);
}
if ($_ == 3) {
$value3 := $fileObject.FileValidationFunction(%.ValidationRules{3}<ValidationFunction>, %.ValidationRules{3}<Arguments>);
}
$fh.seek: SeekFromBeginning;
}
TL;DR It's not possible to confidently answer your question as it stands. This is a nanswer -- an answer in the sense I'm writing it as one but also quite possibly not an answer in the sense of helping you fix your problem.
Is it is rw? A first look.
The is rw trait on a routine or class attribute means it returns a container that contains a value rather than just returning a value.
If you then alias that container then you can get the behavior you've described.
For example:
my $foo;
sub bar is rw { $foo = rand }
my ($value3, $value4);
$value3 := bar;
.say for $value3, $value4;
$value4 := bar;
.say for $value3, $value4;
displays:
0.14168492246366005
(Any)
0.31843665763839857
0.31843665763839857
This isn't a bug in the language or compiler. It's just P6 code doing what it's supposed to do.
A longer version of the same thing
Perhaps the above is so far from your code it's disorienting. So here's the same thing wrapped in something like the code you provided.
spurt 'junk', 'junk';
class FileValidation {
has $.file;
has $!foo;
method FileValidationFunction ($,$) is rw { $!foo = rand }
}
class bar {
has $!FileName = 'junk';
has %.ValidationRules =
{ 3 => { ValidationFunction => {;}, Arguments => () },
4 => { ValidationFunction => {;}, Arguments => () } }
my ($value3, $value4);
method baz {
my $fh= $!FileName.IO.open;
my $fileObject = FileValidation.new( file => $fh );
my ($value3, $value4);
for (3,4).list {
put "Iteration: ", $_;
if ($_ == 4) {
$value4 := $fileObject.FileValidationFunction(
%.ValidationRules{4}<ValidationFunction>, %.ValidationRules{4}<Arguments>);
}
if ($_ == 3) {
$value3 := $fileObject.FileValidationFunction(
%.ValidationRules{3}<ValidationFunction>, %.ValidationRules{3}<Arguments>);
}
$fh.seek: SeekFromBeginning;
.say for $value3, $value4
}
}
}
bar.new.baz
This outputs:
Iteration: 3
0.5779679442816953
(Any)
Iteration: 4
0.8650280000277686
0.8650280000277686
Is it is rw? A second look.
Brad and I came up with essentially the same answer (at the same time; I was a minute ahead of Brad but who's counting? I mean besides me? :)) but Brad nicely nails the fix:
One way to avoid aliasing a container is to just use =.
(This is no doubt also why #ElizabethMattijsen++ asked about trying = instead of :=.)
You've commented that changing from := to = made no difference.
But presumably you didn't change from := to = throughout your entire codebase but rather just (the equivalent of) the two in the code you've shared.
So perhaps the problem can still be fixed by switching from := to =, but in some of your code elsewhere. (That said, don't just globally replace := with =. Instead, make sure you understand their difference and then change them as appropriate. You've got a test suite, right? ;))
How to move forward if you're still stuck
Right now your question has received several upvotes and no downvotes and you've got two answers (that point to the same problem).
But maybe our answers aren't good enough.
If so...
The addition of the reddit comment, and trying = instead of :=, and trying the latest compiler, and commenting on those things, leaves me glad I didn't downvote your question, but I haven't upvoted it yet and there's a reason for that. It's because your question is still missing a Minimal Reproducible Example.
You responded to my suggestion about producing an MRE with:
The problem is that I am not able to replicate this in a simpler environment
I presumed that's your situation, but as you can imagine, that means we can't confidently replicate it at all. That may be the way you prefer to go for reasons but it goes against SO guidance (in the link above) and if the current answers aren't adequate then the sensible way forward is for you to do what it takes to share code that reproduces your problem.
If it's large, please don't just paste it into your question but instead link to it. Perhaps you can set it up on glot.io using the + button to use multiple files (up to 6 I think, plus there's a standard input too). If not, perhaps gist it via, say, gist.github.com, and if I can I'll set it up on glot.io for you.
What is probably happening is that you are returning a container rather than a value, then aliasing the container to a variable.
class Foo {
has $.a is rw;
}
my $o = Foo.new( a => 1 );
my $old := $o.a;
say $old; # 1
$o.a = 2;
say $old; # 2
One way to avoid aliasing a container is to just use =.
my $old = $o.a;
say $old; # 1
$o.a = 2;
say $old; # 1
You could also decontainerize the value using either .self or .<>
my $old := $o.a.<>;
say $old; # 1
$o.a = 2;
say $old; # 1
(Note that .<> above could be .self or just <>.)

Last element of a block thrown in sink context

This program
my #bitfields;
for ^3 -> $i {
#bitfields[$i] = Bool.pick xx 3;
}
my #total = 0 xx 3;
for #bitfields -> #row {
#total Z+= #row;
}
say #total;
says [0 0 0]. If we add something to the loop, whatever:
my #bitfields;
for ^3 -> $i {
#bitfields[$i] = Bool.pick xx 3;
}
my #total = 0 xx 3;
for #bitfields -> #row {
#total Z+= #row;
say "foo";
}
say #total;
It will work correctly. Apparently, the last element of the block is thrown into sink context which in this case means it's simply ignored; this trap is related to that. However, that code above looks perfectly fine; and this
{#total Z+= #^þ} for #bitfields;
apparently works, although I don't see the real difference. Any other idea?
It looks like a bug to me.
This looks very closely related to Which context confuses this Perl 6 zip operator? which became a Rakudo repo issue Failure to sink for when Z+= used as last statement which was closed with roast tests Test sunk for sinks last statement sub calls .
The mystery is why there's a new bug. My suspicion is that someone needs to clean the kitchen sink, i.e. pick up where Zoffix left off with his Flaws in implied sinkage / &unwanted helper issue.
Here's my best golf shot so far for narrowing down the new problem or regression:
my $foo = 'a';
ok: for 1 { $foo X= 'b' }
notok: for 1 -> $_ { $foo X= 'c' }
say $foo; # b
halfok: 'd' ~ do for 1 -> $_ { $foo X= 'e' } # Useless use of "~"
say $foo; # e
The ok: line works because it omits the -> argument.
The notok: line is my golf of your problem.
The error message for the halfok: line is because the result of it is thrown away. But the do has forced the compiler to evaluate the $foo X= 'e' expression in the block, as it should, and as it had failed to in the notok: line.
{#total Z+= #^þ} for #bitfields;
Perhaps that's because that's the non-modifier version. And/or because it doesn't use the -> syntax (which is part of the regression or new bug per my golf above).
Or perhaps just luck. I think most of the sink handling code in Rakudo is Larry's work from long ago when he was trying to get things mostly working right.

Specman/e list of lists (multidimensional array)

How can I create a fixed multidimensional array in Specman/e using varibles?
And then access individual elements or whole rows?
For example in SystemVerilog I would have:
module top;
function automatic my_func();
bit [7:0] arr [4][8]; // matrix: 4 rows, 8 columns of bytes
bit [7:0] row [8]; // array : 8 elements of bytes
row = '{1, 2, 3, 4, 5, 6, 7, 8};
$display("Array:");
foreach (arr[i]) begin
arr[i] = row;
$display("row[%0d] = %p", i, row);
end
$display("\narr[2][3] = %0d", arr[2][3]);
endfunction : my_func
initial begin
my_func();
end
endmodule : top
This will produce this output:
Array:
row[0] = '{'h1, 'h2, 'h3, 'h4, 'h5, 'h6, 'h7, 'h8}
row[1] = '{'h1, 'h2, 'h3, 'h4, 'h5, 'h6, 'h7, 'h8}
row[2] = '{'h1, 'h2, 'h3, 'h4, 'h5, 'h6, 'h7, 'h8}
row[3] = '{'h1, 'h2, 'h3, 'h4, 'h5, 'h6, 'h7, 'h8}
arr[2][3] = 4
Can someone rewrite my_func() in Specman/e?
There are no fixed arrays in e. But you can define a variable of a list type, including a multi-dimensional list, such as:
var my_md_list: list of list of my_type;
It is not the same as a multi-dimensional array in other languages, in the sense that in general each inner list (being an element of the outer list) may be of a different size. But you still can achieve your purpose using it. For example, your code might be rewritten in e more or less like this:
var arr: list of list of byte;
var row: list of byte = {1;2;3;4;5;6;7;8};
for i from 0 to 3 do {
arr.add(row.copy());
print arr[i];
};
print arr[2][3];
Notice the usage of row.copy() - it ensures that each outer list element will be a copy of the original list.
If we don't use copy(), we will get a list of many pointers to the same list. This may also be legitimate, depending on the purpose of your code.
In case of a field (as opposed to a local variable), it is also possible to declare it with a given size. This size is, again, not "fixed" and can be modified at run time (by adding or removing items), but it determines the original size of the list upon creation, for example:
struct foo {
my_list[4][8]: list of list of int;
};