numpy - reason to use 'with' for np.nditer - numpy

Please explain if tbere is a reason of using with for the nditer as shown in the NumPy document numpy.nditer.
It is to understand if there is a necessity that we have to use 'with' when using nditer. In my understanding, 'with' is to make sure a resource (e.g. open file descriptor) gets released. I am not sure what resource is to be released in the code.
def iter_add_py(x, y, out=None):
addop = np.add
it = np.nditer([x, y, out], [],
[['readonly'], ['readonly'], ['writeonly','allocate']])
with it: # <-----
for (a, b, c) in it:
addop(a, b, out=c)
return it.operands[2]
Update
As pointed out, the Iterating Over Arrays - Modifying Array Values says:
because the nditer must copy this buffer data back to the original
array once iteration is finished, you must signal when the iteration
is ended, by one of two methods. You may either:
use the nditer as a context manager using the with statement, and the
temporary data will be written back when the context is exited.
call the iterator’s close method once finished iterating, which will
trigger the write-back.

Related

Does Pandas DataFrame constructor (and helper construction functions) release the GIL while copying the source array?

Preamble
On another question I understood that python constructors routines does not make a copy of the provided numpy array only if the data-type of the array is the same for all the entries. In case the constructor is fed with a structured numpy array with different types on columns, it makes a copy.
Implementation reference
df_dict = {}
for i in range(5):
obj = Object(1000000)
arr = obj.getNpArr()
print(arr[:10])
df_dict[i] = pandas.DataFrame.from_records(arr)
print("The DataFrames are :")
for i in range(5):
print(df_dict[i].head(10))
In this case Object(N) constructs an instance of Object which internally allocates and initializes a 2D array of shape (N,3), with dtypes 'f8','i4','i4' on each row. Object manages the life of these data, deallocating it on destruction. The function Object.getNpArr() returns a np.recarray pointing to the internal data and it has the above mentioned dtype. Importantly, the returned array does not own the data, it is just a view.
Problem
The DataFrame printed at the end show corrupted data (with respect to the printed array inside the first loop). I am not expecting such behaviour, since the array fed to the pandas construction function is copied (I separately checked this behaviour).
I have not many ideas about the cause and solutions to avoid data corruption. The only guess I can make is:
the constructor starts allocating the memory for its own data, which takes long because of the big size, and then copies
before/during the allocation/copy, the GIL is released and it is taken back to the for loop
for loop proceed before the copy of the array is completed, going to the next iteration
at the next iteration the obj name is moved to the new Object and the memory is deallocated, which causes data corruption in the copy of the DataFrame at the previous iteration, which is probably still running.
If this is really the cause of the issue, how can I find a workaround? Is there a way to let the GIL go through only when the copy of the array is effectively done?
Or, if my guess is wrong, what is the cause of the data corruption?

Why is numpy save not immediate, and can it be forced to save immediately?

I thought this question would have been asked already, but I can't find it, so here goes: I've noticed that numpy.save commands only trigger, i.e. the file to-be-created is actually created, after the entire code has finished running. This is bad when the code takes days or weeks to run, and I want to pin down exactly which function, and what arguments into the function, are causing the bottleneck.
There is a similar issue with the print() command; it doesn't write to the output file immediately but rather waits until the entire code is finished before writing. I can force it to write immediately with this code:
def printnow(*messages):
w=open("output.log","a")
for message in messages:
w.write(str(message))
w.write(" ")
w.write("\n")
w.close()
I was wondering whether it's possible to do an analogous thing, i.e. force an immediate save, for numpy arrays. No need for appending; overwriting with the current value of the numpy array is fine.
If it makes a difference, I'm not running the code on my personal computer but a group server, which I issue commands to and check on using Putty and WinSCP.
Thanks
Edit: I tried another package, shelve, and it encounters the same problem. I create a global variable called function_calls and initialize it to 0. Then, at the start of the function that I suspect is causing the bottleneck, I put in the following code:
global function_calls
file='function_inputs'+str(function_calls)
function_shelf=shelve.open(file,'n')
for key in dir():
function_shelf[key]=locals()[key]
function_calls+=1
This code is intended to create a new file that saves the function inputs, each time the function is called. Unfortunately, 9 hours into starting the run, no files have been created. So I suspect Python is just waiting until the whole run is finished before creating the files I asked it to.

How to use hyperoperators with Scalars that aren't really scalar?

I want to make a hash of sets. Well, SetHashes, since they need to be mutable.
In fact, I would like to initialize my Hash with multiple identical copies of the same SetHash.
I have an array containing the keys for the new hash: #keys
And I have my SetHash already initialized in a scalar variable: $set
I'm looking for a clean way to initialize the hash.
This works:
my %hash = ({ $_ => $set.clone } for #keys);
(The parens are needed for precedence; without them, the assignment to %hash is part of the body of the for loop. I could change it to a non-postfix for loop or make any of several other minor changes to get the same result in a slightly different way, but that's not what I'm interested in here.)
Instead, I was kind of hoping I could use one of Raku's nifty hyper-operators, maybe like this:
my %hash = #keys »=>» $set;
That expression works a treat when $set is a simple string or number, but a SetHash?
Array >>=>>> SetHash can never work reliably: order of keys in SetHash is indeterminate
Good to know, but I don't want it to hyper over the RHS, in any order. That's why I used the right-pointing version of the hyperop: so it would instead replicate the RHS as needed to match it up to the LHS. In this sort of expression, is there any way to say "Yo, Raku, treat this as a scalar. No, really."?
I tried an explicit Scalar wrapper (which would make the values harder to get at, but it was an experiment):
my %map = #keys »=>» $($set,)
And that got me this message:
Lists on either side of non-dwimmy hyperop of infix:«=>» are not of the same length while recursing
left: 1 elements, right: 4 elements
So it has apparently recursed into the list on the left and found a single key and is trying to map it to a set on the right which has 4 elements. Which is what I want - the key mapped to the set. But instead it's mapping it to the elements of the set, and the hyperoperator is pointing the wrong way for that combination of sizes.
So why is it recursing on the right at all? I thought a Scalar container would prevent that. The documentation says it prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
The error message says the version of the hyperoperator I'm using is "non-dwimmy", which may explain why it's not in fact doing what I mean, but is there maybe an even-less-dwimmy version that lets me be even more explicit? I still haven't gotten my brain aligned well enough with the way Raku works for it to be able to tell WIM reliably.
I'm looking for a clean way to initialize the hash.
One idiomatic option:
my %hash = #keys X=> $set;
See X metaoperator.
The documentation says ... a Scalar container ... prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
A cat is an animal, but an animal is not necessarily a cat. Flattening may act recursively, but some operations that act recursively don't flatten. Recursive flattening stops if it sees a Scalar. But hyperoperation isn't flattening. I get where you're coming from, but this is not the real problem, or at least not a solution.
I had thought that hyperoperation had two tests controlling recursing:
Is it hyperoperating a nodal operation (eg .elems)? If so, just apply it like a parallel shallow map (so don't recurse). (The current doc quite strongly implies that nodal can only be usefully applied to a method, and only a List one (or augmentation thereof) rather than any routine that might get hyperoperated. That is much more restrictive than I was expecting, and I'm sceptical of its truth.)
Otherwise, is a value Iterable? If so, then recurse into that value. In general the value of a Scalar automatically behaves as the value it contains, and that applies here. So Scalars won't help.
A SetHash doesn't do the Iterable role. So I think this refusal to hyperoperate with it is something else.
I just searched the source and that yields two matches in the current Rakudo source, both in the Hyper module, with this one being the specific one we're dealing with:
multi method infix(List:D \left, Associative:D \right) {
die "{left.^name} $.name {right.^name} can never work reliably..."
}
For some reason hyperoperation explicitly rejects use of Associatives on either the right or left when coupled with the other side being a List value.
Having pursued the "blame" (tracking who made what changes) I arrived at the commit "Die on Associative <<op>> Iterable" which says:
This can never work due to the random order of keys in the Associative.
This used to die before, but with a very LTA error about a Pair.new()
not finding a suitable candidate.
Perhaps this behaviour could be refined so that the determining factor is, first, whether an operand does the Iterable role, and then if it does, and is Associative, it dies, but if it isn't, it's accepted as a single item?
A search for "can never work reliably" in GH/rakudo/rakudo issues yields zero matches.
Maybe file an issue? (Update I filed "RFC: Allow use of hyperoperators with an Associative that does not do Iterable role instead of dying with "can never work reliably".)
For now we need to find some other technique to stop a non-Iterable Associative being rejected. Here I use a Capture literal:
my %hash = #keys »=>» \($set);
This yields: {a => \(SetHash.new("b","a","c")), b => \(SetHash.new("b","a","c")), ....
Adding a custom op unwraps en passant:
sub infix:« my=> » ($lhs, $rhs) { $lhs => $rhs[0] }
my %hash = #keys »my=>» \($set);
This yields the desired outcome: {a => SetHash(a b c), b => SetHash(a b c), ....
my %hash = ({ $_ => $set.clone } for #keys);
(The parens seem to be needed so it can tell that the curlies are a block instead of a Hash literal...)
No. That particular code in curlies is a Block regardless of whether it's in parens or not.
More generally, Raku code of the form {...} in term position is almost always a Block.
For an explanation of when a {...} sequence is a Hash, and how to force it to be one, see my answer to the Raku SO Is that a Hash or a Block?.
Without the parens you've written this:
my %hash = { block of code } for #keys
which attempts to iterate #keys, running the code my %hash = { block of code } for each iteration. The code fails because you can't assign a block of code to a hash.
Putting parens around the ({ block of code } for #keys) part completely alters the meaning of the code.
Now it runs the block of code for each iteration. And it concatenates the result of each run into a list of results, each of which is a Pair generated by the code $_ => $set.clone. Then, when the for iteration has completed, that resulting list of pairs is assigned, once, to my %hash.

How to find out what arguments DM functions should take?

Through trial and error, I have found that the GetPixel function takes two arguments, one for X and one for Y, even if used on a 1D image. On a 1D image, the second index must be set to zero.
image list := [3]: {1,2,3}
list.GetPixel(0,0) // Gets 1
GetPixel(list, 0, 0) // Equivalent
How am I supposed to know this? I can't see anything clearly specifying this in the documentation.
This is best done by using the script function with an incorrect parameter list, running the script, and observing the error output:

writing a vector using "readTrajectory" function in Dymola

I write a vector in Dymola mos script in a simple manner like this:
x_axis = cell.spatialSummary.x_cell;
output: x_axis={1,2,3,4,5} // row vector
I want to do the same thing in a function.'x_cell' has 5 values which I want to store in a row vector. I use DymolaCommands.Trajectories.readTrajectory function to read x_cell values one by one in for loop (I use for loop because, readTrajectory throws an error when I try to read entire x_cell)
Real x_axis[:],axis_value[:,:];
Integer len=5;
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis[i]:=scalar(axis_value);
end for;
I get an error:
Assignment failed x_axis[i] = scalar(axis_value);
what's wrong here? All I want to do is read all values of x_cell and write it into a vector. How can I do this in dymola function?
Thank you!
Solution: Initialize the vector with a certain value. In this case,
x_axis :=fill(0, len);
This solved the above problem for me.
Pre filling as in the other solution works, and is generally the best solution. However, in some cases you might have to append to the vector as follows:
x_axis=fill(0.0, 0);
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis:=cat(1, x_axis, {scalar(axis_value)});
end for;
(This takes x_axis and concatenates a new element at the end. It is generally slower.)