Does Pandas DataFrame constructor (and helper construction functions) release the GIL while copying the source array? - pandas

Preamble
On another question I understood that python constructors routines does not make a copy of the provided numpy array only if the data-type of the array is the same for all the entries. In case the constructor is fed with a structured numpy array with different types on columns, it makes a copy.
Implementation reference
df_dict = {}
for i in range(5):
obj = Object(1000000)
arr = obj.getNpArr()
print(arr[:10])
df_dict[i] = pandas.DataFrame.from_records(arr)
print("The DataFrames are :")
for i in range(5):
print(df_dict[i].head(10))
In this case Object(N) constructs an instance of Object which internally allocates and initializes a 2D array of shape (N,3), with dtypes 'f8','i4','i4' on each row. Object manages the life of these data, deallocating it on destruction. The function Object.getNpArr() returns a np.recarray pointing to the internal data and it has the above mentioned dtype. Importantly, the returned array does not own the data, it is just a view.
Problem
The DataFrame printed at the end show corrupted data (with respect to the printed array inside the first loop). I am not expecting such behaviour, since the array fed to the pandas construction function is copied (I separately checked this behaviour).
I have not many ideas about the cause and solutions to avoid data corruption. The only guess I can make is:
the constructor starts allocating the memory for its own data, which takes long because of the big size, and then copies
before/during the allocation/copy, the GIL is released and it is taken back to the for loop
for loop proceed before the copy of the array is completed, going to the next iteration
at the next iteration the obj name is moved to the new Object and the memory is deallocated, which causes data corruption in the copy of the DataFrame at the previous iteration, which is probably still running.
If this is really the cause of the issue, how can I find a workaround? Is there a way to let the GIL go through only when the copy of the array is effectively done?
Or, if my guess is wrong, what is the cause of the data corruption?

Related

Fortran arrays argument [duplicate]

I'm going through a Fortran code, and one bit has me a little puzzled.
There is a subroutine, say
SUBROUTINE SSUB(X,...)
REAL*8 X(0:N1,1:N2,0:N3-1),...
...
RETURN
END
Which is called in another subroutine by:
CALL SSUB(W(0,1,0,1),...)
where W is a 'working array'. It appears that a specific value from W is passed to the X, however, X is dimensioned as an array. What's going on?
This is non-uncommon idiom for getting the subroutine to work on a (rectangular in N-dimensions) subset of the original array.
All parameters in Fortran (at least before Fortran 90) are passed by reference, so the actual array argument is resolved as a location in memory. Choose a location inside the space allocated for the whole array, and the subroutine manipulates only part of the array.
Biggest issue: you have to be aware of how the array is laid out in memory and how Fortran's array indexing scheme works. Fortran uses column major array ordering which is the opposite convention from c. Consider an array that is 5x5 in size (and index both directions from 0 to make the comparison with c easier). In both languages 0,0 is the first element in memory. In c the next element in memory is [0][1] but in Fortran it is (1,0). This affects which indexes you drop when choosing a subspace: if the original array is A(i,j,k,l), and the subroutine works on a three dimensional subspace (as in your example), in c it works on Aprime[i=constant][j][k][l], but in Fortran in works on Aprime(i,j,k,l=constant).
The other risk is wrap around. The dimensions of the (sub)array in the subroutine have to match those in the calling routine, or strange, strange things will happen (think about it). So if A is declared of size (0:4,0:5,0:6,0:7), and we call with element A(0,1,0,1), the receiving routine is free to start the index of each dimension where ever it likes, but must make the sizes (4,5,6) or else; but that means that the last element in the j direction actually wraps around! The thing to do about this is not use the last element. Making sure that that happens is the programmers job, and is a pain in the butt. Take care. Lots of care.
in fortran variables are passed by address.
So W(0,1,0,1) is value and address. so basically you pass subarray starting at W(0,1,0,1).
This is called "sequence association". In this case, what appears to be a scaler, an element of an array (actual argument in caller) is associated with an array (implicitly the first element), the dummy argument in the subroutine . Thereafter the elements of the arrays are associated by storage order, known as "sequence". This was done in Fortran 77 and earlier for various reasons, here apparently for a workspace array -- perhaps the programmer was doing their own memory management. This is retained in Fortran >=90 for backwards compatibility, but IMO, doesn't belong in new code.

How to copy SPIR-V Private storage-qualified matrix array into Function storage-qualified matrix array?

I am fighting with driver bug, and long story short i have created OpVariables with proper data in invocation-level global memory, aka Private storage-qualified OpVariable of type float4x3[6].
Now, i need this data converted to Function storage qualifier, as OpVariables in OpFunction scope. But i am kind of lost about when do i apply what copying operations, especially with matrices and arrays, and I have both at once. Do i just OpLoad and OpStore? Or do i need to OpLoad, OpCompositeExtract each matrix from indices, OpCompositeConstruct from them and only afterwards OpStore? The SPIR-V specification on the subject is rather dense, and i cannot seem to find one place where copying operations are described. The rules are probably scattered all over the spec.
After some experimentation it seems that all that is required to convert variables from Private to Function is simple OpLoad(Private) and OpStore to Function variable.

numpy - reason to use 'with' for np.nditer

Please explain if tbere is a reason of using with for the nditer as shown in the NumPy document numpy.nditer.
It is to understand if there is a necessity that we have to use 'with' when using nditer. In my understanding, 'with' is to make sure a resource (e.g. open file descriptor) gets released. I am not sure what resource is to be released in the code.
def iter_add_py(x, y, out=None):
addop = np.add
it = np.nditer([x, y, out], [],
[['readonly'], ['readonly'], ['writeonly','allocate']])
with it: # <-----
for (a, b, c) in it:
addop(a, b, out=c)
return it.operands[2]
Update
As pointed out, the Iterating Over Arrays - Modifying Array Values says:
because the nditer must copy this buffer data back to the original
array once iteration is finished, you must signal when the iteration
is ended, by one of two methods. You may either:
use the nditer as a context manager using the with statement, and the
temporary data will be written back when the context is exited.
call the iterator’s close method once finished iterating, which will
trigger the write-back.

Difference between memcpy_htod and to_gpu in Pycuda?

I am learning PyCUDA, and while going through the documentation on pycuda.gpuarray, I am puzzled by the difference between pycuda.driver.memcpy_htod (also _dtoh) and pycuda.gpuarray.to_gpu (also get) functions. According to gpuarray documentation, .get().
For example,transfer the contents of self into array or a newly allocated numpy.ndarray. If array is given, it must have the right size (not necessarily shape) and dtype. If it is not given, a pagelocked specifies whether the new array is allocated page-locked.
Is this saying that .get() is implemented exactly the same way as pycuda.driver.memcpy_dtoh? Somehow, I think I am mis-interpreting it.
pycuda.gpuarray.GPUArray.get() stores the GPUArray as a numpy.ndarray.
pycuda.driver.memcpy_dtoh() and friends copy plain buffers between CPU and GPU memory without any processing of the data in the buffers.

Object Array to Byte Array - Marshal.AllocHGlobal Fragmentation query

I didn't think it fair to post a comment on Fredrik Mörk's answer in this 2 year old post, so I thought I'd just ask it as a new question instead...
NB: This is not a critiscm of the answer in any way, I'm simply trying to understand this all before delving into memory management / the marshal class.
In that answer, the function GetByteArray allocates memory to each object within the given array, within a loop.
Would the GetByteArray function on the aforementioned post have benefited at all from allocating memory for the total size of the provided array:
Dim arrayBufferPtr = Marshal.AllocHGlobal(Marshal.SizeOf(<arrayElement>) * <array>.Count)
I just wonder if allocating the memory, as shown in the answer, causes any kind of fragmentation? Assuming there may be fragmentation, would there be much of an impact to be concerned with? Would allocating the memory in the way I've shown force you to call IntPtr.ToInt## to obtain pointer offsets from the overall allocation pointer, and therefore force you to check the underlying architecture to ensure the correct method is used*1 or is there a better way? (ToInt32/ToInt64 depending on x86/64?)
*1 I read elsewhere that calling the wrong IntPtr.ToInt## will cause overflow exceptions. What I mean by that statement is would I use:
Dim anOffsetPtr As New IntPtr(arrayBufferPtr.ToInt## + (loopIndex * <arrayElementSize>))
I've read through a few articles on the VB.Net Marshal class and memory allocation; listed below, but if you know fo any other good articles I'm all ears!
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.marshal.aspx
http://www.dotnetbips.com/articles/44bad06d-3662-41d3-b712-b45546cd8fa8.aspx
My favourite so far:
http://www.codeproject.com/KB/vb/Marshal.aspx
It is possible to allocate unmanaged memory for the whole array, and then copy every array element with SizeOf(arrayElement)*loopIndex offset. It is better to use appropriate ToInt32/ToInt64 method, according to the current platform, like:
Dim anOffsetPtr
if arrayBufferPtr.Size = 4 then
anOffsetPtr = New IntPtr(arrayBufferPtr.ToInt32() + (loopIndex * arrayElementSize))
else
anOffsetPtr = New IntPtr(arrayBufferPtr.ToInt64() + (loopIndex * arrayElementSize))
endif