Problems while learning cython - numpy

I am learning cython to speed up numpy. I wrote a code to see how to optimize numpy array calculation.
The python code is:
from numpy import *
def set_onsite(n):
a=linspace(0,n,n+1)
onsite=zeros([n+1,n+1],float)
for i in range(0,n+1):
onsite[i,i]=a[i]*a[i]
return onsite
Then, I tried to cythonize this code:
import numpy as np
cimport numpy as np
cimport cython
import cython
#cython.boundscheck(False)
#cython.wraparound(False)
#cython.nonecheck(False)
def set_onsite(np.int_t n):
cdef np.ndarray[double,ndim=1,mode='c'] a=np.linspace(0,n,n+1)
cdef np.ndarray[double,ndim=2,mode='c'] onsite=np.empty(n+1,n+1)
cdef np.int_t i
for i in range(0,n+1):
onsite[i,i]=a[i]*a[i]
return onsite
After running setup.py file, I got the .so file. I ran the code %timeit myfile.set_onsite(10000),but IPython showed
TypeError: data type not understood
So could anyone tell me what is going on here?
I checked my code many times but I did not figure out where the problem arises.

The problem has nothing to do with cython; it's just that np.empty expects the first argument to be the shape given as an int or tuple of ints. The second argument is interpreted as the dtype:
In [19]: np.empty(5,5)
TypeError: data type not understood
while np.empty((5,5)) returns an empty array of shape (5,5).
So instead use
cdef np.ndarray[double,ndim=2,mode='c'] onsite=np.empty((n+1,n+1))
Note the double set of parentheses around n+1, n+1. Or, use np.zeros instead of np.empty to make the Cython function match the Python function.
PS: When debugging Python, it is helpful to note not only the error message, but the line that raises the exception:
File "comp.pyx", line 13, in comp.set_onsite (comp.c:1290)
cdef np.ndarray[double,ndim=2,mode='c'] onsite=np.empty(n+1,n+1)
TypeError: data type not understood

Related

How can I combine multiple numpy arrays into a single memoryview for cython?

I have a list of varying size that contains numpy arrays with the same data type and shape. I would like to process this data using a function written in Cython without copying the data. Both memoryviews and the Python buffer protocol seem to support this kind of data using indirect for the first dimension. So I was hoping that something like this could work:
%%cython
from cython.view cimport indirect
def test(list a):
cdef double[::indirect, :] x
x = a
x[0, 0] = 42
Unfortunately it doesn't.
Is there a way to convert this list of numpy arrays into such a memoryview?

Only size 1 arrays can be converted to python scalars

I created a 3 dimensional object using numpy.random module such as
import numpy as np
b = np.random.randn(4,4,3)
Why can't we cast type float to b?
TypeError
actual code
You can't float(b) because b isn't a number, it's a multidimensional array/matrix. If you're trying to convert every element to a Python float, that's a bad idea because numpy numbers are more precise, but if you really want to do that for whatever reason, you can do b.tolist(), which returns a Python list of floats. However, I don't believe you can have a numpy matrix of native Python types because that doesn't make any sense.

torch.pow does not work

I'm trying to create a custom loss function using PyTorch, and am running into a simple error.
When I try to use torch.pow to take the exponent of a PyTorch Variable, I get the following error message:
AttributeError: 'torch.LongTensor' object has no attribute 'pow'
In the python terminal, I created a simple Variable, and attempted to do the same, and received the same error. Here's a snippet that should recreate the problem:
import torch
from torch.autograd import Variable
import numpy as np
v = Variable(torch.from_numpy(np.array([1, 2, 3, 4])))
torch.pow(v, 2)
I can't find any information on this issue, and nothing is showing up in search results. Help?
EDIT: this problem also occurs when I try to use torch.sqrt()
EDIT: same problem also happens if I try to do
v.pow(2)
pow is definitely a method of v, and the docs clearly state that pow is a method that exists and takes a tensor as it's argument. I really don't see how this is happening, and it seems to me that the docs are just flat out wrong and these methods don't actually work.
You need to initialize the tensor as floats, because pow always returns a Float.
import torch
from torch.autograd import Variable
import numpy as np
v = Variable(torch.from_numpy(np.array([1, 2, 3, 4], dtype="float32")))
torch.pow(v, 2)
You can cast it back to integers afterwards
torch.pow(v, 2).type(torch.LongTensor)
yields
Variable containing:
1
4
9
16
[torch.LongTensor of size 4]

The corresponding ctypes type of a numpy.dtype?

If I have a numpy ndarray with a certain dtype, how do I know what is the corresponding ctypes type?
For example, if I have a ndarray, I can do the following to convert it to a shared array:
import multiprocessing as mp
import numpy as np
import ctypes
x_np = np.random.rand(10, 10)
x_mp = mp.Array(ctypes.c_double, x_np)
However, I have to specify c_double here. It works if I don't specify the exact same type, but I would like to keep the type the same. How should I find out the ctypes type of the ndarray x_np automatically, at least for some common elementary data types?
This is now supported by numpy.ctypeslib.as_ctypes_type(dtype):
import numpy as np
x_np = np.random.rand(10, 10)
np.ctypeslib.as_ctypes_type(x_np.dtype)
Gives ctypes.c_double, as expected.
There is actually a way to do this that's built into Numpy:
x_np = np.random.rand(10, 10)
typecodes = np.ctypeslib._get_typecodes()
typecodes[x_np.__array_interface__['typestr']]
Output:
ctypes.c_double
The caveat is that the np.ctypeslib._get_typecodes function is marked as private (ie it's name starts with _). However, it doesn't seem like its implementation has changed in some time, so you can probably use it fairly reliably.
Alternatively, the implementation of _get_typecodes is pretty short, so you could just also copy the whole function over to your own code:
import ctypes
import numpy as np
def get_typecodes():
ct = ctypes
simple_types = [
ct.c_byte, ct.c_short, ct.c_int, ct.c_long, ct.c_longlong,
ct.c_ubyte, ct.c_ushort, ct.c_uint, ct.c_ulong, ct.c_ulonglong,
ct.c_float, ct.c_double,
]
return {np.dtype(ctype).str: ctype for ctype in simple_types}

Dtype work in FROM but not IMPORT

I swear I read almost all the "FROM vs IMPORT" questions before asking this.
While going through the NumPy tutorial I was using:
import numpy as np
but ran into trouble when declaring dtype of a matrix like:
a = np.ones((2,3),dtype=int32)
I kept getting "NameError: name 'int32' is not defined." I am using Python v3.2, and am following the tentative tutorial that goes along with it. I used:
from numpy import *
a = ones((2,3),dtype=int32)
Which works. Any insight as to why this is would be much appreciated.
Thank you in advance!
import numpy as np
#this will work because int32 is defined inside the numpy module
a = np.ones((2,3), dtype=np.int32)
#this also works
b = np.ones((2,3), dtype = 'int32')
#python doesn't know what int32 is because you loaded numpy as np
c = np.ones((2,3), dtype=int32)
back to your example:
from numpy import *
#this will now work because python knows what int32 is because it is loaded with numpy.
d = np.ones((2,3), dtype=int32)
I tend to define the type using strings as in array b