Why Numba does not work when I'm calling a function over and over into another? - numpy

I have a code to calculate skewness over 100 matrices or more. The matrix is actually a directed percolation. I defined 2 functions. the first: doPercolationStep() defines how does this random matrix should be filled. The second: manual(hl) produces this matrix over and over which means calls the doPercolationStep() over and over then calculates skewness of these random matrices. When I run the code with Numba I get This error:
No implementation of function Function(<built-in function setitem>) found for signature:
>>> setitem(array(undefined, 1d, C), int64, array(float64, 1d, C))
There are 16 candidate implementations:
- Of which 16 did not match due to:
Overload of function 'setitem': File: <numerous>: Line N/A.
With argument(s): '(array(undefined, 1d, C), int64, array(float64, 1d, C))':
No match.
During: typing of setitem at <timed exec> (55)
File "<timed exec>", line 55:
<source missing, REPL/exec in use?>
My First function is :
%%time
import numpy as np
import random as rand
from numba import jit , njit , prange
from pylab import *
import matplotlib.pyplot as plt
from numpy import linalg as la
import statistics as stt
#njit(parallel=True)
def doPercolationStep(vector, PROP, time):
even = time%2
vector_copy = np.copy(vector)
WIDTH = len(vector)
for i in range(even, WIDTH, 2):
if vector[i] == 1:
pro1 = np.random.rand()
pro2 = np.random.rand()
if pro1 < PROP:
vector_copy[(i+WIDTH-1)%WIDTH] = 1
if pro2 < PROP:
vector_copy[(i+1)%WIDTH] = 1
vector_copy[i] = 0
return vector_copy
And my main function is :
li=700
#njit(parallel=True)
def manual(hl):
WIDTH = hl
HEIGHT = hl
#PROP = 0.644
L = hl
p = linspace(0.1,0.9,15)
nx = len(p)
N = 100000
sk=[]
ku = []
for ip in range(nx):
w0=[]
for i in range(N):
vector = np.zeros(WIDTH)
vector[WIDTH//2] = 1
PROP=p[ip]
result = []
#result.append(vector)
for i in range(HEIGHT):
vector=doPercolationStep(vector, PROP, i)
result.append(vector)
#np.savetxt('result.dat', result, fmt='%d')
ss=np.array(result)
ss=ss.astype(np.int64)
##ss=np.int(result)
###ss= result
ss = np.where(ss==0, -1, ss)
ww=(ss+(ss.T))/2
re_size=ww/(np.sqrt(L))
w, v = la.eigh(re_size)
w=w.real
w=max(w)
w0.append(w)
w1=np.array(w0)
w1_mean=np.mean(w1)
w1_std=np.std(w1)
w1_std_3=(w1_std)**3
w1_num=N
w1_3=0
for ai in w1:
w1_3+=(ai-w1_mean)**3
w1_skew=(w1_3)/((w1_num)*(w1_std_3))
sk.append(w1_skew)
#kyu=kurtosis(w0)
#ku.append(kyu)
return sk
manual(li)
And finally, I get this error:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function setitem>) found for signature:
>>> setitem(array(undefined, 1d, C), int64, array(float64, 1d, C))
There are 16 candidate implementations:
- Of which 16 did not match due to:
Overload of function 'setitem': File: <numerous>: Line N/A.
With argument(s): '(array(undefined, 1d, C), int64, array(float64, 1d, C))':
No match.
During: typing of setitem at <timed exec> (55)
File "<timed exec>", line 55:
<source missing, REPL/exec in use?>

You are not showing the relevant part of the error:
No implementation of function Function(<built-in function setitem>) found for signature:
>>> setitem(array(undefined, 1d, C), int64, array(float64, 1d, C))
There are 16 candidate implementations:
- Of which 16 did not match due to:
Overload of function 'setitem': File: <numerous>: Line N/A.
With argument(s): '(array(undefined, 1d, C), int64, array(float64, 1d, C))':
No match.
During: typing of setitem at /home/jotaele/Devel/codetest/tests/test_numba.py (2452)
File "test_numba.py", line 2452:
def manual(hl):
<source elided>
vector = doPercolationStep(vector, PROP, i)
result.append(vector)
^
This tells you exactly what's happening.
Python's setitem(a,b,c) sets the value of a at index b with the value in c.
In your case, result receives the value, but it is defined as [], so Numba doesn't know its type.
You need to initialize it as an array whose size and type you know in advance. Your code is now shorter and runs with Numba:
...
PROP = p[ip]
ss = np.empty((HEIGHT, WIDTH), dtype=np.int64)
for i in range(HEIGHT):
ss[i] = doPercolationStep(vector, PROP, i)
ss = np.where(ss == 0, -1, ss)
...

Related

Numpy's hstack troubles with Numba

I am having trouble compiling a simple function in no-Python mode with Numba:
#njit
def fun(x,y):
points = np.hstack((x,y))
return points
a = 5
b = 2
res = fun(a,b)
While this very simple script works without the #njit decorator, if run with it throws the error:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function hstack at 0x7f491475fe50>) found for signature:
>>> hstack(UniTuple(int64 x 2))
There are 4 candidate implementations:
- Of which 4 did not match due to:
Overload in function '_OverloadWrapper._build.<locals>.ol_generated': File: numba/core/overload_glue.py: Line 129.
With argument(s): '(UniTuple(int64 x 2))':
Rejected as the implementation raised a specific error:
TypeError: np.hstack(): expecting a non-empty tuple of arrays, got UniTuple(int64 x 2)
raised from /usr/local/lib/python3.8/dist-packages/numba/core/typing/npydecl.py:748
During: resolving callee type: Function(<function hstack at 0x7f491475fe50>)
During: typing of call at <ipython-input-41-7a0a3bcd4b1a> (28)
File "<ipython-input-41-7a0a3bcd4b1a>", line 28:
def fun(x, y):
points = np.hstack((x, y))
^
If I try to stack one scalar and one array (which it might be the case in the original function from which this problem arised), the behavior doesn't change:
b = np.ones(3)
res = fun(a,b)
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function hstack at 0x7f491475fe50>) found for signature:
>>> hstack(Tuple(int64, array(float64, 1d, C)))
There are 4 candidate implementations:
- Of which 4 did not match due to:
Overload in function '_OverloadWrapper._build.<locals>.ol_generated': File: numba/core/overload_glue.py: Line 129.
With argument(s): '(Tuple(int64, array(float64, 1d, C)))':
Rejected as the implementation raised a specific error:
TypeError: np.hstack(): expecting a non-empty tuple of arrays, got Tuple(int64, array(float64, 1d, C))
raised from /usr/local/lib/python3.8/dist-packages/numba/core/typing/npydecl.py:748
During: resolving callee type: Function(<function hstack at 0x7f491475fe50>)
During: typing of call at <ipython-input-42-39bffd13df71> (28)
File "<ipython-input-42-39bffd13df71>", line 28:
def fun(x, y):
points = np.hstack((x, y))
This is very puzzling to me. I am using Numba 0.56.4—which should be the last stable release.
A very similar behavior happens with np.concatenate, too.
Any suggestion would much appreciated.
Thank you!

Numpy function round throws error using numba jitclass

I want to use the numpy.round_ ina method of a class.
Any calculation done by methods of this class I want to accelerate by using numba.
In general, it works fine. But I somehow do not get numpy.round_ running.
When using numpy.round_ numba throws an error.
Here is my code of a reduced example:
from numba import types
from numba.experimental import jitclass
import numpy as np
spec = [
('arr', types.Array(types.uint8, 1, 'C')),
('quot', types.Array(types.float64, 1, 'C')),
]
#jitclass(spec)
class test:
def __init__(self):
self.arr = np.array((130,190,130),dtype=np.uint8)
def rnd_(self):
quot = np.zeros(3, dtype=np.float64)
val = self.arr
quot = np.round(val/3.0)
return quot
t = test()
a = t.rnd_()
It throws the following error:
TypingError: - Resolution failure for literal arguments:
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function round_ at 0x0000021B3500D870>) found for signature:
round_(array(float64, 1d, C))
There are 4 candidate implementations:
- Of which 4 did not match due to:
Overload in function '_OverloadWrapper._build.<locals>.ol_generated': File: numba\core\overload_glue.py: Line 131.
With argument(s): '(array(float64, 1d, C))':
Rejected as the implementation raised a specific error:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<intrinsic stub>) found for signature:
stub(array(float64, 1d, C))
There are 2 candidate implementations:
- Of which 2 did not match due to:
Intrinsic of function 'stub': File: numba\core\overload_glue.py: Line 35.
With argument(s): '(array(float64, 1d, C))':
No match.
During: resolving callee type: Function(<intrinsic stub>)
During: typing of call at <string> (3)
File "<string>", line 3:
<source missing, REPL/exec in use?>
raised from C:\ProgramData\Anaconda3\envs\mybase_conda\lib\site-packages\numba\core\typeinfer.py:1086
During: resolving callee type: Function(<function round_ at 0x0000021B3500D870>)
During: typing of call at .......\python\playground\tmp.py (27)
File "tmp.py", line 27:
def rnd_(self):
<source elided>
val = self.arr
quot = np.round(val/3.0)
^
- Resolution failure for non-literal arguments:
None
During: resolving callee type: BoundFunction((<class 'numba.core.types.misc.ClassInstanceType'>, 'rnd_') for instance.jitclass.test#21b3c031930<arr:array(uint8, 1d, C),quot:array(float64, 1d, C)>)
During: typing of call at <string> (3)
What am I doing wrong?
Seems like you need to pass round's optional arguments as well. I could reproduce the error with an even smaller example:
#nb.jit(nopython=True)
def foo(x):
return np.round(x)
The fix to this is something like:
#nb.jit(nopython=True)
def foo(x):
out = np.empty_like(x)
np.round(x, 0, out)
return out
So for your case, it should be:
def rnd_(self):
quot = np.zeros(3, dtype=np.float64)
np.round(self.arr / 3.0, 0, quot)
return quot

How do I get scipy.stats.truncnorm.rvs to use numpy.random.default_rng()?

I am having trouble with random_state in scipy.stats.truncnorm. Here is my code:
from scipy.stats import truncnorm
from numpy.random import default_rng
rg = default_rng( 12345 )
truncnorm.rvs(0.0,1.0,size=10, random_state=rg)
I get the following error:
File "test2.py", line 4, in <module>
truncnorm.rvs(0.0,1.0,size=10, random_state=rg)
File "/opt/anaconda3/envs/newbase/lib/python3.8/site-packages/scipy/stats/_distn_infrastructure.py", line 1004, in rvs
vals = self._rvs(*args, size=size, random_state=random_state)
File "/opt/anaconda3/envs/newbase/lib/python3.8/site-packages/scipy/stats/_continuous_distns.py", line 7641, in _rvs
out = self._rvs_scalar(a.item(), b.item(), size, random_state=random_state)
File "/opt/anaconda3/envs/newbase/lib/python3.8/site-packages/scipy/stats/_continuous_distns.py", line 7697, in _rvs_scalar
U = random_state.random_sample(N)
AttributeError: 'numpy.random._generator.Generator' object has no attribute 'random_sample'
I am using numpy 1.19.1 and scipy 1.5.0. The problem does not occur with scipy.norm.rvs.
In scipy 1.7.1, the problem line has been changed to:
def _rvs_scalar(self, a, b, numsamples=None, random_state=None):
if not numsamples:
numsamples = 1
# prepare sampling of rvs
size1d = tuple(np.atleast_1d(numsamples))
N = np.prod(size1d) # number of rvs needed, reshape upon return
# Calculate some rvs
U = random_state.uniform(low=0, high=1, size=N)
x = self._ppf(U, a, b)
rvs = np.reshape(x, size1d)
return rvs
Both have uniform, but rg does not have random_sample:
In [221]: rg.uniform
Out[221]: <function Generator.uniform>
In [222]: np.random.uniform
Out[222]: <function RandomState.uniform>
np.random.random_sample has this note:
.. note::
New code should use the ``random`` method of a ``default_rng()``
instance instead; please see the :ref:`random-quick-start`.

How to subset a 1-d array using a boolean 1-d array in numba decorated function?

I gotta say, numba seems to be usable only in extremely simplistic use cases carefully designed to be presented in talks
I can run the following code just fine:
def rt(hi):
for i in hi:
hi_ = i == hi
t = hi[hi_]
return None
rt(np.array(['a','b','c','d'],dtype='U'))
But, when i decorate the above code with njit:
#njit
def rt(hi):
for i in hi:
hi_ = i == hi
t = hi[hi_]
return None
rt(np.array(['a','b','c','d'],dtype='U'))
I get the following error:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-34-eadef1d0ecee> in <module>
5 t = hi[hi_]
6 return None
----> 7 rt(np.array(['a','b','c','d'],dtype='U'))
~/miniconda/envs/IndusInd_credit_cards_collections_scorecard_/lib/python3.8/site-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
418 e.patch_message(msg)
419
--> 420 error_rewrite(e, 'typing')
421 except errors.UnsupportedError as e:
422 # Something unsupported is present in the user code, add help info
~/miniconda/envs/IndusInd_credit_cards_collections_scorecard_/lib/python3.8/site-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
359 raise e
360 else:
--> 361 raise e.with_traceback(None)
362
363 argtypes = []
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function getitem>) found for signature:
>>> getitem(array([unichr x 1], 1d, C), Literal[bool](False))
There are 22 candidate implementations:
- Of which 20 did not match due to:
Overload of function 'getitem': File: <numerous>: Line N/A.
With argument(s): '(array([unichr x 1], 1d, C), bool)':
No match.
- Of which 1 did not match due to:
Overload in function 'GetItemBuffer.generic': File: numba/core/typing/arraydecl.py: Line 162.
With argument(s): '(array([unichr x 1], 1d, C), bool)':
Rejected as the implementation raised a specific error:
TypeError: unsupported array index type bool in [bool]
raised from /home/sarthak/miniconda/envs/IndusInd_credit_cards_collections_scorecard_/lib/python3.8/site-packages/numba/core/typing/arraydecl.py:68
- Of which 1 did not match due to:
Overload in function 'GetItemBuffer.generic': File: numba/core/typing/arraydecl.py: Line 162.
With argument(s): '(array([unichr x 1], 1d, C), Literal[bool](False))':
Rejected as the implementation raised a specific error:
TypeError: unsupported array index type Literal[bool](False) in [Literal[bool](False)]
raised from /home/sarthak/miniconda/envs/IndusInd_credit_cards_collections_scorecard_/lib/python3.8/site-packages/numba/core/typing/arraydecl.py:68
During: typing of intrinsic-call at <ipython-input-34-eadef1d0ecee> (5)
File "<ipython-input-34-eadef1d0ecee>", line 5:
def rt(hi):
<source elided>
hi_ = i == hi
t = hi[hi_]
^
How to subset a 1-d array using a boolean 1-d array in numba decorated function?

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

Strange error from numpy via matplotlib when trying to get a histogram of a tiny toy dataset. I'm just not sure how to interpret the error, which makes it hard to see what to do next.
Didn't find much related, though this nltk question and this gdsCAD question are superficially similar.
I intend the debugging info at bottom to be more helpful than the driver code, but if I've missed something, please ask. This is reproducible as part of an existing test suite.
if n > 1:
return diff(a[slice1]-a[slice2], n-1, axis=axis)
else:
> return a[slice1]-a[slice2]
E TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')
../py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py:1567: TypeError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) bt
[...]
py2.7.11-venv/lib/python2.7/site-packages/matplotlib/axes/_axes.py(5678)hist()
-> m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(606)histogram()
-> if (np.diff(bins) < 0).any():
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) p numpy.__version__
'1.11.0'
(Pdb) p matplotlib.__version__
'1.4.3'
(Pdb) a
a = [u'A' u'B' u'C' u'D' u'E']
n = 1
axis = -1
(Pdb) p slice1
(slice(1, None, None),)
(Pdb) p slice2
(slice(None, -1, None),)
(Pdb)
I got the same error, but in my case I am subtracting dict.key from dict.value. I have fixed this by subtracting dict.value for corresponding key from other dict.value.
cosine_sim = cosine_similarity(e_b-e_a, w-e_c)
here I got error because e_b, e_a and e_c are embedding vector for word a,b,c respectively. I didn't know that 'w' is string, when I sought out w is string then I fix this by following line:
cosine_sim = cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)
Instead of subtracting dict.key, now I have subtracted corresponding value for key
I had a similar issue where an integer in a row of a DataFrame I was iterating over was of type numpy.int64. I got the
TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')
error when trying to subtract a float from it.
The easiest fix for me was to convert the row using pd.to_numeric(row).
Why is it applying diff to an array of strings.
I get an error at the same point, though with a different message
In [23]: a=np.array([u'A' u'B' u'C' u'D' u'E'])
In [24]: np.diff(a)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-24-9d5a62fc3ff0> in <module>()
----> 1 np.diff(a)
C:\Users\paul\AppData\Local\Enthought\Canopy\User\lib\site-packages\numpy\lib\function_base.pyc in diff(a, n, axis)
1112 return diff(a[slice1]-a[slice2], n-1, axis=axis)
1113 else:
-> 1114 return a[slice1]-a[slice2]
1115
1116
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'
Is this a array the bins parameter? What does the docs say bins should be?
I am fairly new to this myself, but I had a similar error and found that it is due to a type casting issue. I was trying to concatenate rather than take the difference but I think the principle is the same here. I provided a similar answer on another question so I hope that is OK.
In essence you need to use a different data type cast, in my case I needed str not float, I suspect yours is the same so my suggested solution is. I am sorry I cannot test it before suggesting but I am unclear from your example what you were doing.
return diff(str(a[slice1])-str(a[slice2]), n-1, axis=axis)
Please see my example code below for the fix to my code, the change occurs on the third to last line. The code is to produce a basic random forest model.
import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay = npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain) # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
lenpredictions = len(RFpreds)
lentrue = yTest.shape[0]
if lenpredictions == lentrue :
fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
for i in range(0,lenpredictions) :
fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
else :
print "ERROR - names, prediction and true value array size mismatch."
This leads to an error of;
Traceback (most recent call last):
File "min_example.py", line 40, in <module>
fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')
The solution is to make each variable a str() type on the third to last line then write to file. No other changes to then code have been made from the above.
import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay = npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain) # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
lenpredictions = len(RFpreds)
lentrue = yTest.shape[0]
if lenpredictions == lentrue :
fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
for i in range(0,lenpredictions) :
fpred.write(str(RFpreds[i])+",,"+str(yTest[i])+",\n")
else :
print "ERROR - names, prediction and true value array size mismatch."
These examples are from a larger code so I hope the examples are clear enough.
I think #James is right. I got stuck by same error while working on Polyval(). And yeah solution is to use the same type of variabes. You can use typecast to cast all variables in the same type.
BELOW IS A EXAMPLE CODE
import numpy
P = numpy.array(input().split(), float)
x = float(input())
print(numpy.polyval(P,x))
here I used float as an output type. so even the user inputs the INT value (whole number). the final answer will be typecasted to float.
I ran into the same issue, but in my case it was just a Python list instead of a Numpy array used. Using two Numpy arrays solved the issue for me.