The title says it all. I want to convert a PyTorch autograd.Variable to its equivalent numpy array. In their official documentation they advocated using a.numpy() to get the equivalent numpy array (for PyTorch tensor). But this gives me the following error:
Traceback (most recent call last): File "stdin", line 1, in module
File
"/home/bishwajit/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py",
line 63, in getattr raise AttributeError(name) AttributeError:
numpy
Is there any way I can circumvent this?
Two possible case
Using GPU: If you try to convert a cuda float-tensor directly to numpy like shown below,it will throw an error.
x.data.numpy()
RuntimeError: numpy conversion for FloatTensor is not supported
So, you cant covert a cuda float-tensor directly to numpy, instead you have to convert it into a cpu float-tensor first, and try converting into numpy, like shown below.
x.data.cpu().numpy()
Using CPU: Converting a CPU tensor is straight forward.
x.data.numpy()
I have found the way. Actually, I can first extract the Tensor data from the autograd.Variable by using a.data. Then the rest part is really simple. I just use a.data.numpy() to get the equivalent numpy array. Here's the steps:
a = a.data # a is now torch.Tensor
a = a.numpy() # a is now numpy array
Related
I am doing a project and I need to use both tensorflow and pandas together.
The primary issue i am having is with the numpy version. If I have the latest version of numpy then tensorflow throws an error:
"Cannot convert a symbolic Tensor (lstm/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported"
If i downgrade numpy to 1.19.5 like the internet advises then importing pandas throws this error:
"numpy.ndarray size changed, may indicate binary incompatibility. expected 88 from c header, got 80 from pyobject"
Can someone please help?
I am following the code here:
https://www.kaggle.com/tanlikesmath/diabetic-retinopathy-with-resnet50-oversampling
However, during the metrics calculation, I am getting the following error:
File "main.py", line 50, in <module>
learn.fit_one_cycle(4,max_lr = 2e-3)
...
File "main.py", line 39, in quadratic_kappa
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat,1), y, weights='quadratic'),device='cuda:0')
...
File "/pfs/work7/workspace/scratch/ul_dco32-conda-0/conda/envs/resnet50/lib/python3.8/site-packages/torch/tensor.py", line 486, in __array__
return self.numpy()
TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Here are the metrics and the model:
def quadratic_kappa(y_hat, y):
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat,1), y, weights='quadratic'),device='cuda:0')
learn = cnn_learner(data, models.resnet50, metrics = [accuracy,quadratic_kappa])
learn.fit_one_cycle(4,max_lr = 2e-3)
As it is being said in the discussion https://discuss.pytorch.org/t/typeerror-can-t-convert-cuda-tensor-to-numpy-use-tensor-cpu-to-copy-the-tensor-to-host-memory-first/32850/6, I have to bring the data back to cpu. But I am slightly lost how to do it.
I tried to add .cpu() all over the metrics but could not solve it so far.
I'm assuming that both y and y_hat are CUDA tensors, that means that you need to bring them both to the CPU for the cohen_kappa_score, not just one.
def quadratic_kappa(y_hat, y):
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat.cpu(),1), y.cpu(), weights='quadratic'),device='cuda:0')
# ^^^ ^^^
Calling .cpu() on a tensor that is already on the CPU has no effect, so it's safe to use in any case.
I went from a CPU to a GPU version and received this error. It was due to passing metrics=[mean_absolute_error,mean_squared_error] to the Learner object (in my case tabular_learner).
Removing the metric parameter solved the issue temporarily for me.
I am having issues replicating a pymc2 code using pymc3.
I believe it is due to the fact pymc3 is using the theano type variables which are not compatible with the numpy operations I am using. So I am using the #theano.decorator:
I have this function:
with pymc3.Model() as model:
z_stars = pymc3.Uniform('z_star', self.z_min_ssp_limit, self.z_max_ssp_limit)
Av_stars = pymc3.Uniform('Av_star', 0.0, 5.00)
sigma_stars = pymc3.Uniform('sigma_star',0.0, 5.0)
#Fit observational wavelength
ssp_fit_output = self.ssp_fit_theano(z_stars, Av_stars, sigma_stars,
self.obj_data['obs_wave_resam'],
self.obj_data['obs_flux_norm_masked'],
self.obj_data['basesWave_resam'],
self.obj_data['bases_flux_norm'],
self.obj_data['int_mask'],
self.obj_data['normFlux_obs'])
#Define likelihood
like = pymc.Normal('ChiSq', mu=ssp_fit_output,
sd=self.obj_data['obs_fluxEr_norm'],
observed=self.obj_data['obs_fluxEr_norm'])
#Run the sampler
trace = pymc3.sample(iterations, step=step, start=start_conditions, trace=db)
where:
#theano.compile.ops.as_op(itypes=[t.dscalar,t.dscalar,t.dscalar,t.dvector,
t.dvector,t.dvector,t.dvector,t.dvector,t.dscalar],
otypes=[t.dvector])
def ssp_fit_theano(self, input_z, input_sigma, input_Av, obs_wave, obs_flux_masked,
rest_wave, bases_flux, int_mask, obsFlux_mean):
...
...
The first three variables are scalars (from the pymc3 uniform distribution). The
remaining variables are numpy arrays and the last one is a float. However, I am
getting this "'numpy.ndarray' object has no attribute 'type'" error:
File "/home/user/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 615, in __call__
node = self.make_node(*inputs, **kwargs)
File "/home/user/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 963, in make_node
if not all(inp.type == it for inp, it in zip(inputs, self.itypes)):
File "/home/user/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 963, in <genexpr>
if not all(inp.type == it for inp, it in zip(inputs, self.itypes)):
AttributeError: 'numpy.ndarray' object has no attribute 'type'
Please any advice in the right direction will be most welcomed.
I had a bunch of time-wasting-stops when I went from pymc2 to pymc3. The problem, I think, is that the doc is quite bad. I suspect they neglect the doc as far as the code is still evolving. 3 comments/advises:
I wish you could find some help using '#theano.compile.ops.as_op' here: failure to adapt pymc2 into pymc3 or here how to fit a method belonging to an instance with pymc3?
The drawback of '#theano.compile.ops.as_op' is that you implicitly exclude any analysis related to the gradient of your function. To have access to the gradient, I think you need to define your function in a more complex way presented here how to fit a method belonging to an instance with pymc3?
warning: for the moment, using theano seems to be a source of problem if you want to distribute your code under Windows. See build a .exe for Windows from a python 3 script importing theano with pyinstaller, but I am not sure whether it is just a personal clumsiness or really a problem. Personally I had to give up theano to be able to distribute my code...
I want to save the result of TfidfVectorizer in sklearn.feature_extraction.text into a text file for future use. As I found, it is a sparse matrix of type ''. However when I try to save it using the following code
np.savetxt('Feature_TfIdf.txt', X_Tfidf, fmt='%2.6f')
I get an error like this
IndexError: tuple index out of range
Use joblib.dump or sklearn.externals.joblib.dump for this. NumPy doesn't get SciPy sparse matrices.
Simple example:
np.save('TfIdf.pkl',tfidf)
I manage to solve the problem by converting the sparse matrix to full matrix and then save matrix and save the results. This approach however is not useful for large arrays so it is better to save the matrix in .pkl format.
Can anyone tell me why this NumPy record is having trouble with Python's new-style string formatting? All floats in the record choke on "{:f}".format(record).
Thanks for your help!
In [334]: type(tmp)
Out[334]: numpy.core.records.record
In [335]: tmp
Out[335]: ('XYZZ', 2001123, -23.823917388916016)
In [336]: tmp.dtype
Out[336]: dtype([('sta', '|S6'), ('ondate', '<i8'), ('lat', '<f4')])
# Some formatting works fine
In [337]: '{0.sta:6.6s} {0.ondate:8d}'.format(tmp)
Out[337]: 'XYZZ 2001123'
# Any float has trouble
In [338]: '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/Users/jkmacc/python/pisces/<ipython-input-338-e5f6bcc4f60f> in <module>()
----> 1 '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)
ValueError: Unknown format code 'f' for object of type 'str'
This question was answered on the NumPy user mailing list under "floats coerced to string with "{:f}".format() ?":
It seems that np.int64/32 and np.str inherit their respective native Python __format__(), but np.float32/64 doesn't get __builtin__.float.__format__(). That's not intuitive, but I see now why this works:
In [8]: '{:6.6s} {:8d} {:11.6f}'.format(tmp.sta, tmp.ondate, float(tmp.lat))
Out[8]: 'XYZZ 2001123 -23.820000'
Thanks!
-Jon
EDIT:
np.float32/int32 inherits from native Python types if your system is 32-bit. Same for 64-bit. A mismatch will generate the same problem as the original post.