Issue in converting 3d numpy array to image in PIL - numpy

Consider the following statement
PIL_att = Image.fromarray(np.uint8(one_map))
it is causing the error TypeError: Cannot handle this data type: (1, 1, 48), |u1
The shape of one_map is i.e., print(one_map.shape) gives (272, 272, 48), the dimension is i.e., print("one_map.ndim) gives 3 and print(one_map.dtype) is giving float64. Each data value in np.uint8(one_map) is an whole number that varies from 0 to 255.
What is the issue and how to resolve it?

Related

Cross antimeridian 180° in Matplotlib

I'm trying to plot some Meteorological fields from a Grib2 file and I have a problem. How can I cross Antimeridian 180° in Matplotlib ? When I set the first lon = 160 and second lon = -40, Python returns me an error:
TypeError: Input z must be at least a (2, 2) shaped array, but has shape (101, 0)
Does anyone know how to solve that problem ? Thanks.

numpy.VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences

Here's an example of behavior I cannot understand, maybe someone can share the insight into the logic behind it:
ccn = np.ones(1)
bbb = 7
bbn = np.array(bbb)
bbn * ccn # this is OK
array([7.])
np.prod((bbn,ccn)) # but this is NOT
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.2.2\plugins\python-ce\helpers\pydev\_pydevd_bundle\pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<input>", line 1, in <module>
File "<__array_function__ internals>", line 5, in prod
File "C:\Users\...\venv\lib\site-packages\numpy\core\fromnumeric.py", line 2999, in prod
return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
File "C:\Users\...\venv\lib\site-packages\numpy\core\fromnumeric.py", line 87, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
numpy.VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
Why? Why would a simple multiplication of two numbers be a problem? As far as formal algebra goes there's no dimensional problems, no datatype problems? The result is invariably also a single number, there's no chance it "suddenly" turn vector or object anything alike. prod(a,b) for a and b being scalars or 1by1 "matrices" is something MATLAB or Octave would eat no problem.
I know I can turn this error off and such, but why is it even and error?
In [346]: ccn = np.ones(1)
...: bbb = 7
...: bbn = np.array(bbb)
In [347]: ccn.shape
Out[347]: (1,)
In [348]: bbn.shape
Out[348]: ()
In [349]: np.array((bbn,ccn))
<ipython-input-349-997419ba7a2f>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.array((bbn,ccn))
Out[349]: array([array(7), array([1.])], dtype=object)
You have arrays with different dimensions, that can't be combined into one numeric array.
That np.prod expression is actually:
np.multiply.reduce(np.array([bbn,ccn]))
can be deduced from your traceback.
In Octave both objects have shape (1,1), 2d
>> ccn = ones(1)
ccn = 1
>> ccn = ones(1);
>> size(ccn)
ans =
1 1
>> bbn = 7;
>> size(bbn)
ans =
1 1
>> [bbn,ccn]
ans =
7 1
It doesn't have true scalars; everything is 2d (even 3d is a fudge on the last dimension).
And with 'raw' Python inputs:
In [350]: np.array([1,[1]])
<ipython-input-350-f17372e1b22d>:1: VisibleDeprecationWarning: ...
np.array([1,[1]])
Out[350]: array([1, list([1])], dtype=object)
The object dtype array preserves the type of the inputs.
edit
prod isn't a simple multiplication. It's a reduction operation, like the big Pi in math. Even in Octave it isn't:
>> prod([[2,3],[3;4]])
error: horizontal dimensions mismatch (1x2 vs 2x1)
>> [2,3]*[3;4]
ans = 18
>> [2,3].*[3;4]
ans =
6 9
8 12
The numpy equivalent:
In [97]: np.prod((np.array([2,3]),np.array([[3],[4]])))
/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py:87: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences...
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: could not broadcast input array from shape (2,1) into shape (2,)
In [98]: np.array([2,3])#np.array([[3],[4]])
Out[98]: array([18])
In [99]: np.array([2,3])*np.array([[3],[4]])
Out[99]:
array([[ 6, 9],
[ 8, 12]])
The warning, and here the error, is produced by trying to make ONE array from (np.array([2,3]),np.array([[3],[4]])).

How to interpolate a 5 dimensional array?

I have an array of shape- [41, 101, 6, 4, 280]. I want to interpolate it so that if I give it a value from 41 temperature and 101 density values, it spits out an array of [6,4,280] shape. Is there a NumPy function that can deal with this?
let's start step by step :
Q : Is there a NumPy function that can deal with this?
Yes, there is.
The first step is to generate an instance of a 5D numpy.ndarray, that will contain your known data-points ( do not mind the dtype, that was used for just reminding we can go literally from bits upto complex128 values here, if later needed ):
>>> import numpy as np
>>>
>>> a5Dtensor = np.ndarray( (41, 101, 6, 4, 280 ), dtype = np.uint8 )
Now, let's validate it's .shape :
>>> a5Dtensor.shape
(41, 101, 6, 4, 280)
The core trick is the built-in smart numpy-slicing :
>>> a5Dtensor[0,0,:,:,:].shape
(6, 4, 280)
This indeed returns the requested 3D-cube of data-points.
The slicing-trick is also very smart in not producing any new memory-allocations (which will be of interest once the sizes grow somewhere beyond L1/L2/L3-CPU-cache horizons, the more once you get beyond a few GB-of data)
>>> a5Dtensor[0,0,:,:,:].flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False <------ may enjoy FORTRAN efficient data layout, where needed
OWNDATA : False <------ 3D-cube data not "copied", rather "viewed" inside 5D
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
Last, but not least, if aTemperatureVALUE and aDensityVALUE variables are not indices into 5D, but rather data-values, among which you seek for the sought for interpolation of 3D-cube datapoints' values, the numpy can serve with a piecewise linear interpolation ( with some constraints ), yet making any such interpolation for each of the result ( in the 3D-cube of interpolated-values ) requires a 2D-interpolation being run for each of the 3D-cube coordinates, based on the values held for nearest-{ lower, upper } temperature and density values, present in the original 5D-data-points.
There are other smart tools for this in numpy ( nD; n = 0+ .meshgrid() method, .argwhere() and others ), yet finding ( pre-sorting ), indirect indexing may be needed for this, in case the original 5D-data-points do not exhibit some properties, like a 3D-cubes of data-point having been already pre-sorted in the first two dimensions for easier processing for the sought-for 2D-(temp,density)-interpolator ( be it specifically tailor-made for dtype=uint8, float64, complex128 or object ).

Getting Error while performing Undersampling for Sklearn

I am trying built an randomforest classifier for binary classification . My data is inbalanced hence I am performing undersampling.
train = data.drop(['Co_Name','Cust_ID','Phone','Shpr_ID','Resi_Cnt','Buz_Cnt','Nearby_Cnt','parseNumber','removeString','Qty','bins','Adj_Addr','Resi','Weight','Resi_Area','Lat','Lng'], axis=1)
Y = data['Resi']
from sklearn import metrics
rus = RandomUnderSampler(random_state=42)
X_train_res, y_train_res = rus.fit_sample(train, Y)
I am getting the below error
446 # make sure we actually converted to numeric:
447 if dtype_numeric and array.dtype.kind == "O":
--> 448 array = array.astype(np.float64)
449 if not allow_nd and array.ndim >= 3:
450 raise ValueError("Found array with dim %d. %s expected <= 2."
ValueError: setting an array element with a sequence.
How to fix this.
Can you share the dataframe? or a sample of that!
This error can be a lot of things, for example:
If you try:
np.asarray(
[
[1, 2],
[2, 3, 4]
],
dtype=np.float)
You will get:
ValueError: setting an array element with a sequence.
This is because the array have incorrect shape of columns. So you can't create an array from lists, with a column length different on the second list. So doesn't match column length.
But your error probably it's related to train vs Y shape or the type in the train(data). During the Under-sampled fit function should have some conversion that throws this error. Confirm if train (data) have the appropriate type before to do the RandomUnderSampler.

Sklearn and Sparse Matrices ValueError

I'm aware similar questions have been asked before, and I've tried everything suggested in them, but I'm still stumped. I have a dataset with 2 columns: The first with vectors representing words stored as a 1x10000 sparse csr matrix (so a matrix in each cell), and the second contains integer ratings which I will use for classification. When I run the following code
for index, row in data.iterrows():
print(row)
print(row[0].shape)
I get the correct output for all the rows
Name: 0, dtype: object
(1, 10000)
Vector (0, 0)\t1.0\n (0, 1)\t1.0\n (0, 2)\t1.0\n ...
Rating 5
Now when I try passing my data in any SKlearn classifier like so:
uniform_random_classifier = DummyClassifier(strategy='uniform')
uniform_random_classifier.fit(data["Vectors"], data["Ratings"])
I get the following error:
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence.
What am I doing wrong? I've made sure all my sparse matrices are the same size and I've tried reshaping my data in various ways, but with no luck, and the Sklearn classifiers are supposed to be able to deal with csr matrices.
Update: Converting the entire "Vectors" column into one large 2-D matrix did the trick, but for completeness sake the following is the code I used to generate my dataframe if anyone is curious and wants to try solving the original issue. Assume data is a pandas dataframe with rows that look like
"560 420 222" 5.0
"2345 2344 2344 5" 3.0
def vectorize(feature, size):
"""Given a numeric string generated from a vocabulary table return a binary vector representation of
each feature"""
vector = sparse.lil_matrix((1, size))
for number in feature.split(' '):
try:
vector[0, int(number) - 1] = 1
except ValueError:
pass
return vector
def vectorize_dataset(data, vectorize, size):
"""Given a dataset in the appropriate "num num num..." format, a specific vectorization format, and a vector size,
returns the dataset in vectorized form"""
result_data = pd.DataFrame(index=range(data.shape[0]), columns=["Vector", "Rating"])
for index, row in data.iterrows():
# All the mixing up of decodings and encoding has made it so that Pandas incorrectly parses EOF chars
if type(row[0]) == type('str'):
result_data.iat[index, 0] = vectorize(row[0], size).tocsr()
result_data.iat[index, 1] = data.loc[index][1]
return result_data