Why the line3 raise valueError‘ matrix must be 2-dimensional’
import numpy as np
np.mat([[[1],[2]],[[10],[1,3]]])
np.mat([[[1],[2]],[[10],[1]]])
The reason why this code raises an error is because NumPy tries to determine the dimensionality of your input using nesting levels (nesting levels -> dimensions).
If, at some level, some elements do not have the same length (i.e. they are incompatible), it will create the array using the deepest nesting it can, using the objects as the elements of the array.
For this reason:
np.mat([[[1],[2]],[[10],[1,3]]])
Will give you a matrix of objects (lists), while:
np.mat([[[1],[2]],[[10],[1]]])
would result in a 3D array of numbers which np.mat() does not want to squeeze into a matrix.
Also, please avoid using np.mat() in your code as it is deprecated.
Use np.array() instead.
Incidentally, np.array() would work in both cases and it would give you a (2, 2, 1)-shaped array of int, which you could np.squeeze() into a matrix if you like.
However, it would be better to start from nesting level of 2 if all you want is a matrix:
np.array([[1, 2], [10, 1]])
Related
I have a regression model that I fit in SKlearn's LinearRegression module:
To extract the coefficients, I used the code;
coefficients = model.coef_
It produced the following array with a shape of (1, 10):
[-4.72307152e-05 1.29731143e-04 8.75483702e-05 -6.28749019e-04
1.75096740e-04 -3.30209379e-06 1.35937650e-03 3.89048429e-11
8.48406857e-03 -1.36499030e-05]
Now, I would like to save the array to a pd.Series. I am taking the following approach:
features = ["f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8", "f9", "f10"]
model_coefs = pd.Series(coefficients, index=features)
And, the system gives me the following error:
ValueError: Length of passed values is 1, index implies 10.
What I have tried:
Transposing the underlying array, coefficients, to give it a length of 10.
Reshaping the array to give it a shape of (10,1).
But nothing seems to work. I am not sure where I am going wrong.
For your case you want to flatten the array so .ravel should do the trick for example:
pd.Series(np.zeros((1, 10)).ravel(), index=features)
It's strange the coeffs output are of shape (1, 10), when I run the base sklearn example here (with multiple features) my coeffs are of 1-d:
In [27]: regr.coef_
Out[27]:
array([ 3.03499549e-01, -2.37639315e+02, 5.10530605e+02, 3.27736980e+02,
-8.14131709e+02, 4.92814588e+02, 1.02848452e+02, 1.84606489e+02,
7.43519617e+02, 7.60951722e+01])
In [28]: regr.coef_.shape
Out[28]: (10,)
So first off, I think what I'm trying to achieve is some sort of Cartesian product but elementwise, across the columns only.
What I'm trying to do is, if you have multiple 2D arrays of size [ (N,D1), (N,D2), (N,D3)...(N,Dn) ]
The result is thus to be a combinatorial product across axis=1 such that the final result will then be of shape (N, D) where D=D1*D2*D3*...Dn
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20,30],
[5,6,7]])
cartesian_product( [A,B], axis=1 )
>> np.array([[ 1*10, 1*20, 1*30, 2*10, 2*20, 2*30 ]
[ 3*5, 3*6, 3*7, 4*5, 4*6, 4*7 ]])
and extendable to cartesian_product([A,B,C,D...], axis=1)
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20],
[5,6]])
C = np.array([[50, 0],
[60, 8]])
cartesian_product( [A,B,C], axis=1 )
>> np.array([[ 1*10*50, 1*10*0, 1*20*50, 1*20*0, 2*10*50, 2*10*0, 2*20*50, 2*20*0]
[ 3*5*60, 3*5*8, 3*6*60, 3*6*8, 4*5*60, 4*5*8, 4*6*60, 4*6*8]])
I have a working solution that essentially creates an empty (N,D) matrix and then broadcasting a vector columnwise product for each column within nested for loops for each matrix in the provided list. Clearly is horrible once the arrays get larger!
Is there an existing solution within numpy or tensorflow for this? Potentially one that is efficiently paralleizable (A tensorflow solution would be wonderful but a numpy is ok and as long as the vector logic is clear then it shouldn't be hard to make a tf equivalent)
I'm not sure if I need to use einsum, tensordot, meshgrid or some combination thereof to achieve this. I have a solution but only for single-dimension vectors from https://stackoverflow.com/a/11146645/2123721 even though that solution says to work for arbitrary dimensions array (which appears to mean vectors). With that one i can do a .prod(axis=1), but again this is only valid for vectors.
thanks!
Here's one approach to do this iteratively in an accumulating manner making use of broadcasting after extending dimensions for each pair from the list of arrays for elmentwise multiplications -
L = [A,B,C] # list of arrays
n = L[0].shape[0]
out = (L[1][:,None]*L[0][:,:,None]).reshape(n,-1)
for i in L[2:]:
out = (i[:,None]*out[:,:,None]).reshape(n,-1)
Can you intuitively explain or give more examples about tf.gather_nd for indexing and slicing into high-dimensional tensors in Tensorflow?
I read the API, but it is kept quite concise that I find myself hard to follow the function's concept.
Ok, so think about it like this:
You are providing a list of index values to index the provided tensor to get those slices. The first dimension of the indices you provide is for each index you will perform. Let's pretend that tensor is just a list of lists.
[[0]] means you want to get one specific slice(list) at index 0 in the provided tensor. Just like this:
[tensor[0]]
[[0], [1]] means you want get two specific slices at indices 0 and 1 like this:
[tensor[0], tensor[1]]
Now what if tensor is more than one dimensions? We do the same thing:
[[0, 0]] means you want to get one slice at index [0,0] of the 0-th list. Like this:
[tensor[0][0]]
[[0, 1], [2, 3]] means you want return two slices at the indices and dimensions provided. Like this:
[tensor[0][1], tensor[2][3]]
I hope that makes sense. I tried using Python indexing to help explain how it would look in Python to do this to a list of lists.
You provide a tensor and indices representing locations in that tensor. It returns the elements of the tensor corresponding to the indices you provide.
EDIT: An example
import tensorflow as tf
sess = tf.Session()
x = [[1,2,3],[4,5,6]]
y = tf.gather_nd(x, [[1,1],[1,2]])
print(sess.run(y))
[5, 6]
I have initialized this empty 2d np.array
inputs = np.empty((300, 2), int)
And I am attempting to append a 2d row to it as such
inputs = np.append(inputs, np.array([1,2]), axis=0)
But Im getting
ValueError: all the input arrays must have same number of dimensions
And Numpy thinks it's a 2 row 0 dimensional object (transpose of 2d)
np.array([1, 2]).shape
(2,)
Where have I gone wrong?
To add a row to a (300,2) shape array, you need a (1,2) shape array. Note the matching 2nd dimension.
np.array([[1,2]]) works. So does np.array([1,2])[None, :] and np.atleast_2d([1,2]).
I encourage the use of np.concatenate. It forces you to think more carefully about the dimensions.
Do you really want to start with np.empty? Look at its values. They are random, and probably large.
#Divakar suggests np.row_stack. That puzzled me a bit, until I checked and found that it is just another name for np.vstack. That function passes all inputs through np.atleast_2d before doing np.concatenate. So ultimately the same solution - turn the (2,) array into a (1,2)
Numpy requires double brackets to declare an array literal, so
np.array([1,2])
needs to be
np.array([[1,2]])
If you intend to append that as the last row into inputs, you can just simply use np.row_stack -
np.row_stack((inputs,np.array([1,2])))
Please note this np.array([1,2]) is a 1D array.
You can even pass it a 2D row version for the same result -
np.row_stack((inputs,np.array([[1,2]])))
I have an ndarray A that stores objects of the same type, in particular various LinearNDInterpolator objects. For example's sake assume it's just 2:
>>> A
array([ <scipy.interpolate.interpnd.LinearNDInterpolator object at 0x7fe122adc750>,
<scipy.interpolate.interpnd.LinearNDInterpolator object at 0x7fe11daee590>], dtype=object)
I want to be able to do two things. First, I'd like to evaluate all objects in A at a certain point and get back an ndarray of A.shape with all the values in it. Something like
>> A[[0,1]](1,1) =
array([ 1, 2])
However, I get
TypeError: 'numpy.ndarray' object is not callable
Is it possible to do that?
Second, I would like to change the interpolation values without constructing new LinearNDInterpolator objects (since the nodes stay the same). I.e., something like
A[[0,1]].values = B
where B is an ndarray containing the new values for every element of A.
Thank you for your suggestions.
The same issue, but with simpler functions:
In [221]: A=np.array([add,multiply])
In [222]: A[0](1,2) # individual elements can be called
Out[222]: 3
In [223]: A(1,2) # but not the array as a whole
---------------------------------------------------------------------------
TypeError: 'numpy.ndarray' object is not callable
We can iterate over a list of functions, or that array as well, calling each element on the parameters. Done right we can even zip a list of functions and a list of parameters.
In [224]: ll=[add,multiply]
In [225]: [x(1,2) for x in ll]
Out[225]: [3, 2]
In [226]: [x(1,2) for x in A]
Out[226]: [3, 2]
Another test, the callable function:
In [229]: callable(A)
Out[229]: False
In [230]: callable(A[0])
Out[230]: True
Can you change the interpolation values for individual Interpolators? If so, just iterate through the list and do that.
In general, dtype object arrays function like lists. They contain the same kind of object pointers. Most operations requires the same sort of iteration. Unless you need to organize the elements in multiple dimensions, dtype object arrays have few, if any advantages over lists.
Another thought - the normal array dtype is numeric or fixed length strings. These elements are not callable, so there's no need to implement a .__call__ method on these arrays. They could write something like that to operate on object dtype arrays, but the core action is a Python call. So such a function would just hide the kind of iteration that I outlined.
In another recent question I showed how to use np.char.upper to apply a string method to every element of a S dtype array. But my time tests showed that this did not speedup anything.