This is an excerpt from a documentation.
lambda ind, r: 1.0 + any(np.array(points_2d)[ind][:,0] == 0.0)
But I don't understand np.array(points_2d)[ind][:,0].
It seems equivalent to myarray[0][:,0], which doesn't make sense to me.
Can anyone help to explain?
With points_2d from earlier in the doc:
In [38]: points_2d = [(0., 0.), (0., 1.), (1., 1.), (1., 0.),
...: (0.5, 0.25), (0.5, 0.75), (0.25, 0.5), (0.75, 0.5)]
In [39]: np.array(points_2d)
Out[39]:
array([[0. , 0. ],
[0. , 1. ],
[1. , 1. ],
[1. , 0. ],
[0.5 , 0.25],
[0.5 , 0.75],
[0.25, 0.5 ],
[0.75, 0.5 ]])
Indexing with a scalar gives a 1d array, which can't be further indexed with [:,0].
In [40]: np.array(points_2d)[0]
Out[40]: array([0., 0.])
But with a list or slice:
In [41]: np.array(points_2d)[[0,1,2]]
Out[41]:
array([[0., 0.],
[0., 1.],
[1., 1.]])
In [42]: np.array(points_2d)[[0,1,2]][:,0]
Out[42]: array([0., 0., 1.])
So this selects the first column of a subset of rows.
In [43]: np.array(points_2d)[[0,1,2]][:,0]==0.0
Out[43]: array([ True, True, False])
In [44]: any(np.array(points_2d)[[0,1,2]][:,0]==0.0)
Out[44]: True
I think they could have used:
In [45]: np.array(points_2d)[[0,1,2],0]
Out[45]: array([0., 0., 1.])
Related
Here is what I have so far:
arr = np.round(np.random.uniform(0,1,size = (10,10)),decimals = 0)
print(arr)
arr2 = np.cumsum(arr,axis=0)
print(arr2)
mask = np.where((arr == 1)&(arr2<=3),1,0)
print(mask)
population = np.round(np.random.uniform(0,5,size=(10,10)),decimals=0)
print(population)
maskedPop = population[mask==1]
print(maskedPop)
This outputs a flattened array, is there a way I can keep the 10 columns? So the output would be 3x10?
Your code, reduced in scale:
In [153]: arr = np.round(np.random.uniform(0,1,size = (5,5)),decimals = 0)
...: print(arr)
...: arr2 = np.cumsum(arr,axis=0)
...: print(arr2)
...: mask = np.where((arr == 1)&(arr2<=3),1,0)
...: print(mask)
...: population = np.round(np.random.uniform(0,5,size=(5,5)),decimals=0)
...: print(population)
...: print(mask==1)
...: maskedPop = population[mask==1]
...: print(maskedPop)
The print results - I added the mask==1 line, since that's what's doing the indexing:
[[0. 1. 1. 0. 1.]
[1. 0. 1. 1. 1.]
[1. 0. 0. 1. 1.]
[1. 1. 0. 0. 1.]
[0. 0. 0. 0. 0.]]
[[0. 1. 1. 0. 1.]
[1. 1. 2. 1. 2.]
[2. 1. 2. 2. 3.]
[3. 2. 2. 2. 4.]
[3. 2. 2. 2. 4.]]
[[0 1 1 0 1]
[1 0 1 1 1]
[1 0 0 1 1]
[1 1 0 0 0]
[0 0 0 0 0]]
[[0. 5. 2. 2. 2.]
[1. 4. 2. 4. 0.]
[2. 3. 3. 2. 2.]
[4. 4. 3. 1. 3.]
[4. 2. 2. 1. 5.]]
[[False True True False True]
[ True False True True True]
[ True False False True True]
[ True True False False False]
[False False False False False]]
[5. 2. 2. 1. 2. 4. 0. 2. 2. 2. 4. 4.]
Count the number of True per row or column. Tell us how this could retain some sort of 2d result!
===
I see you already display mask, so mask== is the same as
In [158]: mask.astype(bool)
Out[158]:
array([[False, True, True, False, True],
[ True, False, True, True, True],
[ True, False, False, True, True],
[ True, True, False, False, False],
[False, False, False, False, False]])
There is a MaskedArray class that lets you work with an array with certain values 'masked-out':
In [161]: np.ma.masked_array(population, mask!=1)
Out[161]:
masked_array(
data=[[--, 5.0, 2.0, --, 2.0],
[1.0, --, 2.0, 4.0, 0.0],
[2.0, --, --, 2.0, 2.0],
[4.0, 4.0, --, --, --],
[--, --, --, --, --]],
mask=[[ True, False, False, True, False],
[False, True, False, False, False],
[False, True, True, False, False],
[False, False, True, True, True],
[ True, True, True, True, True]],
fill_value=1e+20)
===
Another way to retain masked values in an array is to somehow 'zero-out' values:
In [162]: mpop = population.copy()
In [163]: mpop[mask!=1] = np.nan
In [164]: mpop
Out[164]:
array([[nan, 5., 2., nan, 2.],
[ 1., nan, 2., 4., 0.],
[ 2., nan, nan, 2., 2.],
[ 4., 4., nan, nan, nan],
[nan, nan, nan, nan, nan]])
It looks like the maks produces the same amount of non-zero rows per column. So you could probably mask (using the boolean array directly) and reshape:
population[(arr == 1)&(arr2<=3)].reshape(3,-1)
array([[3., 2., 5., 0., 4., 2., 0., 4., 5., 1.],
[4., 3., 5., 3., 4., 1., 1., 4., 5., 4.],
[3., 3., 4., 3., 4., 2., 4., 4., 1., 5.]])
Note that the output is flattened, since numpy doesn't know that the result is expected to be a 2d homogeneous array. If mask.sum(0) resulted in different values per column, you wouldn't be able to reconstruct as an ndarray, so numpy just doesn't do that guess for you.
I am creating a neural network in tensorflow and I have created the placeholders like this:
input_tensor = tf.placeholder(tf.float32, shape = (None,n_input), name = "input_tensor")
output_tensor = tf.placeholder(tf.float32, shape = (None,n_classes), name = "output_tensor")
During the training process, I was getting the following error:
Traceback (most recent call last):
File "try.py", line 150, in <module>
sess.run(optimizer, feed_dict={X: x_train[i: i + 1], Y: y_train[i: i + 1]})
TypeError: unhashable type: 'numpy.ndarray'
I identified that is because of the different datatypes of my x_train and y_train to the datatypes of the placeholders.
My x_train looks somewhat like this:
array([[array([[ 1., 0., 0.],
[ 0., 1., 0.]])],
[array([[ 0., 1., 0.],
[ 1., 0., 0.]])],
[array([[ 0., 0., 1.],
[ 0., 1., 0.]])]], dtype=object)
It was initially a dataframe like this:
0 [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
1 [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]
2 [[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]
I did x_train = train_x.values to get the numpy array
And y_train looks this:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
x_train has dtype object and y_train has dtype float64.
What I want to know is that how I can change the datatypes of my training data so that it can work well with the tensorflow placeholders. Or please suggest if I am missing something.
It is little hard to guess what shape you want your data to be, but I am guessing one of the two combinations which you might be looking for. I will also try to simulate your data in Pandas dataframe.
df = pd.DataFrame([[[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]],
[[[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]],
[[[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]]], columns = ['Mydata'])
print(df)
x = df.Mydata.values
print(x.shape)
print(x)
print(x.dtype)
Output:
Mydata
0 [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
1 [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]
2 [[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]
(3,)
[list([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
list([[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]])
list([[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]])]
object
Combination 1
y = [item for sub_list in x for item in sub_list]
y = np.array(y, dtype = np.float32)
print(y.dtype, y.shape)
print(y)
Output:
float32 (6, 3)
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 1. 0.]
[ 1. 0. 0.]
[ 0. 0. 1.]
[ 0. 1. 0.]]
Combination 2
y = [sub_list for sub_list in x]
y = np.array(y, dtype = np.float32)
print(y.dtype, y.shape)
print(y)
Output:
float32 (3, 2, 3)
[[[ 1. 0. 0.]
[ 0. 1. 0.]]
[[ 0. 1. 0.]
[ 1. 0. 0.]]
[[ 0. 0. 1.]
[ 0. 1. 0.]]]
Your x_train is a nested object containing arrays, so you have to unpack it and reshape it. Here's a general purpose hack:
def unpack(a, aggregate=[]):
for x in a:
if type(x) is float:
aggregate.append(x)
else:
unpack(x, aggregate=aggregate)
return np.array(aggregate)
x_train = unpack(x_train.values).reshape(x_train.shape[0],-1)
Once you've got a dense array (y_train is already dense), you can use a function like the following:
def cast(placeholder, array):
dtype = placeholder.dtype.as_numpy_dtype
return array.astype(dtype)
x_train, y_train = cast(X,x_train), cast(Y,y_train)
I have a numpy zero matrix A of the shape (2, 5).
A = [[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]]
I have another array seq of size 2. This is same as the first axis of A.
seq = [2, 3]
I want to create another matrix B which looks like this:
B = [[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]]
B is constructed by changing the first seq[i] elements in the ith row of A with 1.
This is a toy example. A and seq can be large so efficiency is required. I would be extra thankful if someone knows how to do this in tensorflow.
You can do this in TensorFlow (and with some analogous code in NumPy) as follows:
seq = [2, 3]
b = tf.expand_dims(tf.range(5), 0) # A 1 x 5 matrix.
seq_matrix = tf.expand_dims(seq, 1) # A 2 x 1 matrix.
b_bool = tf.greater(seq_matrix, b) # A 2 x 5 bool matrix.
B = tf.to_int32(b_bool) # A 2 x 5 int matrix.
Example output:
In [7]: b = tf.expand_dims(tf.range(5), 0)
[[0 1 2 3 4]]
In [21]: b_bool = tf.greater(seq_matrix, b)
In [22]: op = sess.run(b_bool)
In [23]: print(op)
[[ True True False False False]
[ True True True False False]]
In [24]: bint = tf.to_int32(b_bool)
In [25]: op = sess.run(bint)
In [26]: print(op)
[[1 1 0 0 0]
[1 1 1 0 0]]
This #mrry's solution, expressed a little differently
In [667]: [[2],[3]]>np.arange(5)
Out[667]:
array([[ True, True, False, False, False],
[ True, True, True, False, False]], dtype=bool)
In [668]: ([[2],[3]]>np.arange(5)).astype(int)
Out[668]:
array([[1, 1, 0, 0, 0],
[1, 1, 1, 0, 0]])
The idea is to compare [2,3] with [0,1,2,3,4] in an 'outer' broadcasting sense. The result is boolean which can be easily changed to 0/1 integers.
Another approach would be to use cumsum (or another ufunc.accumulate function):
In [669]: A=np.zeros((2,5))
In [670]: A[range(2),[2,3]]=1
In [671]: A
Out[671]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 1., 0.]])
In [672]: A.cumsum(axis=1)
Out[672]:
array([[ 0., 0., 1., 1., 1.],
[ 0., 0., 0., 1., 1.]])
In [673]: 1-A.cumsum(axis=1)
Out[673]:
array([[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]])
Or a variation starting with 1's:
In [681]: A=np.ones((2,5))
In [682]: A[range(2),[2,3]]=0
In [683]: A
Out[683]:
array([[ 1., 1., 0., 1., 1.],
[ 1., 1., 1., 0., 1.]])
In [684]: np.minimum.accumulate(A,axis=1)
Out[684]:
array([[ 1., 1., 0., 0., 0.],
[ 1., 1., 1., 0., 0.]])
There is a minimal example of an RNN in the Skflow documentation. The input data is a matrix with shape (4,5). Why is the data split according to the following function for input?:
def input_fn(X):
return tf.split(1, 5, X)
This function returns a list of 5 arrays with shape 4,1
[array([[ 2.],
[ 2.],
[ 3.],
[ 2.]], dtype=float32), array([[ 1.],
[ 2.],
[ 3.],
[ 4.]], dtype=float32), array([[ 2.],
[ 3.],
[ 1.],
[ 5.]], dtype=float32), array([[ 2.],
[ 4.],
[ 2.],
[ 4.]], dtype=float32), array([[ 3.],
[ 5.],
[ 1.],
[ 1.]], dtype=f
and, what is the difference/impact on the RNN between the above function, or defining the function like this? As both input functions run
def input_fn(X):
return tf.split(1, 1, X)
Which returns the following:
[[[ 1., 3., 3., 2., 1.],
[ 2., 3., 4., 5., 6.]]
Presented here:
testRNN(self):
random.seed(42)
import numpy as np
data = np.array(list([[2, 1, 2, 2, 3],
[2, 2, 3, 4, 5],
[3, 3, 1, 2, 1],
[2, 4, 5, 4, 1]]), dtype=np.float32)
# labels for classification
labels = np.array(list([1, 0, 1, 0]), dtype=np.float32)
# targets for regression
targets = np.array(list([10, 16, 10, 16]), dtype=np.float32)
test_data = np.array(list([[1, 3, 3, 2, 1], [2, 3, 4, 5, 6]]))
def input_fn(X):
return tf.split(1, 5, X)
# Classification
classifier = skflow.TensorFlowRNNClassifier(
rnn_size=2, cell_type='lstm', n_classes=2, input_op_fn=input_fn)
classifier.fit(data, labels)
classifier.weights_
classifier.bias_
predictions = classifier.predict(test_data)
self.assertAllClose(predictions, np.array([1, 0]))
I got some sparse matrix like this
>>>import numpy as np
>>>from scipy.sparse import *
>>>A = csr_matrix((np.identity(3)))
>>>print A
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
For better understanding A is something like this:
>>>print A.todense()
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]]
And I would like to have an operator (let us call it op1(n) ) doing this:
>>>A.op1(1)
[[ 0. 1. 0.]
[ 0. 0. 1.]
[ 1. 0. 0.]]
=> makes the last n columns the first n ones,
so
>>>A == A.op1(3)
true
. Is there some build-in solution, (EDIT:) that returns a sparse matrix again?
The solution with roll:
X = np.roll(X.todense(),-tau, axis = 0)
print X.__class__
returns
<class 'numpy.matrixlib.defmatrix.matrix'>
scipy.sparse doesn't have roll, but you can simulate it with hstack:
from scipy.sparse import *
A = eye(3, 3, format='csr')
hstack((A[:, 1:], A[:, :1]), format='csr') # roll left
hstack((A[:, -1:], A[:, :-1]), format='csr') # roll right
>>> a = np.identity(3)
>>> a
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
>>> np.roll(a, -1, axis=0)
array([[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.]])
>>> a == np.roll(a, 3, axis=0)
array([[ True, True, True],
[ True, True, True],
[ True, True, True]], dtype=bool)