Let's looke at this simple made up tf operation:
data = np.random.rand(1,2,3)
x = tf.placeholder(tf.float32, shape=[None, None, None], name='x_pl')
out = x
print ('shape:', tf.shape(out))
sess = tf.Session()
sess.run(out, feed_dict={x: data})
and the print is:
shape: Tensor("Shape_13:0", shape=(3,), dtype=int32)
I read that you should use tf.shape() to get the 'dynamic' shape of the tensor, which seems to be what I need, but why the shape is shape=(3,)?
why it is not (1,2,3)? as it should be determined when the session is run?
suppose this is part of a neural network where I need to know the last dimension of x, for example, to pass x into a Dense layer, for which the last dimension of x needed to be known.
how do it do it then?
It is because tf.shape() is an op and you have to run it within a session.
data = np.random.rand(1,2,3)
x = tf.placeholder(tf.float32, shape=[None, None, None], name='x_pl')
out = x
print ('shape:', tf.shape(out))
z = tf.shape(out)
sess = tf.Session()
out_, z_ =sess.run([out,z], feed_dict={x: data})
print(f"shape of out: {z_}")
will return
shape: Tensor("Shape:0", shape=(3,), dtype=int32)
shape of out: [1 2 3]
Even if you look at the example from the docs (https://www.tensorflow.org/api_docs/python/tf/shape):
t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
tf.shape(t)
If you run it just like that it will return something like
<tf.Tensor 'Shape_4:0' shape=(3,) dtype=int32>
but if you run it within a session then you will get the expected result
t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
print(sess.run(tf.shape(t)))
[2 2 3]
Related
I have a dataset which contains many snapshot observations in time and a 1 or 0 as a label for each observation. Lets say each observation contains 3 features. I am wanting to train an LSTM which will take a sequence of n observations and attempt to classify nth observation as a 1 or 0.
So if we have a dataset that looks like this:
# X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
# y = [1, 0, 1, 0]
# so X[0] = y[0], X[1] = y[1]
# . and I would like to input X[0] + X[1] to classify X[1] as y[1]
# . How would I need to structure this below?
X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
y = [1, 0, 1, 0]
def create_model():
model = Sequential()
# input_shape[0] is equal to 2 timesteps?
# input_shape[1] is equal to the 3 features per row?
model.add(LSTM(20, input_shape=(2, 3)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
m = create_model()
m.fit(X, y)
So I want X[0] and X[1] to be the input for one iteration of training and should be classified as y[1].
My question is this. How do I structure the model in order to take this input properly? I am very confused by input_shape, features, input_length, batches etc ...
The below code snippet might help clarify:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
# Number of samples = 4, sequence length = 3, features = 2
X = np.array( [ [ [0, 1], [1, 0,], [1, 1] ],
[ [1, 1], [1, 1,], [1, 0] ],
[ [0, 1], [1, 0,], [0, 0] ],
[ [1, 1], [1, 1,], [1, 1] ]] )
y = np.array([[1], [0], [1], [0]])
print(X)
print(X.shape)
print(y.shape)
model = Sequential()
model.add(LSTM(20, input_shape=(3, 2)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X, y)
Also, on the Keras documentation page: https://keras.io/getting-started/sequential-model-guide/ look at the example for "Stacked LSTM for sequence classification" near the bottom. It might help.
In general using Keras, the batch dimension/sample dimension is not specified in layers - it is automatically inferred from the input data.
I hope this helps.
You have the input shape correct.
I would reshape the input data to be (batch_size, timesteps, features)
m = create_model()
X.reshape((batch_size, 2, 3))
m.fit(X, y)
Common batch sizes are 4, 8 , 16, 32 but for small dataset the impact of the batch size is less important.
And when you want to predict use batch_size = 1
I have a tensor with shape NxM.
I'd like to create another tensor with the same shape, filled with ones up until a certain column (might be different for each row) and the rest of it filled with another value (let's say 10 for the example).
How I do that?
Something like this can help you:
input = tf.Variable([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]], dtype = tf.float32)
indices = tf.constant([1, 4, 2])
X = tf.ones_like(input)
Y = tf.constant(10, dtype = tf.float32, shape = input.shape)
result = tf.where(tf.sequence_mask(indices, tf.shape(input)[1]), X, Y)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(input))
print(sess.run(indices))
print(sess.run(result))
In numpy, it could be easily done as
>>> img
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=int32)
>>> img[img>5] = [1,2,3,4]
>>> img
array([[1, 2, 3],
[4, 5, 1],
[2, 3, 4]], dtype=int32)
However, there seems not exist similar operation in tensorflow.
You can never assign a value to a tensor in tensorflow as the change in tensor value is not traceable by backpropagation, but you can still get another tensor from origin tensor, here is a solution
import tensorflow as tf
tf.enable_eager_execution()
img = tf.constant(list(range(1, 10)), shape=[3, 3])
replace_mask = img > 5
keep_mask = tf.logical_not(replace_mask)
keep = tf.boolean_mask(img, keep_mask)
keep_index = tf.where(keep_mask)
replace_index = tf.where(replace_mask)
replace = tf.random_uniform((tf.shape(replace_index)[0],), 0, 10, tf.int32)
updates = tf.concat([keep, replace], axis=0)
indices = tf.concat([keep_index, replace_index], axis=0)
result = tf.scatter_nd(tf.cast(indices, tf.int32), updates, shape=tf.shape(img))
Actually there is a way to achieve this. Very similar to #Jie.Zhou's answer, you can replace tf.constant with tf.Variable, then replace tf.scatter_nd with tf.scatter_nd_update
I'm trying to build a neural network where the labels and the number of labels change on input. For example, I could have a final layer of 10 units that represent the logit of their class, but sometimes I will only need units [1,3,4] to calculate cross entropy, some of the units [3,4,5,7] etc.
I tried using different combinations of map_fn, gather, py_fn and while_loop but no one seems to be in my case. Another way might be to list all types of label combinations (I call them network heads) and find some conditional constructs that allow me to choose one based on the value of a placeholder. But I'm not sure how to implement it.
For example:
x = tf.placeholder(dtype=tf.float32, shape=[None,3])
y = tf.placeholder(dtype=tf.int32, shape=[None, 3])
... to_do ...
with tf.Session() as sess:
sess.run(to_do, feed_dict={x: [[1, 3, 4], [3, 7, 8]], y: [[1, 0, 0], [0, 1, 1]]})
Here I need something that return [[1],[7,8]].
Oh no matter. There was a very easy way to get the probabilites I needed for cross-entropy.
x = tf.placeholder(dtype=tf.float32, shape=[None,3])
y = tf.placeholder(dtype=tf.int32, shape=[None, 3])
probabilities = tf.where(tf.equal(y,1), tf.exp(x), tf.zeros_like(x))
normalizing_sum = tf.reduce_sum(probabilities, 1, keep_dims=True)
probabilities/=normalizing_sum
with tf.Session() as sess:
res = sess.run(probabilities, feed_dict={x: [[1, 3, 4], [3, 7, 8]], y: [[1, 0, 0], [0, 1, 1]]})
In the api of tf.contrib.rnn.DropoutWrapper, I am trying to set variational_recurrent=True, in which case, input_size is mandatory. As explained, input_size is TensorShape objects containing the depth(s) of the input tensors.
depth(s) is confusing, what is it please? Is it just the shape of the tensor as we can get by tf.shape()? Or the number of channels for the special case of images? But my input tensor is not an image.
And I don't understand why dtype is demanded when variational_recurrent=True.
Thanks!
Inpput_size for tf.TensorShape([200, None, 300]) is just 300
Play with this example.
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see TF issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"
import tensorflow as tf
import numpy as np
n_steps = 2
n_inputs = 3
n_neurons = 5
keep_prob = 0.5
learning_rate = 0.001
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seqs = tf.unstack(tf.transpose(X, perm=[1, 0, 2]))
basic_cell = tf.contrib.rnn.BasicLSTMCell(num_units=n_neurons)
basic_cell_drop = tf.contrib.rnn.DropoutWrapper(
basic_cell,
input_keep_prob=keep_prob,
variational_recurrent=True,
dtype=tf.float32,
input_size=n_inputs)
output_seqs, states = tf.contrib.rnn.static_rnn(
basic_cell_drop,
X_seqs,
dtype=tf.float32)
outputs = tf.transpose(tf.stack(output_seqs), perm=[1, 0, 2])
init = tf.global_variables_initializer()
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 1
[[3, 4, 5], [0, 0, 0]], # instance 2
[[6, 7, 8], [6, 5, 4]], # instance 3
[[9, 0, 1], [3, 2, 1]], # instance 4
])
with tf.Session() as sess:
init.run()
outputs_val = outputs.eval(feed_dict={X: X_batch})
print(outputs_val)
See this for more details: https://github.com/tensorflow/tensorflow/issues/7927