How do I structure 3D Input properly for Keras LSTM - pandas

I have a dataset which contains many snapshot observations in time and a 1 or 0 as a label for each observation. Lets say each observation contains 3 features. I am wanting to train an LSTM which will take a sequence of n observations and attempt to classify nth observation as a 1 or 0.
So if we have a dataset that looks like this:
# X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
# y = [1, 0, 1, 0]
# so X[0] = y[0], X[1] = y[1]
# . and I would like to input X[0] + X[1] to classify X[1] as y[1]
# . How would I need to structure this below?
X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
y = [1, 0, 1, 0]
def create_model():
model = Sequential()
# input_shape[0] is equal to 2 timesteps?
# input_shape[1] is equal to the 3 features per row?
model.add(LSTM(20, input_shape=(2, 3)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
m = create_model()
m.fit(X, y)
So I want X[0] and X[1] to be the input for one iteration of training and should be classified as y[1].
My question is this. How do I structure the model in order to take this input properly? I am very confused by input_shape, features, input_length, batches etc ...

The below code snippet might help clarify:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
# Number of samples = 4, sequence length = 3, features = 2
X = np.array( [ [ [0, 1], [1, 0,], [1, 1] ],
[ [1, 1], [1, 1,], [1, 0] ],
[ [0, 1], [1, 0,], [0, 0] ],
[ [1, 1], [1, 1,], [1, 1] ]] )
y = np.array([[1], [0], [1], [0]])
print(X)
print(X.shape)
print(y.shape)
model = Sequential()
model.add(LSTM(20, input_shape=(3, 2)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X, y)
Also, on the Keras documentation page: https://keras.io/getting-started/sequential-model-guide/ look at the example for "Stacked LSTM for sequence classification" near the bottom. It might help.
In general using Keras, the batch dimension/sample dimension is not specified in layers - it is automatically inferred from the input data.
I hope this helps.

You have the input shape correct.
I would reshape the input data to be (batch_size, timesteps, features)
m = create_model()
X.reshape((batch_size, 2, 3))
m.fit(X, y)
Common batch sizes are 4, 8 , 16, 32 but for small dataset the impact of the batch size is less important.
And when you want to predict use batch_size = 1

Related

Tensorflow conv2d on RGB image

From the accepted answer in this question,
given the following
input and kernel matrices, the output of tf.nn.conv2d is
[[14 6]
[6 12]]
which makes sense. However, when I make the input and kernel matrices have 3-channels each (by repeating each original matrix), and run the same code:
# the previous input
i_grey = np.array([
[4, 3, 1, 0],
[2, 1, 0, 1],
[1, 2, 4, 1],
[3, 1, 0, 2]
])
# copy to 3-dimensions
i_rgb = np.repeat( np.expand_dims(i_grey, axis=0), 3, axis=0 )
# convert to tensor
i_rgb = tf.constant(i_rgb, dtype=tf.float32)
# make kernel depth match input; same process as input
k = np.array([
[1, 0, 1],
[2, 1, 0],
[0, 0, 1]
])
k_rgb = np.repeat( np.expand_dims(k, axis=0), 3, axis=0 )
# convert to tensor
k_rgb = tf.constant(k_rgb, dtype=tf.float32)
here's what my input and kernel matrices look like at this point
# reshape input to format: [batch, in_height, in_width, in_channels]
image_rgb = tf.reshape(i_rgb, [1, 4, 4, 3])
# reshape kernel to format: [filter_height, filter_width, in_channels, out_channels]
kernel_rgb = tf.reshape(k_rgb, [3, 3, 3, 1])
conv_rgb = tf.squeeze( tf.nn.conv2d(image_rgb, kernel_rgb, [1,1,1,1], "VALID") )
with tf.Session() as sess:
conv_result = sess.run(conv_rgb)
print(conv_result)
I get the final output:
[[35. 15.]
[35. 26.]]
But I was expecting the original output*3:
[[42. 18.]
[18. 36.]]
because from my understanding, each channel of the kernel is convolved with each channel of the input, and the resultant matrices are summed to get the final output.
Am I missing something from this process or the tensorflow implementation?
Reshape is a tricky function. It will produce you the shape you want, but can easily ground things together. In cases like yours, one should avoid using reshape by all means.
In that particular case instead, it is better to duplicate the arrays along the new axis. When using [batch, in_height, in_width, in_channels] channels is the last dimension and it should be used in repeat() function. Next code should better reflect the logic behind it:
i_grey = np.expand_dims(i_grey, axis=0) # add batch dim
i_grey = np.expand_dims(i_grey, axis=3) # add channel dim
i_rgb = np.repeat(i_grey, 3, axis=3 ) # duplicate along channels dim
And likewise with filters:
k = np.expand_dims(k, axis=2) # input channels dim
k = np.expand_dims(k, axis=3) # output channels dim
k_rgb = np.repeat(k, 3, axis=2) # duplicate along the input channels dim

use tf.shape() on tensorflow placeholder

Let's looke at this simple made up tf operation:
data = np.random.rand(1,2,3)
x = tf.placeholder(tf.float32, shape=[None, None, None], name='x_pl')
out = x
print ('shape:', tf.shape(out))
sess = tf.Session()
sess.run(out, feed_dict={x: data})
and the print is:
shape: Tensor("Shape_13:0", shape=(3,), dtype=int32)
I read that you should use tf.shape() to get the 'dynamic' shape of the tensor, which seems to be what I need, but why the shape is shape=(3,)?
why it is not (1,2,3)? as it should be determined when the session is run?
suppose this is part of a neural network where I need to know the last dimension of x, for example, to pass x into a Dense layer, for which the last dimension of x needed to be known.
how do it do it then?
It is because tf.shape() is an op and you have to run it within a session.
data = np.random.rand(1,2,3)
x = tf.placeholder(tf.float32, shape=[None, None, None], name='x_pl')
out = x
print ('shape:', tf.shape(out))
z = tf.shape(out)
sess = tf.Session()
out_, z_ =sess.run([out,z], feed_dict={x: data})
print(f"shape of out: {z_}")
will return
shape: Tensor("Shape:0", shape=(3,), dtype=int32)
shape of out: [1 2 3]
Even if you look at the example from the docs (https://www.tensorflow.org/api_docs/python/tf/shape):
t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
tf.shape(t)
If you run it just like that it will return something like
<tf.Tensor 'Shape_4:0' shape=(3,) dtype=int32>
but if you run it within a session then you will get the expected result
t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
print(sess.run(tf.shape(t)))
[2 2 3]

use variational_recurrent in tf.contrib.rnn.DropoutWrapper

In the api of tf.contrib.rnn.DropoutWrapper, I am trying to set variational_recurrent=True, in which case, input_size is mandatory. As explained, input_size is TensorShape objects containing the depth(s) of the input tensors.
depth(s) is confusing, what is it please? Is it just the shape of the tensor as we can get by tf.shape()? Or the number of channels for the special case of images? But my input tensor is not an image.
And I don't understand why dtype is demanded when variational_recurrent=True.
Thanks!
Inpput_size for tf.TensorShape([200, None, 300]) is just 300
Play with this example.
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see TF issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"
import tensorflow as tf
import numpy as np
n_steps = 2
n_inputs = 3
n_neurons = 5
keep_prob = 0.5
learning_rate = 0.001
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seqs = tf.unstack(tf.transpose(X, perm=[1, 0, 2]))
basic_cell = tf.contrib.rnn.BasicLSTMCell(num_units=n_neurons)
basic_cell_drop = tf.contrib.rnn.DropoutWrapper(
basic_cell,
input_keep_prob=keep_prob,
variational_recurrent=True,
dtype=tf.float32,
input_size=n_inputs)
output_seqs, states = tf.contrib.rnn.static_rnn(
basic_cell_drop,
X_seqs,
dtype=tf.float32)
outputs = tf.transpose(tf.stack(output_seqs), perm=[1, 0, 2])
init = tf.global_variables_initializer()
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 1
[[3, 4, 5], [0, 0, 0]], # instance 2
[[6, 7, 8], [6, 5, 4]], # instance 3
[[9, 0, 1], [3, 2, 1]], # instance 4
])
with tf.Session() as sess:
init.run()
outputs_val = outputs.eval(feed_dict={X: X_batch})
print(outputs_val)
See this for more details: https://github.com/tensorflow/tensorflow/issues/7927

How to get tensorflow to do a convolution on a 2 x 2 matrix with a 1 x 2 kernel?

I have the following matrix:
and the following kernel:
If I do a convolution with no padding and slide by 1 row, I should get the following answer:
Because:
Based the documentation of tf.nn.conv2d, I thought this code expresses what I just described above:
import tensorflow as tf
input_batch = tf.constant([
[
[[.0], [1.0]],
[[2.], [3.]]
]
])
kernel = tf.constant([
[
[[1.0, 2.0]]
]
])
conv2d = tf.nn.conv2d(input_batch, kernel, strides=[1, 1, 1, 1], padding='VALID')
sess = tf.Session()
print(sess.run(conv2d))
But it produces this output:
[[[[ 0. 0.]
[ 1. 2.]]
[[ 2. 4.]
[ 3. 6.]]]]
And I have no clue how that is computed. I've tried experimenting with different values for the strides padding parameter but still am not able to produce the result I expected.
You have not correctly read my explanation in the tutorial you linked. After a straight-forward modification of no-padding, strides=1 you suppose to get the following code.
import tensorflow as tf
k = tf.constant([
[1, 2],
], dtype=tf.float32, name='k')
i = tf.constant([
[0, 1],
[2, 3],
], dtype=tf.float32, name='i')
kernel = tf.reshape(k, [1, 2, 1, 1], name='kernel')
image = tf.reshape(i, [1, 2, 2, 1], name='image')
res = tf.squeeze(tf.nn.conv2d(image, kernel, [1, 1, 1, 1], "VALID"))
# VALID means no padding
with tf.Session() as sess:
print sess.run(res)
Which gives you the result you expected: [2., 8.]. Here I got a vector instead of the column because of squeeze operator.
One problem I see with your code (there might be other) is that your kernel is of the shape (1, 1, 1, 2), but it suppose to be (1, 2, 1, 1).

reduce_sum by certain dimension

I have two embeddings tensor A and B, which looks like
[
[1,1,1],
[1,1,1]
]
and
[
[0,0,0],
[1,1,1]
]
what I want to do is calculate the L2 distance d(A,B) element-wise.
First I did a tf.square(tf.sub(lhs, rhs)) to get
[
[1,1,1],
[0,0,0]
]
and then I want to do an element-wise reduce which returns
[
3,
0
]
but tf.reduce_sum does not allow my to reduce by row. Any inputs would be appreciated. Thanks.
Add the reduction_indices argument with a value of 1, eg.:
tf.reduce_sum( tf.square( tf.sub( lhs, rhs) ), 1 )
That should produce the result you're looking for. Here is the documentation on reduce_sum().
According to TensorFlow documentation, reduce_sum function which takes four arguments.
tf.reduce_sum(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None).
But reduction_indices has been deprecated. Better to use axis instead of. If the axis is not set, reduces all its dimensions.
As an example,this is taken from the documentation,
# 'x' is [[1, 1, 1]
# [1, 1, 1]]
tf.reduce_sum(x) ==> 6
tf.reduce_sum(x, 0) ==> [2, 2, 2]
tf.reduce_sum(x, 1) ==> [3, 3]
tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]
tf.reduce_sum(x, [0, 1]) ==> 6
Above requirement can be written in this manner,
import numpy as np
import tensorflow as tf
a = np.array([[1,7,1],[1,1,1]])
b = np.array([[0,0,0],[1,1,1]])
xtr = tf.placeholder("float", [None, 3])
xte = tf.placeholder("float", [None, 3])
pred = tf.reduce_sum(tf.square(tf.subtract(xtr, xte)),1)
# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
nn_index = sess.run(pred, feed_dict={xtr: a, xte: b})
print nn_index