I am trying to import some pytorch code to tensorflow, I came to know that torch.nn.functional.conv1d() is tf.nn.conv1d() but I am afraid there are still some discrepancies in tf's versions. Specifically, I cannot find the group parameter in tf.conv1d. For example: the following codes output two different results:
Pytorch:
inputs = torch.Tensor([[[1, 1, 1, 1],[2, 2, 2, 2],[3, 3, 3, 3]]]) #batch_sizex seq_length x embed_dim,
inputs = inputs.transpose(2,1) #batch_size x embed_dim x seq_length
batch_size, embed_dim, seq_length = inputs.size()
kernel_size = 3
in_channels = 2
out_channels = in_channels
weight = torch.ones(out_channels, 1, kernel_size)
inputs = inputs.contiguous().view(-1, in_channels, seq_length) #batch_size*embed_dim/in_channels x in_channels x seq_length
inputs = F.pad(inputs, (kernel_size-1,0), 'constant', 0)
output = F.conv1d(inputs, weight, padding=0, groups=in_channels)
output = output.contiguous().view(batch_size, embed_dim, seq_length).transpose(2,1)
Output:
tensor([[[1., 1., 1., 1.],
[3., 3., 3., 3.],
[6., 6., 6., 6.]]])
Tensorflow:
inputs = tf.constant([[[1, 1, 1, 1],[2, 2, 2, 2],[3, 3, 3, 3]]], dtype=tf.float32) #batch_sizex seq_length x embed_dim
inputs = tf.transpose(inputs, perm=[0,2,1])
batch_size, embed_dim, seq_length = inputs.get_shape()
print(batch_size, seq_length, embed_dim)
kernel_size = 3
in_channels = 2
out_channels = in_channels
weight = tf.ones([kernel_size, in_channels, out_channels])
inputs = tf.reshape(inputs, [(batch_size*embed_dim)//in_channels, in_channels, seq_length], name='inputs')
inputs = tf.transpose(inputs, perm=[0, 2, 1])
padding = [[0, 0], [(kernel_size - 1), 0], [0, 0]]
padded = tf.pad(inputs, padding)
res = tf.nn.conv1d(padded, weight, 1, 'VALID')
res = tf.transpose(res, perm=[0, 2, 1])
res = tf.reshape(res, [batch_size, embed_dim, seq_length])
res = tf.transpose(res, perm=[0, 2, 1])
print(res)
Output:
[[[ 2. 2. 2. 2.]
[ 6. 6. 6. 6.]
[12. 12. 12. 12.]]], shape=(1, 3, 4), dtype=float32)
Different results
There is no discrepancy between those versions, you are just setting up different things. To get exactly same results as in Tensorflow change the lines specifying weights to:
weight = torch.ones(out_channels, 2, kernel_size)
, because your input has two input channels, as you have correctly declared in TF:
weight = tf.ones([kernel_size, in_channels, out_channels])
Groups parameter
You have misunderstood what is groups parameter responsible for in pytorch. It restricts the number of channels each filter uses (in this case only one as 2 input_channels divided by 2 give us one).
See here for more intuitive explanation for 2D convolution.
Related
I have a dataset which contains many snapshot observations in time and a 1 or 0 as a label for each observation. Lets say each observation contains 3 features. I am wanting to train an LSTM which will take a sequence of n observations and attempt to classify nth observation as a 1 or 0.
So if we have a dataset that looks like this:
# X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
# y = [1, 0, 1, 0]
# so X[0] = y[0], X[1] = y[1]
# . and I would like to input X[0] + X[1] to classify X[1] as y[1]
# . How would I need to structure this below?
X = [[0, 1, 1], [1, 0, 0], [1, 1, 1], [1, 1, 0]]
y = [1, 0, 1, 0]
def create_model():
model = Sequential()
# input_shape[0] is equal to 2 timesteps?
# input_shape[1] is equal to the 3 features per row?
model.add(LSTM(20, input_shape=(2, 3)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
m = create_model()
m.fit(X, y)
So I want X[0] and X[1] to be the input for one iteration of training and should be classified as y[1].
My question is this. How do I structure the model in order to take this input properly? I am very confused by input_shape, features, input_length, batches etc ...
The below code snippet might help clarify:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
# Number of samples = 4, sequence length = 3, features = 2
X = np.array( [ [ [0, 1], [1, 0,], [1, 1] ],
[ [1, 1], [1, 1,], [1, 0] ],
[ [0, 1], [1, 0,], [0, 0] ],
[ [1, 1], [1, 1,], [1, 1] ]] )
y = np.array([[1], [0], [1], [0]])
print(X)
print(X.shape)
print(y.shape)
model = Sequential()
model.add(LSTM(20, input_shape=(3, 2)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X, y)
Also, on the Keras documentation page: https://keras.io/getting-started/sequential-model-guide/ look at the example for "Stacked LSTM for sequence classification" near the bottom. It might help.
In general using Keras, the batch dimension/sample dimension is not specified in layers - it is automatically inferred from the input data.
I hope this helps.
You have the input shape correct.
I would reshape the input data to be (batch_size, timesteps, features)
m = create_model()
X.reshape((batch_size, 2, 3))
m.fit(X, y)
Common batch sizes are 4, 8 , 16, 32 but for small dataset the impact of the batch size is less important.
And when you want to predict use batch_size = 1
According to TF document, the the sample_weight argument can have shape [batch_size]. The relevant documentation is quoted below:
sample_weight: Optional Tensor whose rank is either 0, or the same rank as y_true, or is broadcastable to y_true. sample_weight acts as a coefficient for the loss. If a scalar is provided, then the loss is simply scaled by the given value. If sample_weight is a tensor of size [batch_size], then the total loss for each sample of the batch is rescaled by the corresponding element in the sample_weight vector. If the shape of sample_weight matches the shape of y_pred, then the loss of each measurable element of y_pred is scaled by the corresponding value of sample_weight.
However, I cannot understand why the following code does not work.
import tensorflow as tf
gt = tf.convert_to_tensor([1, 1, 1, 1, 1])
pred = tf.convert_to_tensor([1., 0., 1., 1., 0.])
sample_weights = tf.convert_to_tensor([0, 1, 0, 0, 0])
loss = tf.keras.losses.BinaryCrossentropy()(gt, pred, sample_weight=sample_weights)
print(loss)
The code throw this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Can not squeeze dim[0], expected a dimension of 1, got 5 [Op:Squeeze]
If I expand the dimensions of gt, pred, and sample_weights, then it works properly and output the expected loss value of 3.0849898.
import tensorflow as tf
gt = tf.convert_to_tensor([1, 1, 1, 1, 1])
pred = tf.convert_to_tensor([1., 0., 1., 1., 0.])
sample_weights = tf.convert_to_tensor([0, 1, 0, 0, 0])
# expand dims
gt = tf.expand_dims(gt, 1)
pred = tf.expand_dims(pred, 1)
sample_weights = tf.expand_dims(sample_weights, 1)
loss = tf.keras.losses.BinaryCrossentropy()(gt, pred, sample_weight=sample_weights)
print(loss) # loss is 3.0849898
The problem is not about sample_weight shape. It's pred and gt shape which should be [batch_size, n_labels]:
import tensorflow as tf
gt = tf.convert_to_tensor([1, 1, 1, 1, 1])
pred = tf.convert_to_tensor([1., 0., 1., 1., 0.])
sample_weights = tf.convert_to_tensor([0, 1, 0, 0, 0])
# expand dims
gt = tf.expand_dims(gt, 1)
pred = tf.expand_dims(pred, 1)
print(gt.shape, pred.shape) #(5, 1) (5, 1)
loss = tf.keras.losses.BinaryCrossentropy()(gt, pred, sample_weight=sample_weights)
print(loss) # loss is 3.0849898
I have small matrix 4*4, I want to filter it with two different filters in TensorFlow (1.8.0). I have an example with one filter (my_filter):
I want to change the filter to
my_filter = tf.constant([0.2,0.5], shape=[2, 2, 3, 1])
One will be 2*2 all 0.25 other 2*2 all 0.5. But how to set the values?
This is my code:
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
x_shape = [1, 4, 4, 1]
x_val = np.ones(shape=x_shape)
x_val[0,1,1,0]=5
print(x_val)
x_data = tf.placeholder(tf.float32, shape=x_shape)
my_filter = tf.constant(0.25, shape=[2, 2, 1, 1])
my_strides = [1, 2, 2, 1]
mov_avg_layer= tf.nn.conv2d(x_data, my_filter, my_strides,
padding='SAME', name='Moving_Avg_Window')
# Execute the operations
with tf.Session() as sess:
#print(x_data.eval())
result =sess.run(mov_avg_layer,feed_dict={x_data: x_val})
print("Filter: " , result)
print("Filter: " , result.shape)
sess.close()
First option
The filter can also be defined as a placeholder
filter = tf.placeholder(filter_type, filter_shape)
...
with tf.Session() as sess:
for i in range (number_filters) :
result =sess.run(mov_avg_layer,feed_dict={x_data: x_val, filter: filter_val})
Second option
define a second filter in the graph
my_filter = tf.constant(0.25, shape=[2, 2, 1, 1])
my_filter2 = tf.constant(0.5, shape=[2, 2, 1, 1])
mov_avg_layer= tf.nn.conv2d(x_data, my_filter, my_strides,
padding='SAME', name='Moving_Avg_Window')
mov_avg_laye2= tf.nn.conv2d(x_data, my_filter2, my_strides,
padding='SAME', name='Moving_Avg_Window')
...
with tf.Session() as sess:
result1, result2 =sess.run([mov_avg_layer1, mov_avg_layer2],feed_dict={x_data: x_val})
sess.close()
In the api of tf.contrib.rnn.DropoutWrapper, I am trying to set variational_recurrent=True, in which case, input_size is mandatory. As explained, input_size is TensorShape objects containing the depth(s) of the input tensors.
depth(s) is confusing, what is it please? Is it just the shape of the tensor as we can get by tf.shape()? Or the number of channels for the special case of images? But my input tensor is not an image.
And I don't understand why dtype is demanded when variational_recurrent=True.
Thanks!
Inpput_size for tf.TensorShape([200, None, 300]) is just 300
Play with this example.
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see TF issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"
import tensorflow as tf
import numpy as np
n_steps = 2
n_inputs = 3
n_neurons = 5
keep_prob = 0.5
learning_rate = 0.001
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seqs = tf.unstack(tf.transpose(X, perm=[1, 0, 2]))
basic_cell = tf.contrib.rnn.BasicLSTMCell(num_units=n_neurons)
basic_cell_drop = tf.contrib.rnn.DropoutWrapper(
basic_cell,
input_keep_prob=keep_prob,
variational_recurrent=True,
dtype=tf.float32,
input_size=n_inputs)
output_seqs, states = tf.contrib.rnn.static_rnn(
basic_cell_drop,
X_seqs,
dtype=tf.float32)
outputs = tf.transpose(tf.stack(output_seqs), perm=[1, 0, 2])
init = tf.global_variables_initializer()
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 1
[[3, 4, 5], [0, 0, 0]], # instance 2
[[6, 7, 8], [6, 5, 4]], # instance 3
[[9, 0, 1], [3, 2, 1]], # instance 4
])
with tf.Session() as sess:
init.run()
outputs_val = outputs.eval(feed_dict={X: X_batch})
print(outputs_val)
See this for more details: https://github.com/tensorflow/tensorflow/issues/7927
I have the following matrix:
and the following kernel:
If I do a convolution with no padding and slide by 1 row, I should get the following answer:
Because:
Based the documentation of tf.nn.conv2d, I thought this code expresses what I just described above:
import tensorflow as tf
input_batch = tf.constant([
[
[[.0], [1.0]],
[[2.], [3.]]
]
])
kernel = tf.constant([
[
[[1.0, 2.0]]
]
])
conv2d = tf.nn.conv2d(input_batch, kernel, strides=[1, 1, 1, 1], padding='VALID')
sess = tf.Session()
print(sess.run(conv2d))
But it produces this output:
[[[[ 0. 0.]
[ 1. 2.]]
[[ 2. 4.]
[ 3. 6.]]]]
And I have no clue how that is computed. I've tried experimenting with different values for the strides padding parameter but still am not able to produce the result I expected.
You have not correctly read my explanation in the tutorial you linked. After a straight-forward modification of no-padding, strides=1 you suppose to get the following code.
import tensorflow as tf
k = tf.constant([
[1, 2],
], dtype=tf.float32, name='k')
i = tf.constant([
[0, 1],
[2, 3],
], dtype=tf.float32, name='i')
kernel = tf.reshape(k, [1, 2, 1, 1], name='kernel')
image = tf.reshape(i, [1, 2, 2, 1], name='image')
res = tf.squeeze(tf.nn.conv2d(image, kernel, [1, 1, 1, 1], "VALID"))
# VALID means no padding
with tf.Session() as sess:
print sess.run(res)
Which gives you the result you expected: [2., 8.]. Here I got a vector instead of the column because of squeeze operator.
One problem I see with your code (there might be other) is that your kernel is of the shape (1, 1, 1, 2), but it suppose to be (1, 2, 1, 1).