According to the documentation, the second argument of static_rnn should be set to "a length T list of inputs, each a Tensor of shape [batch_size, input_size], or a nested tuple of such elements."
I passed a list of columns to static_rnn, but I get ValueError: linear is expecting 2D arguments. So input_size can't be 1. What exactly does input_size and T refer to? Why can't input_size be 1?
It occurs to me that static_rnn might expect a list whose matrices contain one-hot vectors. In this case, input size would be the vocabulary length. But if static_rnn requires one-hot vectors, the documentation would say so, right?
input_size denotes the number of the features and it can be 1, e.g, in ordinary time series prediction. You got this error, most probably, because your tensors are [batch_size], not [batch_size, 1].
So, you don't have to one-hot encode your features (though you can), just set the right rank of input tensors.
Sample code:
n_inputs = 1
n_neurons = 5
X0 = tf.placeholder(dtype=tf.float32, shape=[None, n_inputs])
X1 = tf.placeholder(dtype=tf.float32, shape=[None, n_inputs])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output_seqs, states = tf.nn.static_rnn(basic_cell, [X0, X1], dtype=tf.float32)
Y0, Y1 = output_seqs
Related
My codes are as follow:
v = tf.Variable(initial_value=v, trainable=True)
v.shape is (1, 768)
In the model:
inputs_sents = keras.Input(shape=(50,3))
inputs_events = keras.Input(shape=(50,768))
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
But I got an error,
ValueError: Dimensions must be equal, but are 768 and 50 for
'{{node BatchMatMulV2_3}} =
BatchMatMulV2[T=DT_FLOAT,
adj_x=false,
adj_y=false](BatchMatMulV2_3/ReadVariableOp,
Transpose_3)' with input shapes: [1,768], [768,50,?]
I think it takes consideration of the batch? But how shall I deal with this?
v is a trainable vector (or 2d array with first dimension being 1), I want it to be trained in the training process.
PS: This is the result I got using the codes provided by the first answer, I think it is incorrect cause keras already takes consideration of the first batch dimension.
Plus, from the keras documentation,
shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
https://keras.io/api/layers/core_layers/input/
Should I rewrite my codes without keras?
The shape of a batch is denoted by None:
import numpy as np
inputs_sents = keras.Input(shape=(None,1,3))
inputs_events = keras.Input(shape=(None,1,768))
v = np.ones(shape=(1,768), dtype=np.float32)
v = tf.Variable(initial_value=v, trainable=True)
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
I have an idea for a tensor operation that would not be difficult to implement via iteration, with batch size one. However I would like to parallelize it as much as possible.
I have two tensors with shape (n, 5) called X and Y. X is actually supposed to represent 5 one-dimensional tensors with shape (n, 1): (x_1, ..., x_n). Ditto for Y.
I would like to compute a tensor with shape (n, 25) where each column represents the output of the tensor operation f(x_i, y_j), where f is fixed for all 1 <= i, j <= 5. The operation f has output shape (n, 1), just like x_i and y_i.
I feel it is important to clarify that f is essentially a fully-connected layer from the concatenated [...x_i, ...y_i] tensor with shape (1, 10), to an output layer with shape (1,5).
Again, it is easy to see how to do this manually with iteration and slicing. However this is probably very slow. Performing this operation in batches, where the tensors X, Y now have shape (n, 5, batch_size) is also desirable, particularly for mini-batch gradient descent.
It is difficult to really articulate here why I desire to create this network; I feel it is suited for my domain of 'itemized tabular data' and cuts down significantly on the number of weights per operation, compared to a fully connected network.
Is this possible using tensorflow? Certainly not using just keras.
Below is an example in numpy per AloneTogether's request
import numpy as np
features = 16
batch_size = 256
X_batch = np.random.random((features, 5, batch_size))
Y_batch = np.random.random((features, 5, batch_size))
# one tensor operation to reduce weights in this custom 'layer'
f = np.random.random((features, 2 * features))
for b in range(batch_size):
X = X_batch[:, :, b]
Y = Y_batch[:, :, b]
for i in range(5):
x_i = X[:, i:i+1]
for j in range(5):
y_j = Y[:, j:j+1]
x_i_y_j = np.concatenate([x_i, y_j], axis=0)
# f(x_i, y_j)
# implemented by a fully-connected layer
f_i_j = np.matmul(f, x_i_y_j)
All operations you need (concatenation and matrix multiplication) can be batched.
Difficult part here is, that you want to concatenate features of all items in X with features of all items in Y (all combinations).
My recommended solution is to expand the dimensions of X to [batch, features, 5, 1], expand dimensions of Y to [batch, features, 1, 5]
Than tf.repeat() both tensors so their shapes become [batch, features, 5, 5].
Now you can concatenate X and Y. You will have a tensor of shape [batch, 2*features, 5, 5]. Observe that this way all combinations are built.
Next step is matrix multiplication. tf.matmul() can also do batch matrix multiplication, but I use here tf.einsum() because I want more control over which dimensions are considered as batch.
Full code:
import tensorflow as tf
import numpy as np
batch_size=3
features=6
items=5
x = np.random.uniform(size=[batch_size,features,items])
y = np.random.uniform(size=[batch_size,features,items])
f = np.random.uniform(size=[2*features,features])
x_reps= tf.repeat(x[:,:,:,tf.newaxis], items, axis=3)
y_reps= tf.repeat(y[:,:,tf.newaxis,:], items, axis=2)
xy_conc = tf.concat([x_reps,y_reps], axis=1)
f_i_j = tf.einsum("bfij, fg->bgij", xy_conc,f)
f_i_j = tf.reshape(f_i_j , [batch_size,features,items*items])
I want to feed in a 1-D CNN a sequence of fixed length and want it to make a prediction (regression), but I want to have a variable batch size during training. The tutorials are not really helpful.
In my input layer I have something like this:
input = tf.placeholder(tf.float32, [None, sequence_length], name="input")
y = tf.placeholder(tf.float32, [None, 1], name="y")
so I assume the None dimension, can be the a variable batch size of any number, so the current input dimension is batch_size * sequence_length and I am supposed to feed the network a 2d np array with dimensions any * sequence_length
tf.nn.conv1d expects 3-D, since my input is a single channel that is 1 np array of sequence_length observations the input I will need to feed to the cnn should be 1*batch_size * sequence_length, if I had on the other hand 2 different sequences that I combine to predict a single value in the end it would have been 2*batch_size * sequence_length and I would also need to concatenate the 2 different channels. So in my case I need
input = tf.expand_dims(input, -1)
and then the filter also follow the same:
filter_size = 5
channel_size = 1
num_filters = 10
filter_shape = [filter_size, channel_size, num_filters]
filters = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="filters")
tf.nn.conv1d(value=input, filters=filters, stride=1)
After that I add a FC layer, but the network isn't able to learn anything, even the a basic function such as sin(x), does the code above look correct?
Also how can I do a maxpooling?
When taking the one dimensional convolution of a one dimensional array, I receive an error which suggests my second dimension is not big enough.
Here is the overview of the relevant code:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=1, strides=1, padding='valid')
I am hoping to increase the kernel size to 3 such that neighbouring points also influence the output of each input node, however I get the following error:
ValueError: Negative dimension size caused by subtracting 3 from 1 for
'conv1d_4/convolution/Conv2D' (op: 'Conv2D') with input shapes:
[?,1,1,45], [1,3,45,64].
My guess is that tensorflow is expecting me to reshape my input into two dimensions so that some depth can be used to do the kernel multiplication. Question is why is this the case and what to expect for the layer behaviour based on the input dimensions
You need to add a Channel dimension as last dimension even if you only have one channel.
So this code works:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=-1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=3, strides=1, padding='valid')
So basically the error was caused because your tensor looked like having a width of 1, with 45 channels. TensorFlow was trying to convolve with a kernel size 3 along a size 1 dimension.
I have a problem with which I've been struggling. It is related to tf.matmul() and its absence of broadcasting.
I am aware of a similar issue on https://github.com/tensorflow/tensorflow/issues/216, but tf.batch_matmul() doesn't look like a solution for my case.
I need to encode my input data as a 4D tensor:
X = tf.placeholder(tf.float32, shape=(None, None, None, 100))
The first dimension is the size of a batch, the second the number of entries in the batch.
You can imagine each entry as a composition of a number of objects (third dimension). Finally, each object is described by a vector of 100 float values.
Note that I used None for the second and third dimensions because the actual sizes may change in each batch. However, for simplicity, let's shape the tensor with actual numbers:
X = tf.placeholder(tf.float32, shape=(5, 10, 4, 100))
These are the steps of my computation:
compute a function of each vector of 100 float values (e.g., linear function)
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
Y = tf.matmul(X, W)
problem: no broadcasting for tf.matmul() and no success using tf.batch_matmul()
expected shape of Y: (5, 10, 4, 50)
applying average pooling for each entry of the batch (over the objects of each entry):
Y_avg = tf.reduce_mean(Y, 2)
expected shape of Y_avg: (5, 10, 50)
I expected that tf.matmul() would have supported broadcasting. Then I found tf.batch_matmul(), but still it looks like doesn't apply to my case (e.g., W needs to have 3 dimensions at least, not clear why).
BTW, above I used a simple linear function (the weights of which are stored in W). But in my model I have a deep network instead. So, the more general problem I have is automatically computing a function for each slice of a tensor. This is why I expected that tf.matmul() would have had a broadcasting behavior (if so, maybe tf.batch_matmul() wouldn't even be necessary).
Look forward to learning from you!
Alessio
You could achieve that by reshaping X to shape [n, d], where d is the dimensionality of one single "instance" of computation (100 in your example) and n is the number of those instances in your multi-dimensional object (5*10*4=200 in your example). After reshaping, you can use tf.matmul and then reshape back to the desired shape. The fact that the first three dimensions can vary makes that little tricky, but you can use tf.shape to determine the actual shapes during run time. Finally, you can perform the second step of your computation, which should be a simple tf.reduce_mean over the respective dimension. All in all, it would look like this:
X = tf.placeholder(tf.float32, shape=(None, None, None, 100))
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
X_ = tf.reshape(X, [-1, 100])
Y_ = tf.matmul(X_, W)
X_shape = tf.gather(tf.shape(X), [0,1,2]) # Extract the first three dimensions
target_shape = tf.concat(0, [X_shape, [50]])
Y = tf.reshape(Y_, target_shape)
Y_avg = tf.reduce_mean(Y, 2)
As the renamed title of the GitHub issue you linked suggests, you should use tf.tensordot(). It enables contraction of axes pairs between two tensors, in line with Numpy's tensordot(). For your case:
X = tf.placeholder(tf.float32, shape=(5, 10, 4, 100))
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
Y = tf.tensordot(X, W, [[3], [0]]) # gives shape=[5, 10, 4, 50]