tensorflow conv1d kernel size dimensionality error - tensorflow

When taking the one dimensional convolution of a one dimensional array, I receive an error which suggests my second dimension is not big enough.
Here is the overview of the relevant code:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=1, strides=1, padding='valid')
I am hoping to increase the kernel size to 3 such that neighbouring points also influence the output of each input node, however I get the following error:
ValueError: Negative dimension size caused by subtracting 3 from 1 for
'conv1d_4/convolution/Conv2D' (op: 'Conv2D') with input shapes:
[?,1,1,45], [1,3,45,64].
My guess is that tensorflow is expecting me to reshape my input into two dimensions so that some depth can be used to do the kernel multiplication. Question is why is this the case and what to expect for the layer behaviour based on the input dimensions

You need to add a Channel dimension as last dimension even if you only have one channel.
So this code works:
inputs_ = tf.placeholder(tf.float32 ,(None, 45), name='inputs')
x1 = tf.expand_dims(inputs_, axis=-1)
x1 = tf.layers.conv1d(x1, filters=64, kernel_size=3, strides=1, padding='valid')
So basically the error was caused because your tensor looked like having a width of 1, with 45 channels. TensorFlow was trying to convolve with a kernel size 3 along a size 1 dimension.

Related

TensorFlow 2d Pooling but only in 1 axis

I'm creating a SSL neural network and my input tensor is a NxM tensor where N is the length of the sound wave and M is the number of microphones. The actual size is roughly 14000x4
I need to pool, but I only want to pool the rows for each column (not the columns together). For example:
Pool(2)(tensor) --> tensor of size (N/2)xM
Is this possible without splitting the tensor into 4 tensors, preforming 4 separate Pool1D, then concatenating?
Pool1D gives dimensionality error
Pool2D reduces the number of rows and columns
Set the stride to 1 for the columns,
tf.keras.layers.MaxPooling2D(pool_size=(2, 2),strides=(2, 1), padding='same')
Example,
inputs = tf.random.normal(shape=(14000,4))
inputs = inputs[None,...,None]
max_pool_2d = tf.keras.layers.MaxPooling2D(pool_size=(2, 2),strides=(2, 1), padding='same')
max_pool_2d(inputs).shape
#[1, 7000, 4, 1]

ValueError: Dimensions must be equal in Tensorflow/Keras

My codes are as follow:
v = tf.Variable(initial_value=v, trainable=True)
v.shape is (1, 768)
In the model:
inputs_sents = keras.Input(shape=(50,3))
inputs_events = keras.Input(shape=(50,768))
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
But I got an error,
ValueError: Dimensions must be equal, but are 768 and 50 for
'{{node BatchMatMulV2_3}} =
BatchMatMulV2[T=DT_FLOAT,
adj_x=false,
adj_y=false](BatchMatMulV2_3/ReadVariableOp,
Transpose_3)' with input shapes: [1,768], [768,50,?]
I think it takes consideration of the batch? But how shall I deal with this?
v is a trainable vector (or 2d array with first dimension being 1), I want it to be trained in the training process.
PS: This is the result I got using the codes provided by the first answer, I think it is incorrect cause keras already takes consideration of the first batch dimension.
Plus, from the keras documentation,
shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
https://keras.io/api/layers/core_layers/input/
Should I rewrite my codes without keras?
The shape of a batch is denoted by None:
import numpy as np
inputs_sents = keras.Input(shape=(None,1,3))
inputs_events = keras.Input(shape=(None,1,768))
v = np.ones(shape=(1,768), dtype=np.float32)
v = tf.Variable(initial_value=v, trainable=True)
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)

How does tf.keras.layers.Conv2DTranspose behave with stride and padding?

While a convolution layer in TensorFlow has a complete description https://www.tensorflow.org/api_guides/python/nn#Convolution, transposed convolution does not have one.
Although tf.keras.layers.Conv2DTranspose has a reference to https://arxiv.org/pdf/1603.07285.pdf, it is not complete.
Is there any documentation that describes how tf.keras.layers.Conv2DTranspose behaves?
Conv2DTranspose is often used as upsampling for an image/feature map. The code below use 1X1 filter kernel to show how the input is padded with zero. the code is for tensorflow 2.0, add enable_eager_execution() with tensorflow1.x
data = tf.ones([2,2],tf.float32,"input_data")
input_layer = tf.reshape(data, [-1, 2, 2, 1])
transpose2d = layers.Conv2DTranspose(1, (1, 1), kernel_initializer='ones', strides=(2, 2), padding='valid', use_bias=False)
x = transpose2d(input_layer)
print(x)
The input is
1,1
1,1
The x is
1,0,1,0
0,0,0,0
1,0,1,0
0,0,0,0
you can change the stride value to see the diffrence

How to calculate input_dim for a keras sequential model?

Keras Dense layer needs an input_dim or input_shape to be specified. What value do I put in there?
My input is a matrix of 1,000,000 rows and only 3 columns. My output is 1,600 classes.
What do I put there?
dimensionality of the inputs (1000000, 1600)
2 because it's a 2D matrix
input_dim is the number of dimensions of the features, in your case that is just 3. The equivalent notation for input_shape, which is an actual dimensional shape, is (3,)
In your case
lets assume x and y=target variable and are look like as follows after feature engineering
x.shape
(1000000, 3)
y.shape
((1000000, 1600)
# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=x.shape[1])) # Input layer
# now the model will take as input arrays of shape (*, 3)
# and output arrays of shape (*, 32)
...
...
model.add(Dense(y.shape[1],activation='softmax')) # Output layer
y.shape[1]= 1600, the number of output which is the number of classes you have, since you are dealing with Classification.
X = dataset.iloc[:, 3:13]
meaning the X parameter having all the rows and 3rd column till 12th column inclusive and 13th column exclusive.
We will also have a X0 parameter to be given to the neural network, so total
input layers becomes 10+1 = 11.
Dense(input_dim = 11, activation = 'relu', kernel_initializer = 'he_uniform')

No broadcasting for tf.matmul in TensorFlow

I have a problem with which I've been struggling. It is related to tf.matmul() and its absence of broadcasting.
I am aware of a similar issue on https://github.com/tensorflow/tensorflow/issues/216, but tf.batch_matmul() doesn't look like a solution for my case.
I need to encode my input data as a 4D tensor:
X = tf.placeholder(tf.float32, shape=(None, None, None, 100))
The first dimension is the size of a batch, the second the number of entries in the batch.
You can imagine each entry as a composition of a number of objects (third dimension). Finally, each object is described by a vector of 100 float values.
Note that I used None for the second and third dimensions because the actual sizes may change in each batch. However, for simplicity, let's shape the tensor with actual numbers:
X = tf.placeholder(tf.float32, shape=(5, 10, 4, 100))
These are the steps of my computation:
compute a function of each vector of 100 float values (e.g., linear function)
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
Y = tf.matmul(X, W)
problem: no broadcasting for tf.matmul() and no success using tf.batch_matmul()
expected shape of Y: (5, 10, 4, 50)
applying average pooling for each entry of the batch (over the objects of each entry):
Y_avg = tf.reduce_mean(Y, 2)
expected shape of Y_avg: (5, 10, 50)
I expected that tf.matmul() would have supported broadcasting. Then I found tf.batch_matmul(), but still it looks like doesn't apply to my case (e.g., W needs to have 3 dimensions at least, not clear why).
BTW, above I used a simple linear function (the weights of which are stored in W). But in my model I have a deep network instead. So, the more general problem I have is automatically computing a function for each slice of a tensor. This is why I expected that tf.matmul() would have had a broadcasting behavior (if so, maybe tf.batch_matmul() wouldn't even be necessary).
Look forward to learning from you!
Alessio
You could achieve that by reshaping X to shape [n, d], where d is the dimensionality of one single "instance" of computation (100 in your example) and n is the number of those instances in your multi-dimensional object (5*10*4=200 in your example). After reshaping, you can use tf.matmul and then reshape back to the desired shape. The fact that the first three dimensions can vary makes that little tricky, but you can use tf.shape to determine the actual shapes during run time. Finally, you can perform the second step of your computation, which should be a simple tf.reduce_mean over the respective dimension. All in all, it would look like this:
X = tf.placeholder(tf.float32, shape=(None, None, None, 100))
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
X_ = tf.reshape(X, [-1, 100])
Y_ = tf.matmul(X_, W)
X_shape = tf.gather(tf.shape(X), [0,1,2]) # Extract the first three dimensions
target_shape = tf.concat(0, [X_shape, [50]])
Y = tf.reshape(Y_, target_shape)
Y_avg = tf.reduce_mean(Y, 2)
As the renamed title of the GitHub issue you linked suggests, you should use tf.tensordot(). It enables contraction of axes pairs between two tensors, in line with Numpy's tensordot(). For your case:
X = tf.placeholder(tf.float32, shape=(5, 10, 4, 100))
W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1))
Y = tf.tensordot(X, W, [[3], [0]]) # gives shape=[5, 10, 4, 50]