ValueError: Dimensions must be equal in Tensorflow/Keras - tensorflow

My codes are as follow:
v = tf.Variable(initial_value=v, trainable=True)
v.shape is (1, 768)
In the model:
inputs_sents = keras.Input(shape=(50,3))
inputs_events = keras.Input(shape=(50,768))
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)
But I got an error,
ValueError: Dimensions must be equal, but are 768 and 50 for
'{{node BatchMatMulV2_3}} =
BatchMatMulV2[T=DT_FLOAT,
adj_x=false,
adj_y=false](BatchMatMulV2_3/ReadVariableOp,
Transpose_3)' with input shapes: [1,768], [768,50,?]
I think it takes consideration of the batch? But how shall I deal with this?
v is a trainable vector (or 2d array with first dimension being 1), I want it to be trained in the training process.
PS: This is the result I got using the codes provided by the first answer, I think it is incorrect cause keras already takes consideration of the first batch dimension.
Plus, from the keras documentation,
shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
https://keras.io/api/layers/core_layers/input/
Should I rewrite my codes without keras?

The shape of a batch is denoted by None:
import numpy as np
inputs_sents = keras.Input(shape=(None,1,3))
inputs_events = keras.Input(shape=(None,1,768))
v = np.ones(shape=(1,768), dtype=np.float32)
v = tf.Variable(initial_value=v, trainable=True)
x_1 = tf.matmul(v,tf.transpose(inputs_events))
x_2 = tf.matmul(x_1,inputs_sents)

Related

Two input layers for LSTM Neural Network?

I am now building a neural network, and I am facing the task of adding another input layer (since now I just needed one).
In particular, this was the code previously:
###...
if(self.net_embedding==0):
l_input = Input(shape=self.win_size, dtype='int32', name='input_act')
emb_input = Embedding(output_dim=params["output_dim_embedding"], input_dim=unique_events + 1, input_length=self.win_size)(l_input)
toBePassed=emb_input
elif(self.net_embedding==1):
self.getWord2VecEmbeddings(params['word2vec_size'])
X_train=self.encodePrefixes(params['word2vec_size'],X_train)
l_input = Input(shape = (self.win_size, params['word2vec_size']), name = 'input_act')
toBePassed=l_input
l1 = LSTM(params["shared_lstm_size"],return_sequences=True, kernel_initializer='glorot_uniform',dropout=params['dropout'])(toBePassed)
l1 = BatchNormalization()(l1)
#and so on with the rest of the layers...
The input of the model (X_train) was just an array of arrays (with size = self.win_size) of integers (e.g. [[0 1 2 3] [1 2 3 4]...] if self.win_size = 4), where the integers represent categorical elements.
As you can see, I also have two types of embeddings for this input:
Embedding layer
Word2Vec encoding
Now, I need to add another input to the net, which is as well an array of arrays (with size = self.win_size again) of integers (eg. [[0 123 334 2212][123 334 2212 4888]...], but this time I don't need to apply any embedding (I think) because the elements here are not categorical (they represent elapsed time in seconds).
I tried by simply changing the net to:
#...
if(self.net_embedding==0):
l_input = Input(shape=self.win_size, dtype='int32', name='input_act')
emb_input = Embedding(output_dim=params["output_dim_embedding"], input_dim=unique_events + 1, input_length=self.win_size)(l_input)
toBePassed=emb_input
elif(self.net_embedding==1):
self.getWord2VecEmbeddings(params['word2vec_size'])
X_train=self.encodePrefixes(params['word2vec_size'],X_train)
l_input = Input(shape = (self.win_size, params['word2vec_size']), name = 'input_act')
toBePassed=l_input
elapsed_time_input = Input(shape=self.win_size, name='input_time')
input_concat = Concatenate(axis=1)([toBePassed, elapsed_time_input])
l1 = LSTM(params["shared_lstm_size"],return_sequences=True, kernel_initializer='glorot_uniform',dropout=params['dropout'])(input_concat)
l1 = BatchNormalization()(l1)
#and so on with other layers...
but I get the error:
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 4, 12), (None, 4)]
Do you please have any solution for this? Any kind of help would be really appreciated, since I have a deadline in a few days and I'm smashing my head on this for so long now! Thanks :)
There are two problems with your approach.
First, inputs to LSTM should have a shape of (batch_size, num_steps, num_feats), yet your elapsed_time_input has shape (None, 4). You need to expand its dimension to get the proper shape (None, 4, 1).
elapsed_time_input = tf.keras.layers.Reshape((-1, 1))(elapsed_time_input)
or
elapsed_time_input = tf.expand_dims(elapsed_time_input, axis=-1)
With this, "elapsed time in seconds" will be seen as just another feature of a timestep.
Secondly, you'll want to concatenate the two inputs in the feature dimension (not the timestep dimension).
input_concat = Concatenate(axis=-1)([toBePassed, elapsed_time_input])
or
input_concat = Concatenate(axis=2)([toBePassed, elapsed_time_input])
After this, you'll get a keras tensor with a shape of (None, 4, 13). It represents a batch of time series, each having 4 timesteps and 13 features per step (12 original features + elapsed time in second for each step).

How to correctly ignore padded or missing timesteps at decoding time in multi-feature sequences with LSTM autonecoder

I am trying to learn a latent representation for text sequence (multiple features (3)) by doing reconstruction USING AUTOENCODER. As some of the sequences are shorter than the maximum pad length or a number of time steps I am considering (seq_length=15), I am not sure if reconstruction will learn to ignore the timesteps or not for calculating loss or accuracies.
I followed suggestions from this answer to crop the outputs but my losses are nan and several of accuracies as well.
input1 = keras.Input(shape=(seq_length,),name='input_1')
input2 = keras.Input(shape=(seq_length,),name='input_2')
input3 = keras.Input(shape=(seq_length,),name='input_3')
input1_emb = layers.Embedding(70,32,input_length=seq_length,mask_zero=True)(input1)
input2_emb = layers.Embedding(462,192,input_length=seq_length,mask_zero=True)(input2)
input3_emb = layers.Embedding(84,36,input_length=seq_length,mask_zero=True)(input3)
merged = layers.Concatenate()([input1_emb, input2_emb,input3_emb])
activ_func = 'tanh'
encoded = layers.LSTM(120,activation=activ_func,input_shape=(seq_length,),return_sequences=True)(merged) #
encoded = layers.LSTM(60,activation=activ_func,return_sequences=True)(encoded)
encoded = layers.LSTM(15,activation=activ_func)(encoded)
# Decoder reconstruct inputs
decoded1 = layers.RepeatVector(seq_length)(encoded)
decoded1 = layers.LSTM(60, activation= activ_func , return_sequences=True)(decoded1)
decoded1 = layers.LSTM(120, activation= activ_func , return_sequences=True,name='decoder1_last')(decoded1)
Decoder one has an output shape of (None, 15, 120).
input_copy_1 = layers.TimeDistributed(layers.Dense(70, activation='softmax'))(decoded1)
input_copy_2 = layers.TimeDistributed(layers.Dense(462, activation='softmax'))(decoded1)
input_copy_3 = layers.TimeDistributed(layers.Dense(84, activation='softmax'))(decoded1)
For each output, I am trying to crop the O padded timesteps as suggested by this answer. padding has 0 where actual input was missing (had zero due to padding) and 1 otherwise
#tf.function
def cropOutputs(x):
#x[0] is softmax of respective feature (time distributed) on top of decoder
#x[1] is the actual input feature
padding = tf.cast( tf.not_equal(x[1][1],0), dtype=tf.keras.backend.floatx())
print(padding)
return x[0]*tf.tile(tf.expand_dims(padding, axis=-1),tf.constant([1,x[0].shape[2]], tf.int32))
Applying crop function to all three outputs.
input_copy_1 = layers.Lambda(cropOutputs, name='input_copy_1', output_shape=(None, 15, 70))([input_copy_1,input1])
input_copy_2 = layers.Lambda(cropOutputs, name='input_copy_2', output_shape=(None, 15, 462))([input_copy_2,input2])
input_copy_3 = layers.Lambda(cropOutputs, name='input_copy_3', output_shape=(None, 15, 84))([input_copy_3,input3])
My logic is to crop timesteps of each feature (all 3 features for sequence have the same length, meaning they miss timesteps together). But for timestep, they have been applied softmax as per their feature size (70,462,84) so I have to zero out timestep by making a multi-dimensional mask array of zeros or ones equal to this feature size with help of mask padding, and multiply by respective softmax representation using this using multi-dimensional mask array.
I am not sure I am doing this right or not as I have Nan losses for these inputs as well as other accuracies have that I am learning jointly with this task (it happens only with this cropping thing).
If it helps someone, I end up cropping the padded entries from the loss directly (taking some keras code pointer from these answers).
#tf.function
def masked_cc_loss(y_true, y_pred):
mask = tf.keras.backend.all(tf.equal(y_true, masked_val_hotencoded), axis=-1)
mask = 1 - tf.cast(mask, tf.keras.backend.floatx())
loss = tf.keras.losses.CategoricalCrossentropy()(y_true, y_pred) * mask
return tf.keras.backend.sum(loss) / tf.keras.backend.sum(mask) # averaging by the number of unmasked entries

How to create Keras ZeroTensor of specific shape

I am a total beginner with tensorflow.keras and I am wondering how I could create a constant zero tensor of a specific shape.
For example with this:
zeros = tf.keras.backend.zeros((someTensor.shape[0], someTensor.shape[1], someTensor.shape[2], channels))
concat = tf.kerasbackend.concatenate([someTensor, zeros], axis=3)
The operation tf.keras.backend.zeros fails with:
ValueError: Cannot convert a partially known TensorShape to a Tensor
I guess thats because the batch size is unknown during graph building. How can I create a ZeroTensor or any other constant tensor when I don't know the batchsize at that moment? Or is there some kind of unknown(?) value that I can specify?
It's strange because you are using a tuple of tensors and integers. Sort of weird.
You should:
shape = K.shape(someTensor)
ch = K.variable([channels]) #I think K.constant also works.
newShape = K.concatenate([shape[:3], ch])
zeros = K.zeros(newShape)
Now, if this doesn't work because of unknown shapes, a dirty workaround would be:
#if someTensor is 3D
zeros = K.zeros_like(someTensor)
zeros = K.stack([zeros] * channels, axis=-1)
#if someTensor is 4D
zeros = K.zeros_like(someTensor[:,:,:,0])
zeros = K.stack([zeros]*channels, axis=-1)

Variable batch size in tensorflow and CNN

I want to feed in a 1-D CNN a sequence of fixed length and want it to make a prediction (regression), but I want to have a variable batch size during training. The tutorials are not really helpful.
In my input layer I have something like this:
input = tf.placeholder(tf.float32, [None, sequence_length], name="input")
y = tf.placeholder(tf.float32, [None, 1], name="y")
so I assume the None dimension, can be the a variable batch size of any number, so the current input dimension is batch_size * sequence_length and I am supposed to feed the network a 2d np array with dimensions any * sequence_length
tf.nn.conv1d expects 3-D, since my input is a single channel that is 1 np array of sequence_length observations the input I will need to feed to the cnn should be 1*batch_size * sequence_length, if I had on the other hand 2 different sequences that I combine to predict a single value in the end it would have been 2*batch_size * sequence_length and I would also need to concatenate the 2 different channels. So in my case I need
input = tf.expand_dims(input, -1)
and then the filter also follow the same:
filter_size = 5
channel_size = 1
num_filters = 10
filter_shape = [filter_size, channel_size, num_filters]
filters = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="filters")
tf.nn.conv1d(value=input, filters=filters, stride=1)
After that I add a FC layer, but the network isn't able to learn anything, even the a basic function such as sin(x), does the code above look correct?
Also how can I do a maxpooling?

Tensorflow ValueError: Dimension 0 in both shapes must be equal, but are 344 and 2. From merging shape 0 with other shapes

I am currently working on my bachelor thesis. The goal is to train a LSTM to seperate instruments in a given piece of music. Therefore I input a signal which gets processed by a Short-term Fourier Transform, which outputs an array of windows with complex values for every frequency. This array is then put into a Placeholder. I am currently unsure what the Input Dimension should be in [Batch Size, Sequence Length, Input Dimension] I just put in 1. The following is an excerpt of my code.
num_hidden,numFreqs=1024
numWindows=344
input_data = tf.placeholder(tf.float32,[numWindows,numFreqs,1])
target_data = tf.placeholder(tf.float32,[numWindows,numFreqs,1])
cell = tf.nn.rnn_cell.LSTMCell(num_hidden, forget_bias=1.0)
val = tf.nn.dynamic_rnn(cell, input_data, dtype=tf.float32)
weight = tf.Variable(tf.truncated_normal([numFreqs,num_hidden]))
bias = tf.Variable(tf.truncated_normal([num_hidden]))
prediction = tf.add(tf.mul(val,weight),bias)
cost = tf.reduce_mean(tf.square(prediction - target_data))
optimizer = tf.train.AdamOptimizer(learning_rate)
minimize = optimizer.minimize(cost)
mistakes = tf.not_equal(tf.argmax(target_data, 1), tf.argmax(prediction, 1))
error = tf.reduce_mean(tf.cast(mistakes, tf.float32))
I get this error in the prediction line when I run it
ValueError: Dimension 0 in both shapes must be equal, but are 344 and 2
From merging shape 0 with other shapes.
numWindows has the value 344 so the input_data placeholder is affected. But I cannot find the other shape which is causing the problem. The weight and bias variables have shapes [1024,1024] and [1024] so I was thinking that they cannot be the problem. If you need more information I am happy to give it to you.