i have a Simple RNN model with below code:
s_input = Input((window_size, ), dtype='int32', name='S')
t_input = Input((window_size, ), dtype='int32', name='T')
emb1 = Embedding(nb_points + 1, emb_size1)
emb2 = Embedding(tm_length + 1, emb_size2)
xe = emb1(s_input)
he = emb2(t_input)
x = Concatenate()([xe, he])
x = SimpleRNN(rnn_size)(x)
y = Dense(nb_points, activation='softmax')(x)
model = Model([s_input, t_input], y)
model.compile('adadelta', 'categorical_crossentropy', metrics=['accuracy'])
return model
When i try to use and called the model. I have this model summary:
Layer (type) Output Shape Param # Connected to
==================================================================================================
S (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
T (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
embedding_32 (Embedding) (None, 2, 100) 500 S[0][0]
__________________________________________________________________________________________________
embedding_33 (Embedding) (None, 2, 6) 150 T[0][0]
__________________________________________________________________________________________________
concatenate_15 (Concatenate) (None, 2, 106) 0 embedding_32[0][0]
embedding_33[0][0]
__________________________________________________________________________________________________
simple_rnn_10 (SimpleRNN) (None, 20) 2540 concatenate_15[0][0]
__________________________________________________________________________________________________
dense_4 (Dense) (None, 4) 84 simple_rnn_10[0][0]
==================================================================================================
Total params: 3,274
Trainable params: 3,274
Non-trainable params: 0
_________________________________________________________________________________________________
But, it does not give any accuracy and lost result for each epoch. only print something like this:
Train on 40 samples, validate on 11 samples
Epoch 1/100
Processing user 1.
Is there anyone can help me with this? result for epoch is not printed.
Related
I am trying to use the following custom accuracy function in my model:
def acc_fn(pred, gt):
pred_occupy = pred[..., 1] >= config.IOU_THRESHOLD
I1 = tf.reduce_sum(tf.cast(tf.math.logical_and(pred_occupy, tf.cast(gt, tf.bool)), tf.float32))
I2 = tf.reduce_sum(tf.cast(tf.math.logical_or(pred_occupy, tf.cast(gt, tf.bool)), tf.float32))
IoU = tf.math.divide(I1, I2, name = "IoU")
tf.summary.scalar("IoU", IoU)
return IoU
invoked in keras like:
model.compile(loss=loss_fn, #categorical crossentropy
optimizer=keras.optimizers.Adam(learning_rate=config.LR),
metrics=[acc_fn])
I am getting incompatible shapes error while fitting the model:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,128,128,128,2] vs. [1,128,128,128]
My model outputs a hot encoded layer of shape [1,128,128,128,2] and the ground truth array is hot-encoded as well and of the same shape!
the last couple of layers in my model
add_10 (Add) (None, 128, 128, 128 0 conv3d_11[0][0]
conv3d_13[0][0]
__________________________________________________________________________________________________
conv3d_14 (Conv3D) (None, 128, 128, 128 433 add_10[0][0]
__________________________________________________________________________________________________
lambda_2 (Lambda) (None, 128, 128, 128 0 conv3d_14[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 128, 128, 128 0 conv3d_14[0][0]
lambda_2[0][0]
__________________________________________________________________________________________________
softmax_1 (Softmax) (None, 128, 128, 128 0 concatenate_1[0][0]
==================================================================================================
Total params: 176,458,081
Trainable params: 176,451,105
Non-trainable params: 6,976
I'm trying to apply average pooling at each time step of lstm output, please find my architecture as below
X_input = tf.keras.layers.Input(shape=(64,35))
X= tf.keras.layers.LSTM(512,activation="tanh",return_sequences=True,kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X_input)
X= tf.keras.layers.LSTM(256,activation="tanh",return_sequences=True,kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.GlobalAvgPool1D()(X)
X = tf.keras.layers.Dense(128,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.Dense(64,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
X = tf.keras.layers.Dense(32,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
# X = tf.keras.layers.Dense(16,activation="relu",kernel_initializer=tf.keras.initializers.he_uniform(seed=45),kernel_regularizer=tf.keras.regularizers.l2(0.1))(X)
output_layer = tf.keras.layers.Dense(10,activation='softmax', kernel_initializer=tf.keras.initializers.he_uniform(seed=45))(X)
model2 = tf.keras.Model(inputs = X_input,outputs = output_layer)
I want to take average at each time step, not on each unit
For example now I'm getting the shape (None,256) but I want to get the shape (None,64) from global average pooling layer, what I need to do for that.
I am not sure this is the most efficient way, but you can try this :
X = tf.keras.layers.Reshape(target_shape=(64,256,1))(X)
X = tf.keras.layers.TimeDistributed(tf.keras.layers.GlobalAveragePooling1D())(X)
X = tf.keras.layers.Reshape(target_shape=(64,))(X)
instead of :
X = tf.keras.layers.GlobalAvgPool1D()(X)
The summary is now :
Model: "functional_13"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_14 (InputLayer) [(None, 64, 35)] 0
_________________________________________________________________
lstm_26 (LSTM) (None, 64, 512) 1122304
_________________________________________________________________
lstm_27 (LSTM) (None, 64, 256) 787456
_________________________________________________________________
reshape_2 (Reshape) (None, 64, 256, 1) 0
_________________________________________________________________
time_distributed_8 (TimeDist (None, 64, 1) 0
_________________________________________________________________
reshape_3 (Reshape) (None, 64) 0
_________________________________________________________________
dense_61 (Dense) (None, 128) 8320
_________________________________________________________________
dense_62 (Dense) (None, 64) 8256
_________________________________________________________________
dense_63 (Dense) (None, 32) 2080
_________________________________________________________________
dense_64 (Dense) (None, 10) 330
=================================================================
Total params: 1,928,746
Trainable params: 1,928,746
Non-trainable params: 0
I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
300,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_LEN,
trainable=False)
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['sparse_categorical_accuracy'])
print(model.summary())
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
__________________________________________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
__________________________________________________________________________________________________
meta_input (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
meta_input[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
==================================================================================================
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
__________________________________________________________________________________________________
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!
I want to feed only RNN output at odd positions to the next RNN layer. How to achieve that in tensorflow?
I basically want to build the top layer in the following diagram, which halves the sequence size. The bottom layer is just a simple RNN.
Is this what you need?
from tensorflow.keras import layers, models
import tensorflow.keras.backend as K
inp = layers.Input(shape=(10, 5))
out = layers.LSTM(50, return_sequences=True)(inp)
out = layers.Lambda(lambda x: tf.stack(tf.unstack(out, axis=1)[::2], axis=1))(out)
out = layers.LSTM(50)(out)
out = layers.Dense(20)(out)
m = models.Model(inputs=inp, outputs=out)
m.summary()
You get the following model. You can see the second LSTM only gets 5 timesteps from the total 10 steps (i.e. every other output of the previous layer)
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 10, 5)] 0
_________________________________________________________________
lstm_2 (LSTM) (None, 10, 50) 11200
_________________________________________________________________
lambda_1 (Lambda) (None, 5, 50) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 20200
_________________________________________________________________
dense_1 (Dense) (None, 20) 1020
=================================================================
Total params: 32,420
Trainable params: 32,420
Non-trainable params: 0
I have the following sequential model that works with variable length inputs:
m = Sequential()
m.add(Embedding(len(chars), 4, name="embedding"))
m.add(Bidirectional(LSTM(16, unit_forget_bias=True, name="lstm")))
m.add(Dense(len(chars),name="dense"))
m.add(Activation("softmax"))
m.summary()
Gives the following summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 4) 204
_________________________________________________________________
bidirectional_2 (Bidirection (None, 32) 2688
_________________________________________________________________
dense (Dense) (None, 51) 1683
_________________________________________________________________
activation_2 (Activation) (None, 51) 0
=================================================================
Total params: 4,575
Trainable params: 4,575
Non-trainable params: 0
However when I try to implement the same model in functional API I don't know whatever I try as Input layer shape doesn't seem to be the same as the sequential model. Here is one of my tries:
charinput = Input(shape=(4,),name="input",dtype='int32')
embedding = Embedding(len(chars), 4, name="embedding")(charinput)
lstm = Bidirectional(LSTM(16, unit_forget_bias=True, name="lstm"))(embedding)
dense = Dense(len(chars),name="dense")(lstm)
output = Activation("softmax")(dense)
And here is the summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 4) 0
_________________________________________________________________
embedding (Embedding) (None, 4, 4) 204
_________________________________________________________________
bidirectional_1 (Bidirection (None, 32) 2688
_________________________________________________________________
dense (Dense) (None, 51) 1683
_________________________________________________________________
activation_1 (Activation) (None, 51) 0
=================================================================
Total params: 4,575
Trainable params: 4,575
Non-trainable params: 0
Use shape=(None,) in the input layer, in your case:
charinput = Input(shape=(None,),name="input",dtype='int32')
Try adding the argument input_length=None to the embeddinglayer.