this is my first question on StackOverflow, I hope I'm explaining enough.
So, I'm trying to run a GCN + LSTM using TensorFlow as follows:
features_sample = (5000,10)
def create_model(features_sample, G, network_size=(10,128,256,512), each_batch_size=400):
feature_shape = features_sample.shape[1]
whole_size = features_sample.shape[0]
seq_len = int(whole_size/each_batch_size)
X_in = Input(shape=(features_sample.shape[1],), name="input_feature_layer")
H = X_in
H = GraphConvolution(network_size[0], support, activation='relu', kernel_regularizer=l2(0.05))([H]+G)
H = GraphConvolution(network_size[1], support, activation='relu', kernel_regularizer=l2(0.05))([H]+G)
H = GraphConvolution(network_size[2], support, activation='relu', kernel_regularizer=l2(0.05))([H]+G)
H = Lambda(lambda H: tf.reshape(H, shape=(-1, H.shape[-1]), name="reshape_layer"))(H)
seq_list = []
for i in range(seq_len):
seq_list.append(tf.gather(params=H, indices=range(i*each_batch_size, (i+1)*each_batch_size)))
sequence = tf.stack(seq_list, 0)
L = LSTM(units=10)(sequence)
Y = Dense(10)(L)
Y = Dense(1)(Y)
# Compile model
model = Model(inputs=[X_in]+G, outputs=Y)
return model
model = create_model(features_sample=X, G = G)
model.compile(loss=tf.keras.losses.MeanSquaredError(), optimizer=Adam(lr=0.01))
Everything works well before I add LSTM there.
And here's the model summary:
Model: "functional_19"
__________________________________________________________________________________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==========================================================================================================================================================================
input_feature_layer (InputLayer) [(None, 10)] 0
__________________________________________________________________________________________________________________________________________________________________________
input_8 (InputLayer) [(None, None, None)] 0
__________________________________________________________________________________________________________________________________________________________________________
graph_convolution_32 (GraphConvolution) (None, None, 10) 110 input_feature_layer[0][0]
input_8[0][0]
__________________________________________________________________________________________________________________________________________________________________________
graph_convolution_33 (GraphConvolution) (None, None, None, 128) 1408 graph_convolution_32[0][0]
input_8[0][0]
__________________________________________________________________________________________________________________________________________________________________________
graph_convolution_34 (GraphConvolution) (None, None, None, None, 256) 33024 graph_convolution_33[0][0]
input_8[0][0]
__________________________________________________________________________________________________________________________________________________________________________
lambda_10 (Lambda) (None, 256) 0 graph_convolution_34[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_96 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_97 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_98 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_99 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_100 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_101 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_102 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_103 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_104 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_105 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_106 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_GatherV2_107 (TensorFlowOpLayer) [(400, 256)] 0 lambda_10[0][0]
__________________________________________________________________________________________________________________________________________________________________________
tf_op_layer_stack_8 (TensorFlowOpLayer) [(12, 400, 256)] 0 tf_op_layer_GatherV2_96[0][0]
tf_op_layer_GatherV2_97[0][0]
tf_op_layer_GatherV2_98[0][0]
tf_op_layer_GatherV2_99[0][0]
tf_op_layer_GatherV2_100[0][0]
tf_op_layer_GatherV2_101[0][0]
tf_op_layer_GatherV2_102[0][0]
tf_op_layer_GatherV2_103[0][0]
tf_op_layer_GatherV2_104[0][0]
tf_op_layer_GatherV2_105[0][0]
tf_op_layer_GatherV2_106[0][0]
tf_op_layer_GatherV2_107[0][0]
__________________________________________________________________________________________________________________________________________________________________________
lstm_8 (LSTM) (12, 10) 10680 tf_op_layer_stack_8[0][0]
__________________________________________________________________________________________________________________________________________________________________________
dense_20 (Dense) (12, 10) 110 lstm_8[0][0]
__________________________________________________________________________________________________________________________________________________________________________
dense_21 (Dense) (12, 1) 11 dense_20[0][0]
==========================================================================================================================================================================
Total params: 45,343
Trainable params: 45,343
Non-trainable params: 0
__________________________________________________________________________________________________________________________________________________________________________
Then I fit this
model.fit(graph, y_train, sample_weight=train_mask,
batch_size=A.shape[0], epochs=1, shuffle=False, verbose=0)
Here, graph is a 5000x5000 sparse matrix, y train = (5000,), A.shape[0] = 5000.
But I get the error:
InvalidArgumentError: Incompatible shapes: [5000] vs. [12]
[[node gradient_tape/mean_squared_error/weighted_loss/Mul (defined at \AppData\Local\Temp/ipykernel_21548/1117556248.py:14) ]] [Op:__inference_train_function_3097229]
Function call stack:
train_function
Changing to batch_size=12 results in another error:
InvalidArgumentError: Matrix size-incompatible: In[0]: [12,5000], In[1]: [12,10]
[[node functional_19/graph_convolution_32/MatMul (defined at \AppData\Local\Temp/ipykernel_21548/2292291052.py:63) ]] [Op:__inference_train_function_3097229]
Errors may have originated from an input operation.
Input Source operations connected to node functional_19/graph_convolution_32/MatMul:
IteratorGetNext (defined at \AppData\Local\Temp/ipykernel_21548/1117556248.py:14)
Function call stack:
train_function
But I need my batch_size to be A.shape[0] because this will run for the whole graph for GCN.
Does anyone know how to handle this? Thank you in advance!
Related
I am trying to use the following custom accuracy function in my model:
def acc_fn(pred, gt):
pred_occupy = pred[..., 1] >= config.IOU_THRESHOLD
I1 = tf.reduce_sum(tf.cast(tf.math.logical_and(pred_occupy, tf.cast(gt, tf.bool)), tf.float32))
I2 = tf.reduce_sum(tf.cast(tf.math.logical_or(pred_occupy, tf.cast(gt, tf.bool)), tf.float32))
IoU = tf.math.divide(I1, I2, name = "IoU")
tf.summary.scalar("IoU", IoU)
return IoU
invoked in keras like:
model.compile(loss=loss_fn, #categorical crossentropy
optimizer=keras.optimizers.Adam(learning_rate=config.LR),
metrics=[acc_fn])
I am getting incompatible shapes error while fitting the model:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,128,128,128,2] vs. [1,128,128,128]
My model outputs a hot encoded layer of shape [1,128,128,128,2] and the ground truth array is hot-encoded as well and of the same shape!
the last couple of layers in my model
add_10 (Add) (None, 128, 128, 128 0 conv3d_11[0][0]
conv3d_13[0][0]
__________________________________________________________________________________________________
conv3d_14 (Conv3D) (None, 128, 128, 128 433 add_10[0][0]
__________________________________________________________________________________________________
lambda_2 (Lambda) (None, 128, 128, 128 0 conv3d_14[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 128, 128, 128 0 conv3d_14[0][0]
lambda_2[0][0]
__________________________________________________________________________________________________
softmax_1 (Softmax) (None, 128, 128, 128 0 concatenate_1[0][0]
==================================================================================================
Total params: 176,458,081
Trainable params: 176,451,105
Non-trainable params: 6,976
Having the following model written in the sequential API:
config = {
'learning_rate': 0.001,
'lstm_neurons':32,
'lstm_activation':'tanh',
'dropout_rate': 0.08,
'batch_size': 128,
'dense_layers':[
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
]
}
def get_model(num_features, output_size):
opt = Adam(learning_rate=0.001)
model = Sequential()
model.add(Input(shape=[None,num_features], dtype=tf.float32, ragged=True))
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation']))
model.add(BatchNormalization())
if 'dropout_rate' in config:
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(BatchNormalization())
if 'dropout_rate' in layer:
model.add(Dropout(layer['dropout_rate']))
model.add(Dense(output_size, activation='sigmoid'))
model.compile(loss='mse', optimizer=opt, metrics=['mse'])
print(model.summary())
return model
When using a distributed training framework, I need to convert the syntax to use model subclassing instead.
I've looked at the docs but couldn't figure out how to do it.
Here is one equivalent subclassed implementation. Though I didn't test.
import tensorflow as tf
# your config
config = {
'learning_rate': 0.001,
'lstm_neurons':32,
'lstm_activation':'tanh',
'dropout_rate': 0.08,
'batch_size': 128,
'dense_layers':[
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
]
}
# Subclassed API Model
class MySubClassed(tf.keras.Model):
def __init__(self, output_size):
super(MySubClassed, self).__init__()
self.lstm = tf.keras.layers.LSTM(config['lstm_neurons'],
activation=config['lstm_activation'])
self.bn = tf.keras.layers.BatchNormalization()
if 'dropout_rate' in config:
self.dp1 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp2 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp3 = tf.keras.layers.Dropout(config['dropout_rate'])
for layer in config['dense_layers']:
self.dense1 = tf.keras.layers.Dense(layer['neurons'],
activation=layer['activation'])
self.bn1 = tf.keras.layers.BatchNormalization()
self.dense2 = tf.keras.layers.Dense(layer['neurons'],
activation=layer['activation'])
self.bn2 = tf.keras.layers.BatchNormalization()
self.out = tf.keras.layers.Dense(output_size,
activation='sigmoid')
def call(self, inputs, training=True, **kwargs):
x = self.lstm(inputs)
x = self.bn(x)
if 'dropout_rate' in config:
x = self.dp1(x)
x = self.dense1(x)
x = self.bn1(x)
if 'dropout_rate' in config:
x = self.dp2(x)
x = self.dense2(x)
x = self.bn2(x)
if 'dropout_rate' in config:
x = self.dp3(x)
return self.out(x)
# A convenient way to get model summary
# and plot in subclassed api
def build_graph(self, raw_shape):
x = tf.keras.layers.Input(shape=(None, raw_shape),
ragged=True)
return tf.keras.Model(inputs=[x],
outputs=self.call(x))
Build and compile the mdoel
s = MySubClassed(output_size=1)
s.compile(
loss = 'mse',
metrics = ['mse'],
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
Pass some tensor to create weights (check).
raw_input = (16, 16, 16)
y = s(tf.ones(shape=(raw_input)))
print("weights:", len(s.weights))
print("trainable weights:", len(s.trainable_weights))
weights: 21
trainable weights: 15
Summary and Plot
Summarize and visualize the model graph.
s.build_graph(16).summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, 16)] 0
_________________________________________________________________
lstm (LSTM) (None, 32) 6272
_________________________________________________________________
batch_normalization (BatchNo (None, 32) 128
_________________________________________________________________
dropout (Dropout) (None, 32) 0
_________________________________________________________________
dense_2 (Dense) (None, 32) 1056
_________________________________________________________________
batch_normalization_3 (Batch (None, 32) 128
_________________________________________________________________
dropout_1 (Dropout) (None, 32) 0
_________________________________________________________________
dense_3 (Dense) (None, 32) 1056
_________________________________________________________________
batch_normalization_4 (Batch (None, 32) 128
_________________________________________________________________
dropout_2 (Dropout) (None, 32) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 33
=================================================================
Total params: 8,801
Trainable params: 8,609
Non-trainable params: 192
tf.keras.utils.plot_model(
s.build_graph(16),
show_shapes=True,
show_dtype=True,
show_layer_names=True,
rankdir="TB",
)
I am trying to train a neural network on Semantic Role Labeling task (text classification task). The dataset consist of sentences on which the neural network has to be trained to predict a class for each word. Apart from using the embedding matrix, I am also using other features (meta_data_features). The number of classes in Y_train are 61. The number 3306 represents the number of sentences in my dataset (size of my dataset). MAX_LEN = 67. The code for the architecture is:
embedding_layer = Embedding(67,
300,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_LEN,
trainable=False)
sentence_input = Input(shape=(67,), dtype='int32')
meta_input = Input(shape=(67,), name='meta_input')
embedded_sequences = embedding_layer(sentence_input)
x_1 = (SimpleRNN(256))(embedded_sequences)
x = concatenate([x_1, meta_input], axis=1)
x = Dropout(0.3)(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(61, activation='softmax')(x)
model = Model([sentence_input,meta_input], predictions)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['sparse_categorical_accuracy'])
print(model.summary())
The snapshot of model summary is:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 67, 300) 1176000 input_1[0][0]
__________________________________________________________________________________________________
simple_rnn_1 (SimpleRNN) (None, 256) 142592 embedding_1[0][0]
__________________________________________________________________________________________________
meta_input (InputLayer) (None, 67) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 323) 0 simple_rnn_1[0][0]
meta_input[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 323) 0 concatenate_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 32) 10368 dropout_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 61) 2013 dense_1[0][0]
==================================================================================================
Total params: 1,330,973
Trainable params: 154,973
Non-trainable params: 1,176,000
__________________________________________________________________________________________________
The function call is:
simple_RNN_model_trainable.fit([padded_sentences, meta_data_features], padded_verbs,batch_size=32,epochs=1)
X_train constitutes [padded_sentences, meta_data_features] and Y_train is padded_verbs. Their shapes are:
padded_sentences - (3306, 67)
meta_data_features - (3306, 67)
padded_verbs - (3306, 67, 1)
When I try to fit the model, I get the error, "ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (3306, 67, 1)"
It would be great if somebody can help me in resolving the error. Thanks!
i have a Simple RNN model with below code:
s_input = Input((window_size, ), dtype='int32', name='S')
t_input = Input((window_size, ), dtype='int32', name='T')
emb1 = Embedding(nb_points + 1, emb_size1)
emb2 = Embedding(tm_length + 1, emb_size2)
xe = emb1(s_input)
he = emb2(t_input)
x = Concatenate()([xe, he])
x = SimpleRNN(rnn_size)(x)
y = Dense(nb_points, activation='softmax')(x)
model = Model([s_input, t_input], y)
model.compile('adadelta', 'categorical_crossentropy', metrics=['accuracy'])
return model
When i try to use and called the model. I have this model summary:
Layer (type) Output Shape Param # Connected to
==================================================================================================
S (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
T (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
embedding_32 (Embedding) (None, 2, 100) 500 S[0][0]
__________________________________________________________________________________________________
embedding_33 (Embedding) (None, 2, 6) 150 T[0][0]
__________________________________________________________________________________________________
concatenate_15 (Concatenate) (None, 2, 106) 0 embedding_32[0][0]
embedding_33[0][0]
__________________________________________________________________________________________________
simple_rnn_10 (SimpleRNN) (None, 20) 2540 concatenate_15[0][0]
__________________________________________________________________________________________________
dense_4 (Dense) (None, 4) 84 simple_rnn_10[0][0]
==================================================================================================
Total params: 3,274
Trainable params: 3,274
Non-trainable params: 0
_________________________________________________________________________________________________
But, it does not give any accuracy and lost result for each epoch. only print something like this:
Train on 40 samples, validate on 11 samples
Epoch 1/100
Processing user 1.
Is there anyone can help me with this? result for epoch is not printed.
I want to use an upsampling 2D layer in keras so that I can increase the image size by a decimal factor (in this case from [213,213] to [640,640]). The layer is compiled as expected, but when I want to train or predict on real images, they are upsampled only by the closest integer to the input factor. Any idea? Details below:
Network:
mp_size = (3,3)
inputs = Input(input_data.shape[1:])
lay1 = Conv2D(32, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='glorot_normal')(inputs)
lay2 = MaxPooling2D(pool_size=mp_size)(lay1)
lay3 = Conv2D(32, (3,3), strides=(1,1), activation='relu', padding='same', kernel_initializer='glorot_normal')(lay2)
size1=lay3.get_shape()[1:3]
size2=lay1.get_shape()[1:3]
us_size = size2[0].value/size1[0].value, size2[1].value/size1[1].value
lay4 = Concatenate(axis=-1)([UpSampling2D(size=us_size)(lay3),lay1])
lay5 = Conv2D(1, (1, 1), strides=(1,1), activation='sigmoid')(lay4)
model = Model(inputs=inputs, outputs=lay5)
Network summary when I use model.summary() :
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_4 (InputLayer) (None, 640, 640, 2) 0
____________________________________________________________________________________________________
conv2d_58 (Conv2D) (None, 640, 640, 32) 608 input_4[0][0]
____________________________________________________________________________________________________
max_pooling2d_14 (MaxPooling2D) (None, 213, 213, 32) 0 conv2d_58[0][0]
____________________________________________________________________________________________________
conv2d_59 (Conv2D) (None, 213, 213, 32) 9248 max_pooling2d_14[0][0]
____________________________________________________________________________________________________
up_sampling2d_14 (UpSampling2D) (None, 640.0, 640.0, 0 conv2d_59[0][0]
____________________________________________________________________________________________________
concatenate_14 (Concatenate) (None, 640.0, 640.0, 0 up_sampling2d_14[0][0]
conv2d_58[0][0]
____________________________________________________________________________________________________
conv2d_60 (Conv2D) (None, 640.0, 640.0, 65 concatenate_14[0][0]
====================================================================================================
Total params: 9,921
Trainable params: 9,921
Non-trainable params: 0
Error when training the network:
InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,639,639,32] vs. shape[1] = [1,640,640,32]
[[Node: concatenate_14/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](up_sampling2d_14/ResizeNearestNeighbor, conv2d_58/Relu, concatenate_14/concat/axis)]]
It can be resolved by using the below code:
from keras.layers import UpSampling2D
from keras.utils.generic_utils import transpose_shape
class UpSamplingUnet(UpSampling2D):
def compute_output_shape(self, input_shape):
size_all_dims = (1,) + self.size + (1,)
spatial_axes = list(range(1, 1 + self.rank))
size_all_dims = transpose_shape(size_all_dims,
self.data_format,
spatial_axes)
output_shape = list(input_shape)
for dim in range(len(output_shape)):
if output_shape[dim] is not None:
output_shape[dim] *= size_all_dims[dim]
output_shape[dim]=int(output_shape[dim])
return tuple(output_shape)
Then alter UpSampling2D(size=us_size) to UpSamplingUnet(size=us_size).