I am using Tensorflow 2.4.0 and Ubuntu 16.04
Given this model
base_model = tf.keras.applications.EfficientNetB0(weights="imagenet", include_top=False)
base_model.trainable = base_model_trainable
inputs = tf.keras.Input(shape=(IMG_SIZE,IMG_SIZE,3), name="input")
x = tf.keras.applications.efficientnet.preprocess_input(inputs)
# more details - https://www.tensorflow.org/tutorials/images/transfer_learning#important_note_about_batchnormalization_layers
x = base_model(x,
training=False) # training=training is needed only if there are layers with different behavior during training versus inference (e.g. Dropout)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(len(class_names), name="output")(x)
model = tf.keras.Model(inputs, outputs)
My objective is given the input to get the value of model.get_layer("efficientnetb0").output,
and model.output
if I do (a) (it does not give model.get_layer("efficientnetb0").output)
layer_name="efficientnetb0"
grad_model = tf.keras.models.Model([model.inputs], [model.output])
or (b) (it requires an additional input after preprocessing)
grad_model = tf.keras.models.Model([model.inputs, model.get_layer(layer_name).input], [model.get_layer(layer_name).output, model.output])
both works
but if I do
grad_model = tf.keras.models.Model([model.inputs], [model.get_layer(layer_name).output, model.output])
It gives exception:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'") at layer "rescaling". The following previous layers were accessed without issue: []
What is the reason, and is there a way for me to pass in just an input and get two outputs?
Related
I am building a Siamese network using Keras(TensorFlow) where the target is a binary column, i.e., match or mismatch(1 or 0). But the model fit method throws an error saying that the y_pred is not compatible with the y_true shape. I am using the binary_crossentropy loss function.
Here is the error I see:
Here is the code I am using:
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
epochs=2,
batch_size=32,
verbose=2,
shuffle=True)
My Input data shapes are as follows:
Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)
Target:
y_train.shape is (700,1)
In the error it throws, y_pred is the variable which was created internally. What is y_pred dimension is 2822 when I am having a binary target. And 2822 dimension actually matches the input size, but how do I understand this?
Here is the model I created:
in_layers = []
out_layers = []
for i in range(2):
input_layer = Input(shape=(1,))
embedding_layer = Embedding(embed_input_size+1, embed_output_size)(input_layer)
lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
in_layers.append(input_layer)
out_layers.append(lstm_layer_2)
merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()
Since my data is very sparse, I used todense. And there the type is as follows:
type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix
Summary of last few layers as follows:
Mismatched shape in the Input layer. The input shape needs to match the shape of a single element passed as x, or dataset.shape[1:]. So since your dataset size is (700,2822), that is 700 samples of size 2822. So your input shape should be 2822.
Change:
input_layer = Input(shape=(1,))
To:
input_layer = Input(shape=(2822,))
You need to set return_sequences in the lstm_layer_2 to False:
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
Otherwise, you will still have the timesteps of your input. That is why you have the shape (None, 2822, 1). You can also add a Flatten layer prior to your output layer, but I would recommend setting return_sequences=False.
Note that a Dense layer computes the dot product between the inputs and the kernel along the last axis of the inputs.
I am creating a language model with a bidirecitonal LSTM, seq2seq model.
I have created the model and trained it successfully:
lstm_units = 100
# Set up embedding layer using pretrained weights
embedding_layer = Embedding(total_words+1, emb_dimension, input_length=max_input_len, weights=[embedding_matrix], name="Embedding")
# Encoder
encoder_input_x = Input(shape=(None,), name="Enc_x_Input")
encoder_embedding_x = embedding_layer(encoder_input_x)
encoder_lstm_x, enc_state_h_fwd, enc_state_c_fwd, enc_state_h_bwd, enc_state_c_bwd = Bidirectional(LSTM(lstm_units, dropout=0.5, return_state=True, name="Enc_LSTM1"), name="Enc_Bi1")(encoder_embedding_x) # pass hidden activation and memory cell states forward
encoder_state_h = Concatenate()([enc_state_h_fwd, enc_state_h_bwd])
encoder_state_c = Concatenate()([enc_state_c_fwd, enc_state_c_bwd])
encoder_states = [encoder_state_h, encoder_state_c] # package states to pass to decoder
# Decoder
decoder_input_x = Input(shape=(None,), name="Dec_x_Input")
decoder_embedding_x = embedding_layer(decoder_input_x)
decoder_lstm_layer = LSTM(lstm_units*2, return_state=True, return_sequences=True, dropout=0.5, name="Dec_LSTM1") # We define an LSTM layer without passing anything in here, as we will need to use this LSTM later.
decoder_lstm_x, _, _ = decoder_lstm_layer(decoder_embedding_x, initial_state=encoder_states) # we pass in encoder states
decoder_dense_layer = TimeDistributed(Dense(total_words+1, activation="softmax", name="Dec_Softmax")) # we set this dense to a variable so we can use it later, as above with the LSTM
decoder_output_x = decoder_dense_layer(decoder_lstm_x)
model = Model(inputs=[encoder_input_x, decoder_input_x], outputs=decoder_output_x)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
I then set up the inference model:
# Inference Encoder
inf_encoder_model = Model(encoder_input_x, encoder_states) # Here we are creating a model using layers from the model we built earlier.
# The encoder model outputs the encoder_states, ie. the concatenated h and c values from the BLSTM
# Inference Decoder
# Create new inputs for decoder state
inf_dec_state_h_input = Input(shape=(2*lstm_units,), name="Dec_h_state_input") # The must be sized to fit both FWD and BWD h values from the BLSTM
inf_dec_state_c_input = Input(shape=(2*lstm_units,), name="Dec_c_state_input")
inf_dec_state_input = [inf_dec_state_h_input, inf_dec_state_c_input] # package states to pass to decoder
# Decoder LSTM + Dense
inf_decoder_lstm_x, inf_dec_state_h, inf_dec_state_c = decoder_lstm_layer(decoder_embedding_x, initial_state=inf_dec_state_input) # reuse embedding layer from training. We pass in encoder states
inf_decoder_states = [inf_dec_state_h, inf_dec_state_c] # I think we we loop inference, we'll pass these states back in to the input instead of the encoder states
inf_decoder_output = decoder_dense_layer(inf_decoder_lstm_x)
decoder_model = Model([decoder_input_x] + inf_dec_state_input, [inf_decoder_output] + inf_decoder_states) # we reuse the decoder_input_x from the training model
The decoder model for inference is set up to take the decoder inputs + the c and h states which are output from the encoder.
When running the inference loop using this code:
states = inf_encoder_model.predict(x_inputs[700])
# Generate empty target sequence of length 1.
target_seq = np.zeros((max_output_len, 1), dtype=int)
# Populate the first character of target sequence with the start character.
target_seq[0, 0] = 4 # 4 is the start of sequence token used during training
# Get prediction
prediction, h, c = decoder_model.predict([target_seq] + states)
it gives me a long error that ends with:
ValueError: Layer Dec_LSTM1 expects 3 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'model_16/Embedding/embedding_lookup/Identity_1:0' shape=(None, 1, 100) dtype=float32>]
The encoder states seem to be fine; a list containing 2 arrays, the h and c values, each with shape (60, 200). The target_seq is an array of shape (1, 60). x_inputs[700] is training data, also of shape (1, 60).
Why is the model.predict line suggesting I am giving it 1 input tensor when I am giving it a list containing 3 arrays?
I want to construct a variational autoencoder in Keras (2.2.4, with TensorFlow backend), here is my code:
dims = [1000, 256, 64, 32]
x_inputs = Input(shape=(dims[0],), name='inputs')
h = x_inputs
# internal layers in encoder
for i in range(n_stacks-1):
h = Dense(dims[i + 1], activation='relu', kernel_initializer='glorot_uniform', name='encoder_%d' % i)(h)
# hidden layer
z_mean = Dense(dims[-1], kernel_initializer='glorot_uniform', name='z_mean')(h)
z_log_var = Dense(dims[-1], kernel_initializer='glorot_uniform', name='z_log_var')(h)
z = Lambda(sampling, output_shape=(dims[-1],), name='z')([z_mean, z_log_var])
encoder = Model(inputs=x_inputs, outputs=z, name='encoder')
encoder_z_mean = Model(inputs=x_inputs, outputs=z_mean, name='encoder_z_mean')
# internal layers in decoder
latent_inputs = Input(shape=(dims[-1],), name='latent_inputs')
h = latent_inputs
for i in range(n_stacks-1, 0, -1):
h = Dense(dims[i], activation='relu', kernel_initializer='glorot_uniform', name='decoder_%d' % i)(h)
# output
outputs = Dense(dims[0], activation='relu', kernel_initializer='glorot_uniform' name='mean')
decoder = Model(inputs=latent_inputs, outputs=outputs, name='decoder')
ae_output = decoder(encoder_z_mean(x_inputs))
ae = Model(inputs=x_inputs, outputs=ae_output, name='ae')
ae.summary()
vae_output = decoder(encoder(x_inputs))
vae = Model(inputs=x_inputs, outputs=vae_output, name='vae')
vae.summary()
The problem is I can print the summary of the "ae" and "vae" models, but when I train the ae model, it says
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'latent_inputs' with dtype float and shape [?,32]
In the model "decoder" is supposed to connect to the output of "encoder_z_mean" layer in the ae model. But when I print the summary of the "ae" model, "decoder" is actually connected to "encoder_z_mean[1][0]". Should it be "encoder_z_mean[0][0]"?
A few corrections:
x_inputs is already the input of the encoders, don't call it again with encoder_z_mean(x_inputs) or with encoder(x_inputs)
Besides creating a second node (the 1 that you are worried with, and that is not a problem), it may be the source of the error because it's not an extra input, but the same input
A healthy usage of this would need the creation of a new Input(...) tensor to be called
The last Dense layer is not being called on a tensor. You probably want (h) there.
Do it this way:
# output - called h in the last layer
outputs = Dense(dims[0], activation='relu', kernel_initializer='glorot_uniform' name='mean')(h)
#unchanged
decoder = Model(inputs=latent_inputs, outputs=outputs, name='decoder')
#adjusted inputs
ae_output = decoder(encoder_z_mean.output)
ae = Model(encoder_z_mean.input, ae_output, name='ae')
ae.summary()
vae_output = decoder(encoder.output)
vae = Model(encoder.input, vae_output, name='vae')
vae.summary()
It's possible that the [1][0] still occurs with the decoder, but this is not a problem at all. It means that the decoder itself has its own input node (number 0), and you created an extra input node (number 1) when you called it with the output of another model. This is harmless. The node 1 will be used while node 0 will be ignored.
from keras.models import load_model
import h5py
# sq_model.save_weights('sq_model_weights.h5')
# res_model.save_weights('res_model_weights.h5')
# model.save('my_model.h5')
# dense_model.save_weights('dense_model_v3_weights.h5')
sq_model.load_weights('sq_model_weights.h5')
res_model.load_weights('res_model_weights.h5')
dense_model.load_weights('dense_model_v2_weights.h5')
models = [sq_model, res_model, dense_model]
model_input = Input((3,32,32))
def ensemble(models, model_input):
outputs = [model.outputs[0] for model in models]
y = Average()(outputs)
model = Model(inputs = model_input, outputs = y, name='ensemble')
return model
ensemble_model = ensemble(models,model_input)
I am getting the following error when I run the above code:
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_2:0", shape=(?, 3, 32, 32), dtype=float32) at layer "input_2". The following previous layers were accessed without issue: []
You have three models, each of them with a separate input. In your call to
model = Model(inputs = model_input, outputs = y, name='ensemble')
you specify a new Model. Its input should be your model_input, and the outputs should be your averaged outputs.
But you forgot to actually connect your three models to your input. So you have a disconnected model containing the loose input layer model_input and the ensemble, with each of the three models contained in the ensemble waiting for an input on its own input layer (so 4 input layers in total).
Changing
outputs = [model.outputs[0] for model in models]
to
outputs = [model(model_inputs) for model in models]
should do the trick. It calls each of the models on model_input and gives the corresponding outputs.
Changing
outputs = [model.outputs[0] for model in models]
to
outputs = [model(model_input) for model in models]
worked for me
Environment info
Operating System:
Rocks OS (Centos 6.5)
I installed from sources, and here is my version:
https://github.com/shiyemin/tensorflow/
Nothing changed but to make it compile successfully on our server.
ERROR
I use caffe-tensorflow to convert caffe model to tensorflow and the GoogLeNet is selected to construct our network.
I add a LSTM layer to this code as follows:
#layer
def lstm(self, input, lstm_type, n_steps, initial_state, num_units, name):
# with tf.variable_scope(name) as scope:
input_shape = input.get_shape()
dim = 1
for d in input_shape[1:].as_list():
dim *= d
input = tf.reshape(input, [input_shape[0].value, -1])
# select LSTM type, Define a lstm cell with tensorflow
if lstm_type == 'basic':
lstm_cell = rnn_cell.BasicLSTMCell(num_units, input_size=dim)
elif lstm_type == 'lstm':
lstm_cell = rnn_cell.LSTMCell(num_units, input_size=dim)
elif lstm_type == 'GRU':
lstm_cell = rnn_cell.GRUCell(num_units, input_size=dim)
else:
raise ValueError("LSTM type %s error."%lstm_type)
# Split data because rnn cell needs a list of inputs for the RNN inner loop
input = tf.split(0, n_steps, input) # n_steps * (batch_size, n_hidden)
# Get lstm cell output
outputs, states = rnn.rnn(lstm_cell, input, initial_state=initial_state) # , scope=scope)
outputs = tf.concat(0, outputs)
return outputs #, states
When i add this LSTM layer to GoogLeNet, the following error occurs.
Failed precondition: Attempting to use uninitialized value RNN/GRUCell/Gates/Linear/Bias
But when i using the code from:
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3%20-%20Neural%20Networks/recurrent_network.py,
everything works well.
Anyone knows what happened? I don't know how to debug this error.