Seralizing a keras model with an embedding layer - tensorflow

I've trained a model with pre-trained word embeddings like this:
embedding_matrix = np.zeros((vocab_size, 100))
for word, i in text_tokenizer.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
embedding_layer = Embedding(vocab_size,
With the architecture looking like this:
sequence_input = Input(shape=(50,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
text_cnn = Conv1D(filters=5, kernel_size=5, padding='same', activation='relu')(embedded_sequences)
text_lstm = LSTM(500, return_sequences=True)(embedded_sequences)
char_in = Input(shape=(50, 18, ))
char_cnn = Conv1D(filters=5, kernel_size=5, padding='same', activation='relu')(char_in)
char_cnn = GaussianNoise(0.40)(char_cnn)
char_lstm = LSTM(500, return_sequences=True)(char_in)
merged = concatenate([char_lstm, text_lstm])
merged_d1 = Dense(800, activation='relu')(merged)
merged_d1 = Dropout(0.5)(merged_d1)
text_class = Dense(len(y_unique), activation='softmax')(merged_d1)
model = Model([sequence_input,char_in], text_class)
When I go to convert the model to json, I get this error:
ValueError: can only convert an array of size 1 to a Python scalar
Similarly, if I use the function, it seems to save correctly, but when I go to load it, I get Type Error: Expected Float32.
My question is: is there something I am missing when trying to serialize this model? Do I need some sort of Lambda layer or something of the sorts?
Any help would be greatly appreciated!

You can use the weights argument in Embedding layer to provide initial weights.
embedding_layer = Embedding(vocab_size,
The weights should remain non-trainable after model saving/loading:'1.h5')
m = load_model('1.h5')
Layer (type) Output Shape Param # Connected to
input_3 (InputLayer) (None, 50) 0
input_4 (InputLayer) (None, 50, 18) 0
embedding_1 (Embedding) (None, 50, 100) 1000000 input_3[0][0]
lstm_4 (LSTM) (None, 50, 500) 1038000 input_4[0][0]
lstm_3 (LSTM) (None, 50, 500) 1202000 embedding_1[0][0]
concatenate_2 (Concatenate) (None, 50, 1000) 0 lstm_4[0][0]
dense_2 (Dense) (None, 50, 800) 800800 concatenate_2[0][0]
dropout_2 (Dropout) (None, 50, 800) 0 dense_2[0][0]
dense_3 (Dense) (None, 50, 15) 12015 dropout_2[0][0]
Total params: 4,052,815
Trainable params: 3,052,815
Non-trainable params: 1,000,000

I hope you are saving the model after compiling. Like:
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
To save model, you can do:
from keras.models import load_model'model.h5')
model = load_model('model_detect1.h5')
model_json = model.to_json()
with open("model.json", "w") as json_file:
To load model,
from keras.models import model_from_json
json_file = open('model.json', 'r')
model_json =
model = model_from_json(model_json)

I tried multiple methods . The problem is when we work in the embedding layer, then pickle doesnt work, and is not able to save the data.
SO what you can do , when you have some layers like these:-
## Creating model
then, u can use
h5 extension to d=save file, and then convert that to json, model converetd to model2 here
from tensorflow.keras.models import load_model'model.h5')
model = load_model('model.h5')
model_json = model.to_json()
with open("model.json", "w") as json_file:
and this to load data:-
from tensorflow.keras.models import model_from_json
json_file = open('model.json', 'r')
model_json =
model2 = model_from_json(model_json)


Error on custom dataset dimensions feeding transfer model in TensorFLow

Can someone explain this TensorFlow error for me, I'm having trouble understanding what I am doing wrong.
I have a dataset in Tensorflow constructed with a generator. When I test the output of the generator, output dimensions look correct (224 x 224 x 1). But when I try to train the model, I get an error:
WARNING:tensorflow:Model was constructed with shape (None, 224, 224, 1) for input
KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 1), dtype=tf.float32,
name='input_2'), name='input_2', description="created by layer 'input_2'"),
but it was called on an input with incompatible shape (224, 224, 1, 1).
I'm unsure why the dimension of this output has an extra 1 at the end.
Here is the code to create the generator and model. df is a dataframe with file-paths to data and labels. The data are 2D matrices of variable dimensions. I'm using cv2.resize to make them 224x224 and then np.reshape to transform dimensions to (224x224x1). Then I yield the result.
def datagen_row():
# ======================== #
# Import data
# ======================== #
df = get_data()
rowsize = 224
colsize = 224
# ======================== #
# ======================== #
for row in range(len(df)):
data = get_data_from_filepath(df.iloc[row].file_path)
data = cv2.resize(data, dsize=(rowsize, colsize), interpolation=cv2.INTER_CUBIC)
labels = df.iloc[row].label
data = data.reshape( 224, 224, 1)
yield data, labels
dataset =
tf.TensorSpec(shape = (int(os.getenv('rowsize')), int(os.getenv('colsize')), 1), dtype=tf.float32, name=None),
tf.TensorSpec(shape=(), dtype=tf.int64, name=None)
Testing the following I get what I expected:
iterator = iter(dataset.batch(8))
x = iterator.get_next()
x[0].shape # TensorShape([8, 224, 224, 1])
x[1].shape # TensorShape([8])
x[0] # <tf.Tensor: shape=(8, 224, 224, 1), dtype=float32, numpy=array(...
x[1] # <tf.Tensor: shape=(8,), dtype=int64, numpy=array([1, 1, 1, 1, 1, 1, 1, 1], dtype=int64)>
I'm trying to plug this into InceptionV3 model to do a classification
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.layers import Input
from tensorflow.keras import layers
origModel = InceptionV3(weights = 'imagenet', include_top = False)
inputs = layers.Input(shape = (224, 224, 1))
modified_inputs = layers.Conv2D(3, 3, padding = 'same', activation='relu')(inputs)
x = origModel(modified_inputs)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(1024, activation = 'relu')(x)
x = layers.Dense(512, activation = 'relu')(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dense(128, activation = 'relu')(x)
x = layers.Dense(64, activation = 'relu')(x)
x = layers.Dense(32, activation = 'relu')(x)
outputs = layers.Dense(2)(x)
model = tf.keras.Model(inputs, outputs)
model.summary() # 24.6 M trainable params
for layer in origModel.layers:
layer.trainable = False
model.summary() # now shows 2.8 M trainable params
optimizer = 'adam',
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = True),
metrics = ['accuracy']
), epochs = 1, verbose = True, batch_size = 32)
Here is the output of model.summary
Model: "model"
Layer (type) Output Shape Param #
input_2 (InputLayer) [(None, 224, 224, 1)] 0
conv2d_94 (Conv2D) (None, 224, 224, 3) 30
inception_v3 (Functional) (None, None, None, 2048) 21802784
global_average_pooling2d (G (None, 2048) 0
dense (Dense) (None, 1024) 2098176
dense_1 (Dense) (None, 512) 524800
dense_2 (Dense) (None, 256) 131328
dense_3 (Dense) (None, 128) 32896
dense_4 (Dense) (None, 64) 8256
dense_5 (Dense) (None, 32) 2080
dense_6 (Dense) (None, 2) 66
Total params: 24,600,416
Trainable params: 2,797,632
Non-trainable params: 21,802,784
This code worked after changing, epochs = 1, verbose = True, batch_size = 32)
to, epochs = 1, verbose = True, batch_size = 32)
So... I will have to look into using dataset.batch versus batch_size in

Stacking ensemble with two different inputs for image segmenatation

I an stacking two models trained on different inputs from two data collections as shown below using Tensorflow Keras 2.6.2. The stacking is performed with a convolutional meta-learner to predict on a common hold out test set. Given below is the code and he model architecture.
#load data
X_tr1 = np.load('data/X_tr1.npy') #shape (200, 224,224,3)
Y_tr1 = np.load('data/Y_tr1.npy') #shape (200, 224,224,1)
X_val1 = np.load('data/X_val1.npy') #shape (100, 224,224,3)
Y_val1 = np.load('data/Y_val1.npy') #shape (100, 224,224,1)
X_tr2 = np.load('data/X_tr2.npy') #shape (200, 224,224,3)
Y_tr2 = np.load('data/Y_tr2.npy') #shape (200, 224,224,1)
X_val2 = np.load('data/X_val2.npy') #shape (100, 224,224,3)
Y_val2 = np.load('data/Y_val2.npy') #shape (100, 224,224,1)
#common hold-out test set
X_ts = np.load('data/X_ts.npy') #shape (50, 224,224,3)
Y_ts = np.load('data/Y_ts.npy') #shape (50, 224,224,1)
#instantiate the models
img_width, img_height = 224,224
input_shape = (img_width, img_height, 3) #RGB inputs
model_input1 = Input(shape=input_shape) #input to model1
model_input2 = Input(shape=input_shape) #input to model2
n_classes=1 #grayscale mask output
batch_size = 8
n_epochs = 256
BACKBONE = 'vgg16'
# define model
model1 = sm.Unet(BACKBONE, encoder_weights='imagenet',
classes=n_classes, activation=activation)
model2 = sm.Unet(BACKBONE, encoder_weights='imagenet',
classes=n_classes, activation=activation)
# constructing a stacking ensemble of the two models
# A second-level fully-convolutional meta-learner is used to learn
# the features extracted from the penultimate layers of the models
n_models = 2
def load_all_models(n_models):
all_models = list()
model1.load_weights('weights/vgg16_1.hdf5') # path to model1
outputs=model1.get_layer('decoder_stage4b_relu').output) #name of the penultimate layer
x1 = model_loss1a.output
model1a = Model(inputs=model1.input, outputs=x1, name='model1')
model2.load_weights('weights/vgg16_2.hdf5') #path to model2
x2 = model_loss2a.output
model2a = Model(inputs=model2.input, outputs=x2, name='model2')
return all_models
# load models
n_members = 2
members = load_all_models(n_members)
print('Loaded %d models' % len(members))
def define_stacked_model(members):
# update all layers in all models to not be trainable
for i in range(len(members)):
model = members[i]
for layer in model.layers [1:]:
# make not trainable
layer.trainable = False
layer._name = 'ensemble_' + str(i+1) + '_' +
ensemble_outputs = [model(model_input1, model_input2) for model in members]
merge = Concatenate()(ensemble_outputs)
# meta-learner, fully-convolutional
x4 = Conv2D(128, (3,3), activation='relu',
name = 'NewConv1', padding='same')(merge)
x5 = Conv2D(1, (1,1), activation='sigmoid',
name = 'NewConvfinal')(x4)
model= Model(inputs=[model_input1,model_input2],
return model
print("Creating Ensemble")
ensemble = define_stacked_model(members)
print("Ensemble architecture: ")
Shown below is the architecture of the stacked model:
Model: "model_4"
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) [(None, 224, 224, 3) 0
input_2 (InputLayer) [(None, 224, 224, 3) 0
model1 (Functional) (None, None, None, 1 23752128 input_1[0][0]
model2 (Functional) (None, None, None, 1 23752128 input_1[0][0]
concatenate (Concatenate) (None, 224, 224, 32) 0 model1[0][0]
NewConv1 (Conv2D) (None, 224, 224, 128 36992 concatenate[0][0]
NewConv2 (Conv2D) (None, 224, 224, 64) 73792 NewConv1[0][0]
NewConv3 (Conv2D) (None, 224, 224, 32) 18464 NewConv2[0][0]
NewConvfinal (Conv2D) (None, 224, 224, 1) 33 NewConv3[0][0]
Total params: 47,633,537
Trainable params: 129,281
Non-trainable params: 47,504,256
I compile and train the model as shown below:
opt = keras.optimizers.Adam(lr=0.001)
results_ensemble =, Y_tr1, X_tr2, Y_tr2),
validation_data=(X_val1, Y_val1, X_val2, Y_val2))
I get the following error:
Traceback (most recent call last):
File "/home/codes/", line 563, in <module>
validation_data=(X_val1, Y_val1, X_val2, Y_val2))
File "/home/anaconda3/envs/tf262/lib/python3.7/site-packages/keras/engine/", line 1125, in fit
File "/home/anaconda3/envs/tf262/lib/python3.7/site-packages/keras/engine/", line 1574, in unpack_x_y_sample_weight
raise ValueError(error_msg)
ValueError: Data is expected to be in format `x`, `(x,)`, `(x, y)`, or `(x, y, sample_weight)`, found: (array([[[[0.09803922, 0.09803922, 0.09803922],
[0.09803922, 0.09803922, 0.09803922],
[0.09803922, 0.09803922, 0.09803922],
[0.08627451, 0.08627451, 0.08627451],
[0.08627451, 0.08627451, 0.08627451],
[0.05098039, 0.05098039, 0.05098039]],...
Also how do I predict with a single X_ts provided the ensemble model now has two separate inputs?
New error after trying to implement the suggestions:
File "/home/codes/", line 595, in <module>
File "/home/anaconda3/envs/tf262/lib/python3.7/site-packages/keras/engine/", line 1184, in fit
tmp_logs = self.train_function(iterator)
ValueError: Layer model_4 expects 2 input(s), but it received 4 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 224, 224, 3) dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=(None, 224, 224, 1) dtype=float32>, <tf.Tensor 'IteratorGetNext:2' shape=(None, 224, 224, 3) dtype=float32>, <tf.Tensor 'IteratorGetNext:3' shape=(None, 224, 224, 1) dtype=float32>]
Answer based on comment. Multi-inputs need to be passed as a list, not a tuple.
results_ensemble =, Y_tr1, X_tr2, Y_tr2),
validation_data=(X_val1, Y_val1, X_val2, Y_val2))
inputs = [X_tr1, Y_tr1, X_tr2, Y_tr2] # you can pass the list itself or the variable
results_ensemble =,
validation_data=([X_val1, X_val2], y_val))
# test_inputs_diff = [x_test1, x_test2] # different input
# test_inputs_same = [x_test1, x_test1] # same input
# preds_diff = ensemble.predict(test_inputs_diff)
# preds_same = ensemble.predict(test_inputs_same)

Sequential VGG16 model, graph disconnected error

I have a sequential model with a VGG16 at the top.:
def rescale(x):
return x/65535.
base_model = tf.keras.applications.VGG16(
include_top=True, weights=None, input_tensor=None, input_shape=(224,224,1),
pooling=None, classes=102, classifier_activation='softmax')
model = tf.keras.Sequential([
tf.keras.Input(shape=(None, None, 1)),
tf.keras.layers.experimental.preprocessing.Resizing(224, 224),
tf.keras.layers.experimental.preprocessing.RandomFlip(mode='horizontal_and_vertical', seed=42),
Output model.summary():
Model: "sequential"
Layer (type) Output Shape Param #
lambda (Lambda) (None, None, None, 1) 0
resizing (Resizing) (None, 224, 224, 1) 0
random_flip (RandomFlip) (None, 224, 224, 1) 0
vgg16 (Functional) (None, 102) 134677286
Total params: 134,677,286
Trainable params: 134,677,286
Non-trainable params: 0
Now I want to create a new model with two outputs:
vgg_model = model.layers[3]
last_conv_layer = vgg_model.get_layer('block5_conv3')
new_model = tf.keras.models.Model(inputs=[model.inputs], outputs=[last_conv_layer.output, model.output])
But I get this error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1_6:0", shape=(None, 224, 224, 1), dtype=float32) at layer "block1_conv1". The following previous layers were accessed without issue: []
What am I missing here?
Given a fitted model in this form:
def rescale(x):
return x/65535.
base_model = tf.keras.applications.VGG16(
include_top=True, weights=None, input_tensor=None, input_shape=(224,224,1),
pooling=None, classes=102, classifier_activation='softmax')
model = tf.keras.Sequential([
tf.keras.Input(shape=(None, None, 1)),
tf.keras.layers.experimental.preprocessing.Resizing(224, 224),
tf.keras.layers.experimental.preprocessing.RandomFlip(mode='horizontal_and_vertical', seed=42),
You can wrap your vgg in a Model that returns all the outputs you need
new_model = Model(inputs=model.layers[3].input,
inp = tf.keras.Input(shape=(None, None, 1))
x = tf.keras.layers.Lambda(rescale)(inp)
x = tf.keras.layers.experimental.preprocessing.Resizing(224, 224)(x)
outputs = new_model(x)
new_model = Model(inp, outputs)
The summary of new_model:
Layer (type) Output Shape Param #
input_49 (InputLayer) [(None, None, None, 1)] 0
lambda_25 (Lambda) (None, None, None, 1) 0
resizing_25 (Resizing) (None, 224, 224, 1) 0
functional_47 (Functional) [(None, 102), (None, 14, 134677286
Total params: 134,677,286
Trainable params: 134,677,286
Non-trainable params: 0

How to use model subclassing in Keras?

Having the following model written in the sequential API:
config = {
'learning_rate': 0.001,
'dropout_rate': 0.08,
'batch_size': 128,
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
def get_model(num_features, output_size):
opt = Adam(learning_rate=0.001)
model = Sequential()
model.add(Input(shape=[None,num_features], dtype=tf.float32, ragged=True))
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation']))
if 'dropout_rate' in config:
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
if 'dropout_rate' in layer:
model.add(Dense(output_size, activation='sigmoid'))
model.compile(loss='mse', optimizer=opt, metrics=['mse'])
return model
When using a distributed training framework, I need to convert the syntax to use model subclassing instead.
I've looked at the docs but couldn't figure out how to do it.
Here is one equivalent subclassed implementation. Though I didn't test.
import tensorflow as tf
# your config
config = {
'learning_rate': 0.001,
'dropout_rate': 0.08,
'batch_size': 128,
{'neurons': 32, 'activation': 'relu'},
{'neurons': 32, 'activation': 'relu'},
# Subclassed API Model
class MySubClassed(tf.keras.Model):
def __init__(self, output_size):
super(MySubClassed, self).__init__()
self.lstm = tf.keras.layers.LSTM(config['lstm_neurons'],
activation=config['lstm_activation']) = tf.keras.layers.BatchNormalization()
if 'dropout_rate' in config:
self.dp1 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp2 = tf.keras.layers.Dropout(config['dropout_rate'])
self.dp3 = tf.keras.layers.Dropout(config['dropout_rate'])
for layer in config['dense_layers']:
self.dense1 = tf.keras.layers.Dense(layer['neurons'],
self.bn1 = tf.keras.layers.BatchNormalization()
self.dense2 = tf.keras.layers.Dense(layer['neurons'],
self.bn2 = tf.keras.layers.BatchNormalization()
self.out = tf.keras.layers.Dense(output_size,
def call(self, inputs, training=True, **kwargs):
x = self.lstm(inputs)
x =
if 'dropout_rate' in config:
x = self.dp1(x)
x = self.dense1(x)
x = self.bn1(x)
if 'dropout_rate' in config:
x = self.dp2(x)
x = self.dense2(x)
x = self.bn2(x)
if 'dropout_rate' in config:
x = self.dp3(x)
return self.out(x)
# A convenient way to get model summary
# and plot in subclassed api
def build_graph(self, raw_shape):
x = tf.keras.layers.Input(shape=(None, raw_shape),
return tf.keras.Model(inputs=[x],
Build and compile the mdoel
s = MySubClassed(output_size=1)
loss = 'mse',
metrics = ['mse'],
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
Pass some tensor to create weights (check).
raw_input = (16, 16, 16)
y = s(tf.ones(shape=(raw_input)))
print("weights:", len(s.weights))
print("trainable weights:", len(s.trainable_weights))
weights: 21
trainable weights: 15
Summary and Plot
Summarize and visualize the model graph.
Model: "model"
Layer (type) Output Shape Param #
input_1 (InputLayer) [(None, None, 16)] 0
lstm (LSTM) (None, 32) 6272
batch_normalization (BatchNo (None, 32) 128
dropout (Dropout) (None, 32) 0
dense_2 (Dense) (None, 32) 1056
batch_normalization_3 (Batch (None, 32) 128
dropout_1 (Dropout) (None, 32) 0
dense_3 (Dense) (None, 32) 1056
batch_normalization_4 (Batch (None, 32) 128
dropout_2 (Dropout) (None, 32) 0
dense_4 (Dense) (None, 1) 33
Total params: 8,801
Trainable params: 8,609
Non-trainable params: 192

Optimization of Hyperparameters in a CNN

EDIT: I adjusted the model as suggested. That means I included lr and dropout as arguments in the ConvNet function.
I am new to Neural Networks and CNNs and facing a problem regarding Optimization of Hyperparameters. So now I will explain my process so far:
With the help of various excellent Blog-Posts I was able to build a CNN that works for my project. In my project I am trying to predict the VIX and S&P 500 with the help of the FOMC meeting statements. So basically I habe text data on the one hand and financial data (returns) on the other hand. After preprocessing and applying Googles Word2Vec pre-trained Word-Embeddings I built the following Convolutional Network:
def ConvNet(embeddings, max_sequence_length, num_words, embedding_dim, trainable=False, extra_conv=True,
lr=0.001, dropout=0.5):
embedding_layer = Embedding(num_words,
sequence_input = Input(shape=(max_sequence_length,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
convs = []
filter_sizes = [3, 4, 5]
for filter_size in filter_sizes:
l_conv = Conv1D(filters=128, kernel_size=filter_size, activation='relu')(embedded_sequences)
l_pool = MaxPooling1D(pool_size=3)(l_conv)
l_merge = concatenate([convs[0], convs[1], convs[2]], axis=1)
# add a 1D convnet with global maxpooling, instead of Yoon Kim model
conv = Conv1D(filters=128, kernel_size=3, activation='relu')(embedded_sequences)
pool = MaxPooling1D(pool_size=3)(conv)
if extra_conv == True:
x = Dropout(dropout)(l_merge)
# Original Yoon Kim model
x = Dropout(dropout)(pool)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(1, activation='linear')(x)
model = Model(sequence_input, preds)
sgd = SGD(learning_rate = lr, momentum= 0.8)
optimizer= sgd,
return model
My model architecture looks like this:
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) (None, 1086) 0
embedding_1 (Embedding) (None, 1086, 300) 532500 input_1[0][0]
conv1d_1 (Conv1D) (None, 1084, 128) 115328 embedding_1[0][0]
conv1d_2 (Conv1D) (None, 1083, 128) 153728 embedding_1[0][0]
conv1d_3 (Conv1D) (None, 1082, 128) 192128 embedding_1[0][0]
max_pooling1d_1 (MaxPooling1D) (None, 361, 128) 0 conv1d_1[0][0]
max_pooling1d_2 (MaxPooling1D) (None, 361, 128) 0 conv1d_2[0][0]
max_pooling1d_3 (MaxPooling1D) (None, 360, 128) 0 conv1d_3[0][0]
concatenate_1 (Concatenate) (None, 1082, 128) 0 max_pooling1d_1[0][0]
dropout_2 (Dropout) (None, 1082, 128) 0 concatenate_1[0][0]
flatten_1 (Flatten) (None, 138496) 0 dropout_2[0][0]
dense_3 (Dense) (None, 128) 17727616 flatten_1[0][0]
dense_4 (Dense) (None, 1) 129 dense_3[0][0]
Total params: 18,721,429
Trainable params: 18,188,929
Non-trainable params: 532,500
So, now I am facing the next big problem, and I am really running out of ideas how to solve is: Optimization of hyperparameters
My problem ist, that every code example I found so far is applied to the Optimization of hyperparameters is to the architecture:
model = Sequential()
embedding = model.add(layers.Embedding(MAX_VOCAB_SIZE, EMBEDDING_DIM, input_length=MAX_SEQUENCE_LENGTH))
model.add(layers.Conv1D(filters=128, kernel_size=5, activation='relu'))
So my specific question is, how to perform the Optimization of hyperparameters, because whenever I change something in my ConvNet I am getting errors an as I said all tutorials I can find are applied to model = Sequential().
The new error message is:
0%| | 0/100 [00:00<?, ?trial/s, best loss=?]
job exception: 'Model' object is not subscriptable
Traceback (most recent call last):
File "/Users/lukaskoston/Desktop/MasterarbeitFOMCAnalysis/07_Regression/CNN regression", line 262, in <module>
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 482, in fmin
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 686, in fmin
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 509, in fmin
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 330, in exhaust - n_done, block_until_done=self.asynchronous)
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 286, in run
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 165, in serial_evaluate
result = self.domain.evaluate(spec, ctrl)
File "/Users/lukaskoston/.local/lib/python3.7/site-packages/hyperopt/", line 894, in evaluate
rval = self.fn(pyll_rval)
File "/Users/lukaskoston/Desktop/MasterarbeitFOMCAnalysis/07_Regression/CNN regression", line 248, in train_and_score
return hist['val_loss'][-1]
TypeError: 'Model' object is not subscriptable
Thanks in advance,
You should make your hyperparameters arguments to your method.
def ConvNet(embeddings, max_sequence_length, num_words, embedding_dim, trainable=False, extra_conv=True
lr=1.0, dropout=0.5):
# ...
Then you can update your code to use those values instead of the fixed values or Keras defaults.
if extra_conv == True:
x = Dropout(dropout)(l_merge)
# Original Yoon Kim model
x = Dropout(dropout)(pool)
I would start with those two, and leave batch size and epoch count for later. Those can have a big effect on run time, which is hard to account for in the hyperparameter optimization.
Then you can optimize with a library like hyperopt.
from hyperopt import fmin, hp, tpe, space_eval
def train_and_score(args):
# Train the model the fixed params plus the optimization args.
# Note that this method should return the final History object.
hist = ConvNet(embeddings, max_sequence_length, num_words,
embedding_dim, trainable=False, extra_conv=True,
lr=args['lr'], dropout=args['dropout'])
# Unpack and return the last validation loss from the history.
return hist['val_loss'][-1]
# Define the space to optimize over.
space = {
'lr': hp.loguniform('lr', np.log(0.1), np.log(10.0)),
'dropout': hp.uniform('dropout', 0, 1),
# Minimize the training score over the space.
trials = Trials()
best = fmin(train_and_score, space, trials=trials, algo=tpe.suggest, max_evals=100)
# Print details about the best results and hyperparameters.
print(space_eval(space, best))
There are also libraries that will help you directly integrate this with Keras. A popular choice is hyperas. In that case you would modify your function to use some templates instead of parameters, but it is otherwise very similar.