I'm gradually switching from using only Tensorflow to Tensorflow+Keras. For now I'm still training with a tensorflow optimizer, but using Dense layers from Keras. e.g.,
model.add(Dense(hidden_width, kernel_regularizer=regularizers.l2(0.01)))
How can I retrieve all the l2 penalities from my Dense Keras layers so that I can add them to my overall loss function?
Before I was using Keras, I used to do
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
loss = recon_loss + sum(reg_losses)
But now that I'm using Keras for my Dense layers, tf.GraphKeys.REGULARIZATION_LOSSES is empty.
All regularizers are named following the same template {weight|bias|activation}_regularizer.
You could try to retrieve them by filtering the losses of a model:
model = Model(...)
reg_losses = [l for l in model.losses
if 'regularizer' in l.name]
loss = recon_loss + sum(reg_losses)
Related
I am interested in building reinforcement learning models with the simplicity of the Keras API. Unfortunately, I am unable to extract the gradient of the output (not error) with respect to the weights. I found the following code that performs a similar function (Saliency maps of neural networks (using Keras))
get_output = theano.function([model.layers[0].input],model.layers[-1].output,allow_input_downcast=True)
fx = theano.function([model.layers[0].input] ,T.jacobian(model.layers[-1].output.flatten(),model.layers[0].input), allow_input_downcast=True)
grad = fx([trainingData])
Any ideas on how to calculate the gradient of the model output with respect to the weights for each layer would be appreciated.
To get the gradients of model output with respect to weights using Keras you have to use the Keras backend module. I created this simple example to illustrate exactly what to do:
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
To calculate the gradients we first need to find the output tensor. For the output of the model (what my initial question asked) we simply call model.output. We can also find the gradients of outputs for other layers by calling model.layers[index].output
outputTensor = model.output #Or model.layers[index].output
Then we need to choose the variables that are in respect to the gradient.
listOfVariableTensors = model.trainable_weights
#or variableTensors = model.trainable_weights[0]
We can now calculate the gradients. It is as easy as the following:
gradients = k.gradients(outputTensor, listOfVariableTensors)
To actually run the gradients given an input, we need to use a bit of Tensorflow.
trainingExample = np.random.random((1,8))
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:trainingExample})
And thats it!
The below answer is with the cross entropy function, feel free to change it your function.
outputTensor = model.output
listOfVariableTensors = model.trainable_weights
bce = keras.losses.BinaryCrossentropy()
loss = bce(outputTensor, labels)
gradients = k.gradients(loss, listOfVariableTensors)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:training_data1})
print(evaluated_gradients)
I converted Sports_1M caffe model to Keras and using it as an pretrained model into my new Keras Model.I also loaded the pretrained weights.
I removed the top layer of Pretrained model and finally concatenated with the New Model. I don't want to train the loaded pretrained model again (just wanted to use the embedding of pretrained model and use it to train my new Keras model).
The code looks like this:
from keras.models import model_from_json
from keras import backend as K
K.set_image_dim_ordering('th')
model = model_from_json(open('/content/sports_1M/sports1M_model_new.json', 'r').read())
model.load_weights('/content/sports_1M/sports1M_weights.h5')
My questions are:
Should I compile the pretrained model then concatenate it?
model.compile(loss='mean_squared_error', optimizer='adam')
How do I know that the pretrained model is not training it again (which I don't want)?
How do I train the whole (concatenated) architecture?
model2 = Model(model.get_input_at(0),model.get_layer(layer_name).output)
input_shape = (3, 16, 112, 112)
encoded_l = model2(left_input)
prediction = Dense(1,activation='sigmoid')(encoded_l)
Model([left_input,right_input] , prediction)
When we use Inbuild pretrained models like VGG , we generally use VGG(include_top = False , weights = 'imagenet')
I am thinking like this for my case
I got the answer , simply we can set layers.trainable = False
for layer in model.layers:
layer.trainable = False
I am using a bidirectional RNN in Keras and need to use Tensoflows LazyAdamOptimizer. I need to do Gradient Normalization. How can I implement gradient normalization with tensorflows LazyAdamOptimizer and than use the functional keras model further on?
I am training a unsupervised RNN to predict a input sequence of lenght 10. The Problem is, that i am using a keras functional model. Because of the sparsity of the embedding layer i need to use Tensorflows LazyAdamOptimizer, which is not a default optimizer in keras. When using a default keras optimizer i can do gradient normalization just by setting the argument 'clipnorm=1' in the optimizer function. Because i am using LazyAdam i need to do this with tensorflow and than pass it back to my keras model, but i can't get the code going.
#model architecture
model_input = Input(shape=(seq_len, ))
embedding_a = Embedding(len(port_fwd_dict), 50, input_length=seq_len, mask_zero=True)(model_input)
lstm_a = Bidirectional(GRU(25, return_sequences=True,implementation=2, reset_after=True, recurrent_activation='sigmoid'), merge_mode="concat (embedding_a)
dropout_a = Dropout(0.2)(lstm_a)
lstm_b = Bidirectional(GRU(25, return_sequences=False, activation="relu", implementation=2, reset_after=True, recurrent_activation='sigmoid'), merge_mode="concat")(dropout_a)
dropout_b = Dropout(0.2)(lstm_b)
dense_layer = Dense(100, activation="linear")(dropout_b)
dropout_c = Dropout(0.2)(dense_layer)
model_output = Dense(len(port_fwd_dict)-1, activation="softmax(dropout_c)
# trying to implement gradient normalization
optimizer = tf.contrib.opt.LazyAdamOptimizer()
optimizer = tf.contrib.estimator.clip_gradients_by_norm(optimizer, 1)
loss = tf.reduce_mean(categorical_crossentropy(model_input, model_output))
train_op = optimizer.minimize(loss, tf.train.get_global_step())
model = Model(inputs=model_input, outputs=model_output)
model.compile(optimizer=train_op, loss='categorical_crossentropie', metrics = [ 'categorical_accuracy'])
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size, validation_split=validation_split, class_weight = 'auto')
Blockquote
I get the following Error: NameError: name 'categorical_crossentropy' is not defined
But even if this error is solved, i do not know if this code will work. Because I need to use the keras function model.compile and in this function there need to be a loss specified. but when i do this in the tensorflow part above, it is not working.
I need a way to do gradient normalization and use my normal keras functional model?!
maybe you can try my implement of lazy optimizer:
https://github.com/bojone/keras_lazyoptimizer
It is a pure keras implement, wrapping a existing optimizer to be a lazy version.
I came across this code I want to convert to keras:
l2 = lambda_loss_amount * sum(
tf.nn.l2_loss(tf_var) for tf_var in tf.trainable_variables()
) # L2 loss prevents this overkill neural network to overfit the data
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y, logits=pred)) + l2 # Softmax loss
How would this be written as a Keras loss function?
See here for a description of regularizers in keras. Here a toy example:
from keras import regularizers
model.add(Dense(64, input_dim=64,
kernel_regularizer=regularizers.l2(lambda_loss_amount),
bias_regularizer=regularizers.l2(lambda_loss_amount)))
You can use the activation and kernel_regularizer on keras layer as the following:
Dense(..., activation='softmax', kernel_regularizer=regularizers.l2(0))
I'm building image processing network in tensorflow and I want to make use of texture loss. Texture loss seems simple to implement if you have pretrained model loaded.
I'm using TF to build the computational graph for my model and I want to incorporate Keras.application.VGG19 model to get output from layer 'block4_conv4'.
The problem is: I have two TF tensors target and result from my main model, how to feed them into keras VGG19 in the same session to compute their diff and use it in main loss for my model?
It seems following code does the trick
with tf.variable_scope("") as scope:
phi_func = VGG19(include_top=False, weights=None, input_shape=(128, 128, 3))
text_1 = phi_func(predicted)
scope.reuse_variables()
text_2 = phi_func(x)
text_loss = tf.reduce_mean((text_1 - text_2)**2)
right after session created I call phi_func.load_weights(path) to initiate weights