Keras: How to access step in model - tensorflow

Is there a way to access the current Keras training step as a tensor in the tensorflow graph?
I am trying to build a model which has an 'epsilon' parameter which is decayed as a function of the current training step.
epsilon = some_fn_of(K.global_step) # <- Something like this?
self.q = K.Sequential([
K.layers.InputLayer(input_shape),
K.layers.Dense(n, name='q'),
K.layers.Lambda(lambda x: tf.cond(tf.random.uniform((), 0, 1) < epsilon,
lambda _: tf.constant(0.0),
lambda ac: ac)
], name='q')
FYI: I'm using the Tensorflow bundled Keras.

I don't know if this will work for all purposes, but it looks like you can find the next training step number using model.optimizer.iterations. The variable name appears to have the format "<optimizer name>/iter:0". You can find the iterations property in the Optimizer documentation. Example value:
<tf.Variable 'Adam/iter:0' shape=() dtype=int64, numpy=5978>

I suspect that Keras does not have any such tensor in the graph and that the only way to access the step is through Callbacks (Keras Docs,Tensorflow Docs). Especially since Keras is meant to be agnostic to the backend, and so would likely maintain the step outside of tensorflow.

Related

Use TensorFlow loss Global Objectives (recall_at_precision_loss) with Keras (not metrics)

Background
I have a multi-label classification problem with 5 labels (e.g. [1 0 1 1 0]). Therefore, I want my model to improve at metrics such as fixed recall, precision-recall AUC or ROC AUC.
It doesn't make sense to use a loss function (e.g. binary_crossentropy) that is not directly related to the performance measurement I want to optimize. Therefore, I want to use TensorFlow's global_objectives.recall_at_precision_loss() or similar as loss function.
Relevant GitHub:
https://github.com/tensorflow/models/tree/master/research/global_objectives
Relevant paper (Scalable Learning of Non-Decomposable Objectives): https://arxiv.org/abs/1608.04802
Not metric
I'm not looking for implementing a tf.metrics. I already succeeded in that following: https://stackoverflow.com/a/50566908/3399066
Problem
I think my issue can be divided into 2 problems:
How to use global_objectives.recall_at_precision_loss() or similar?
How to use it in a Keras model with TF backend?
Problem 1
There is a file called loss_layers_example.py on the global objectives GitHub page (same as above). However, since I don't have much experience with TF, I don't really understand how to use it. Also, Googling for TensorFlow recall_at_precision_loss example or TensorFlow Global objectives example won't give me any clearer example.
How do I use global_objectives.recall_at_precision_loss() in a simple TF example?
Problem 2
Would something like (in Keras): model.compile(loss = ??.recall_at_precision_loss, ...) be enough?
My feeling tells me it is more complex than that, due to the use of global variables used in loss_layers_example.py.
How to use loss functions similar to global_objectives.recall_at_precision_loss() in Keras?
Similar to Martino's answer, but will infer shape from input (setting it to a fixed batch size did not work for me).
The outside function isn't strictly necessary, but it feels a bit more natural to pass params as you configure the loss function, especially when your wrapper is defined in an external module.
import keras.backend as K
from global_objectives.loss_layers import precision_at_recall_loss
def get_precision_at_recall_loss(target_recall):
def precision_at_recall_loss_wrapper(y_true, y_pred):
y_true = K.reshape(y_true, (-1, 1))
y_pred = K.reshape(y_pred, (-1, 1))
return precision_at_recall_loss(y_true, y_pred, target_recall)[0]
return precision_at_recall_loss_wrapper
Then, when compiling the model:
TARGET_RECALL = 0.9
model.compile(optimizer='adam', loss=get_precision_at_recall_loss(TARGET_RECALL))
I managed to make it work by:
Explicitly reshaping tensors to BATCH_SIZE length (see code below)
Cutting the dataset size to a multiple of BATCH_SIZE
def precision_recall_auc_loss(y_true, y_pred):
y_true = keras.backend.reshape(y_true, (BATCH_SIZE, 1))
y_pred = keras.backend.reshape(y_pred, (BATCH_SIZE, 1))
util.get_num_labels = lambda labels : 1
return loss_layers.precision_recall_auc_loss(y_true, y_pred)[0]

`None` in Keras loss function

I have a problem working with Tensorflow and keras. That problem we could explain in this way:
We have a model (convolutional neural network) which has output of the form [None, 7, 7, 6]. We have a function 'custom_loss'. This function has y_true and y_pred parameters. They are of the form [7,7,6]. When I compile it, I got error message: TypeError: must be real number, not Tensor. I suppose there is mistake when I call y_pred[k][l][m] and y_true[k][l][m] but I don't know how to fix this to include this None in [None, 7, 7, 6]. Please help.
Update: Here is the code
def custom_loss(y_true, y_pred):
loss = 0
for i in range(S*S):
k, l = i%S, i//S
first_part = 5* sum([(y_pred[k][l][m] - y_true[k][l][m])**2 for m in range(1,3)])
second_part = 5 * sum([(math.sqrt(y_pred[k][l][m]) - math.sqrt(y_true[k][l][m])) ** 2 for m in range(3, 5)])
third_part = 5* sum([(y_pred[k][l][m] - y_true[k][l][m])**2 for m in [0, 5]])
if y_true[k][l][0] > 0.5:
loss += first_part + second_part + third_part
else:
loss += 0.5 * (y_pred[k][l][0] - y_true[k][l][0])**2
return loss
In keras (and TensorFlow without eager execution) you cannot access the content of a tensor. Therefore, lines as
loss += 0.5 * (y_pred[k][l][0] - y_true[k][l][0])**2
will fail. You can try to use the eager execution mode of TensorFlow together with keras as explained here.
In general you should always try to express these things just with built-in functions of the keras backend or with TensorFlow operations. Just try to express your loss function using matrix/vector notation and then it is easier (maybe we can also help you) to express this in keras.
When you wirte a loss function in keras (with tensorflow backend) it's for building your execution graph but not for execution directly.
You have to use tensorflow or keras backend function to define your loss function. If you compile your model keras (and tensorflow as backend) try to build your execution graph and therefore send tensors trough your loss function. the math package does not support tensors. Also is not possible to use if in your loss function since it's not derivable. Instead you could use a sigmoid function which is very close to a step function.

Get values of tensors in loss function

I would like to get the values of the y_pred and y_true tensors of this keras backend function. I need this to be able to perform some custom calculations and change the loss, these calculations are just possible with the real array values.
def mean_squared_error(y_true, y_pred):
#some code here
return K.mean(K.square(y_pred - y_true), axis=-1)
There is a way to do this in keras? Or in any other ML framework (tf, pytorch, theano)?
No, in general you can't compute the loss that way, because Keras is based on frameworks that do automatic differentiation (like Theano, TensorFlow) and they need to know which operations you are doing in between in order to compute the gradients of the loss.
You need to implement your loss computations using keras.backend functions, else there is no way to compute gradients and optimization won't be possible.
Try including this within the loss function:
y_true = keras.backend.print_tensor(y_true, message='y_true')
Following is an excerpt from the Keras documentation (https://keras.io/backend/):
print_tensor
keras.backend.print_tensor(x, message='')
Prints message and the tensor value when evaluated.
Note that print_tensor returns a new tensor identical to x which should be used in the later parts of the code. Otherwise, the print operation is not taken into account during evaluation.

How to get weights in tf.layers.dense?

I wanna draw the weights of tf.layers.dense in tensorboard histogram, but it not show in the parameter, how could I do that?
The weights are added as a variable named kernel, so you could use
x = tf.dense(...)
weights = tf.get_default_graph().get_tensor_by_name(
os.path.split(x.name)[0] + '/kernel:0')
You can obviously replace tf.get_default_graph() by any other graph you are working in.
I came across this problem and just solved it. tf.layers.dense 's name is not necessary to be the same with the kernel's name's prefix. My tensor is "dense_2/xxx" but it's kernel is "dense_1/kernel:0". To ensure that tf.get_variable works, you'd better set the name=xxx in the tf.layers.dense function to make two names owning same prefix. It works as the demo below:
l=tf.layers.dense(input_tf_xxx,300,name='ip1')
with tf.variable_scope('ip1', reuse=True):
w = tf.get_variable('kernel')
By the way, my tf version is 1.3.
The latest tensorflow layers api creates all the variables using the tf.get_variable call. This ensures that if you wish to use the variable again, you can just use the tf.get_variable function and provide the name of the variable that you wish to obtain.
In the case of a tf.layers.dense, the variable is created as: layer_name/kernel. So, you can obtain the variable by saying:
with tf.variable_scope("layer_name", reuse=True):
weights = tf.get_variable("kernel") # do not specify
# the shape here or it will confuse tensorflow into creating a new one.
[Edit]: The new version of Tensorflow now has both Functional and Object-Oriented interfaces to the layers api. If you need the layers only for computational purposes, then using the functional api is a good choice. The function names start with small letters for instance -> tf.layers.dense(...). The Layer Objects can be created using capital first letters e.g. -> tf.layers.Dense(...). Once you have a handle to this layer object, you can use all of its functionality. For obtaining the weights, just use obj.trainable_weights this returns a list of all the trainable variables found in that layer's scope.
I am going crazy with tensorflow.
I run this:
sess.run(x.kernel)
after training, and I get the weights.
Comes from the properties described here.
I am saying that I am going crazy because it seems that there are a million slightly different ways to do something in tf, and that fragments the tutorials around.
Is there anything wrong with
model.get_weights()
After I create a model, compile it and run fit, this function returns a numpy array of the weights for me.
In TF 2 if you're inside a #tf.function (graph mode):
weights = optimizer.weights
If you're in eager mode (default in TF2 except in #tf.function decorated functions):
weights = optimizer.get_weights()
in TF2 weights will output a list in length 2
weights_out[0] = kernel weight
weights_out[1] = bias weight
the second layer weight (layer[0] is the input layer with no weights) in a model in size: 50 with input size: 784
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(50, activation="relu", name="dense_1")(inputs)
x = layers.Dense(50, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(...)
model.fit(...)
kernel_weight = model.layers[1].weights[0]
bias_weight = model.layers[1].weights[1]
all_weight = model.layers[1].weights
print(len(all_weight)) # 2
print(kernel_weight.shape) # (784,50)
print(bias_weight.shape) # (50,)
Try to make a loop for getting the weight of each layer in your sequential network by printing the name of the layer first which you can get from:
model.summary()
Then u can get the weight of each layer running this code:
for layer in model.layers:
print(layer.name)
print(layer.get_weights())

Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.
You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).