How to evaluate the value of a tensor, from inside the model function of a custom tf.estimator - tensorflow

I am implementing an NLP model based on BERT, using tf.TPUEstimator(). I want to implement layer-wise training, where I need to select only one layer of the model to train for each epoch. In order to do this I wanted to change my model_fn and get the value of current_epoch.
I know how to compute the value of current_epoch as a tensor using tf.train.get_or_create_global_step() inside the model_fn BUT, I need to evaluate the value of this tensor to select which layer to train and implement return the correct train_op to the tf.estimator (train_op pertaining to a single layer chosen accrding to the value of the current_epoch).
I am unable to evaluate this tensor (current_epoch / global_step) from inside the model_fn. I tried the following but the training hangs at the step my_sess.run(my_global_step.initializer
global_step = tf.train.get_or_create_global_step()
graph = tf.get_default_graph()
my_sess = tf.Session(graph=graph)
current_epoch = (global_step * full_bs) // train_size
my_sess.run(my_global_step.initializer)
current_epoch = sess.run(current_epoch)
# My program hangs at the initialising step: my_sess.run(my_global_step.initializer)
Is there any way to evaluate a tensor using the tf.Estimators default session? How do I get the default session/ Graph?
Most importantly what is wrong in my code and why does the training hang when using tpu's and TPUEstimator?

This is not direct answer to OP's 2nd question, it is answer to the title.
I managed to print variable value with get_variable_value, but not sure if this is optimal way.
with
estimator = tf.contrib.tpu.TPUEstimator(
# ...
)
out = estimator.get_variable_value('output_bias')
print(type(out))
print(out)
I got
<class 'numpy.ndarray'>
[-0.00107745 0.00107744]

Related

Keras Model - Get input in custom loss function

I am having trouble with Keras Custom loss function. I want to be able to access truth as a numpy array.
Because it is a callback function, I think I am not in eager execution, which means I can't access it using the backend.get_value() function. i also tried different methods, but it always comes back to the fact that this 'Tensor' object doesn't exist.
Do I need to create a session inside the custom loss function ?
I am using Tensorflow 2.2, which is up to date.
def custom_loss(y_true, y_pred):
# 4D array that has the label (0) and a multiplier input dependant
truth = backend.get_value(y_true)
loss = backend.square((y_pred - truth[:,:,0]) * truth[:,:,1])
loss = backend.mean(loss, axis=-1)
return loss
model.compile(loss=custom_loss, optimizer='Adam')
model.fit(X, np.stack(labels, X[:, 0], axis=3), batch_size = 16)
I want to be able to access truth. It has two components (Label, Multiplier that his different for each item. I saw a solution that is input dependant, but I am not sure how to access the value. Custom loss function in Keras based on the input data
I think you can do this by enabling run_eagerly=True in model.compile as shown below.
model.compile(loss=custom_loss(weight_building, weight_space),optimizer=keras.optimizers.Adam(), metrics=['accuracy'],run_eagerly=True)
I think you also need to update custom_loss as shown below.
def custom_loss(weight_building, weight_space):
def loss(y_true, y_pred):
truth = backend.get_value(y_true)
error = backend.square((y_pred - y_true))
mse_error = backend.mean(error, axis=-1)
return mse_error
return loss
I am demonstrating the idea with a simple mnist data. Please take a look at the code here.

Change training and testing behavior of custom Layer Keras

I am trying to change my Keras model's behavior during training and testing.
To be more precise, I want to simply evaluate the predictions during training, and add another Lambda layer (as a kind of postprocessing) for testing.
I've found a solution Here, where K.functionreceives the specified K.learning_phase() and returns an output. As I understand, using K.in_test_phase() or K.in_training_phase() will return either the first or the second parameter (as described in the docs), based on the training parameter passed.
I'm running on TF 2.0 as Backend, so eager execution is enabled by default. That being said,
passing K.learning_phase() results in an error, which has been described see here. Hence, I'm using
tensorflow.python.keras.symbolic_learning_phase(), which seems to work.
I'm currently able to get the output with K.function(), but my aim is to perform model.fit() in order to train my model, and later call model.evaluate() (I'm using the Keras' functional API).
How would the proper way to train and test my model, based on the learning flag be?
Currently my MWE is:
def build_model(images, training=None):
input_layer = Input(shape=(256,256,3), dtype="float32", batch_size=80)
...
#performing some factor disentanglement here
angle_pred = UpSampling2D(size=(8, 8), interpolation='bilinear')(angle)
radius_pred = UpSampling2D(size=(8, 8), interpolation='bilinear')(radius)
angle_radius_stack = tf.stack([angle_pred, radius_pred], 0)
hough_voting = Lambda(hough_vote)((angle_radius_stack, images))
train_test = K.in_test_phase(hough_voting, angle_radius_stack, training = training)
angle_v, radius_v = tf.unstack(train_test)
model = Model(inputs=input_layer, outputs=[angle_v, radius_v])
adam = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, amsgrad=False)
model.compile(optimizer=adam, loss='mean_absolute_error', metrics=['accuracy'], run_eagerly=True)
return model
And then using training:
def train_model(model, patches, radii, angles):
fun = K.function([model.layers[0].input, B.symbolic_learning_phase()], [model.layers[-1].output])
print(fun([patches[0:80,:,:,:], True]))
model.fit(patches, [radii, angles], batch_size=80, epochs=1)

tf.gradients(model.output, model.input) computes a different value each time I run it

I'm trying to compute the gradient of the output layer with respect to the input layer. My neural network is relatively small (input layer composed of 9 activation units and the output layer of 1) and the training went fine as the test provided a very good accuracy. I made the NN model using Keras.
In order to solve my problem, I need to compute the gradient of the output with respect to the input. This is, I need to obtain the Jacobian which as dimension [1x9]. The gradients function in tensorflow should provide me with everything I need, but when I run the code below I obtain a different solution every time.
output_v = model.output
input_v = model.input
gradients = tf.gradients(output_v, input_v)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(model.input,feed_dict={model.input:x_test_N[0:1,:]}))
evaluated_gradients = sess.run(gradients,feed_dict{model.input:x_test_N[0:1,:]})
print(evaluated_gradients)
sess.close()
The first print command shows this value every time I run it (just to make sure that the input values are not modified):
[[-1.4306372 -0.1272892 0.7145787 1.338818 -1.2957293 -0.5402862-0.7771702 -0.5787912 -0.9157122]]
But the second print shows different ones:
[[ 0.00175761, -0.0490326 , -0.05413761, 0.09952173, 0.06112418, -0.04772799, 0.06557006, -0.02473242, 0.05542536]]
[[-0.00416433, 0.08235116, -0.00930298, 0.04440641, 0.03752216, 0.06378302, 0.03508484, -0.01903783, -0.0538374 ]]
Using finite differences, evaluated_gradients[0,0] = 0.03565103, which isn't close to any of the first values previously printed.
Thanks for your time!
Alberto
Solved by creating a specific session just before training my model:
sess = tf.Session()
sess.run(tf.global_variables_initializer())
K.set_session(sess)
history = model.fit(x_train_N, y_train_N, epochs=n_epochs,
validation_split=split, verbose=1, batch_size=n_batch_size,
shuffle='true', callbacks=[early_stop, tensorboard])
And evaluating the gradient after training, while tf.session is still open:
evaluated_gradients = sess.run(K.gradients(model.output, model.input), feed_dict={model.input: x_test_N})
Presumably your network is set up to initialize weights to random values. When you run sess.run(tf.initialize_all_variables()), you are initializing your variables to new random values. Therefore you get different values for output_v in every run, and hence different gradients. If you want to use a model you trained before, you should replace the initialization with initialize_all_variables() with a restore command. I am not familiar with how this is done in Keras since I usually work directly with tensorflow, but I would try this.
Also note that initialize_all_variables is deprecated and you should use global_variables_initializer instead.

'Reuse' a single tensorflow model in different graphs?

Currently, I am working with a pretrained VGG model in Tf-Slim library. My motivation is to generate adversarial examples for a given image for this netowrk. The summary of task is:
x= tf.placeholder(shape=(None, 32, 32,3), dtype=tf.float32)
for i in range(2):
logits= vgg.vgg_16(x, is_training=False, spatial_squeeze=False, fc_conv_padding='SAME')
x = x + learning_rate*cal_gradient_of_logits_wrt_x(logits)
However, as soon we enter into the second iteration and start running logits= vgg.vgg16(....) we get the following error:
Variable vgg_16/conv1/conv1_1/weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?
It is clear that this error occurred due to replication of graph in the second iteration. As the tf-slim model doesn't use reuse=True in the scopes, it throws this error (because in the second iteration we again ask it to add the vgg layers in the graph, which already exist).
Is it possible to somehow avoid this error? It should be possible to create the graph for VGG model once and use it whenever we need to calculate logits.
The reason this should be possible is the examples from keras. In keras we can simply define the model once with,
model= vgg.VGG16(*args, *kwargs)
Later on, we can add calculate logits for different tensor with,
logits_1= model(x1)
logits_2= model(x2)
Now in both these calculation, the same model parameters will be used i.e, no such error will appear. Is there a way to achieve the same functionality with a tensorflow model.
In tfslim/models/research/slim/nets/vgg.py:
add a reuse parameter in the vgg16 or vgg19 definition
def vgg_16(inputs,
num_classes=1000,
is_training=True,
dropout_keep_prob=0.5,
spatial_squeeze=True,
scope='vgg_16',
fc_conv_padding='VALID',
global_pool=False,
reuse=None):
...
then set the variable scope to reuse the arg_scope if needed
with tf.variable_scope(scope, 'vgg_16', [inputs], reuse=reuse) as sc:
...
Then, when you invoke the function, pass the parameter as reuse=tf.AUTO_REUSE
vgg.vgg_16(images, num_classes=dataset.num_classes,
is_training=True, reuse=tf.AUTO_REUSE)
You can use tf.make_template(func) to do this.
x= tf.placeholder(shape=(None, 32, 32,3), dtype=tf.float32)
vgg_model = tf.make_template(vgg.vgg_16, is_training=False, spatial_squeeze=False, fc_conv_padding='SAME')
for i in range(2):
logits = vgg_model(x)
x += learning_rate*cal_gradient_of_logits_wrt_x(logits)

How to get weights in tf.layers.dense?

I wanna draw the weights of tf.layers.dense in tensorboard histogram, but it not show in the parameter, how could I do that?
The weights are added as a variable named kernel, so you could use
x = tf.dense(...)
weights = tf.get_default_graph().get_tensor_by_name(
os.path.split(x.name)[0] + '/kernel:0')
You can obviously replace tf.get_default_graph() by any other graph you are working in.
I came across this problem and just solved it. tf.layers.dense 's name is not necessary to be the same with the kernel's name's prefix. My tensor is "dense_2/xxx" but it's kernel is "dense_1/kernel:0". To ensure that tf.get_variable works, you'd better set the name=xxx in the tf.layers.dense function to make two names owning same prefix. It works as the demo below:
l=tf.layers.dense(input_tf_xxx,300,name='ip1')
with tf.variable_scope('ip1', reuse=True):
w = tf.get_variable('kernel')
By the way, my tf version is 1.3.
The latest tensorflow layers api creates all the variables using the tf.get_variable call. This ensures that if you wish to use the variable again, you can just use the tf.get_variable function and provide the name of the variable that you wish to obtain.
In the case of a tf.layers.dense, the variable is created as: layer_name/kernel. So, you can obtain the variable by saying:
with tf.variable_scope("layer_name", reuse=True):
weights = tf.get_variable("kernel") # do not specify
# the shape here or it will confuse tensorflow into creating a new one.
[Edit]: The new version of Tensorflow now has both Functional and Object-Oriented interfaces to the layers api. If you need the layers only for computational purposes, then using the functional api is a good choice. The function names start with small letters for instance -> tf.layers.dense(...). The Layer Objects can be created using capital first letters e.g. -> tf.layers.Dense(...). Once you have a handle to this layer object, you can use all of its functionality. For obtaining the weights, just use obj.trainable_weights this returns a list of all the trainable variables found in that layer's scope.
I am going crazy with tensorflow.
I run this:
sess.run(x.kernel)
after training, and I get the weights.
Comes from the properties described here.
I am saying that I am going crazy because it seems that there are a million slightly different ways to do something in tf, and that fragments the tutorials around.
Is there anything wrong with
model.get_weights()
After I create a model, compile it and run fit, this function returns a numpy array of the weights for me.
In TF 2 if you're inside a #tf.function (graph mode):
weights = optimizer.weights
If you're in eager mode (default in TF2 except in #tf.function decorated functions):
weights = optimizer.get_weights()
in TF2 weights will output a list in length 2
weights_out[0] = kernel weight
weights_out[1] = bias weight
the second layer weight (layer[0] is the input layer with no weights) in a model in size: 50 with input size: 784
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(50, activation="relu", name="dense_1")(inputs)
x = layers.Dense(50, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(...)
model.fit(...)
kernel_weight = model.layers[1].weights[0]
bias_weight = model.layers[1].weights[1]
all_weight = model.layers[1].weights
print(len(all_weight)) # 2
print(kernel_weight.shape) # (784,50)
print(bias_weight.shape) # (50,)
Try to make a loop for getting the weight of each layer in your sequential network by printing the name of the layer first which you can get from:
model.summary()
Then u can get the weight of each layer running this code:
for layer in model.layers:
print(layer.name)
print(layer.get_weights())