I have a simple frozen tensorflow model (frozen in Keras) that I load and then try to use for prediction. I do this first in python (code below), and then using C and libtensorflow (and get the same results). The examples I have found provide the the logits (before activation) as the final output, rather than the class label after activation. Is there a way to obtain the label through the graph itself?
I understand I can operate the sigmoid / softmax operators on the logits, but that's not what I want to do. (I'm porting the code to use the libtensorflow C api, and would prefer to let the graph do the math.)
My understanding is that the session runs the graph to the operation / tensor, and stops before that operation. Is there a way to get the operation after the activation?
Keras Model:
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(21,)))
model.add(Dense(1, activation='sigmoid'))
Tensorflow code to load frozen model and predict:
from tensorflow.python.platform import gfile
with tf.Session() as sess:
with gfile.FastGFile('slopemodel/slopemodel.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
g_in = tf.import_graph_def(graph_def)
tensor_output = sess.graph.get_tensor_by_name('import/dense_2/Sigmoid:0')
tensor_input = sess.graph.get_tensor_by_name('import/dense_1_input:0')
predictions = sess.run(tensor_output, {tensor_input:sample})
print(predictions)
Truncated list of important nodes in the graph:
['import/dense_1_input',
'import/dense_1/kernel',
'import/dense_1/kernel/read',
'import/dense_1/bias',
'import/dense_1/bias/read',
'import/dense_1/MatMul',
'import/dense_1/BiasAdd',
'import/dense_1/Relu',
'import/dense_2/kernel',
'import/dense_2/kernel/read',
'import/dense_2/bias',
'import/dense_2/bias/read',
'import/dense_2/MatMul',
'import/dense_2/BiasAdd',
'import/dense_2/Sigmoid',
'import/Adam/iterations',
.
.
.]
Yes, you simply need to change the tensor_output to the one you want to obtain. Note that you will not receive the labels themselves but a one-hot vector from which you'll need to find the corresponding label yourself.
Related
I have defined my Functional model like this:
base_model = VGG16(include_top=False, input_shape=(224,224,3), pooling='avg')
inputs = tf.keras.Input(shape=(224,224,3))
x = preprocess_input(inputs)
x = base_model(x, training=False)
x = tf.keras.layers.Dropout(0.2)(x, training=True)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
The problem is when I call .evaluate() or .predict() I get slightly different results everytime when using the exact same batch (with shuffle=False in my dataset, and all the random seeds initialized).
I tried reconstructing the model without some of the layers and I found the culprit to be these 2 layers constructed by the line x=preprocess_input(inputs), which give randomness to the results:
model summary
Note: preprocess_input is a vgg16 preprocessing function at tf.keras.applications.vgg16.preprocess_input.
However, if I redefine my Functional model as Sequential:
new_model = tf.keras.Sequential()
new_model.add(model.layers[0]) #input layer
new_model.add(tf.keras.layers.Lambda(preprocess_input))
new_model.add(model.layers[3]) #vgg16
new_model.add(model.layers[4]) #dropout
new_model.add(model.layers[5]) #dense
The problem is gone and I get consistent results from .evaluate() or .predict().
What could potentially cause the Functional model to behave like this?
EDIT
As xdurch0 pointed out, it was the dropout layer at fault for different results. The functional model applied dropout during .predict() and .evaluate() methods.
I am interested in building reinforcement learning models with the simplicity of the Keras API. Unfortunately, I am unable to extract the gradient of the output (not error) with respect to the weights. I found the following code that performs a similar function (Saliency maps of neural networks (using Keras))
get_output = theano.function([model.layers[0].input],model.layers[-1].output,allow_input_downcast=True)
fx = theano.function([model.layers[0].input] ,T.jacobian(model.layers[-1].output.flatten(),model.layers[0].input), allow_input_downcast=True)
grad = fx([trainingData])
Any ideas on how to calculate the gradient of the model output with respect to the weights for each layer would be appreciated.
To get the gradients of model output with respect to weights using Keras you have to use the Keras backend module. I created this simple example to illustrate exactly what to do:
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
To calculate the gradients we first need to find the output tensor. For the output of the model (what my initial question asked) we simply call model.output. We can also find the gradients of outputs for other layers by calling model.layers[index].output
outputTensor = model.output #Or model.layers[index].output
Then we need to choose the variables that are in respect to the gradient.
listOfVariableTensors = model.trainable_weights
#or variableTensors = model.trainable_weights[0]
We can now calculate the gradients. It is as easy as the following:
gradients = k.gradients(outputTensor, listOfVariableTensors)
To actually run the gradients given an input, we need to use a bit of Tensorflow.
trainingExample = np.random.random((1,8))
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:trainingExample})
And thats it!
The below answer is with the cross entropy function, feel free to change it your function.
outputTensor = model.output
listOfVariableTensors = model.trainable_weights
bce = keras.losses.BinaryCrossentropy()
loss = bce(outputTensor, labels)
gradients = k.gradients(loss, listOfVariableTensors)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:training_data1})
print(evaluated_gradients)
I'm learning tensorflow 2 working through the text classification with TF hub tutorial. It used an embedding module from TF hub. I was wondering if I could modify the model to include a LSTM layer. Here's what I've tried:
train_data, validation_data, test_data = tfds.load(
name="imdb_reviews",
split=('train[:60%]', 'train[60%:]', 'test'),
as_supervised=True)
embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[],
dtype=tf.string, trainable=True)
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Embedding(10000, 50))
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary()
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_data.shuffle(10000).batch(512),
epochs=10,
validation_data=validation_data.batch(512),
verbose=1)
results = model.evaluate(test_data.batch(512), verbose=2)
for name, value in zip(model.metrics_names, results):
print("%s: %.3f" % (name, value))
I don't know how to get the vocabulary size from the hub_layer. So I just put 10000 there. When run it, it throws this exception:
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[480,1] = -6 is not in [0, 10000)
[[node sequential/embedding/embedding_lookup (defined at .../learning/tensorflow/text_classify.py:36) ]] [Op:__inference_train_function_36284]
Errors may have originated from an input operation.
Input Source operations connected to node sequential/embedding/embedding_lookup:
sequential/embedding/embedding_lookup/34017 (defined at Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py:112)
Function call stack:
train_function
I stuck here. My questions are:
how should I use the embedding module from TF hub to feed an LSTM layer? it looks like embedding lookup has some issues with the setting.
how do I get the vocabulary size from the hub layer?
Thanks
Finally figured out the way to link pre-trained embeddings to LSTM or other layers. Just post the steps here in case anyone feels helpful.
Embedding layer has to be the first layer in the model. (hub_layer is the same as Embedding layer.) The not very intuitive part is that any text input to the hub layer will be converted to only one vector of shape [embedding_dim]. You need to do sentence splitting and tokenization to make sure whatever input to the model is a sequence in the form of array of arrays. e.g., "Let us prepare the data." should be converted to [["let"],["us"],["prepare"], ["the"], ["data"]]. You will also need to pad the sequences if you are using batch mode.
In addition, you will need to convert your target tokens to int if your training labels are strings. The input to the model is array of strings with shape [batch, seq_length], the hub embedding layer converts it to [batch, seq_length, embed_dim]. (If you add a LSTM or other RNN layer, the output from the layer is [batch, seq_length, rnn_units]. ) The output dense layer will output index of text instead of actual text. The index of text is stored in the downloaded tfhub directory as "tokens.txt". You can load the file and convert text to the corresponding index. Otherwise you cannot compute the loss.
It seems setting model.trainable=False in tensorflow keras does nothing except for to print a wrong model.summary(). Here is the code to reproduce the issue:
import tensorflow as tf
import numpy as np
IMG_SHAPE = (160, 160, 3)
# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = False
# for layer in base_model.layers:
# layer.trainable=False
bc=[] #before compile
ac=[] #after compile
for layer in base_model.layers:
bc.append(layer.trainable)
print(np.all(bc)) #True
print(base_model.summary()) ##this changes to show no trainable parameters but that is wrong given the output to previous np.all(bc)
base_model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
for layer in base_model.layers:
ac.append(layer.trainable)
print(np.all(ac)) #True
print(base_model.summary()) #this changes to show no trainable parameters but that is wrong given the output to previous np.all(ac)
In light of this - What is the expected behavior and purpose of model.trainable=False in tensorflow keras?
https://github.com/tensorflow/tensorflow/issues/29535
I think this issue could help.
If you are looking for a way to not update some weights in your model I would suggest using the parameter var_list in the minimize function from your Optimizer.
For some reason when creating a model from keras Tensorflow switch all tf.Variables to True, and since all are Tensors we are not able to update the value to False.
What I do in my code is create scope names for all pretrained models and loop over it adding all layers that are not from my pretrained model.
trainable_variables = []
variables_collection = tf.get_collection('learnable_variables')
for layer in tf.trainable_variables():
if 'vgg_model' not in layer.name:
trainable_variables.append(layer)
tf.add_to_collection('learnable_variables', layer)
grad = tf.train.GradientDescentOptimizer(lr)
train_step = grad.minimize(tf.reduce_sum([loss]), var_list=trainable_variables)
Watch out for global_initializer as well, since it will overwrite your pretrained Weights as well. You can solve that by using tf.variables_initializer and passing a list of variables you want to add weights.
sess.run(tf.variables_initializer(variables_collection))
Source I used when trying to solve this problem
Is it possible to make a trainable variable not trainable?
TensorFlow: Using tf.global_variables_initializer() after partially loading pre-trained weights
I have a trained convolutional neural network A that outputs the propability that a given picture contains a square or a circle.
Another Network B takes images of random noise. My idea is to have a bunch of convolutional layers so that the output is a newly generated square.
As an error function I would like to feed the generated image into A and learn filters of B from the softmax tensor of A. To my understanding this is sort of a generative adversarial network, except for that A does not learn. While trying to implement this I have encountered two problems.
I have imported the Layers of A that I want to use in B as followed:
with gfile.FastGFile("shape-classifier.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
image_input_layer, extern_softmax_tensor = tf.import_graph_def(
graph_def, name="", return_elements=["image_input", "Softmax"])
I would like to avoid using two sess.run() three times. (Generating the random image, getting the softmax values from A, adjusting weights of B).
Is there a way to directly connect the tensors so that I only have one graph?
Calling:
logits = extern_softmax_tensor(my_generated_image_tensor)
throws:
TypeError: 'Operation' object is not callable
The "Graph-Connected" and the "Feed-Connected" approach confuse me a bit.
logits = extern_softmax_tensor(my_generated_image_tensor) # however you would call it
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=label_input,
logits=logits)
cross_entropy_mean = tf.reduce_mean(cross_entropy_tensor)
optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
learning_step = optimizer.minimize(cross_entropy_mean)
With that Logic the error will be first passed back through A.
Is there a way to use the softmax calculated by A to directly adjust Layers of B?
Leaving aside if my idea actually works, is it actually possible to build it in tensorflow? I hope I could make my problems clear.
Thank you very much
Yes, it is possible to connect two models like that.
# Generator networ
my_generated_image_tensor = ...
# Read classifier
with gfile.FastGFile("shape-classifier.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
# Map generator output to classifier input
extern_softmax_tensor, = tf.import_graph_def(
graph_def, name="", input_map={"image_input": my_generated_image_tensor},
return_elements=["Softmax:0"])
# Define loss and training step
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=label_input, logits=extern_softmax_tensor)
cross_entropy_mean = tf.reduce_mean(cross_entropy_tensor)
optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
learning_step = optimizer.minimize(cross_entropy_mean)
As side notes: 1) tf.nn.softmax_cross_entropy_with_logits expects logits as input, that is, the values before being passed through the softmax function, so if the Softmax tensor in the loaded model is the output of a softmax operation you should probably change it to use the input logits instead. 2) Note that tf.nn.softmax_cross_entropy_with_logits is now deprecated anyway, see tf.nn.softmax_cross_entropy_with_logits_v2.