Tensorflow: How can I assign numpy pre-trained weights to subsections of graph? - numpy

This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)

I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.

You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).

Related

How to extract Tensorflow trained weights from graph.pbtxt to raw data

I have trained a custom neural network with the function:
tf.estimator.train_and_evaluate
After correct training, it contains the following files:
checkpoint
events.out.tfevents.1538489166.ti
model.ckpt-0.data-00000-of-00002
model.ckpt-0.index
model.ckpt-10.data-00000-of-00002
model.ckpt-10.index eval
graph.pbtxt
model.ckpt-0.data-00001-of-00002
model.ckpt-0.meta
model.ckpt-10.data-00001-of-00002
model.ckpt-10.meta
Now I need to export the weights and biases of every layer, into a raw data structure, e.g. an array, numpy.
I have read multiple pages on TensorFlow, and on other topics, but neither can find this question. The first thing I would assume to put the fils together into graph.pd with the freeze.py as suggested here:
Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file
But then still the main question is unsolved.
If you wish to evaluate tensors alone, you can check out this question. But if you wish to e.g. deploy your network, you can take a look at TensorFlow serving, which is probably the most performant one right now. Or if you want to export this network to other frameworks and use them there, you can actually use ONNX for this purpose.
If saving weights and biases in a numpy array is your strict requirement, you can follow this example:
# In a TF shell, define all requirements and call the model function
y = model(x, is_training=False, reuse=tf.AUTO_REUSE) # For example
Once you call this function, you can see all the variables in the graph by running
tf.global_variables()
You need to restore all these variables from the latest checkpoint (say ckpt_dir) and then execute each of these variables to get the latest values.
checkpoint = tf.train.latest_checkpoint('./model_dir/')
fine_tune = tf.contrib.slim.assign_from_checkpoint_fn(checkpoint,
tf.global_variables(),
ignore_missing_vars=True)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
gv = sess.run(tf.global_variables())
Now gv will be a list of all the values of your variables (weights and biases); You can access any individual component via indexing - gv[5] etc. Or you can convert the entire thing into an array and save using numpy.
np.save('my_weights', np.array(gv))
This will save all your weights and biases in your current working directory as a numpy array - 'my_weights.npy'.
Hope this helps.

'Reuse' a single tensorflow model in different graphs?

Currently, I am working with a pretrained VGG model in Tf-Slim library. My motivation is to generate adversarial examples for a given image for this netowrk. The summary of task is:
x= tf.placeholder(shape=(None, 32, 32,3), dtype=tf.float32)
for i in range(2):
logits= vgg.vgg_16(x, is_training=False, spatial_squeeze=False, fc_conv_padding='SAME')
x = x + learning_rate*cal_gradient_of_logits_wrt_x(logits)
However, as soon we enter into the second iteration and start running logits= vgg.vgg16(....) we get the following error:
Variable vgg_16/conv1/conv1_1/weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?
It is clear that this error occurred due to replication of graph in the second iteration. As the tf-slim model doesn't use reuse=True in the scopes, it throws this error (because in the second iteration we again ask it to add the vgg layers in the graph, which already exist).
Is it possible to somehow avoid this error? It should be possible to create the graph for VGG model once and use it whenever we need to calculate logits.
The reason this should be possible is the examples from keras. In keras we can simply define the model once with,
model= vgg.VGG16(*args, *kwargs)
Later on, we can add calculate logits for different tensor with,
logits_1= model(x1)
logits_2= model(x2)
Now in both these calculation, the same model parameters will be used i.e, no such error will appear. Is there a way to achieve the same functionality with a tensorflow model.
In tfslim/models/research/slim/nets/vgg.py:
add a reuse parameter in the vgg16 or vgg19 definition
def vgg_16(inputs,
num_classes=1000,
is_training=True,
dropout_keep_prob=0.5,
spatial_squeeze=True,
scope='vgg_16',
fc_conv_padding='VALID',
global_pool=False,
reuse=None):
...
then set the variable scope to reuse the arg_scope if needed
with tf.variable_scope(scope, 'vgg_16', [inputs], reuse=reuse) as sc:
...
Then, when you invoke the function, pass the parameter as reuse=tf.AUTO_REUSE
vgg.vgg_16(images, num_classes=dataset.num_classes,
is_training=True, reuse=tf.AUTO_REUSE)
You can use tf.make_template(func) to do this.
x= tf.placeholder(shape=(None, 32, 32,3), dtype=tf.float32)
vgg_model = tf.make_template(vgg.vgg_16, is_training=False, spatial_squeeze=False, fc_conv_padding='SAME')
for i in range(2):
logits = vgg_model(x)
x += learning_rate*cal_gradient_of_logits_wrt_x(logits)

How I reuse trained model in DNN?

Everyone!
I have a question releate in trained model reusing( tensorflow ).
I have train model
I want predict new data used trained model.
I use DNNClassifier.
I have a model.ckpt-200000.meta, model.ckpt-200000.index, checkpoint, and eval folder.
but I don't know reuse this model..
plz help me.
First, you need to import your graph,
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
new_saver = tf.train.import_meta_graph('model.ckpt-200000.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
Then you can give input to the graph and get the output.
graph = tf.get_default_graph()
input = graph.get_tensor_by_name("input:0")#input tensor by name
feed_dict ={input:} #input to the model
#Now, access theoutput operation.
op_to_restore = graph.get_tensor_by_name("y_:0") #output tensor
print sess.run(op_to_restore,feed_dict) #get output here
Few things to note,
You can replace the above code with your training part of the graph
(i.e you can get the output without training).
However, you still have to construct your graph as previously and
only replace the training part.
Above method only loading the weights for the constructed graph. Therefore, you have to construct the graph first.
A good tutorial on this can be found here, http://cv-tricks.com/tensorflow-tutorial/save-restore-tensorflow-models-quick-complete-tutorial/
If you don't want to construct the graph again you can follow this tutorial, https://blog.metaflow.fr/tensorflow-how-to-freeze-a-model-and-serve-it-with-a-python-api-d4f3596b3adc

How to get weights in tf.layers.dense?

I wanna draw the weights of tf.layers.dense in tensorboard histogram, but it not show in the parameter, how could I do that?
The weights are added as a variable named kernel, so you could use
x = tf.dense(...)
weights = tf.get_default_graph().get_tensor_by_name(
os.path.split(x.name)[0] + '/kernel:0')
You can obviously replace tf.get_default_graph() by any other graph you are working in.
I came across this problem and just solved it. tf.layers.dense 's name is not necessary to be the same with the kernel's name's prefix. My tensor is "dense_2/xxx" but it's kernel is "dense_1/kernel:0". To ensure that tf.get_variable works, you'd better set the name=xxx in the tf.layers.dense function to make two names owning same prefix. It works as the demo below:
l=tf.layers.dense(input_tf_xxx,300,name='ip1')
with tf.variable_scope('ip1', reuse=True):
w = tf.get_variable('kernel')
By the way, my tf version is 1.3.
The latest tensorflow layers api creates all the variables using the tf.get_variable call. This ensures that if you wish to use the variable again, you can just use the tf.get_variable function and provide the name of the variable that you wish to obtain.
In the case of a tf.layers.dense, the variable is created as: layer_name/kernel. So, you can obtain the variable by saying:
with tf.variable_scope("layer_name", reuse=True):
weights = tf.get_variable("kernel") # do not specify
# the shape here or it will confuse tensorflow into creating a new one.
[Edit]: The new version of Tensorflow now has both Functional and Object-Oriented interfaces to the layers api. If you need the layers only for computational purposes, then using the functional api is a good choice. The function names start with small letters for instance -> tf.layers.dense(...). The Layer Objects can be created using capital first letters e.g. -> tf.layers.Dense(...). Once you have a handle to this layer object, you can use all of its functionality. For obtaining the weights, just use obj.trainable_weights this returns a list of all the trainable variables found in that layer's scope.
I am going crazy with tensorflow.
I run this:
sess.run(x.kernel)
after training, and I get the weights.
Comes from the properties described here.
I am saying that I am going crazy because it seems that there are a million slightly different ways to do something in tf, and that fragments the tutorials around.
Is there anything wrong with
model.get_weights()
After I create a model, compile it and run fit, this function returns a numpy array of the weights for me.
In TF 2 if you're inside a #tf.function (graph mode):
weights = optimizer.weights
If you're in eager mode (default in TF2 except in #tf.function decorated functions):
weights = optimizer.get_weights()
in TF2 weights will output a list in length 2
weights_out[0] = kernel weight
weights_out[1] = bias weight
the second layer weight (layer[0] is the input layer with no weights) in a model in size: 50 with input size: 784
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(50, activation="relu", name="dense_1")(inputs)
x = layers.Dense(50, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(...)
model.fit(...)
kernel_weight = model.layers[1].weights[0]
bias_weight = model.layers[1].weights[1]
all_weight = model.layers[1].weights
print(len(all_weight)) # 2
print(kernel_weight.shape) # (784,50)
print(bias_weight.shape) # (50,)
Try to make a loop for getting the weight of each layer in your sequential network by printing the name of the layer first which you can get from:
model.summary()
Then u can get the weight of each layer running this code:
for layer in model.layers:
print(layer.name)
print(layer.get_weights())

How to get the output of a maxpool layer in a pre-trained model in TensorFlow?

I have a model that I trained. I wish to extract from the model the output of an intermediate maxpool layer.
I tried the following
saver = tf.train.import_meta_graph(BASE_DIR + LOG_DIR + '/model.ckpt.meta')
saver.restore(sess,tf.train.latest_checkpoint(BASE_DIR + LOG_DIR))
sess.run("maxpool/maxpool",feed_dict=feed_dict)
here, feed_dict contains the placeholders and their contents for this run in a dictionary.
I keep getting the following error
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_1_1' with dtype float and shape...
what can be the cause of this? I generated all of the placeholders and input them in the feed dictionary.
I ran in to a similar issue and it was frustrating. What got me around it was filling out the name field for every variable and operation that I wanted to call later. You also may need to add your maxpool/maxpool op to a collection with tf.add_to_collection('name_for_maxpool_op', maxpool_op_handle). You can then restore the ops and named tensors with:
# Restore from metagraph.
saver = tf.train.import_meta_graph(...)
sess = tf.Session()
saver = restore(sess, ...)
graph = sess.graph
# Restore your ops and tensors.
maxpool_op = tf.get_collection('name_for_maxpool_op')[0] # returns a list, you want the first element
a_tensor = graph.get_tensor_by_name('tensor_name:0') # need the :0 added to your name
Then you would build your feed_dict using your restored tensors. More information can be found here. Also, as you mentioned in your comment, you need to pass the op itself to sess.run, not it's name:
sess.run(maxpool_op, feed_dict=feed_dict)
You can access your tensors and ops from a restored metagraph even if you did not name them (to avoid retraining the model with new fancy tensor names, for instance), but it can be a bit of a pain. The names given to the tensors automatically are not always the most transparent. You can list the names of all variables in your graph with:
print([v.name for v in tf.all_variables()])
You can hopefully find the name that you are looking for there and then restore that tensor using graph.get_tensor_by_name as described above.