How to plot weights of fully connected layer? - tensorflow

I am using a a fully connected network with 4 input and 2 output nodes. I store the Weights of my network after completely training it. Suppose here is my weight matrix
`W = np.array([[0.8,0.02],[0.5,0.4],[0.3,0.2],[0.1,0.7]])`
I want to visualize that what weights each class has adopted. How I can do that? I searched the codes related to this they are are using plt.imshow. Should I simply mention plt.imshow(W) to visualize weights?

You should use TensorBoard for this. Also, you should not need to store the weights manually, as they are stored by TensorFlow. You can access them in a couple of different ways, such as with tf.trainable_variables(), or tape.watched_variables() in eager mode. Then its just a matter of looping through the variables for the weights you want.
To plot your weights in TensorBoard, check out this: https://www.tensorflow.org/api_docs/python/tf/contrib/summary

Related

Training Tensorflow only one object

Corresponding Tensorflow documentation I trained 3 objects and get result (It can recognize these objects). When I show other objects (not the 3 ones) it doesn't work correctly.
I want to train only one object (example: a cup) and recognize only this object. Is it possible to do via Tensorflow ?
Your question doesn't provide enough details, but as I can guess your trained the network with softmax activation and Categorical or SparseCategorical cross entropy loss. If my guess is right, such network always generates prediction to one of three classess, regardless to actual data, i.e. there is no option of "no-one".
In order to train network to recognize only one class of objects, make the only one output with only one channel and sigmoid activation. Use BinaryCrossEntropy loss to train your model for the specific object. Provide dataset that includes examples with this object and without it.

Why does Tensorboard shows a single value as many in the histogram visualization?

I was trying to visualize the output of all the activation functions in each layers of my fully-connected network and I was surprised when I checked the last layer. It is a very simple regression model and the output layer has therefore one neuron. I know it would be better to visualize it as a scalar, but I was just trying to visualize it using histograms and I realized that the value is somehow split. Does this make any sense? I would rather expect that it cannot be visualized at all or that the histogram would consist of a single point.

Merge weights of same model trained on 2 different computers using tensorflow

I was doing some research on training deep neural networks using tensorflow. I know how to train a model. My problem is i have to train the same model on 2 different computers with different datasets. Then save the model weights. Later i have to merge the 2 model weight files somehow. I have no idea how to merge them. Is there a function that does this or should the weights be averaged?
Any help on this problem would be useful
Thanks in advance
There is literally no way to merge weights, you cannot average or combine them in any way, as the result will not mean anything. What you could do instead is combine predictions, but for that the training classes have to be the same.
This is not a programming limitation but a theoretical one.
It is better to merge weight updates (gradients) during the training and keep a common set of weights rather than trying to merge the weights after individual trainings have completed. Both individually trained networks may find a different optimum and e.g. averaging the weights may give a network which performs worse on both datasets.
There are two things you can do:
Look at 'data parallel training': distributing forward and backward passes of the training process over multiple compute nodes each of which has a subset of the entire data.
In this case typically:
each node propagates a minibatch forward through the network
each node propagates the loss gradient backwards through the network
a 'master node' collects gradients from minibatches on all nodes and updates the weights correspondingly
and distributes the weight updates back to the compute nodes to make sure each of them has the same set of weights
(there are variants of the above to avoid that compute nodes idle too long waiting for results from others). The above assumes that Tensorflow processes running on the compute nodes can communicate with each other during the training.
Look at https://www.tensorflow.org/deploy/distributed) for more details and an example of how to train networks over multiple nodes.
If you really have train the networks separately, look at ensembling, see e.g. this page: https://mlwave.com/kaggle-ensembling-guide/ . In a nutshell, you would train individual networks on their own machines and then e.g. use an average or maximum over the outputs of both networks as a combined classifier / predictor.

Changing a trained network to keep only a subset of its output

Suppose I have a trained TensorFlow classification network for 20 classes as in PASCAL VOC 2007: aeroplane, bicycle, ..., car, cat, ..., person, ..., tvmonitor.
Now, I would like to have a sub-network for only a subset of the classes, e.g., 3 classes: car, cat, person.
Then, I can use this network for testing or for re-training/fine-tuning on a new dataset, only for the 3 classes.
It should be possible to extract this sub-network out of the original network, since it is only the last layer that will change. We need to discard the neurons/weights for the discarded classes.
My question: Is there an easy way to do this in TensorFlow?
It will be great if you can point to some sample code or similar solution.
I have googled, but have not come across any mention of this.
The symmetric problem, expanding the number of classes without discarding the original weights, can potentially be useful for some people, but my current focus is the one above.
If you want to only keep the output for a few slices, you could simply extract the corresponding slices from the last layer.
For example, let's assume the last layer is fully connected. Its weights are a tensor of size num_previous x num_output.
You want to keep only a few of these outputs, says output 1, 22, and 42. You can get the weights of your new fully connected layer as:
outputs_to_keep = [1, 22, 42]
new_W = tf.transpose(tf.gather(tf.transpose(old_W), outputs_to_keep))
It is possible to extract a pretrained subnet as you said. It is called transfer learning. There are different ways to do it, here you have one:
Find the layer you want to start with. You can use Tensorboard to find it and then use graph.get_tensor_by_name() Usually you keep the convolutional layers and discard the fully connected ones.
Connect your new layers (normally fully connected ones) to the previous layer.
Freeze the variables (weights) of the pretrained layers using trainable=false. Alternatively, you can instruct the optimizer to update only the weights from the new layers.
Train your model with the new classes.

pruning tensorflow connections and weights (using cifar10 cnn)

I'm using tensorflow to run a cnn for image classification.
I use tensorflow cifar10 cnn implementation.(tensorflow cifar10)
I want to decrease the number of connections, meaning I want to prune the low-weight connections.
How can I create a new graph(subgraph) without some of the nuerones?
Tensorflow does not allow you lock/freeze a particular kernel of a particular layer, that I have found. The only I've found to do this is to use the tf.assign() function as shown in
How to freeze/lock weights of one Tensorflow variable (e.g., one CNN kernel of one layer
It's fairly cave-man but I've seen no other solution that works. Essentially, you have to .assign() the values every so often as you iterate through the data. Since this approach is so inelegant and brute-force, it's very slow. I do the .assign() every 100 batches.
Someone please post a better solution and soon!
The cifar10 model you point to, and for that matter, most models written in TensorFlow, do not model the weights (and hence, connections) of individual neurons directly in the computation graph. For instance, for fully connected layers, all the connections between the two layers, say, with M neurons in the layer below, and 'N' neurons in the layer above, are modeled by one MxN weight matrix. If you wanted to completely remove a neuron and all of its outgoing connections from the layer below, you can simply slice out a (M-1)xN matrix by removing the relevant row, and multiply it with the corresponding M-1 activations of the neurons.
Another way is add an addition mask to control the connections.
The first step involves adding mask and threshold variables to the
layers that need to undergo pruning. The variable mask is the same
shape as the layer's weight tensor and determines which of the weights
participate in the forward execution of the graph.
There is a pruning implementation under tensorflow/contrib/model_pruning to prune the model. Hope this can help you to prune model quickly.
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/model_pruning
I think google has an updated answer here : https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/model_pruning
Removing pruning nodes from the trained graph:
$ bazel build -c opt contrib/model_pruning:strip_pruning_vars
$ bazel-bin/contrib/model_pruning/strip_pruning_vars --checkpoint_path=/tmp/cifar10_train --output_node_names=softmax_linear/softmax_linear_2 --filename=cifar_pruned.pb
I suppose that cifar_pruned.pb will be smaller, since the pruned "or zero masked" variables are removed.