Feature visualizing in tensor flow or keras is easy and can be found here. https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/ or Convolutional Neural Network visualization - weights or activations?
how to do this in pytorch?
I am using PyTorch with pretrained resnet18 model. All i need to input the image and get activation for specific layer(e.g. Layer2.0.conv2). Layer2.0.conv2 is specified in the pretrained model.
In simple words; how to convert link one code to PyTorch? how to get the specific layers in resnet18 PyTorch and how to get the activation for input image.
I tried this in tensorflow and it worked but not PyTorch.
You would have to register PyTorch's hooks on specific layer. See this tutorial for intro about hooks.
Basically, it allows to capture input/output of forward/backward going into the torch.nn.Module. Whole thing could be a bit complicated, there exists a library with similar goal to your (disclaimer I'm the author), called torchfunc. Especially torchfunc.hooks.recorder allows you to do what you want, see code snippet and comments below:
import torchvision
import torchfunc
my_network = torchvision.resnet18(pretrained=True)
# Recorder saving inputs to all submodules
recorder = torchfunc.hooks.recorders.ForwardPre()
# Will register hook for all submodules of resnet18
# You could specify some submodules by index or by layer type, see docs
recorder.modules(my_networks)
# Push example image through network
my_network(torch.randn(1, 3, 224, 224))
You could register recorder only for some layers (submodule) specified by index or layer type, to get necessary info, run:
# Zero image before going into the third submodule of this network
recorder.data[3][0]
# You can see all submodules and their positions by running this:
for i, submodule in enumerate(my_network.modules()):
print(i, submodule)
# Or you can just print the network to get this info
print(my_network)
Related
I try to run some ML on my ESP32, and I want to use Tensorflow lite micro. But I don't really understand, how they build up the layers. So here is the example how to train the person detection model:
Person detection model training
It is clear, but at the end they say:
MobileNet v1 is a stack of 14 of these depthwise separable convolution layers with an average pool, then a fully-connected layer followed by a softmax at the end.
If I check the sample code, where they build up the tf lite micro model, it only has 3 lines:
static tflite::MicroMutableOpResolver<3> micro_op_resolver;
micro_op_resolver.AddAveragePool2D();
micro_op_resolver.AddConv2D();
micro_op_resolver.AddDepthwiseConv2D();
There is the Average pool, and the depthwise layer, but where the Conv2D layer comes from? And only 1 depthwise layer is presented, but in the documentation, there is 14 depthwise layers in the model.
So the question is, is there any relation between training model, and the model I should build in tensoflow lite micro? If there is, how can I determine how to build up. And that is the question if there is no relation, in what way I need to build up the model.
They don't explicitly build the model, they rely on a model file that contains the architecture (source):
model = tflite::GetModel(g_person_detect_model_data);
Where g_person_detect_model_data.cc is generated from a tflite model (containing the architecture) with the following command (See Converting into a c source file in the Readme) :
# Install xxd if it is not available
!apt-get -qq install xxd
# Save the file as a C source file
!xxd -i vww_96_grayscale_quantized.tflite > person_detect_model_data.cc
So the code you shared doesn't build the model. What you see is that for performances reasons, they explicitly add the ops needed by the model instead of relying on the more complex tflite::AllOpsResolver.
It is indicated by this comment above the code you shared :
// Pull in only the operation implementations we need.
// This relies on a complete list of all the ops needed by this graph.
// An easier approach is to just use the AllOpsResolver, but this will
// incur some penalty in code space for op implementations that are not
// needed by this graph.
//
// tflite::AllOpsResolver resolver;
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::MicroMutableOpResolver<5> micro_op_resolver;
micro_op_resolver.AddAveragePool2D();
micro_op_resolver.AddConv2D();
micro_op_resolver.AddDepthwiseConv2D();
micro_op_resolver.AddReshape();
micro_op_resolver.AddSoftmax();
I have heard that it is possible to use the pretrained Universal Sentence Encoder (USE) (neural language model) from TF-hub as part of a trainable model, e.g. a sentence classifier. Some versions of USE rely on SentencePiece sub-word tokenizer, which I also need. There are minimal instructions online for how to do this.
Here is how to use USE-lite with SentencePiece:
- https://tfhub.dev/google/universal-sentence-encoder-lite/2
Here is how to train a classifier based on a pretrained USE model:
- http://hunterheidenreich.com/blog/google-universal-sentence-encoder-in-keras/
- https://www.youtube.com/watch?v=gnz1CUzb5qo
And here is how to measure sentence similarity using both USE-lite and SentencePiece:
- https://github.com/tensorflow/hub/blob/master/examples/colab/semantic_similarity_with_tf_hub_universal_encoder_lite.ipynb
I have successfully reproduced the above pieces separately. I have then tried to combine the above ideas into a single POC that will build a classifier model based on USE-lite and SentencePiece, but I cannot see how to do it. I am currently stuck on the part where I modify the trainable classifier's first layer(s). I have tried to make it accept either (1) SentencePiece token IDs (in which I tokenize the text outide of the Tensorflow graph) or (2) raw text (using SentencePiece as an Op inside the Tensorflow graph). After that point, it should feed tokenized text forward into the USE-lite model, either in a lambda or in some other way. Finally, the output of USE-lite should be fed into a dense layer (or two?) ending in softmax for computing class probabilities.
I am relatively new to Tensorflow. I imagine that the above sources would be sufficient for a more experienced Tensorflow developer to merge and make work for my use-case. Let me know if you can provide any pointers. Thanks.
Is there a way to load a pretrained model in Tensorflow and remove the top layers in the network? I am looking at Tensorflow release r1.10
The only documentation I could find is with tf.keras.Sequential.pop
https://www.tensorflow.org/versions/r1.10/api_docs/python/tf/keras/Sequential#pop
I want to manually prune a pretrained network by removing bunch of top convolution layers and add a custom fully convoluted layer.
EDIT:
The model is ssd_mobilenet_v1_coco downloaded from Tensorflow Model Zoo. I have access to both the frozen_inference_graph.pb model file and checkpoint file.
I donot have access to the python code which is used to construct the model.
Thanks.
From inspecting the code, SSDMobileNetV1FeatureExtractor.extract_features redirects research.slim.nets:
from nets import mobilenet_v1 # nets will have to be on your PYTHONPATH
with tf.variable_scope('MobilenetV1',
reuse=self._reuse_weights) as scope:
with slim.arg_scope(
mobilenet_v1.mobilenet_v1_arg_scope(
is_training=None, regularize_depthwise=True)):
with (slim.arg_scope(self._conv_hyperparams_fn())
if self._override_base_feature_extractor_hyperparams
else context_manager.IdentityContextManager()):
_, image_features = mobilenet_v1.mobilenet_v1_base(
ops.pad_to_multiple(preprocessed_inputs, self._pad_to_multiple),
final_endpoint='Conv2d_13_pointwise',
min_depth=self._min_depth,
depth_multiplier=self._depth_multiplier,
use_explicit_padding=self._use_explicit_padding,
scope=scope)
The mobilenet_v1_base function takes a final_endpoint argument. Rather than prune the constructed graph, just construct the graph up until the endpoint you want.
Many thanks for support!
I currently use TF Slim - and TF Hub seems like a very useful addition for transfer learning. However the following things are not clear from the documentation:
1. Is preprocessing done implicitly? Is this based on "trainable=True/False" parameter in constructor of module?
module = hub.Module("https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1", trainable=True)
When I use Tf-slim I use the preprocess method:
inception_preprocessing.preprocess_image(image, img_height, img_width, is_training)
2.How to get access to AuxLogits for an inception model? Seems to be missing:
import tensorflow_hub as hub
import tensorflow as tf
img = tf.random_uniform([10,299,299,3])
module = hub.Module("https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1", trainable=True)
outputs = module(dict(images=img), signature="image_feature_vector", as_dict=True)
The output is
dict_keys(['InceptionV3/Mixed_6b', 'InceptionV3/MaxPool_5a_3x3', 'InceptionV3/Mixed_6c', 'InceptionV3/Mixed_6d', 'InceptionV3/Mixed_6e', 'InceptionV3/Mixed_7a', 'InceptionV3/Mixed_7b', 'InceptionV3/Conv2d_2a_3x3', 'InceptionV3/Mixed_7c', 'InceptionV3/Conv2d_4a_3x3', 'InceptionV3/Conv2d_1a_3x3', 'InceptionV3/global_pool', 'InceptionV3/MaxPool_3a_3x3', 'InceptionV3/Conv2d_2b_3x3', 'InceptionV3/Conv2d_3b_1x1', 'default', 'InceptionV3/Mixed_5b', 'InceptionV3/Mixed_5c', 'InceptionV3/Mixed_5d', 'InceptionV3/Mixed_6a'])
These are excellent questions; let me try to give good answers also for readers less familiar with TF-Slim.
1. Preprocessing is not done by the module, because it is a lot about your data, and not so much about the CNN architecture within the module. The module only handles transforming input values from the canonical [0,1] range into whatever the pre-trained CNN within the module expects.
Lengthy rationale: Preprocessing of images for CNN training usually consists of decoding the input JPEG (or whatever), selecting a (reasonably large) random crop from it, random photometric and geometric transformations (distort colors, flip left/right, etc.), and resizing to the common image size for a batch of training inputs. The TensorFlow Hub modules that implement https://tensorflow.org/hub/common_signatures/images leave all of that to your code around the module.
The primary reason is that the suitable random transformations depend a lot on your training task, but not on the architecture or trained state weights of the module. For example, color distortions will help if you classify cars vs dogs, but probably not for ripe vs unripe bananas, and so on.
Also, a batch of images that have been decoded but not yet cropped/resized are hard to represent as a single tensor (unless you make it a 1-D tensor of encoded strings, but that brings other problems, such as breaking backprop into module inputs for advanced uses).
Bottom line: The Python code using the module needs to do image preprocessing (except scaling values), for example, as in https://github.com/tensorflow/hub/blob/master/examples/image_retraining/retrain.py
The slim preprocessing methods conflate the dataset-specific random transformations (tuned for Imagenet!) with the re-scaling to the architecture's value range (which the Hub module does for you). That means they are not directly applicable here.
2. Indeed, auxiliary heads are missing from the initial set of modules published under tfhub.dev/google/..., but I expect them to work fine for re-training anyways.
More details: Not all architectures have auxiliary heads, and even the original Inception paper says their effect was "relatively minor" [Szegedy&al. 2015; ยง5]. Using an image feature vector module for a custom classification task would burden the module consumer code with checking for aux features and, if found, putting aux logits and a loss term on top.
This complication did not seem to pull its weight, but more experiments might refute that assessment. (Please share in a GitHub issue if you know of any.)
For now, the only way to put an aux head onto https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1 is to copy&paste some lines from https://github.com/tensorflow/models/blob/master/research/slim/nets/inception_v3.py (search "Auxiliary head logits") and apply that to the "Inception_V3/Mixed_6e" output that you saw.
3. You didn't ask, but: For training, the module's documentation recommends to pass hub.Module(..., tags={"train"}), or else batch norm operates in inference mode (and dropout, if the module had any).
Hope this explains how and why things are.
Arno (from the TensorFlow Hub developers)
Is it possible to get the gradients with respect to each layer in Caffe in CNNs, edit them and again apply the new gradients in the training process? If possible, using pycaffe interface.
For example in TensorFlow, it could be done by means of functions:
given_optimizer.compute_gradients(total_loss)
given_optimizer.apply_gradients(grads)
I'm not sure what you mean by "apply the new gradients in the training process", but you can access the gradients in the pycaffe interface:
import caffe
net = caffe.Net('/path/to/net.prototxt', '/path/to/weights.caffemodel', caffe.TEST)
# provide inputs to the net, do a pass so that meaningful data/gradients propagate to all the layers
net.forward_backward_all()
# once data/gradients are updated, you can access them
net.blobs['blob_name'].diff # access the gradient of blob 'blob_name'
net.layers[5].blobs[0].diff # access the gradient of the first parameter blob of the 6th layer
To map between layer names and layer indices, you can use this code:
list(net._layer_names).index('layer_name')
This will return the index of layer 'layer_name'.