In tensorflow2 ,we know that in backpropagation progress, we need weights and activations to calcute partial derivatives,I can find weights in stored in the model tf.keras.Model, but the activation values (intermediate values used for the calcualtions of back-propagation), where are they stored?.
The activations are under tf.keras.activations check the API docs
Related
Lets say that I have a model trained with TF2 (e.g., a model from TF model zoo). How can I get the loss function of this model?
Note that I do not want the value of the loss for a given input (that can be obtained via model.evaluate method), but I want the loss function itself such that:
I can take its gradient with respect to input or any desired parameter
I can pass the labels and logits and it provide me the loss value
Note that I am using an already trained model (inheriting tf.keras.Model).
A layer (....) which is an input to the Conv operator producing the output array model/re_lu_1/Relu, is lacking min/max data, which is necessary for quantization. If accuracy matters, either target a non-quantized output format, or run quantized training with your model from a floating point checkpoint to change the input graph to contain min/max information. If you don't care about accuracy, you can pass --default_ranges_min= and --default_ranges_max= for easy experimentation.
For tensorflow 1.x, if you want to quantize, you have to place it with fake quantization nodes to activate the quantization of the model.
There are 3 phases of quantization:
Training part: load your model to graph => create training graph by contrib => train and store weights ckpt
Eval part: load your model to graph without weights => create eval graph => restore graph => export to frozen model
Toco/tflite convert frozen model to quantized model
However, the most important factor is the configuration of batch_normalization in the model. After trying multiple configuration, the best one is using batch_normalization without fused option from tensorflow.keras.layers.
The reason is because Tensorflow want to avoid the folding result to be quantized. Therefore, activation behind batchnorm wont work. Details in [here][1]
In short, this layer should be attached only under tensorflow.keras.layers.Conv2D with parsed activation param, which is Relu/Relu6/Identity
If you conduct the above process: Conv2d=>Activation=>BatchNorm
the layer will not yield errors does not have MinMax information
I train the model in tensorflow,the model has dropout layer.And then I convert it into tensorflowjs ,then I load it by loadFrozenModel(),Can I modify the dropout rate after model=tf.loadFrozenModel?
Currently frozen models cannot be trained further. You can of course use them as a base for a transfer learning task, but the variables inside that model are frozen and not marked as updatable.
Using transfer learning, you can retrieve the layer before the dropout layer and change the dropout layer and train further
The documentation of the Embedding layer (https://www.cntk.ai/pythondocs/layerref.html#embedding) shows that it can be initialized with pretrained embeddings using the weights parameter, but these embeddings are not updated during training.
Is there a way to initialize the Embedding layer with pretrained embeddings and still update them during training?
If not, what's the most efficient way to do batch embeddings look up with one hot vectors?
Yes, just pass the initial values the init argument instead. That will create a learnable parameter initialized with the array you pass in.
After training a network using Keras:
I want to access the final trained weights of the network in some order.
I want to know the neuron activation values for every input passed. For example, after training, if I pass X as my input to the network, I want to know the neuron activation values for that X for every neuron in the network.
Does Keras provide API access to these things? I want to do further analysis based on the neuron activation values.
Update : I know I can do this using Theano purely, but Theano requires more low-level coding. And, since Keras is built on top of Theano, I think there could be a way to do this?
If Keras can't do this, then among Tensorflow and Caffe , which can? Keras is the easiest to use, followed by Tensorflow/Caffe, but I don't know which of these provide the network access I need. The last option for me would be to drop down to Theano, but I think it'd be more time-consuming to build a deep CNN with Theano..
This is covered in the Keras FAQ, you basically want to compute the activations for each layer, so you can do it with this code:
from keras import backend as K
#The layer number
n = 3
# with a Sequential model
get_nth_layer_output = K.function([model.layers[0].input],
[model.layers[n].output])
layer_output = get_nth_layer_output([X])[0]
Unfortunately you would need to compile and run a function for each layer, but this should be straightforward.
To get the weights, you can call get_weights() on any layer.
nth_weights = model.layers[n].get_weights()