How to get all layers from Keras Model with TensorFlow hub - tensorflow2.0

I use efficientnetv2 from hub.KerasLayer, I would like to see all layers when using model.summary(), but it shows only "keras_layer (KerasLayer)"
Layer (type)
Output Shape
Param #
keras_layer (KerasLayer)
(None, 1280)
5919312
dropout (Dropout)
(None, 1280)
0
dense (Dense)
(None, 2)
2562

This way you can "jump inside" the model;
import tensorflow_hub as hub
malli = hub.KerasLayer("https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_b0/feature_vector/2")
print("Thickness of the model:", len(malli.weights))
for i in range(len(malli.weights)):
print("In layer ",malli.weights[i].name," the content is: ", malli.weights[i])
...of course note the output is quite a loooong but modify the output print according to your need.

TensorFlow's SavedModel is essentially a computation graph. While you could in principle inspect its structure, there is no information about high-level architectural blocks.
If you'd like to access individual layers, a better option might be using the model from the authors' original implementation. It can be constructed as follows:
from effnetv2_model import get_model
model = get_model('efficientnetv2-s', weights='imagenet21k-ft1k', with_endpoints=True)
Pretrained weights are available for the same configurations as in TensorFlow Hub. model.summary() displays individual (Fused)MBConv blocks.
Also note that a future version of TensorFlow will include EfficientNet v2 (already available in nightly builds).

Related

how to calculate the confidence of a softmax layer

I am working on a multi-class computer vision classification task and using a CNN with FC layers stacked on top using softmax activation, the problem is that lets say im classifying animals categories, if i predicted what a rock image is it will return a high probability for the most similar category of animals due to using softmax activation that returns a probabilistic distribution compressed between 0 and 1. what can i use to determine the confidence of my models probability output to say whether i can rely on these probabilities or not.
PS:I dont want to add a no_label class
Is it possible using keras functional api to have 2 outputs of the model the pre_softmax and the softmax output without updating the weights according to a linear activation which is the pre_softmax layer since the training would be affected
Is it possible using keras functional api to have 2 outputs of the model the pre_softmax and the softmax output without updating the weights according to a linear activation which is the pre_softmax layer since the training would be affected
Yes. You can do it like this
input = tf.keras.layers.Input((128,128,3))
x = tf.keras.layers.Conv2D(32,3)(input)
x = tf.keras.layers.MaxPooling2D()(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(128)(x)
non_softmax_output = tf.keras.layers.Dense(10)(x)
softmax_output = tf.keras.layers.Softmax()(non_softmax_output)
model = tf.keras.models.Model(inputs=input,outputs=[non_softmax_output,softmax_output])
model.summary()
>>>
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 128, 128, 3)] 0
conv2d_1 (Conv2D) (None, 126, 126, 32) 896
max_pooling2d_1 (MaxPooling (None, 63, 63, 32) 0
2D)
flatten_1 (Flatten) (None, 127008) 0
dense_23 (Dense) (None, 128) 16257152
dense_24 (Dense) (None, 10) 1290
softmax (Softmax) (None, 10) 0
=================================================================
Total params: 16,259,338
Trainable params: 16,259,338
Non-trainable params: 0
_________________________________________________________________
The easier alternative is to just work with the predictions from the softmax layer. You don't gather much from the linear layer without the activation. Those weights by themselves do not mean much. You could instead define a function outside the model that changes the predictions based on some threshold value
Assume you define only 1 output in the above model with a softmax layer. You can define a function like this to get predictions based on some threshold value you choose
def modify_predict(test_images,threshold):
predictions = model.predict(test_images)
max_values = np.max(predictions,axis=1)
labels = np.argmax(predictions,axis=1)
new_predictions = np.where(max_values > threshold, labels, 999) #You can use any indicator here instead of 999 for your no_label class
return new_predictions
On the first part of your question, the only way you can know how your
model will behave on non-animal pictures is by having non-animal pictures
in your data.
There are two options
The first is to include non-animal pictures in the training set (and dev and test sets), and to train the model to distinguish between animal / non-animal.
You could either build a separate binary classification model to distinguish animal/non-animal (as alrady suggesetd in comments), or you could integrate it into one model by having a
'non-animal' class. (Although I recognise you indicate this last option is
not something you want to do).
The second is to include non-animal pictures in the dev and test sets, but not in the training set. You can't then train the model to distinguish between animal and non-animal, but you can at least measure how it behaves on
non-animal pictures, and perhaps create some sort of heuristic for selecting only some of your model's predictions. This seems like a worse option to me, even though it's generally accepted that dev and test sets can come from a different distribution to the training set. It's something one might do if there were only a small number of non-animal pictures available, but that surely can't be the case here.
There is, for example, a large labelled image database
available at https://www.image-net.org/index.php

How to feed new vectors into recurrent and convolutional keras model for real-time/streaming/live inference?

I have successfully trained a Keras/TensorFlow model consisting of layers SimpleRNN→Conv1D→GRU→Dense. The model is meant to run on an Apple Watch for real time inference, which means I want to feed it with a new feature vector and predict a new output for each time step. My problem is that I don't know how to feed data into it such that the convolutional layer receives the latest k outputs from the RNN layer.
I can see three options:
Feed it with one feature vector at a time, i.e. (1,1,6). In this case I assume that the convolutional layer will receive only one time step and hence zero pad for all the previous samples.
Feed it with the last k feature vectors for each time step, i.e. (1,9,6), where k = 9 is the CNN kernel length. In this case I assume that the state flow in the recurrent layers will not work.
Feed it with the last k feature vectors every k:th time step, again where k = 9 is the CNN kernel length. I assume this would work, but introduces unnecessary latency that I wish to avoid.
What I want is a model that I can feed with a new single feature vector for each time step, and it will automatically feed the last k outputs of the SimpleRNN layer into the following Conv1D layer. Is this possible with my current model? If not, can I work with the layer arguments, or can I introduce some kind of FIFO buffer layer between the SimpleRNN and Conv1D layer?
Here is my current model:
feature_vector_size = 6
model = tf.keras.models.Sequential([
Input(shape=(None, feature_vector_size)),
SimpleRNN(16, return_sequences=True, name="rnn"),
Conv1D(16, 9, padding="causal", activation="relu"),
GRU(12, return_sequences=True, name="gru"),
Dropout(0.2),
Dense(1, activation=tf.nn.sigmoid, name="dense")
])
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rnn (SimpleRNN) (None, None, 16) 368
_________________________________________________________________
conv1d (Conv1D) (None, None, 16) 2320
_________________________________________________________________
gru (GRU) (None, None, 12) 1080
_________________________________________________________________
dropout (Dropout) (None, None, 12) 0
_________________________________________________________________
dense (Dense) (None, None, 1) 13
=================================================================
Edit:
After having researched the problem a bit, I have realized:
The Conv1D layer will zero pad in all three cases that I described, so option 3 won't work either. Setting padding="valid" solves this particular problem.
The SimpleRNN and GRU layers must have stateful=True. I found this description of how to make a model stateful after it has been trained stateless: How to implement a forward pass in a Keras RNN in real-time?
Keras sequence models seem to be made for complete, finite sequences only. The infinite streaming use case with one time step at a time isn't really supported.
However, the original question remains open: How can I build and/or feed new feature vectors into the model such that the convolutional layer receives the latest k outputs from the RNN layer?
For anyone else with the same problem: I couldn't solve the SimpleRNN to Conv1D data flow easily, so I ended up replacing the SimpleRNN layer with another Conv1D layer and setting padding="valid" on both Conv1D layers. The resulting model outputs exactly one time step when fed with a sequence of c * k - 1 time steps, where c is the number of Conv1D layers and k is the convolutional kernel length (c = 2 and k = 9 in my case):
feature_vector_size = 6
model = tf.keras.models.Sequential([
Input(shape=(None, feature_vector_size)),
Conv1D(16, 9, padding="valid", name="conv1d1"),
Conv1D(16, 9, padding="valid", name="conv1d2"),
GRU(12, return_sequences=True, name="gru"),
Dropout(0.2),
Dense(1, activation=tf.nn.sigmoid, name="dense")
])
After training, I make the GRU layer stateful according to How to implement a forward pass in a Keras RNN in real-time?. For real-time inference I keep a FIFO queue of the 17 latest feature vectors and feed all these 17 vectors into the model as an input sequence for each new time step.
I don't know if this is the best possible solution, but at least it works.

How to access a particular layer of Huggingface's pre-trained BERT model?

For experimentation purposes, I need to access an Embedding layer of the encoder. That is, assuming Tensorflow implementation, the layer defined as tf.keras.layers.Embedding(...).
For example, what is a way to set 'embeddings_regularizer=' argument of the Embedding() layer in the encoder part of the transformer?
You can iterate over the BERT model in the same way as any other model, like so:
for layer in model.layers:
if isinstance(layer ,tf.keras.layers.Embedding):
layer.embeddings_regularizer = argument
isinstance checks the type of the layer, so really you can put any layer type here and change what you need.
I haven't checked specifically whether embeddings_regularizer is available, however if you want to see what methods are available to that particular layer, run a debugger and call dir(layer) inside the above function.
Updated question
The TFBertForSequenceClassification model has 3 layers:
>>> model.summary()
Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bert (TFBertMainLayer) multiple 108310272
_________________________________________________________________
dropout_37 (Dropout) multiple 0
_________________________________________________________________
classifier (Dense) multiple 1538
=================================================================
Total params: 108,311,810
Trainable params: 108,311,810
Non-trainable params: 0
Similarly, calling model.layers gives:
[<transformers.models.bert.modeling_tf_bert.TFBertMainLayer at 0x7efda85595d0>,
<tensorflow.python.keras.layers.core.Dropout at 0x7efd6000ae10>,
<tensorflow.python.keras.layers.core.Dense at 0x7efd6000afd0>]
We can access the layers inside TFBertMainLayer:
>>> model.layers[0]._layers
[<transformers.models.bert.modeling_tf_bert.TFBertEmbeddings at 0x7efda8080f90>,
<transformers.models.bert.modeling_tf_bert.TFBertEncoder at 0x7efda855ced0>,
<transformers.models.bert.modeling_tf_bert.TFBertPooler at 0x7efda84f0450>,
DictWrapper({'name': 'bert'})]
So from the above we can access the TFBertEmbeddings layer by:
model.layers[0].embeddings
OR
model.layers[0]._layers[0]
If you check the documentation (search for the "TFBertEmbeddings" class) you can see that this inherits a standard tf.keras.layers.Layer which means you have access to all the normal regularizer methods, so you should be able to call something like:
from tensorflow.keras import regularizers
model.layers[0].embeddings.activity_regularizer = regularizers.l2(1e-5)
Or whatever argument / regularizer you need to change. See here for regularizer docs.

Change fully convolutional network input shape in TF 2.3 and tf.keras

I'm working with tensorflow 2.3 and tf.keras
I've trained a network on images with input shape (None,120,120,12) . Actually I've also been able to train the model while declaring the input as (None,128,128,12) while feeding (None,120,120,12) batches because of a coding error. TF just printed out a warning and didn't care. This wasn't the behavior in previous versions. My network has only convolutional layers and, if the input size has enough powers of 2 considering the depth, it provides an output image of the same shape as the input, it has only convolutional layers.
I've finally fully trained this model and I'd like to apply it also to images of different shape. Is there a proper way to change the input shape of my trained model? Or should I define a new model and then copy the weights layer by layer? Or should I just forget about it and just accept the warnings and forget about them since it works anyway?
ah. You again. I think your problem is basically simple. Once you train your model with an input size. If you want to run the model, the input must be the same shape. However, if you want to take advantage of the trained model and believe that the features have learnt is not much different, then you can apply transferlearning, and of course, retrain it again. You don't have to copy weights, just freeze the model and train only the input and output. You can check this for some basic example with your VGG
base_model = tensorflow.keras.applications.VGG19(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(224, 224, 3),
include_top=False)
base_model.trainable = False
inputs = layers.Input(shape=150,150,3)
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 150, 150, 3)] 0
_________________________________________________________________
vgg19 (Model) multiple 20024384
_________________________________________________________________
global_average_pooling2d (Gl (None, 512) 0
_________________________________________________________________
dense (Dense) (None, 1) 513
=================================================================

Want to check Intermediate Operations inside Keras Layer

I am facing floating point resolution loss during convolution operation while porting the code on my embedded processor which supports only half precision, so I want to test the intermediate operations that are performed layer by layer in my Keras based model which is performing good while on Full precision on my desktop.
In the following snippet of code I want to compute the 1DConv on the 1500x3 shaped input data. The kernel size is 10 and Kernel shape is (10x3x16).
To compute the 1D-Convolution, Keras does the Expand Dimensions on input shape and add one more dimension to it, which becomes suitable for 2D Convolution operation.
Then series of operations are called e.g. Conv2D followed by Squeeze and finally BiasAdd.
Finally the output of the Conv1D layer is pushed in
conv1d_20/Elu layer.
Please find the picture attached for full description of operations involved.
Now, I want to test the output much before the actual output of a Layer is produced.
Please see the below code:
Input_sequence = keras.layers.Input(shape=(1500,3))
encoder_conv1 = keras.layers.Conv1D(filters=16, kernel_size=10, padding='same', activation=tf.nn.elu)(Input_sequence)
The Model summary shows:
Model: "model_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, 1500, 3)] 0
_________________________________________________________________
conv1d_20 (Conv1D) (None, 1500, 16) 496
I want to define the model output at conv1d_20/Conv2D but it gives me error. But the below is accepted at compilation.
encoder = keras.Model(inputs=autoencoder.input, outputs=autoencoder.get_layer('conv1d_20').output)
encoder.get_output_at(0)
It outputs
<tf.Tensor 'conv1d_20/Elu:0' shape=(?, 1500, 16) dtype=float32>
I want to test the output of Conv2D operation but it produces the output of conv1d_20/Elu.
How can I do this test. Please help me.
Conv1D operation
You can disable the bias(use_bias=False) and activation functions(activation=None) when defining the Conv1D operation.
Input_sequence = keras.layers.Input(shape=(1500,3))
encoder_conv1 = keras.layers.Conv1D(filters=16, kernel_size=10,
padding='same', use_bias=False,
activation=None)(Input_sequence)