Functional API or Custom Layer keras tensorflow - tensorflow

I was building a neural network in tensorflow keras and ended up with the following code as a step in the model:
enc = tfkl.Reshape((-1, 20,input_shape[1]))(input_layer)
encoder_output = []
for i in range(enc.shape[1]):
o = encoder(enc[:, i, ...])
encoder_output.append(o)
encoder_output = tf.stack(encoder_output, axis=1)
The code WORKS FINE. However it is quite long in the summary and in general not very elegant.
I don't know much about functional API and custom layers. But I would like to know if in this case
it's best to use a custom layer or build a functional API block.
Thank you

Related

How to obtain the ResNet component of the Tensorflow implementation of SimCLR v2?

I am currently trying to create embeddings of images by passing them through pre-trained Neural Networks and getting the values obtained at the last layer just before the fully-connected ones. I did not have much problem doing it with Pytorch implementations of other Neural Networks. However, I am stuck with the Tensorflow implementation of SimCLR v2 and do not know how to proceed.
The official repo of SimCLR v2 is this one: https://github.com/google-research/simclr
And the paper is here: https://arxiv.org/abs/2006.10029v2
If I understood correctly the paper and the code, this architecture is composed of a backbone ResNet as well as a projection head. In my case, I am not interested in the projection head and just want to obtain the results of the output of the ResNet model.
Looking at the code in the colabs, I have managed to import pre-trained SimCLR models:
model_path = 'gs://simclr-checkpoints-tf2/simclrv2/pretrained/r50_1x_sk0/saved_model'
saved_model = tf.saved_model.load(model_path)
However, I do not know what to do to get the outputs of the ResNet. In all the colabs, they only get the outputs of the projection head which I am uninterested in.
for x in ds.take(1):
image = x['image']
labels = x['label']
logits = saved_model(image, trainable=False)['logits_sup']
pred = tf.argmax(logits, -1)
Moreover, the way the model is imported makes it difficult to get the variables and layers. For instance if I try obtain a summary of the model, I have this error:
'_UserObject' object has no attribute 'summary'
I also do not want to convert the weights of Tensorflow into Pytorch and import them into a pytorch ResNet.
What then would be the best way to isolate the ResNet from the overall SimCLR v2 architecture in order to get the outputs of the final layer ?

Fine tuning embedding weights within my Tensorflow hub model for an unsupervised learning problem

Tensorflow Version: 1.15
I'm currently using the Universal Sentence Encoder embeddings for pairwise similarity. I'd like to fine-tune the Universal Sentence to improve embeddings quality and I've gotten to this point:
module = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2", trainable=True)
variables_names = [v.name for v in tf.trainable_variables()]
with tf.Session() as sess111:
init = tf.global_variables_initializer()
sess111.run(init)
values = sess111.run(variables_names)
#for k, v in zip(variables_names, values):
# print ("Variable: ", k)
print(values[0])
module_embeds = module(sentences)
values = sess111.run(variables_names)
print(values[0])
My first thought was to pass sentences through the USE module thinking it would update the trainable variables within a tf session which wasn't the case. So at this point, I have access to each of the trainable variables but I'm not sure how to proceed. Reviewing this tensorflow hub issue, they mention the following strategy:
Define a loss, and add an optimizer for that loss, then running the optimizer will update the trained weights of the embed module.
I'm entirely sure what the best way to do this would be for my use case. I've seen this notebook which retrains a classifier but I can't grasp how we end up extracting tuned weights that can be used to generate new embeddings.
Any help or guidance would be much appreciated.

Can you feed a custom Keras layer with another layer instead of a Tensor?

Let's say I have the custom layer Node which inherits from keras.layers.Layer and should represent a single node in a neural network.
As far as I know, in order to feed a layer in keras you need to pass a tensor into it, but my desired syntax is something along the lines of:
n1 = Node()
n2 = Node()
n2(n1) # Instead of n2(n1.output) where n1.output is a Tensor
Is it considered bad practice to do something like that?
The Keras Functional API is a way to create models that are more flexible than the tf.keras.Sequential API. The functional API can handle models with non-linear topology, shared layers, and even multiple inputs or outputs.
The functional API can be used to create complex graphs of layers.
Lets look at a very simple example:
x = layers.Dense(64)(inputs)
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)
model = keras.Model(inputs=inputs, outputs=outputs)
Here you have 3 layers: Dense(64) -> Dense(64) -> Dense(10), the code first created the 3 layer pipeline and then builds the Model by linking the inputs and outputs.
This is similar to your desired syntax
Refer to the Tensorflow Keras Functional API Guide

TensorFlow Graph to Keras Model?

Is it possible to define a graph in native TensorFlow and then convert this graph to a Keras model?
My intention is simply combining (for me) the best of the two worlds.
I really like the Keras model API for prototyping and new experiments, i.e. using the awesome multi_gpu_model(model, gpus=4) for training with multiple GPUs, saving/loading weights or whole models with oneliners, all the convenience functions like .fit(), .predict(), and others.
However, I prefer to define my model in native TensorFlow. Context managers in TF are awesome and, in my opinion, it is much easier to implement stuff like GANs with them:
with tf.variable_scope("Generator"):
# define some layers
with tf.variable_scope("Discriminator"):
# define some layers
# model losses
G_train_op = ...AdamOptimizer(...)
.minimize(gloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Generator")
D_train_op = ...AdamOptimizer(...)
.minimize(dloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Discriminator")
Another bonus is structuring the graph this way. In TensorBoard debugging complicated native Keras models are hell since they are not structured at all. With heavy use of variable scopes in native TF you can "disentangle" the graph and look at a very structured version of a complicated model for debugging.
By utilizing this I can directly setup custom loss function and do not have to freeze anything in every training iteration since TF will only update the weights in the correct scope, which is (at least in my opinion) far easier than the Keras solution to loop over all the existing layers and set .trainable = False.
TL;DR:
Long story short: I like the direct access to everything in TF, but most of the time a simple Keras model is sufficient for training, inference, ... later on. The model API is much easier and more convenient in Keras.
Hence, I would prefer to set up a graph in native TF and convert it to Keras for training, evaluation, and so on. Is there any way to do this?
I don't think it is possible to create a generic automated converter for any TF graph, that will come up with a meaningful set of layers, with proper namings etc. Just because graphs are more flexible than a sequence of Keras layers.
However, you can wrap your model with the Lambda layer. Build your model inside a function, wrap it with Lambda and you have it in Keras:
def model_fn(x):
layer_1 = tf.layers.dense(x, 100)
layer_2 = tf.layers.dense(layer_1, 100)
out_layer = tf.layers.dense(layer_2, num_classes)
return out_layer
model.add(Lambda(model_fn))
That is what sometimes happens when you use multi_gpu_model: You come up with three layers: Input, model, and Output.
Keras Apologetics
However, integration between TensorFlow and Keras can be much more tighter and meaningful. See this tutorial for use cases.
For instance, variable scopes can be used pretty much like in TensorFlow:
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
with tf.name_scope('block1'):
y = LSTM(32, name='mylstm')(x)
The same for manual device placement:
with tf.device('/gpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops / variables in the LSTM layer will live on GPU:0
Custom losses are discussed here: Keras: clean implementation for multiple outputs and custom loss functions?
This is how my model defined in Keras looks in Tensorboard:
So, Keras is indeed only a simplified frontend to TensorFlow so you can mix them quite flexibly. I would recommend you to inspect source code of Keras model zoo for clever solutions and patterns that allows you to build complex models using clean API of Keras.
You can insert TensorFlow code directly into your Keras model or training pipeline! Since mid-2017, Keras has fully adopted and integrated into TensorFlow. This article goes into more detail.
This means that your TensorFlow model is already a Keras model and vice versa. You can develop in Keras and switch to TensorFlow whenever you need to. TensorFlow code will work with Keras APIs, including Keras APIs for training, inference and saving your model.

How to use TensorBoard and summary operations with the tf.layers module

I have followed the TensorFlow Layers tutorial to create a CNN for MNIST digit classification using TensorFlow's tf.layers module. Now I'm trying to learn how to use TensorBoard from TensorBoard: Visualizing Learning. Perhaps this tutorial hasn't been updated recently, because it says its example code is a modification of that tutorial's and links to it, but the code is completely different: it manually defines a single-hidden-layer fully-connected network.
The TensorBoard tutorial shows how to use tf.summary to attach summaries to a layer by creating operations on the layer's weights tensor, which is directly accessible because we manually defined the layer, and attaching tf.summary objects to those operations. To do this if I'm using tf.layers and its tutorial code, I believe I'd have to:
Modify the Layers tutorial's example code to use the non-functional interface (Conv2D instead of conv2d and Dense instead of dense) to create the layers
Use the layer objects' trainable_weights() functions to get the weight tensors and attach tf.summary objects to those
Is that the best way to use TensorBoard with tf.layers, or is there a way that's more directly compatible with tf.layers and the functional interface? If so, is there an updated official TensorBoard tutorial? It would be nice if the documentation and tutorials were more unified.
You should be able to use the output of your tf.layers call to get the activations. Taking the first convolutional layer of the linked layers tutorial:
# Convolutional Layer #1
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=32,
kernel_size=[5, 5],
padding="same",
activation=tf.nn.relu)
You could do:
tensor_name = conv1.op.name
tf.summary.histogram(tensor_name + '/activation', conv1)
Not sure if this is the best way, but I believe it is the most direct way of doing what you want.
Hope this helps!
You can use something like this
with tf.name_scope('dense2'):
preds = tf.layers.dense(inputs=dense1,units = 12,
activation=tf.nn.sigmoid, name="dense2")
d2_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'dense2')
tf.summary.histogram("weights", d2_vars[0])
tf.summary.histogram("biases", d2_vars[1])
tf.summary.histogram("activations", preds)
Another choice is to use tf.layers.Dense instead of tf.layers.dense (difference between d and D).
The paradigm for Dense is :
x = tf.placeholder(shape=[None, 100])
dlayer = tf.layers.Dense(hidden_unit)
y = dlayer(x)
With dlayer as intermediate, you're able to do:
k = dlayer.kernel
b = dlayer.bias
k_and_b = dlayer.weights
Note that you won't get the dlayer.kernel until you apply y = dlayer(x).
Things are similar for other layers such as convolution layer. Check them with any available auto-completion.