What are symbolic tensors in TensorFlow and Keras? - tensorflow

What are symbolic tensors in TensorFlow and Keras? How are they different than other tensors? Why do they even exist? Where do they come up in TensorFlow and Keras? How should we deal with them or what problems can we face when dealing with them?
In the past, I had faced certain issues related to symbolic tensors, such as the _SymbolicException, but the documentation does not describe this concept. There's also another post where this question is also asked, but, in this post, I am focusing on this specific question, so that answers can be later used as a reference.

According to blog.tensorflow.org, a symbolic tensor differs from other tensors in that they do not specifically hold values.
Let's consider a simple example.
>>> a = tf.Variable(5, name="a")
>>> b = tf.Variable(7, name="b")
>>> c = (b**2 - a**3)**5
>>> print(c)
The output is as follows:
tf.Tensor(1759441920, shape=(), dtype=int32)
For the above, the values are specifically defined in tf.Variable format, and the output is in Tensor format. However, the tensor must contain a value in order to be considered as such.
Symbolic tensors are different in that no explicit values are required to define the tensor, and this has implications in terms of building neural networks with TensorFlow 2.0, which now uses Keras as the default API.
Here is an example of a Sequential neural network that is used to build a classification model for predicting hotel cancellation incidences (full Jupyter Notebook here if interested):
from tensorflow.keras import models
from tensorflow.keras import layers
model = models.Sequential()
model.add(layers.Dense(8, activation='relu', input_shape=(4,)))
model.add(layers.Dense(1, activation='sigmoid'))
This is a symbolically defined model, as no values are explicitly being defined in the network. Rather, a framework is created for the input variables to be read by the network, and then generate predictions.
In this regard, Keras has become quite popular given that it allows for building of graphs using symbolic tensors, while at the same time maintaining an imperative layout.

Related

How to use sample weights with tensorflow datasets?

I have been training a unet model for multiclass semantic segmentation in python using Tensorflow and Tensorflow Datasets.
I've noticed that one of my classes seems to be underrepresented in training. After doing some research, I found out about sample weights and thought it might be a good solution to my problem, but I've been having trouble deciphering the documentation on how to use it or find examples of it being used.
Could someone help explain how sample weights come into play with datasets for training or point me to an example where it is being implemented? Or even what type of input the model.fit function is expecting would be helpful.
From the documentation of tf.keras model.fit():
sample_weight
[...] This argument is not supported when x is a dataset, generator, or keras.utils.Sequence instance, instead provide the sample_weights as the third element of x.
What is meant by that? This is demonstrated for the Dataset case in one of the official documentation turorials:
sample_weight = np.ones(shape=(len(y_train),))
sample_weight[y_train == 5] = 2.0
# Create a Dataset that includes sample weights
# (3rd element in the return tuple).
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train, sample_weight))
# Shuffle and slice the dataset.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
model = get_compiled_model()
model.fit(train_dataset, epochs=1)
See the link for a full-fledged example.

Keras: Custom loss function with training data not directly related to model

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.
According to this guide, when defining a loss function it expects the arguments (y_true, y_pred)
https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses
def basic_loss_function(y_true, y_pred):
return ...
However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?
To be concrete, here is my problem:
I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.
I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators.
My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.
There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)
In a functional API model:
def loss_calc(x):
loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
#according to what you gave to the Lambda layer
#here you use some external data that doesn't relate to the samples
externalData = K.constant(external_numpy_data)
#calculate the loss
return the loss
Using the outputs of the model itself (the tensor(s) that are used in your loss)
loss = Lambda(loss_calc)([model_output_1, model_output_2])
Create the model outputting the loss instead of the outputs:
model = Model(inputs, loss)
Create a dummy keras loss function for compilation:
def dummy_loss(y_true, y_pred):
return y_pred #where y_pred is the loss itself, the output of the model above
model.compile(loss = dummy_loss, ....)
Use any dummy array correctly sized regarding number of samples for training, it will be ignored:
model.fit(your_inputs, np.zeros((number_of_samples,)), ...)
Another way of doing it, is using a custom training loop.
This is much more work, though.
Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())
Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough
Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.
Finally, you can use the approach you suggested of model.add_loss.
In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.
You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.
In this case, your model's output will probably be None too, and you should fit with y=None.

How to re-initialize layer weights of an existing model in Keras?

The actual problem is generating random layer weights for an existing (already built) model in Keras. There are some solutions using Numpy [2] but it is not good to choice that solutions. Because, in Keras, there are special initializers using different distributions for each layer type. When Numpy is used instead of the initializers, the generated weights have different distribution then its original. Let's give an example:
Second layer of my model is a convolutional (1D) layer and its initializer is GlorotUniform [1]. If you generate random weights using Numpy, the distribution of generated weights will not be the GlorotUniform.
I have a solution for this problem but it has some problems. Here is what I have:
def set_random_weights(self, tokenizer, config):
temp_model = build_model(tokenizer, config)
self.model.set_weights(temp_model.get_weights())
I am building the existing model. After the building process, weights of the model are re-initialized. Then I get the re-initalized weights and set them to another model. Building model to generate new weights has redundant processes. So, I need a new solution without building a model and Numpy.
https://keras.io/initializers/
https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g
See previous answers to this question here.
Specifically, if you want to use the original weights initializer of a Keras layer, you can do the following:
import tensorflow as tf
import keras.backend as K
def init_layer(layer):
session = K.get_session()
weights_initializer = tf.variables_initializer(layer.weights)
session.run(weights_initializer)
layer = model.get_layer('conv2d_1')
init_layer(layer)

TensorFlow Graph to Keras Model?

Is it possible to define a graph in native TensorFlow and then convert this graph to a Keras model?
My intention is simply combining (for me) the best of the two worlds.
I really like the Keras model API for prototyping and new experiments, i.e. using the awesome multi_gpu_model(model, gpus=4) for training with multiple GPUs, saving/loading weights or whole models with oneliners, all the convenience functions like .fit(), .predict(), and others.
However, I prefer to define my model in native TensorFlow. Context managers in TF are awesome and, in my opinion, it is much easier to implement stuff like GANs with them:
with tf.variable_scope("Generator"):
# define some layers
with tf.variable_scope("Discriminator"):
# define some layers
# model losses
G_train_op = ...AdamOptimizer(...)
.minimize(gloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Generator")
D_train_op = ...AdamOptimizer(...)
.minimize(dloss,
var_list=tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,
scope="Discriminator")
Another bonus is structuring the graph this way. In TensorBoard debugging complicated native Keras models are hell since they are not structured at all. With heavy use of variable scopes in native TF you can "disentangle" the graph and look at a very structured version of a complicated model for debugging.
By utilizing this I can directly setup custom loss function and do not have to freeze anything in every training iteration since TF will only update the weights in the correct scope, which is (at least in my opinion) far easier than the Keras solution to loop over all the existing layers and set .trainable = False.
TL;DR:
Long story short: I like the direct access to everything in TF, but most of the time a simple Keras model is sufficient for training, inference, ... later on. The model API is much easier and more convenient in Keras.
Hence, I would prefer to set up a graph in native TF and convert it to Keras for training, evaluation, and so on. Is there any way to do this?
I don't think it is possible to create a generic automated converter for any TF graph, that will come up with a meaningful set of layers, with proper namings etc. Just because graphs are more flexible than a sequence of Keras layers.
However, you can wrap your model with the Lambda layer. Build your model inside a function, wrap it with Lambda and you have it in Keras:
def model_fn(x):
layer_1 = tf.layers.dense(x, 100)
layer_2 = tf.layers.dense(layer_1, 100)
out_layer = tf.layers.dense(layer_2, num_classes)
return out_layer
model.add(Lambda(model_fn))
That is what sometimes happens when you use multi_gpu_model: You come up with three layers: Input, model, and Output.
Keras Apologetics
However, integration between TensorFlow and Keras can be much more tighter and meaningful. See this tutorial for use cases.
For instance, variable scopes can be used pretty much like in TensorFlow:
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
with tf.name_scope('block1'):
y = LSTM(32, name='mylstm')(x)
The same for manual device placement:
with tf.device('/gpu:0'):
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = LSTM(32)(x) # all ops / variables in the LSTM layer will live on GPU:0
Custom losses are discussed here: Keras: clean implementation for multiple outputs and custom loss functions?
This is how my model defined in Keras looks in Tensorboard:
So, Keras is indeed only a simplified frontend to TensorFlow so you can mix them quite flexibly. I would recommend you to inspect source code of Keras model zoo for clever solutions and patterns that allows you to build complex models using clean API of Keras.
You can insert TensorFlow code directly into your Keras model or training pipeline! Since mid-2017, Keras has fully adopted and integrated into TensorFlow. This article goes into more detail.
This means that your TensorFlow model is already a Keras model and vice versa. You can develop in Keras and switch to TensorFlow whenever you need to. TensorFlow code will work with Keras APIs, including Keras APIs for training, inference and saving your model.

Accessing neural network weights and neuron activations

After training a network using Keras:
I want to access the final trained weights of the network in some order.
I want to know the neuron activation values for every input passed. For example, after training, if I pass X as my input to the network, I want to know the neuron activation values for that X for every neuron in the network.
Does Keras provide API access to these things? I want to do further analysis based on the neuron activation values.
Update : I know I can do this using Theano purely, but Theano requires more low-level coding. And, since Keras is built on top of Theano, I think there could be a way to do this?
If Keras can't do this, then among Tensorflow and Caffe , which can? Keras is the easiest to use, followed by Tensorflow/Caffe, but I don't know which of these provide the network access I need. The last option for me would be to drop down to Theano, but I think it'd be more time-consuming to build a deep CNN with Theano..
This is covered in the Keras FAQ, you basically want to compute the activations for each layer, so you can do it with this code:
from keras import backend as K
#The layer number
n = 3
# with a Sequential model
get_nth_layer_output = K.function([model.layers[0].input],
[model.layers[n].output])
layer_output = get_nth_layer_output([X])[0]
Unfortunately you would need to compile and run a function for each layer, but this should be straightforward.
To get the weights, you can call get_weights() on any layer.
nth_weights = model.layers[n].get_weights()