Irreproducible results Tensorflow - tensorflow

I have a very basic code that tries to create a single-layered Dense neural net and predicts the output for a deterministic input. The code is as follows:
import tensorflow as tf
from tensorflow.keras import layers
model = tf.keras.models.Sequential()
model.add(layers.Dense(units = 10))
import numpy as np
inp = np.ones((1,10))
model.predict(inp)
But the output that I am getting isn't being deterministic. I think it is related to initializing the weights and biases. So, how do I fix this without writing the initializing function from scratch?

Set global seed before initializing model tf.random.set_seed(42)
You can also set seed for specific parts of model, e.g. kernel_initializer in Dense layer, but with this approach, you may miss initializers that will still be nondeterministic. In your case setting it globally will be the best solution.

Related

Learning a Categorical Variable with TensorFlow Probability

I would like to use TFP to write a neural network where the output are the probabilities of a categorical variable with 3 classes, and train it using the negative log-likelihood.
As I'm moving my first steps with TF and TFP, I started with a toy model where the input layer has only 1 unit receiving a null input, and the output layer has 3 units with softmax activation function. The idea is that the biases should learn (up to an additive constant) the log of the probabilities.
Here below is my code, true_p are the true parameters I use to generate the data and I would like to learn, while learned_p is what I get from the NN.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from functions import nll
from tensorflow.keras.optimizers import SGD
import tensorflow.keras.layers as layers
import tensorflow_probability as tfp
tfd = tfp.distributions
# params
true_p = np.array([0.1, 0.7, 0.2])
n_train = 1000
# training data
x_train = np.array(np.zeros(n_train)).reshape((n_train,))
y_train = np.array(np.random.choice(len(true_p), size=n_train, p=true_p)).reshape((n_train,))
# model
input_layer = layers.Input(shape=(1,))
p_layer = layers.Dense(len(true_p), activation=tf.nn.softmax)(input_layer)
p_y = tfp.layers.DistributionLambda(tfd.Categorical)(p_layer)
model_p = keras.models.Model(inputs=input_layer, outputs=p_y)
model_p.compile(SGD(), loss=nll)
# training
hist_p = model_p.fit(x=x_train, y=y_train, batch_size=100, epochs=3000, verbose=0)
# check result
learned_p = np.round(model_p.layers[1].call(tf.constant([0], shape=(1, 1))).numpy(), 3)
learned_p
With this setup, I get the result:
>>> learned_p
array([[0.005, 0.989, 0.006]], dtype=float32)
I over-estimate the second category, and can't really distinguish between the first and the third one. What's worst, if I plot the probabilities at the end of each epoch, it looks like they are converging monotonically to the vector [0,1,0], which doesn't make sense (it seems to me the gradient should push in the opposite direction once I start to over-estimate).
I really can't figure out what's going on here, but have the feeling I'm doing something plain wrong. Any idea? Thank you for your help!
For the record, I also tried using other optimizers like Adam or Adagrad playing with the hyper-params, but with no luck.
I'm using Python 3.7.9, TensorFlow 2.3.1 and TensorFlow probability 0.11.1
I believe the default argument to Categorical is not the vector of probabilities, but the vector of logits (values you'd take softmax of to get probabilities). This is to help maintain precision in internal Categorical computations like log_prob. I think you can simply eliminate the softmax activation function and it should work. Please update if it doesn't!
EDIT: alternatively you can replace the tfd.Categorical with
lambda p: tfd.Categorical(probs=p)
but you'll lose the aforementioned precision gains. Just wanted to clarify that passing probs is an option, just not the default.

how to convert saved model from sklearn into tensorflow/lite

If I want to implement a classifier using the sklearn library. Is there a way to save the model or convert the file into a saved tensorflow file in order to convert it to tensorflow lite later?
If you replicate the architecture in TensorFlow, which will be pretty easy given that scikit-learn models are usually rather simple, you can explicitly assign the parameters from the learned scikit-learn models to TensorFlow layers.
Here is an example with logistic regression turned into a single dense layer:
import tensorflow as tf
import numpy as np
from sklearn.linear_model import LogisticRegression
# some random data to train and test on
x = np.random.normal(size=(60, 21))
y = np.random.uniform(size=(60,)) > 0.5
# fit the sklearn model on the data
sklearn_model = LogisticRegression().fit(x, y)
# create a TF model with the same architecture
tf_model = tf.keras.models.Sequential()
tf_model.add(tf.keras.Input(shape=(21,)))
tf_model.add(tf.keras.layers.Dense(1))
# assign the parameters from sklearn to the TF model
tf_model.layers[0].weights[0].assign(sklearn_model.coef_.transpose())
tf_model.layers[0].bias.assign(sklearn_model.intercept_)
# verify the models do the same prediction
assert np.all((tf_model(x) > 0)[:, 0].numpy() == sklearn_model.predict(x))
It is not always easy to replicate a scikit model in tensorflow. For instance scitik has a lot of on the fly imputation libraries which will be a bit tricky to implement in tensorflow

How to re-initialize layer weights of an existing model in Keras?

The actual problem is generating random layer weights for an existing (already built) model in Keras. There are some solutions using Numpy [2] but it is not good to choice that solutions. Because, in Keras, there are special initializers using different distributions for each layer type. When Numpy is used instead of the initializers, the generated weights have different distribution then its original. Let's give an example:
Second layer of my model is a convolutional (1D) layer and its initializer is GlorotUniform [1]. If you generate random weights using Numpy, the distribution of generated weights will not be the GlorotUniform.
I have a solution for this problem but it has some problems. Here is what I have:
def set_random_weights(self, tokenizer, config):
temp_model = build_model(tokenizer, config)
self.model.set_weights(temp_model.get_weights())
I am building the existing model. After the building process, weights of the model are re-initialized. Then I get the re-initalized weights and set them to another model. Building model to generate new weights has redundant processes. So, I need a new solution without building a model and Numpy.
https://keras.io/initializers/
https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g
See previous answers to this question here.
Specifically, if you want to use the original weights initializer of a Keras layer, you can do the following:
import tensorflow as tf
import keras.backend as K
def init_layer(layer):
session = K.get_session()
weights_initializer = tf.variables_initializer(layer.weights)
session.run(weights_initializer)
layer = model.get_layer('conv2d_1')
init_layer(layer)

Keras isn't raising Tensorflow errors?

I'm writing a custom Tensorflow loss function for Keras, and I tried debugging it by using Tensorflow assertions, but these don't seem to raise errors anywhere even when I'm sure they ought to. I can boil it down to the following example:
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf
import numpy as np
def demo_loss(y_true, y_pred):
tf.assert_negative(tf.ones([1,1]))
return tf.square(y_true - y_pred)
model = Sequential()
model.add(Dense(1, input_dim=1, activation='linear'))
model.compile(optimizer='rmsprop', loss=demo_loss)
model.fit(np.ones((1000,1)), np.ones((1000,1)), epochs=10, batch_size=100)
This really seems to me like it should emit an InvalidArgumentError. Why doesn't it?
(Alternately, what's the more sensible way to debug my custom loss functions?)
Your TensorFlow code is not working because there is nothing which forces the assertion to be executed. To make it work you need to add a control dependency to it, something like:
def demo_loss(y_true, y_pred):
with tf.control_dependencies([tf.assert_negative(tf.ones([1,1]))]):
return tf.square(y_true - y_pred)
I'm not sure whether the code should stop... your loss function will be compiled together with your model in a single graph, and that tf.assert command is totally disconnected from everything.
These functions are not meant to be debugged. They're created to achieve the highest performance possible, that's why it's made first as a graph, and only later you feed the data.
When I want to debug, I go for a little model and predict:
trueInput = Input(outputShape)
predInput = Input(outputShape)
output = Lambda(lambda x: demo_loss(x[0],x[1]))([trueInput,predInput])
debugModel = Model([trueInput,predInput], output)
Now use this model to predict:
retults = degugModel.predict([someNumpyTrue, someNumpyPred])
You can divide the function in smaller functions, each one in a different Lambda layer, and see each output separately.

Keras weights and get_weights() show different values

I am using Keras with Tensorflow. A Keras layer has a method "get_weights()" and an attribute "weights". My understanding is that "weights" output the Tensorflow tensors of the weights and "get_weights()" evaluate the weight tensors and output the values as numpy arrays. However, the two actually show me different values. Here is the code to replicate.
from keras.applications.vgg19 import VGG19
import tensorflow as tf
vgg19 = VGG19(weights='imagenet', include_top=False)
vgg19.get_layer('block5_conv1').get_weights()[0][0,0,0,0]
#result is 0.0028906602, this is actually the pretrained weight
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#I have to run the initializer here. Otherwise, the next line will give me an error
sess.run(vgg19.get_layer('block5_conv1').weights[0][0,0,0,0])
#The result here is -0.017039195 for me. It seems to be a random number each time.
My Keras version is 2.0.6. My Tensorflow is 1.3.0. Thank you!
The method get_weights() is indeed just evaluating the values of the the Tensorflow tensor given by the attribute weights. THe reason that I got different values between get_weights() and sess.run(weight) is that I was referring to the variables in two different sessions. When I ran vgg19 = VGG19(weights='imagenet', include_top=False), Keras has already created a Tensorflow session and initialized the weights with pre-trained values in that session. Then I created another Tensorflow session called sess by running sess = tf.Session(). In this session, the weights are not initialized yet. Then when I ran sess.run(tf.global_variables_initializer()), random numbers were assigned to the weights in this session. So the key is to make sure that you are working with the same session when using Tensorflow and Keras. The following code show that get_weights() and sess.run(weight) give the same value.
import tensorflow as tf
from keras import backend as K
from keras.applications.vgg19 import VGG19
sess = tf.Session()
K.set_session(sess)
vgg19 = VGG19(weights='imagenet', include_top=False)
vgg19.get_layer('block5_conv1').get_weights()[0][0,0,0,0]
#result is 0.0028906602, this is actually the pretrained weight
sess.run(vgg19.get_layer('block5_conv1').weights[0][0,0,0,0])
#The result here is also 0.0028906602