How to Reproduce Same Result Using Conv2d in Tensorflow.Keras? - tensorflow

I read many posts on Stack Overflow as well Github on this topic, but I think my situation might be little different.
My code starts like below, and I can consistently reproduce the result 100% if I only use Dense layer.
import numpy as np
import random as rn
import tensorflow as tf
import os
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(1)
rn.seed(2)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
from tensorflow.keras import backend as K
tf.set_random_seed(3)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
However, every time I run, I get different results if I insert this one line "model.add(Conv2D(32, 3, activation='relu'))" before "model.add(Flatten())".
Input> flatten > dense produces consistent result, but input > conv2d > flatten > dense produces different result every time I run the code.
I'd appreciate any guidance.

Related

Feature extraction in pretrained model: different result from keras-gpu&tensorflow-gpu and keras&tensorflow base

I'm extracting features from a pretrained keras model (VGG16).The task is simple but I obtain different features (nx4096) when I run the code on GPU (tensorflow-gpu and keras-gpu installed) or CPU (tensorflow and keras installed).
The differences in the features are huge and when I use the different extracted features in a classifier, I obtain very much better results with those obtained on GPU.
Could someone explain me why?
I write the code in case it is useful:
from keras.preprocessing import image
from keras.applications.vgg16 import VGG16
import tensorflow.keras
from keras.applications.vgg16 import preprocess_input
import numpy as np
from sklearn.cluster import KMeans
# %matplotlib inline
import matplotlib.pyplot as plt
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import decode_predictions
from keras.models import Model
from pickle import dump
model = VGG16(weights='imagenet')#, include_top=False
# remove the output layer
model = Model(inputs=model.inputs, outputs=model.layers[-2].output)
model.summary()
trainstack=np.load('.../All_imgs.npy')
trainstack=np.transpose(trainstack,[2,0,1,3])
trainstack=preprocess_input(trainstack)
trainstacklabel=np.load('.../All_labels.npy')
img_data = preprocess_input(trainstack)
vgg16_feature_list=[]
for i in range(trainstack.shape[0]):
v =img_data[i,:,:,:]
v=v[np.newaxis,...]
vgg16_feature = model.predict(v)
print (vgg16_feature.shape)
vgg16_feature_np = np.array(vgg16_feature)
vgg16_feature_list.append(vgg16_feature_np.flatten())
np.save('.../features.npy',vgg16_feature_list)
Thank you very much in advance!!
I try to extract features from a pretrained model for combining them with other variables in a binary classifier.
I obtain very different extracted features when I run my code on GPU and CPU.

Why tensorflow probability is so slow?

I try to use TensorFlow to test the kalman filter. I follow the official instruction (https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/LinearGaussianStateSpaceModel) to define the model, generate a sample and finally calculate the log-likelihood value of the sample.
I am running the code provided by the instruction
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
import matplotlib.pyplot as plt
tfd = tfp.distributions
ndims = 2
step_std = 1.0
noise_std = 5.0
model = tfd.LinearGaussianStateSpaceModel(
num_timesteps=1000,
transition_matrix=tf.linalg.LinearOperatorIdentity(ndims),
transition_noise=tfd.MultivariateNormalDiag(
scale_diag=step_std**2 * tf.ones([ndims])),
observation_matrix=tf.linalg.LinearOperatorIdentity(ndims),
observation_noise=tfd.MultivariateNormalDiag(
scale_diag=noise_std**2 * tf.ones([ndims])),
initial_state_prior=tfd.MultivariateNormalDiag(
scale_diag=tf.ones([ndims])))
x = model.sample(1) # Sample from the prior on sequences of observations.
lp = model.log_prob(x) # Marginal likelihood of a (batch of) observations.
print(lp)
It takes 30 second to calculate the log-likelhoo. PS: I ran the code on colab and GPU was used.
My questions: Why it is so slow and how I can improve the performance?
Thanks.
Eager mode (the default in TF) is pretty slow in general. You can graph-ify this with tf.function.
lp = tf.function(model.log_prob, autograph=False, jit_compile=False)(x)
You can also set jit_compile to True and lower to xla. That will add some compile time (possibly nontrivial) but will usually make the code faster and will amortize if you will run it many times.

Cannot get reproducible results with ImageDataGenerator in keras

I am trying to get reproducible results between multiple runs of the same script in keras, but I get different ones at each iteration. My code looks like this:
import numpy as np
from numpy.random import seed
import random as rn
import os
seed_num = 1
os.environ['TF_CUDNN_DETERMINISTIC'] = '1'
os.environ['PYTHONHASHSEED'] = '1'
os.environ['TF_DETERMINISTIC_OPS'] = '1'
np.random.seed(seed_num)
rn.seed(seed_num)
import tensorflow as tf
tf.random.set_seed(seed_num)
import tensorflow.keras as ks
from tensorflow.python.keras import backend as K
...some imports...
from tensorflow.keras.preprocessing.image import ImageDataGenerator
.... data loading etc ....
generator = ImageDataGenerator(
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True)
generator.fit(X_train, seed=seed_num)
my_model.fit(generator.flow(X_train, y_train, batch_size=batch_size, shuffle=False, seed=seed_num), validation_data=(X_val, y_val), callbacks=callbacks , epochs=epochs, shuffle=False)
I identified the problem to be in ImageDataGenerator, i.e., when setting generator = ImageDataGenerator() without any augmentation the results are reproducible. I am also running on CPU and TensorFlow version is 2.4.1. What am I missing here?
Using GPU while creating augmented images can produce nondeterministic results.
To get reproducible results using ImageDataGenerator and GPU, one way is the following:
import random, os
import numpy as np
import tensorflow as tf
def set_seed(seed=0):
np.random.seed(seed)
tf.random.set_seed(seed)
random.seed(seed)
os.environ['TF_DETERMINISTIC_OPS'] = "1"
os.environ['TF_CUDNN_DETERMINISM'] = "1"
os.environ['PYTHONHASHSEED'] = str(seed)
set_seed()
Before model.fit() call again set_seed():
set_seed()
model.fit(...)
Otherwise, you can install the package tensorflow-determinism:
pip install tensorflow-determinism
If you're using Google Colab, restart your runtime or it won't probably work
The package will interact with GPU to produce deterministic results.
import random, os
import numpy as np
import tensorflow as tf
def set_seed(seed=0):
os.environ['TF_DETERMINISTIC_OPS'] = '1'
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
set_seed()
# code
Also in this case, before model.fit() call again set_seed():
set_seed()
model.fit(...)

Reproducible results with Keras with Tensorflow background

I have my own network. But it is giving me different outputs each time I run the code. I'm using keras (with Tensorflow backend), write the following code for reproducibility. My training sample: 280, validation sample # 27, test sample # 21.
# The following lines are for reproducibility
import os
import random as rn
os.environ['PYTHONHASHSEED'] = '0'
# random seed for NP genreator of ranodm numbers
np.random.seed(37)
rn.seed(1254) # specifying the seed for python-generated random numbers:
import tensorflow as tf
tf.compat.v1.set_random_seed(89) #tf.set_random_seed(89)
import keras.backend.tensorflow_backend as K
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess=tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config= session_conf)
K.set_session(sess)

How to print full (not truncated) tensor in tensorflow?

Whenever I try printing I always get truncated results
import tensorflow as tf
import numpy as np
np.set_printoptions(threshold=np.nan)
tensor = tf.constant(np.ones(999))
tensor = tf.Print(tensor, [tensor])
sess = tf.Session()
sess.run(tensor)
As you can see I've followed a guide I found on Print full value of tensor into console or write to file in tensorflow
But the output is simply
...\core\kernels\logging_ops.cc:79] [1 1 1...]
I want to see the full tensor, thanks.
This is solved easily by checking the Tensorflow API for tf.Print. Pass summarize=n where n is the number of elements you want displayed.
You can do it as follows in TensorFlow 2.x:
import tensorflow as tf
tensor = tf.constant(np.ones(999))
tf.print(tensor, summarize=-1)
From TensorFlow docs -> summarize: The first and last summarize elements within each dimension are recursively printed per Tensor. If set to -1, it will print all elements of every tensor.
https://www.tensorflow.org/api_docs/python/tf/print
To print all tensors without truncation in TensorFlow 2.x:
import numpy as np
import sys
np.set_printoptions(threshold=sys.maxsize)