Model trained on tf2.3 performs poorly on tf2.6 - tensorflow

I've trained a model on tensorflow 2.3 and now am trying inference on tf2.6. The model was originally trained on AWS so the CUDA and CUDNN versions are different, as well as the rest of the environment (python 3.7.11 and 3.8.12, numpy 1.18.5 and numpy 1.19.5).
The model is the exact same, the weights are also the same. However the resulting inference of the same input data results in two different vectors. They are NOT the same, differing substantially.
What is causing this issue and how do I make the results the same?

Not a solution but found a workaround. It seems to be keras 2.6.0 being problematic. Downgrading from tf2.6 to tf2.5 makes it fine as the results are the same.

Related

How to perform quantize aware training with tensorflow 1.15?

I am using tensorflow 1.15 due to dependency on multiple other modules and struggling to do quantize aware training. I came across tensorflow_model_optimization, but it works with tensorflow 2.x. Is there any way Quantization can be performed during training with tensorflow 1.15?
We cannot as quantization aware training was introduced in TF 2.0, so please upgrade to 2.x and let us know if you face any issues. as there is no workaround for it in 1.x.

What effects should tensorflow.compat.v1.disable_v2_behavior() have on training using the Keras API?

I have a CNN that trains, on a few hundred thousand examples, to a validation accuracy of ~95% after one epoch. It's straight forward code, using Keras to define a network using the Sequential API. Originally I prepared and used this model on TF 1.3. When I port it over to TF 2.1, replacing the keras calls with tensorflow.keras, it gets to ~60% quickly and gets stuck there (seemingly for many epochs), and the training loss always seems to converge to the same value.
If I add in tf.disable_v2_behavior() at the top of the script, it trains similarly to before.
The documentation states simply that "It switches all global behaviors that are different between TensorFlow 1.x and 2.x to behave as intended for 1.x". Hidden behind the Keras API, I haven't found a clear answer to what this really means in practice. Why should I expect a VGG-like CNN, defined using Keras and trained with model.fit(), to work well without v2 behaviour but to fail so consistently with?
Edit: disable_eager_execution() produces the same result, with improved performance.
Please try disabling eager execution and see if that helps.
tf.compat.v1.disable_eager_execution()
(Add this to the top of your script)

How to do parallel GPU inferencing in Tensorflow 2.0 + Keras?

Let's begin with the premise that I'm newly approaching to TensorFlow and deep learning in general.
I have TF 2.0 Keras-style model trained using tf.Model.train(), two available GPUs and I'm looking to scale down inference times.
I trained the model distributing across GPUs using the extremely handy tf.distribute.MirroredStrategy().scope() context manager
mirrored_strategy = tf.distribute.MirroredStrategy()
with mirrored_strategy.scope():
model.compile(...)
model.train(...)
both GPUs get effectively used (even if I'm not quite happy with the results accuracy).
I can't seem to find a similar strategy for distributing inference between GPUs with the tf.Model.predict() method: when i run model.predict() I get (obviously) usage from only one of the two GPUs.
Is it possible to istantiate the same model on both GPUs and feed them different chunks of data in parallel?
There are posts that suggest how to do it in TF 1.x but I can't seem to replicate the results in TF2.0
https://medium.com/#sbp3624/tensorflow-multi-gpu-for-inferencing-test-time-58e952a2ed95
Tensorflow: simultaneous prediction on GPU and CPU
my mental struggles with the question are mainly
TF 1.x is tf.Session()based while sessions are implicit in TF2.0, if I get it correctly, the solutions I read use separate sessions for each GPU and I don't really know how to replicate it in TF2.0
I don't know how to use the model.predict() method with a specific session.
I know that the question is probably not well-formulated but I summarize it as:
Does anybody have a clue on how to run Keras-style model.predict() on multiple GPUs (inferencing on a different batch of data on each GPU in a parallel way) in TF2.0?
Thanks in advance for any help.
Try to load model in tf.distribute.MirroredStrategy and use greater batch_size
mirrored_strategy = tf.distribute.MirroredStrategy()
with mirrored_strategy.scope():
model = tf.keras.models.load_model(saved_model_path)
result = model.predict(batch_size=greater_batch_size)
There still does not seem to be an official example for distributed inference. There is a potential solution here using tf.distribute.MirroredStrategy: https://github.com/tensorflow/tensorflow/issues/37686. However, it does not seem to fully utilize multi gpus

Why I got different training results from using keras and tf.keras?

I was using TensorFlow 1.13 and Keras for my research projects. Nowadays, due to some future warnings, I installed TensorFlow 2.0 and tried to use it.
Instead of using Keras as I did before, I used tf.keras and built the same RNN model. i.e.
from keras.layers import Dense (I used before)
v.s.
from tf.keras.layers import Dense (I tried now)
All other codes are the same. However, I get some worse results for using import from tf.keras.layers one. And I am pretty sure it's not a coincidence, I tried cross-validation and run the models many times.
Does anyone have some ideas about why it happens? Are there any differences from the tf.keras.layers and keras.layers? If so, how can we be careful in case we got some "wrose" results?
tf.keras is tensorflow's implementation of the keras api. Ideally, using tf.keras should not provide you worse results. However, there might be a mismatch in the versions of both the keras which may/may not give you different results. You can check the version using tf.keras.version function and see if that is the same version of keras that you had used before.
For more details refer:
https://www.tensorflow.org/guide/keras/overview

How does AWS-Sagemaker XGBoost perform compared to XGBoost installed locally?

I did hyperparameter tuning on two XGBoost model -- one is the XGBoost in AWS-Sagemaker, the other is XGBoost installed locally using the same parameter range. It seems the optimized model via the former performs worse than the latter (18% less in prediction accuracy for a binary classification problem). I wonder has anyone encounter similar problem and if so, what would be the possible reasons? Thanks!