Run TensorFlow op in graph mode in tf 2.x - tensorflow

I would like to benchmark some TensorFlow operations (for example between them or against PyTorch). However most of the time I will write something like:
import numpy as np
import tensorflow as tf
tf_device = '/GPU:0'
a = np.random.normal(scale=100, size=shape).astype(np.int64)
b = np.array(7).astype(np.int64)
with tf.device(tf_device):
a_tf = tf.constant(a)
b_tf = tf.constant(b)
%timeit tf.math.floormod(a_tf, b_tf)
The problem with this approach is that it does the computation in eager-mode (I think in particular that it has to perform GPU to CPU placement). Eventually, I want to use those ops in a tf.keras model and therefore would like to evaluate their performance in graph mode.
What is the preferred way to do it?
My google searches have led to nothing and I don't know how to use sessions like in tf 1.x.

What you are looking for is tf.function. Check this tutorial and this docs.
As the tutorial says, in TensorFlow 2, eager execution is turned on by default. The user interface is intuitive and flexible (running one-off operations is much easier and faster), but this can come at the expense of performance and deployability. To get performant and portable models, use tf.function to make graphs out of your programs.
Check this code:
import numpy as np
import tensorflow as tf
import timeit
tf_device = '/GPU:0'
shape = [100000]
a = np.random.normal(scale=100, size=shape).astype(np.int64)
b = np.array(7).astype(np.int64)
#tf.function
def experiment(a_tf, b_tf):
tf.math.floormod(a_tf, b_tf)
with tf.device(tf_device):
a_tf = tf.constant(a)
b_tf = tf.constant(b)
# warm up
experiment(a_tf, b_tf)
print("In graph mode:", timeit.timeit(lambda: experiment(a_tf, b_tf), number=10))
print("In eager mode:", timeit.timeit(lambda: tf.math.floormod(a_tf, b_tf), number=10))

Related

Why tensorflow probability is so slow?

I try to use TensorFlow to test the kalman filter. I follow the official instruction (https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/LinearGaussianStateSpaceModel) to define the model, generate a sample and finally calculate the log-likelihood value of the sample.
I am running the code provided by the instruction
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
import matplotlib.pyplot as plt
tfd = tfp.distributions
ndims = 2
step_std = 1.0
noise_std = 5.0
model = tfd.LinearGaussianStateSpaceModel(
num_timesteps=1000,
transition_matrix=tf.linalg.LinearOperatorIdentity(ndims),
transition_noise=tfd.MultivariateNormalDiag(
scale_diag=step_std**2 * tf.ones([ndims])),
observation_matrix=tf.linalg.LinearOperatorIdentity(ndims),
observation_noise=tfd.MultivariateNormalDiag(
scale_diag=noise_std**2 * tf.ones([ndims])),
initial_state_prior=tfd.MultivariateNormalDiag(
scale_diag=tf.ones([ndims])))
x = model.sample(1) # Sample from the prior on sequences of observations.
lp = model.log_prob(x) # Marginal likelihood of a (batch of) observations.
print(lp)
It takes 30 second to calculate the log-likelhoo. PS: I ran the code on colab and GPU was used.
My questions: Why it is so slow and how I can improve the performance?
Thanks.
Eager mode (the default in TF) is pretty slow in general. You can graph-ify this with tf.function.
lp = tf.function(model.log_prob, autograph=False, jit_compile=False)(x)
You can also set jit_compile to True and lower to xla. That will add some compile time (possibly nontrivial) but will usually make the code faster and will amortize if you will run it many times.

Tensorflow operationseemingly not using GPU

I need to perform a job that averages large numbers of long vectors multiple times, and I would like this to be done on my GPU.
Monitoring nvtop and htop while running, I see that GPU (which always shows top activity when I train Keras models) is not being used at all in these operations, while CPU-use surges during these operations.
I have simulated it in the code snippet below (trying to minimize non-tf-work).
what am I doing wrong?
import tensorflow as tf
from tensorflow.math import add_n, add, scalar_mul
import numpy as np
tf.debugging.set_log_device_placement(True)
sess = tf.compat.v1.Session(config=config)
tf.compat.v1.keras.backend.set_session(sess)
os.environ["CUDA_VISIBLE_DEVICES"]="1"
#Make a random numpy matrix
vecs=np.random.rand(100, 300)
with sess.as_default():
with tf.device('/GPU:0'):
for _ in range(1000):
#vecs=np.random.rand(100, 300)
tf_vecs=tf.Variable(vecs, dtype=tf.float64)
tf_invlgt=tf.Variable(1/np.shape(vecs)[0],dtype=tf.float64)
vectors=tf.unstack(tf_vecs)
sum_vecs=add_n(vectors)
mean_vec=tf.Variable(scalar_mul(tf_invlgt, sum_vecs))
Thanks
Michael
I might be wrong but could it be that the cuda_visible_devices should be "0" like
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"
see github comment here
If if still does not work, you can also add a small piece of code to check if tensorflow can see the gpu devices:
from tensorflow.python.client import device_lib
def get_available_gpus():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos if x.device_type == 'GPU']
This is mentioned here

Saving, loading, and predicting from a TensorFlow Estimator model (2.0)

Is there a guide anywhere for serializing and restoring Estimator models in TF2? The documentation is very spotty, and much of it not updated to TF2. I've yet to see a clear ands complete example anywhere of an Estimator being saved, loaded from disk and used to predict from new inputs.
TBH, I'm a bit baffled by how complicated this appears to be. Estimators are billed as simple, relatively high-level ways of fitting standard models, yet the process for using them in production seems very arcane. For example, when I load a model from disk via tf.saved_model.load(export_path) I get an AutoTrackable object:
<tensorflow.python.training.tracking.tracking.AutoTrackable at 0x7fc42e779f60>
Its not clear why I don't get my Estimator back. It looks like there used to be a useful-sounding function tf.contrib.predictor.from_saved_model, but since contrib is gone, it does not appear to be in play anymore (except, it appears, in TFLite).
Any pointers would be very helpful. As you can see, I'm a bit lost.
maybe the author doesn't need the answer anymore but I was able to save and load a DNNClassifier using TensorFlow 2.1
# training.py
from pathlib import Path
import tensorflow as tf
....
# Creating the estimator
estimator = tf.estimator.DNNClassifier(
model_dir = <model_dir>,
hidden_units = [1000, 500],
feature_columns = feature_columns, # this is a list defined earlier
n_classes = 2,
optimizer = 'adam')
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_path = Path(estimator.export_saved_model(<model_dir>, export_input_fn).decode('utf8'))
print(f'Model saved at {servable_model_path}')
For loading, you found the correct method, you just need to retrieve the predict_fn
# testing.py
import tensorflow as tf
import pandas as pd
def predict_input_fn(test_df):
'''Convert your dataframe using tf.train.Example() and tf.train.Features()'''
examples = []
....
return tf.constant(examples)
test_df = pd.read_csv('test.csv', ...)
# Loading the estimator
predict_fn = tf.saved_model.load(<model_dir>).signatures['predict']
# Predict
predictions = predict_fn(examples=predict_input_fn(test_df))
Hope that this can help other people too (:

Using keras.layers.Add() in a keras.sequential model

Using TF 2.0 and tfp probability layers, I have constructed a keras.sequential model. I would like to export it for serving with TensorFlow Serving, and I would like to include the preprocessing and post processing steps in the servable.
My preprocessing steps are fairly simple-- fill NAs with explicit values, encoding a few strings as floats, normalize inputs, and denormalize outputs. For training, I have been doing the pre/post processing with pandas and numpy.
I know that I can export my Keras model's weights, wrap the keras.sequential model's architecture in a bigger TensorFlow graph, use low-level ops like tf.math.subtract(inputs, vector_of_feature_means) to do pre/post processing operations, define tf.placeholders for my inputs and outputs, and make a servable, but I feel like there has to be a cleaner way of doing this.
Is it possible to use keras.layers.Add() and keras.layers.Multiply() in a keras.sequence model for explicit preprocessing steps, or is there some more standard way of doing these things?
The standard and efficient way of doing these things, as per my understanding is, to use Tensorflow Transform. It doesn't essentially mean that we should use entire TFX Pipeline if we have to use TF Transform. TF Transform can be used as a Standalone as well.
Tensorflow Transform creates a Beam Transormation Graph, which injects these Transformations as Constants in Tensorflow Graph. As these transformations are represented as Constants in the Graph, they will be consistent across Training and Serving. Advantages of that consistency across Training and Serving are
Eliminates Training-Serving Skew
Eliminates the need for having code in the Serving System, which improves the latency.
Sample Code for TF Transform is mentioned below:
Code for Importing all the Dependencies:
try:
import tensorflow_transform as tft
import apache_beam as beam
except ImportError:
print('Installing TensorFlow Transform. This will take a minute, ignore the warnings')
!pip install -q tensorflow_transform
print('Installing Apache Beam. This will take a minute, ignore the warnings')
!pip install -q apache_beam
import tensorflow_transform as tft
import apache_beam as beam
import tensorflow as tf
import tensorflow_transform.beam as tft_beam
from tensorflow_transform.tf_metadata import dataset_metadata
from tensorflow_transform.tf_metadata import dataset_schema
Below mentioned is the Pre-Processing function where we mention all the Transformations. As of now, TF Transform doesn't provide a direct API for Missing Value Imputation. So, only for that, we have to write our own code for that using low level APIs.
def preprocessing_fn(inputs):
"""Preprocess input columns into transformed columns."""
# Since we are modifying some features and leaving others unchanged, we
# start by setting `outputs` to a copy of `inputs.
outputs = inputs.copy()
# Scale numeric columns to have range [0, 1].
for key in NUMERIC_FEATURE_KEYS:
outputs[key] = tft.scale_to_0_1(outputs[key])
for key in OPTIONAL_NUMERIC_FEATURE_KEYS:
# This is a SparseTensor because it is optional. Here we fill in a default
# value when it is missing.
dense = tf.sparse_to_dense(outputs[key].indices,
[outputs[key].dense_shape[0], 1],
outputs[key].values, default_value=0.)
# Reshaping from a batch of vectors of size 1 to a batch to scalars.
dense = tf.squeeze(dense, axis=1)
outputs[key] = tft.scale_to_0_1(dense)
# For all categorical columns except the label column, we generate a
# vocabulary but do not modify the feature. This vocabulary is instead
# used in the trainer, by means of a feature column, to convert the feature
# from a string to an integer id.
for key in CATEGORICAL_FEATURE_KEYS:
tft.vocabulary(inputs[key], vocab_filename=key)
# For the label column we provide the mapping from string to index.
table = tf.contrib.lookup.index_table_from_tensor(['>50K', '<=50K'])
outputs[LABEL_KEY] = table.lookup(outputs[LABEL_KEY])
return outputs
You can refer below mentioned link for the detailed information and for the Tutorial of TF Transform.
https://www.tensorflow.org/tfx/transform/get_started
https://www.tensorflow.org/tfx/tutorials/transform/census

calling tensorflow from keras.backend

I am learning Keras and Thensorflow for deep learning and I have a question.
With this imports:
import tensorflow as tf
from keras import backend as K
Is there a difference between these two calls:
K.foo
and
tf.foo
?
In which conditions are they equivalent ?
Yes, there may be a difference.
Keras is built on top of a backend. This backend may be Tensorflow, Theano or CNTK.
So, a function from keras will call a function from the backend, something like this:
#at keras.backend
def foo(args**):
#there may be some preprocessing or inversion in dimensions
return tf.foo(args_that_may_be_different**)
It's impossible to have an answer for all functions. Some are indeed exactly the same, some may have a difference.
You can search the backend codes, speficially the tensorflow backend and see how keras handles each function.