What is the difference between JAX, Trax, and TensorRT, in simple terms? - tensorflow

I have been using TensorRT and TensorFlow-TRT to accelerate the inference of my DL algorithms.
Then I have heard of:
JAX https://github.com/google/jax
Trax https://github.com/google/trax
Both seem to accelerate DL. But I am having a hard time to understand them. Can anyone explain them in simple terms?

Trax is a deep learning framework created by Google and extensively used by the Google Brain team. It comes as an alternative to TensorFlow and PyTorch when it comes to implementing off-the-shelf state of the art deep learning models, for example Transformers, Bert etc. , in principle with respect to the Natural Language Processing field.
Trax is built upon TensorFlow and JAX. JAX is an enhanced and optimised version of Numpy. The important distinction about JAX and NumPy is that the former using a library called XLA (advanced linear algebra) which allows to run your NumPy code on GPU and TPU rather than on CPU like it happens in the plain NumPy, thus speeding up computation.

Related

Tensorflow profiling for non-model computations

I have a computation which has for loops and calls to Tensorflow matrix algorithms such as tf.lstsq and Tensorflow iteration with tf.map_fn. I would like to profile this to see how much parallelism I am getting in the tf.map_fn and matrix algorithms that get called.
This doesn't seem to be the use case at all for the Tensorflow Profiler which is organized around the neural network model training loop.
Is there a way to use Tensorflow Profiler for arbitrary Tensorflow computations, or is the go-to move in this case to use NVidia tools like nvprof?
I figured out that the nvprof and nvvp and nsight tools I was looking for are available as a Conda install of cudatoolkit-dev. Uses are described in this gist.

How does Ruy, XNNPACK, and Eigen work in Tensorflow Lite?

I heard from various sources (mostly from the official documents) that Tensorflow Lite (for
ARM) uses these three libraries - Ruy, Eigen, XNNPACK - for its operation.
I understand they somehow accelerate the computation (mainly convolution) in TF Lite, but I'm not exactly sure what purpose each library serves. I know Eigen is a BLAS library, but I'm not sure what others are and how they are related to each other in TF Lite.
Would someone care to explain what different purposes they serve and how they are used in conjunction in TF Lite? (Call Stacks maybe?)
I've been looking around the official documentations of each libraries but I was unable to find much details for Ruy and XNNPACK. Ruy says that it provides efficient matrix multiplication, but isn't that what BLAS libraries do?
Older version of TensorFlow Lite used Eigen and Gemmlowp library to accelerate the computation. However on Arm platforms the performance was worst compared to e.g. Arm Compute Library.
TensorFlow Lite replaced the Eigen and Gemmlowp around version 2.3 and with Ruy matrix multiplication library. They serves similar purpose, but Ruy performs better. The Ruy is default on Arm platform, but you can still compile the TensorFlow Lite without use of Ruy.
XNNPACK outperforms Ruy even more, but it focus solely on operation on float.
Regarding Ruy performance benchmarks check this thread https://github.com/google/ruy/issues/195, and the benchmarks on Pixel4 https://docs.google.com/spreadsheets/d/1CB4gsI7pujNRAf5Iz5vuD783QQqO2zOu8up9IpTKdlU/edit#gid=510573209.

fast.ai equivalent in tensorflow

Is there any equivalent/alternate library to fastai in tensorfow for easier training and debugging deep learning models including analysis on results of trained model in Tensorflow.
Fastai is built on top of pytorch looking for similar one in tensorflow.
The obvious choice would be to use tf.keras.
It is bundled with tensorflow and is becoming its official "high-level" API -- to the point where in TF 2 you would probably need to go out of your way not using it at all.
It is clearly the source of inspiration for fastai to easy the use of pytorch as Keras does for tensorflow, as mentionned by the authors time and again:
Unfortunately, Pytorch was a long way from being a good option for part one of the course, which is designed to be accessible to people with no machine learning background. It did not have anything like the clear simple API of Keras for training models. Every project required dozens of lines of code just to implement the basics of training a neural network. Unlike Keras, where the defaults are thoughtfully chosen to be as useful as possible, Pytorch required everything to be specified in detail. However, we also realised that Keras could be even better. We noticed that we kept on making the same mistakes in Keras, such as failing to shuffle our data when we needed to, or vice versa. Also, many recent best practices were not being incorporated into Keras, particularly in the rapidly developing field of natural language processing. We wondered if we could build something that could be even better than Keras for rapidly training world-class deep learning models.

What does DeepMind's Sonnet afford that Keras doesn't?

I'm really confused about the purpose of DeepMind's Sonnet library for TensorFlow. As far as I can tell from the documentation, it seems to do essentially what Keras does (flexible functional abstractions). Can someone tell me what the advantage of Sonnet is?
There isn't much difference between them. They are both:
High-level object oriented libraries that bring about abstraction when developing neural networks (NN) or other machine learning (ML) algorithms.
Built on top of TensorFlow (with the addition of Theano for Keras).
So why did they make Sonnet? It appears that Keras doesn't seem to suit the needs of DeepMind. So DeepMind came up with Sonnet, a high-level object oriented programming library built on top of TensorFlow to address its research needs.
Keras and Sonnet are both trying to simplify deep reinforcement learning, with the major difference being Sonnet is specifically adapted to the problems that DeepMind explores.
The main advantage of Sonnet, from my perspective, is you can use it to reproduce the research demonstrated in DeepMind's papers with greater ease than keras, since DeepMind will be using Sonnet themselves. Aside from that advantage, it's just yet another framework with which to explore deep RL problems.

Can we achieve the same computational scaling with tensorflow as we could with MPI?

the reason that I'm asking this question is because I've seen code around the web where people use the MPI for distributed computing in order to scale their computational models. What I can't wrap my head around is that most of those examples I'm referring to are written in tensorflow. Now given that tensorflow already implements mpi and gRPC the question that I'm asking is if we can achieve the same results purely with tensorflow instead of using MPI?
To put in other words what are some pros and cons in comparison to MPI vs TF?
Thanks in advance!
TF is a machine learning framework, and MPI is a Message Passing library. Parallel TF is built on top of MPI (TF is not an implementation of MPI)
Bottom line, you cannot compare apples and oranges, nor MPI and TF.