Compare deep learning framework between TensorFlow and PaddlePaddle - tensorflow

I want to study on the research of deep learning, but I don't know which framwork should I choice between TensorFlow and PaddlePaddle. who can make a contrast between the two frameworks? which one is better? especially in the running efficiency of CPU

It really depends what you are shooting for...
If you plan on training, CPU is not going to work well for you. Use colab or kaggle.
Assuming you do get a GPU, it depends if you want to focus on classification or object detection.
If you focus on classification, Keras is probably the easiest to work with or pytorch if you want some advanced stuff and to be able to change things.
If you plan on object detection, things are getting complicated... Inference is reasonably easy but training is complicated. There are actually 4 platforms you should consider:
Tensorflow - powerful but very difficult to work with. If you do not use Keras (and for OD you usually can't), you need to preprocess the dataset into tfrecords and it is a pain. The OD Api has very cryptic messages and it is very sensitive to the combination of tf version and api version. On the other hand, cool models like efficientdet are more or less easy to use.
MMdetection - very powerful framework, has lots of advanced models and once you understand how to work with it, you can easily work with and of the models it supports. Downside is that some models are slow to arrive (efficientdet, for example)
paddlepaddle - if you know Chinese, this should work ok, maybe. The documentation is a bit behind and usually requires lots of improvisation. Basically it is similar to mmdetection just with a few unique models and a few missing models.
detectron2 - I didn't work with this one, but it seems to support only a few models.
You probably need first to define for yourself what do you want to do and then choose.
Good luck!

It is not that trivial. Some models run faster with one kind of framework others with another. Furthermore, it depends on the hardware as well. See this blog. If inference is your only concern, then you can develop your model in any of the popular frameworks like TensorFlow, PyTorch, etc. In the end convert your model to ONNX format and benchmark its performance with DNN-Bench to choose the best inference engine for your application.

Related

Which model (GPT2, BERT, XLNet and etc) would you use for a text classification task? Why?

I'm trying to train a model for a sentence classification task. The input is a sentence (a vector of integers) and the output is a label (0 or 1). I've seen some articles here and there about using Bert and GPT2 for text classification tasks. However, I'm not sure which one should I pick to start with. Which of these recent models in NLP such as original Transformer model, Bert, GPT2, XLNet would you use to start with? And why? I'd rather to implement in Tensorflow, but I'm flexible to go for PyTorch too.
Thanks!
It highly depends on your dataset and is part of the data scientist's job to find which model is more suitable for a particular task in terms of selected performance metric, training cost, model complexity etc.
When you work on the problem you will probably test all of the above models and compare them. Which one of them to choose first? Andrew Ng in "Machine Learning Yearning" suggest starting with simple model so you can quickly iterate and test your idea, data preprocessing pipeline etc.
Don’t start off trying to design and build the perfect system.
Instead, build and train a basic system quickly—perhaps in just a few
days
According to this suggestion, you can start with a simpler model such as ULMFiT as a baseline, verify your ideas and then move on to more complex models and see how they can improve your results.
Note that modern NLP models contain a large number of parameters and it is difficult to train them from scratch without a large dataset. That's why you may want to use transfer learning: you can download pre-trained model and use it as a basis and fine-tune it to your task-specific dataset to achieve better performance and reduce training time.
I agree with Max's answer, but if the constraint is to use a state of the art large pretrained model, there is a really easy way to do this. The library by HuggingFace called pytorch-transformers. Whether you chose BERT, XLNet, or whatever, they're easy to swap out. Here is a detailed tutorial on using that library for text classification.
EDIT: I just came across this repo, pytorch-transformers-classification (Apache 2.0 license), which is a tool for doing exactly what you want.
Well like others mentioned, it depends on the dataset and multiple models should be tried and best one must be chosen.
However, sharing my experience, XLNet beats all other models so far by a good margin. Hence if learning is not the objective, i would simple start with XLNET and then try a few more down the line and conclude. It just saves time in exploring.
Below repo is excellent to do all this quickly. Kudos to them.
https://github.com/microsoft/nlp-recipes
It uses hugging face transformers and makes them dead simple. 😃
I have used XLNet, BERT, and GPT2 for summarization tasks (English only). Based on my experience, GPT2 works the best among all 3 on short paragraph-size notes, while BERT performs better for longer texts (up to 2-3 pages). You can use XLNet as a benchmark.

fast.ai equivalent in tensorflow

Is there any equivalent/alternate library to fastai in tensorfow for easier training and debugging deep learning models including analysis on results of trained model in Tensorflow.
Fastai is built on top of pytorch looking for similar one in tensorflow.
The obvious choice would be to use tf.keras.
It is bundled with tensorflow and is becoming its official "high-level" API -- to the point where in TF 2 you would probably need to go out of your way not using it at all.
It is clearly the source of inspiration for fastai to easy the use of pytorch as Keras does for tensorflow, as mentionned by the authors time and again:
Unfortunately, Pytorch was a long way from being a good option for part one of the course, which is designed to be accessible to people with no machine learning background. It did not have anything like the clear simple API of Keras for training models. Every project required dozens of lines of code just to implement the basics of training a neural network. Unlike Keras, where the defaults are thoughtfully chosen to be as useful as possible, Pytorch required everything to be specified in detail. However, we also realised that Keras could be even better. We noticed that we kept on making the same mistakes in Keras, such as failing to shuffle our data when we needed to, or vice versa. Also, many recent best practices were not being incorporated into Keras, particularly in the rapidly developing field of natural language processing. We wondered if we could build something that could be even better than Keras for rapidly training world-class deep learning models.

Which kinds of high level API of tensorflow should I learn?

I have studied tensorflow for about one month. I just feel that creating a network with primitive operations of Tensorflow is very verbose. Then I found some high level API, such as TF-Slim, TF Learn, Keras. But multiple choices confuse me so that I don't know which I should learn.
TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow, but as I investigated, it's only for convnets. What networks Keras can build are more diverse.
Can Anyone give a comparision between them so that I could choose which high level API I should learn ? In terms of :
1. popularity: which ones are the most popular ?
2. practicality: what kinds of network can they build ?
3. performance: what's their training/inference performance ?
... something else
Hope someone could give me a suggestion. Thanks.
I suggest you start with Keras.
It´s very easy to learn, it has a broad user base (see Shobhits link), there is a ton of reference code out there on GitHub and in tutorials / MOOCs / eBooks etc. and you can build almost anything with it. And I personally think that is has a good documentation (although some might disagree with that...).
Since it´s an API that connects to Tensorflow, Theano, CNTK (and possibly more frameworks in the future) you have even more flexibility.
Don´t worry too much about performance. That´s really not important while youre learning.

How to specify the architecture of deep neural network in Tensorflow?

I am newbie in Tensorflow
Actually, I am testing some example in Tensorflow web-site, and I start to understand some features of the framwork, but what I don't understand is how I can design my architecture, I mean number of layers, type of Layer "conv, pool...", and if it is necessery to do that, because there are many predifined architectures like AmexNet,
Thanks,
I would strongly recommend working through their hands on tutorial, depending on if you have previous ML experience (https://www.tensorflow.org/get_started/mnist/pros) or not (https://www.tensorflow.org/get_started/mnist/beginners). The questions you are asking are answered in there.
The question on using predefined architectures or self defined depends on your use case. If you want to do something easy like classifying if there is only a car in the scene or not a more shallow architecture might work better, because it is faster and a more deep one is overkill. However most architectures are similar to the ones already defined in literature.
Another question that arises naturally, while talking about pre defined architecture is about transfer learning / fine tuning. Often pre defined architectures are already learned on some big dataset (mostly ImageNet) and already perform really well out of the box for many tasks. With little training data it makes a lot of sense to use this. With lots of training data it can hinder your progress though.

regarding caffe to tensorflow

Currently, there are a lot of deep learning models developed in Caffe instead of tensorflow. If I want to re-write these models in tensorflow, how to start? I am not familiar with Caffe structure. It seems to me that there are some files storing the model architecture only. My guess is that I only need to understand and transfer those architecture design into Tensorflow. The input/output/training will be re-written anyway. Is this thought meaningful?
I see some Caffe implementation also need to hack into the original Caffe framework down to the C++ level, and make some modifications. I am not sure under what kind of scenario the Caffe model developer need to go that deep? If I just want to re-implement their models in Tensorflow, do I need to go to check their C++ modifications, which are sometimes not documented at all.
I know there are some Caffe-Tensorflow transformation tool. But there are always some constraints, and I think re-write the model directly maybe more straightforward.
Any thougts, suggestions, and link to tutorials are highly appreciated.
I have already asked a similar question.
To synthetise the possible answers :
You can either use pre-existing tools like etheron's kaffe(which is really simple to use). But its simplicity comes at a cost: it is not easy to debug.
As #Yaroslav Bulatov answered start from scratch and try to make each layer match. In this regard I would advise you to look at ry's github which is a remarkable example where you basically have small helper functions which indicate how to reshape the weights appropriately from caffe to Tensorflow, which is the only real thing you have to do to make simple models match and also provides activations check layer by layer.