What's the difference between training with Huggingface's Trainer class and native PyTorch or Tensorfllow? - tensorflow

I came across this article from Huggingface, it shows training using the Trainer API and also using a native PyTorch training loop, it talks about it as it were interchangable. But I haven't seen any explanations comparing between the two. Is it the same? Is one faster than the other? Some other third party tutorials also use the two interchangebly

Related

How to connect different deep learning architectures?

Based on 5 features extracted from a sample of binary files, the idea is to combine different deep learning models each of them processing one feature sample.
Or simply is there a way to connect a CNN and a RNN, in a way that the output of the CNN would be the input of the RNN ?
Any help or reference would be appreciated
The Keras Functional API can be used to combine different Deeplearing models.
It is much more flexible than the Keras Sequential API, in that it can support multiple input, output pipelines.
You can implement non-linear topology with the Functional API.
For example:

Regression using MXNet

I have a regression model based on various independent features which eventually predict a value with a custom loss function. Somewhat similar to the link below.
https://www.evergreeninnovations.co/blog-quantile-loss-function-for-machine-learning/
The current model is built using Tensorflow library but now I want to use MXNet becuase of the speed and other advantages it provides. How to write a similar logic in MXNet with a custom loss function?
Simple regression with L2 loss is featured in 2 famous tutorials - you can just pick any of those and customize the loss:
In the D2L.ai book (used at many universities):
https://d2l.ai/chapter_linear-networks/linear-regression-gluon.html
In The Straight Dope (guide to the python API of MXNet,
gluon). A lot of that guide went into D2L.ai:
https://gluon.mxnet.io/chapter02_supervised-learning/linear-regression-gluon.html

fast.ai equivalent in tensorflow

Is there any equivalent/alternate library to fastai in tensorfow for easier training and debugging deep learning models including analysis on results of trained model in Tensorflow.
Fastai is built on top of pytorch looking for similar one in tensorflow.
The obvious choice would be to use tf.keras.
It is bundled with tensorflow and is becoming its official "high-level" API -- to the point where in TF 2 you would probably need to go out of your way not using it at all.
It is clearly the source of inspiration for fastai to easy the use of pytorch as Keras does for tensorflow, as mentionned by the authors time and again:
Unfortunately, Pytorch was a long way from being a good option for part one of the course, which is designed to be accessible to people with no machine learning background. It did not have anything like the clear simple API of Keras for training models. Every project required dozens of lines of code just to implement the basics of training a neural network. Unlike Keras, where the defaults are thoughtfully chosen to be as useful as possible, Pytorch required everything to be specified in detail. However, we also realised that Keras could be even better. We noticed that we kept on making the same mistakes in Keras, such as failing to shuffle our data when we needed to, or vice versa. Also, many recent best practices were not being incorporated into Keras, particularly in the rapidly developing field of natural language processing. We wondered if we could build something that could be even better than Keras for rapidly training world-class deep learning models.

TensorFlow checkpoints and models vis-a-vis multi-gpu settings

Let us take a practical situation a researcher often finds him/herself into when using TensorFlow :
Multiple GPUs are available for training and I'd like to use them for speedup.
Subsequently I'd like to give the trained model to a colleague or collaborator with a different (maybe 1 !!) number of GPUs.
It is important that the code need not be modified for use when shared with multiple collaborators.
However, TensorFlow documentation/examples are not very clear/explained for such a scenario.
Basic questions are :
How do I write a code which involves training a model with multiple GPUs and where the model can be easily restored from checkpoints ?
How do I deal with the situation where my collaborators have different number of GPU resources ? More precisely, what best practices should I follow to ensure that the code and model I share with them is easily usable by them ?
Are there some examples or best practices other TensorFlow users (facing the same situation !!) can share ?
NOTE : I am not looking for a readymade solution. My prime purpose is to understand a TensorFlow feature which is not very well documented.

CNTK time series anomaly detection tutorial or documentation (RNN/LTSM)?

Problem
Do you have a tutorial for LTSM or RNN time series anomaly detection using deep learning with CNTK? If not, can you make one or suggest a series of simple steps here for us to follow?
I am a software developer and a member of a team investigating using deep learning on time series data we have for anomaly detection. We have not found anything on your python docs that can help us. It seems most of the tutorials are for visual recognition problems and not specific to the problem domain of interest to us.
Using LTSM and RNN in Anomaly Detection
I have found the following
This link references why we are trying to use time series for anomaly detection
This paper convinced us that the first link is a respected approach to the problem in general
This link also outlined the same approach
I look around on CNTK here, but didn't find any similar question and so I hope this question helps other developers in the future.
Additional Notes and Questions
My problem is that I am finding CNTK not that simple to use or as well documented as I had hoped. Frankly, our framework and stack is heavy on .NET and Microsoft technologies. So I repeat the question again for emphasis with a few follow ups:
Do you have any resources you feel you can recommend to developers learning neural networks, deep learning, and so on to help us understand what is going on under the hood with CNTK?
Build 2017 mentions C# is supported by CNTK. Can you please point us in the direction of where the documentation and support is for this?
Most importantly can you please help get us unstuck on trying to do time series anomaly analysis for time series using CNTK?
Thank you very much for time and assistance in reading and asking this question
Thanks for your feedback. Your suggestions help improve the toolkit.
First Bullet
I would suggest that you can start with the CNTK tutorials.
https://github.com/Microsoft/CNTK/tree/master/Tutorials
They are designed from CNTK 101 to 301. Suggest that you work through them. Many of them even though uses image data, the concept and the models are amenable to build solutions with numerical data. 101-103 series are great to understand basics of the train-test-predict workflow.
Second Bullet:
Once you have trained the model (using Python recommended). The model evaluation can be performed using different language bindings, C# being one of them.
https://github.com/Microsoft/CNTK/wiki/CNTK-Evaluation-Overview
Third Bullet
There are different approaches suggested in the papers you have cited. All of them are possible to do in CNTK with some changes to the code in the tutorials.
The key tutorial for you would be CNTK 106, CNTK 105, and CNTK 202
Anomaly as classification: This would involve you label your target value as 1 of N classes, with one of the class being "anomaly". Then you can combine 106 with 202, to classify the prediction
Anomaly as an autoencoder: You can need to study 105 autoencoder. Now instead of a dense network, you could apply the concept for Recurrent networks. Train only on the normal data. Once trained, pass any data through the trained model. The difference between the input and autoencoded version will be small for normal data but the difference will be much larger for anomalies. The 105 tutorial uses images, but you can train these models with any numerical data.
Hope you find these suggestions helpful.