TensorFlow 2 Detection Model Zoo metrics

TensorFlow 2 Detection Model Zoo metrics - tensorflow

I know it's a banality, but i'm really confused on what Speed (ms) and COCO mAP means HERE.
I get the idea, lower speed and higher mAP are better, but can i ask what does those metrics mean?
I have to write a report about a project that uses one of the model listed in the github model of tensorflow, so i would like a technical description of those two if possible. About COCO mAP i found something already, i'm trying to understand it, but nothing related to Speed. What does speed measure?
I'm sorry about the stupid question, but i like to fully understand things

It refers to inference speed, how much time it take for the network to provide an output based on your input.

Related

Which model (GPT2, BERT, XLNet and etc) would you use for a text classification task? Why?

I'm trying to train a model for a sentence classification task. The input is a sentence (a vector of integers) and the output is a label (0 or 1). I've seen some articles here and there about using Bert and GPT2 for text classification tasks. However, I'm not sure which one should I pick to start with. Which of these recent models in NLP such as original Transformer model, Bert, GPT2, XLNet would you use to start with? And why? I'd rather to implement in Tensorflow, but I'm flexible to go for PyTorch too.
Thanks!

It highly depends on your dataset and is part of the data scientist's job to find which model is more suitable for a particular task in terms of selected performance metric, training cost, model complexity etc.
When you work on the problem you will probably test all of the above models and compare them. Which one of them to choose first? Andrew Ng in "Machine Learning Yearning" suggest starting with simple model so you can quickly iterate and test your idea, data preprocessing pipeline etc.
Don’t start off trying to design and build the perfect system.
Instead, build and train a basic system quickly—perhaps in just a few
days
According to this suggestion, you can start with a simpler model such as ULMFiT as a baseline, verify your ideas and then move on to more complex models and see how they can improve your results.
Note that modern NLP models contain a large number of parameters and it is difficult to train them from scratch without a large dataset. That's why you may want to use transfer learning: you can download pre-trained model and use it as a basis and fine-tune it to your task-specific dataset to achieve better performance and reduce training time.

I agree with Max's answer, but if the constraint is to use a state of the art large pretrained model, there is a really easy way to do this. The library by HuggingFace called pytorch-transformers. Whether you chose BERT, XLNet, or whatever, they're easy to swap out. Here is a detailed tutorial on using that library for text classification.
EDIT: I just came across this repo, pytorch-transformers-classification (Apache 2.0 license), which is a tool for doing exactly what you want.

Well like others mentioned, it depends on the dataset and multiple models should be tried and best one must be chosen.
However, sharing my experience, XLNet beats all other models so far by a good margin. Hence if learning is not the objective, i would simple start with XLNET and then try a few more down the line and conclude. It just saves time in exploring.
Below repo is excellent to do all this quickly. Kudos to them.
https://github.com/microsoft/nlp-recipes
It uses hugging face transformers and makes them dead simple. 😃

I have used XLNet, BERT, and GPT2 for summarization tasks (English only). Based on my experience, GPT2 works the best among all 3 on short paragraph-size notes, while BERT performs better for longer texts (up to 2-3 pages). You can use XLNet as a benchmark.

Is this overfitting

I’m running a machine learning algorithm to answer True/False questions.
Assuming I use classification algo.
After running 1200 data, I got 30% of accuracy.
But then, I made a second algorithm to always negate the first algorithm’s answer
Thus it’s accuracy is 70%
Is this an overfitting for the second algo? Assuming my 1st algorithm consistenly predicts 30% accuracy

To your questions.
I feel like this answer kind of depends on the machine learning model which you choose and the training set. Most ML Models make mistakes initially. In your case if the training set of Algo 2 is 70% it might mean that it is good at predicting the wrong thing? If i'm understanding this correctly? All though this might be true in the beginning of the data negating a ML answer is a bad idea. The better idea is to prepare your data correctly and train it on a data set which is the best fit for your model.
Most Machine learning models make mistakes it is bound to happen. But the training set and all that data helps you to choose the right model. Data preparation is key in order to make your training set correctly. I know I'm bouncing all over the place. I apologize for that
For instance we might have a logistic regression model and we want to identify the individuals who have a certain condition versus those who don't. The first thing we do is properly prepare our data and then train it (this is the short version) but my point is training a model is very important it allows your ML model to be able to predict the accuracy of it.
I should say I really enjoy Machine Learning/ Deep Learning but I am no means an expert. I highly recommend this class though its how I started off understanding the fundamentals.
Coursera Andrew Ng course

YOLO vs Inception on unique images

I have images of unique products that are used at my workplace. I can't imagine that the inception database already has similar items that it has been trained on.
I tried to train a model using YOLO. It was taking a very very long time. Maybe 7minutes between epochs; and I wanted to do 1000 epochs due to small data size.
I used tiny-yolov2-voc cfg/weight on 1.0 GPU. I had a video of the item but i broke it up into frames so i could annotate. I then attempted to train on the images (not video). The products are healthcare related. Basically anything that a hospital would use.
Ive also used the inception method on images I got from Google. I noticed that inception method was very fast and resulted in accurate predictions. However, i'm worried that my images are too unique for inception to work.
Which method is best to use?
If you recommend YOLO, can you please provide suggestions on how to speed up the training phase?
If you recommend inception, can you please provide an explanation why it would work on unique images? I guess i'm having trouble understanding how inception knows which item i'm trying to train on without me providing annotations.
Thanks in advance

Just my impression (no recommendation or even related experience)
Having a look at the Hardware recommendations related to darknet a assumption is that you might stock up your own hardware to get faster results.
I read about the currently three different versions of YOLO and expect there are lot's of GFLOPS training included if you download the recommended files, but if the models never fit to your products then for you they never might be very helpful.
I must admit I've neither been active with YOLO nor with Tensorflow, so my impression might not be helpful at all.
If you see some videos of YOLO you can remark that sometimes a camel is labeled with horse and the accuracy seems being bad but it depends on the threshold that is applied to the images, so the videos look amazing as it seems the recognition is done so fast but with higher accuracy the process would slow down - also depending on the trained motives.
They never hide it though, they explain on an image where a dog is labeled as cow and a horse as sheep (Version 2) that in combination with darknet it's getting much faster but less accurate too, so usage of darknet is an important aspect too.
The information about details seems being quite bad on the websites of YOLO, they present it more like you'd do with a popstar, in comparison the website of Tensorflow looks more academic and is informing about the mathematics behind the framework.
Concerning Tensorflow I don't know about the hardware-recommendations, but as you wrote your results are useful, probably they are a bit or even much less.
My impression is that YOLO is primary intended for real-time detection in (live-)videos and needs much training for high accuracy. So depending on your use-case it might be right but you'd to invest in hardware probably for professional usage.
This is not an opinion against Tensorflow but that I had to verify more and it seems taking more time to get an impression. Concerning Tensorflow in the moment I even can't say if it can be used for real-time-detection, how accurate it is then and if the results are then still better then those of YOLO.
My assumption is that concerning both solutions it's a matter of involved elements (like the decision if to include darknet for speed), configuration, training and adjustments. Probably there is always something to increase in speed and accuracy, so investing in a system for recognition won't be static process with fixed end in timeline, but a steady process.
This is just a short overview of my impressions, I've never any experience with any recognition-software and hardly recommend that you make any decision based on my words.
Just if you want to do use any recognition software professional, especially for real-time-recognition, then you've to invest in hardware probably.

To my understanding of your problem you need you need inception with the capability of identifying your unique images. In this circumstance you can use transfer-learning on the inception model. With transfer-learning you can still train inception your own pictures while retaining the previous knowledge of inception.
More on transfer-learning

Process to build our own model for image detection

Currently, I am working on deep neural network for image detection and I founded a model called YOLO Network, and it's very powerful to make objects detections, but I have a question:
How can we design and concept our own model? Do we use a brut force for that, for example "I use 2 convolutional and 1 pooling layer and 1 fully connected layer" after that if the result is'nt good I change the number of layers and change the parameter until I find the best model, Please if there is anyone who knows some informations about that, show me how ?
I use Tensorflow.
Thanks,

There are a couple of papers addressing this issue. For example in http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf some general principles are mentioned, like preserving information by not having too rapid changes in any cut of the graph seperating the output from the input.
Another paper is https://arxiv.org/pdf/1606.02228.pdf where specific hyperparameter combinations are tried.
The remainder are just what you observe in practice and depends on your dataset and on your requirement. Maybe you have performance requirements because you want to deploy to mobile or you need more than 90 % accuracy. Then you will have to choose your model accordingly.

How to predict using Tensorflow?

This is a newbie question for the tensorflow experts:
I reading lot of data from power transformer connected to an array of solar panels using arduinos, my question is can I use tensorflow to predict the power generation in future.
I am completely new to tensorflow, if can point me to something similar I can start with that or any github repo which is doing similar predictive modeling.
Edit: Kyle pointed me to the MNIST data, which I believe is a Image Dataset. Again, not sure if tensorflow is the right computation library for this problem or does it only work on Image datasets?
thanks, Rajesh

Surely you can use tensorflow to solve your problem.
TensorFlow™ is an open source software library for numerical
computation using data flow graphs.
So it works not only on Image dataset but also others. Don't worry about this.
And about prediction, first you need to train a model(such as linear regression) on you dataset, then predict. The tutorial code can be found in tensorflow homepage .
Get your hand dirty, you will find it works on your dataset.
Good luck.

You can absolutely use TensorFlow to predict time series. There are plenty of examples out there, like this one. And this is a really interesting one on using RNN to predict basketball trajectories.
In general, TF is a very flexible platform for solving problems with machine learning. You can create any kind of network you can think of in it, and train that network to act as a model for your process. Depending on what kind of costs you define and how you train it, you can build a network to classify data into categories, predict a time series forward a number of steps, and other cool stuff.
There is, sadly, no short answer for how to do this, but that's just because the possibilities are endless! Have fun!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas