How to determine what type of layers do I need for my Deep learning model? - tensorflow

Suppose that I have want to make a model that does something. Now when I search about the topic in Google or YouTube, I find many related tutorials and it seems like some clever programmer had already implemented that model with Deep learning.
But how do they know that what type of layers, what type of activation functions, loss functions, optimizer, number of units etc. they need to solve that certain problem using deep learning.
Are there any techniques for knowing this, or its just a matter of understanding and experience? Also it would be very helpful if somebody could point me to some videos or articles answering my question.

This is more of a matter of understanding and experience. When building a model from scratch, you must understand which optimizer, loss, etc. makes sense for your particular problem. In order to choose these appropriately, you must understand the differences between the available optimizers, loss functions, etc.
In regards to choosing how many layers and nodes, what batch size, what learning rate, etc.-- these are all hyperparameters that you will need to test and tune as you experiment with your model.
I have a Deep Learning Fundamentals YouTube playlist that you may find helpful. It covers the fundamental basics of each of these topics in short videos. Additionally, this Deep Learning with Keras playlist may also be beneficial if you're wanting to focus more on coding after getting the basic concepts down.

Thanks for the question.
The CS231n Stanford lectures on CNN is the best for beginners refer to the video lectures here and class notes are available here
After watching the lectures and completing the assignments, you will get a basic idea of what Deep Learning is and all the algorithms available etc.
But when it comes to solving real-world problems this won't be sufficient So take this course by Jeremy Howard where he teaches more on how to approach a problem using Kaggle platform.
Keep on solving more problems experimenting new models and algorithms using several platforms like hackerearth, Kaggle, topcoder etc.

Related

Deep learning for computer vision: What after MNIST stage?

I am trying to explore computer vision using deep learning techniques. I have gone through basic literature, made a NN of my own to classify digits using MNIST data(without using any library like TF,Keras etc, and in the process understood concepts like loss function, optimization, backward propagation etc), and then also explored Fashion MNIST using TF Keras.
I applied my knowledge gained so far to solve a Kaggle problem(identifying a plant type), but results are not very encouraging.
So, what should be my next step in progress? What should I do to improve my knowledge and models to solve more complex problems? What more books, literature etc should I read to move ahead of beginner stage?
You should try hyperparameter tuning, it will help improve your model performance. Feel free to surf around various articles, fine tuning your model will be the next step as you have fundamental knowledge regarding how model works.

YOLO vs Inception on unique images

I have images of unique products that are used at my workplace. I can't imagine that the inception database already has similar items that it has been trained on.
I tried to train a model using YOLO. It was taking a very very long time. Maybe 7minutes between epochs; and I wanted to do 1000 epochs due to small data size.
I used tiny-yolov2-voc cfg/weight on 1.0 GPU. I had a video of the item but i broke it up into frames so i could annotate. I then attempted to train on the images (not video). The products are healthcare related. Basically anything that a hospital would use.
Ive also used the inception method on images I got from Google. I noticed that inception method was very fast and resulted in accurate predictions. However, i'm worried that my images are too unique for inception to work.
Which method is best to use?
If you recommend YOLO, can you please provide suggestions on how to speed up the training phase?
If you recommend inception, can you please provide an explanation why it would work on unique images? I guess i'm having trouble understanding how inception knows which item i'm trying to train on without me providing annotations.
Thanks in advance
Just my impression (no recommendation or even related experience)
Having a look at the Hardware recommendations related to darknet a assumption is that you might stock up your own hardware to get faster results.
I read about the currently three different versions of YOLO and expect there are lot's of GFLOPS training included if you download the recommended files, but if the models never fit to your products then for you they never might be very helpful.
I must admit I've neither been active with YOLO nor with Tensorflow, so my impression might not be helpful at all.
If you see some videos of YOLO you can remark that sometimes a camel is labeled with horse and the accuracy seems being bad but it depends on the threshold that is applied to the images, so the videos look amazing as it seems the recognition is done so fast but with higher accuracy the process would slow down - also depending on the trained motives.
They never hide it though, they explain on an image where a dog is labeled as cow and a horse as sheep (Version 2) that in combination with darknet it's getting much faster but less accurate too, so usage of darknet is an important aspect too.
The information about details seems being quite bad on the websites of YOLO, they present it more like you'd do with a popstar, in comparison the website of Tensorflow looks more academic and is informing about the mathematics behind the framework.
Concerning Tensorflow I don't know about the hardware-recommendations, but as you wrote your results are useful, probably they are a bit or even much less.
My impression is that YOLO is primary intended for real-time detection in (live-)videos and needs much training for high accuracy. So depending on your use-case it might be right but you'd to invest in hardware probably for professional usage.
This is not an opinion against Tensorflow but that I had to verify more and it seems taking more time to get an impression. Concerning Tensorflow in the moment I even can't say if it can be used for real-time-detection, how accurate it is then and if the results are then still better then those of YOLO.
My assumption is that concerning both solutions it's a matter of involved elements (like the decision if to include darknet for speed), configuration, training and adjustments. Probably there is always something to increase in speed and accuracy, so investing in a system for recognition won't be static process with fixed end in timeline, but a steady process.
This is just a short overview of my impressions, I've never any experience with any recognition-software and hardly recommend that you make any decision based on my words.
Just if you want to do use any recognition software professional, especially for real-time-recognition, then you've to invest in hardware probably.
To my understanding of your problem you need you need inception with the capability of identifying your unique images. In this circumstance you can use transfer-learning on the inception model. With transfer-learning you can still train inception your own pictures while retaining the previous knowledge of inception.
More on transfer-learning

Which kinds of high level API of tensorflow should I learn?

I have studied tensorflow for about one month. I just feel that creating a network with primitive operations of Tensorflow is very verbose. Then I found some high level API, such as TF-Slim, TF Learn, Keras. But multiple choices confuse me so that I don't know which I should learn.
TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow, but as I investigated, it's only for convnets. What networks Keras can build are more diverse.
Can Anyone give a comparision between them so that I could choose which high level API I should learn ? In terms of :
1. popularity: which ones are the most popular ?
2. practicality: what kinds of network can they build ?
3. performance: what's their training/inference performance ?
... something else
Hope someone could give me a suggestion. Thanks.
I suggest you start with Keras.
It´s very easy to learn, it has a broad user base (see Shobhits link), there is a ton of reference code out there on GitHub and in tutorials / MOOCs / eBooks etc. and you can build almost anything with it. And I personally think that is has a good documentation (although some might disagree with that...).
Since it´s an API that connects to Tensorflow, Theano, CNTK (and possibly more frameworks in the future) you have even more flexibility.
Don´t worry too much about performance. That´s really not important while youre learning.

CNTK time series anomaly detection tutorial or documentation (RNN/LTSM)?

Problem
Do you have a tutorial for LTSM or RNN time series anomaly detection using deep learning with CNTK? If not, can you make one or suggest a series of simple steps here for us to follow?
I am a software developer and a member of a team investigating using deep learning on time series data we have for anomaly detection. We have not found anything on your python docs that can help us. It seems most of the tutorials are for visual recognition problems and not specific to the problem domain of interest to us.
Using LTSM and RNN in Anomaly Detection
I have found the following
This link references why we are trying to use time series for anomaly detection
This paper convinced us that the first link is a respected approach to the problem in general
This link also outlined the same approach
I look around on CNTK here, but didn't find any similar question and so I hope this question helps other developers in the future.
Additional Notes and Questions
My problem is that I am finding CNTK not that simple to use or as well documented as I had hoped. Frankly, our framework and stack is heavy on .NET and Microsoft technologies. So I repeat the question again for emphasis with a few follow ups:
Do you have any resources you feel you can recommend to developers learning neural networks, deep learning, and so on to help us understand what is going on under the hood with CNTK?
Build 2017 mentions C# is supported by CNTK. Can you please point us in the direction of where the documentation and support is for this?
Most importantly can you please help get us unstuck on trying to do time series anomaly analysis for time series using CNTK?
Thank you very much for time and assistance in reading and asking this question
Thanks for your feedback. Your suggestions help improve the toolkit.
First Bullet
I would suggest that you can start with the CNTK tutorials.
https://github.com/Microsoft/CNTK/tree/master/Tutorials
They are designed from CNTK 101 to 301. Suggest that you work through them. Many of them even though uses image data, the concept and the models are amenable to build solutions with numerical data. 101-103 series are great to understand basics of the train-test-predict workflow.
Second Bullet:
Once you have trained the model (using Python recommended). The model evaluation can be performed using different language bindings, C# being one of them.
https://github.com/Microsoft/CNTK/wiki/CNTK-Evaluation-Overview
Third Bullet
There are different approaches suggested in the papers you have cited. All of them are possible to do in CNTK with some changes to the code in the tutorials.
The key tutorial for you would be CNTK 106, CNTK 105, and CNTK 202
Anomaly as classification: This would involve you label your target value as 1 of N classes, with one of the class being "anomaly". Then you can combine 106 with 202, to classify the prediction
Anomaly as an autoencoder: You can need to study 105 autoencoder. Now instead of a dense network, you could apply the concept for Recurrent networks. Train only on the normal data. Once trained, pass any data through the trained model. The difference between the input and autoencoded version will be small for normal data but the difference will be much larger for anomalies. The 105 tutorial uses images, but you can train these models with any numerical data.
Hope you find these suggestions helpful.

How to specify the architecture of deep neural network in Tensorflow?

I am newbie in Tensorflow
Actually, I am testing some example in Tensorflow web-site, and I start to understand some features of the framwork, but what I don't understand is how I can design my architecture, I mean number of layers, type of Layer "conv, pool...", and if it is necessery to do that, because there are many predifined architectures like AmexNet,
Thanks,
I would strongly recommend working through their hands on tutorial, depending on if you have previous ML experience (https://www.tensorflow.org/get_started/mnist/pros) or not (https://www.tensorflow.org/get_started/mnist/beginners). The questions you are asking are answered in there.
The question on using predefined architectures or self defined depends on your use case. If you want to do something easy like classifying if there is only a car in the scene or not a more shallow architecture might work better, because it is faster and a more deep one is overkill. However most architectures are similar to the ones already defined in literature.
Another question that arises naturally, while talking about pre defined architecture is about transfer learning / fine tuning. Often pre defined architectures are already learned on some big dataset (mostly ImageNet) and already perform really well out of the box for many tasks. With little training data it makes a lot of sense to use this. With lots of training data it can hinder your progress though.