Deep learning for computer vision: What after MNIST stage? - tensorflow

I am trying to explore computer vision using deep learning techniques. I have gone through basic literature, made a NN of my own to classify digits using MNIST data(without using any library like TF,Keras etc, and in the process understood concepts like loss function, optimization, backward propagation etc), and then also explored Fashion MNIST using TF Keras.
I applied my knowledge gained so far to solve a Kaggle problem(identifying a plant type), but results are not very encouraging.
So, what should be my next step in progress? What should I do to improve my knowledge and models to solve more complex problems? What more books, literature etc should I read to move ahead of beginner stage?

You should try hyperparameter tuning, it will help improve your model performance. Feel free to surf around various articles, fine tuning your model will be the next step as you have fundamental knowledge regarding how model works.

Related

Is there any Capsnet Implementation on CIFAR10 that achieve better accuracy than common CNN?

I implement Capsule Network by EM-Routing based-on Sara Sabour & Hinton's article, It works great on MNIST dataset and some other grayscale dataset as same as MNIST such as Hoda (Persian/Arabic Digits) but When I tried on CIFAR10 the accuracy was unbelievable low.
Yup, that is the current problem with Capsule Networks. It works well with MNIST because of the dataset's simplicity. All you require is some edges and blobs to be detected in order to classify each data. For more complex datasets, naively stacking the capsules and hoping it to perform well does not work. However, works are being performed currently to tweak the current CapsNet architecture to make it perform better than now. When CNN was developed during the days, it also had this same problem. It took many years for CNN to be what is it now.
Refer to this paper if you want to know the performance of CapsNet on different datasets : https://arxiv.org/abs/1712.03480
Earlier I mentioned there are works being done to improve CapsNet. However, there are already some works that has been done so far. You can refer to these:
http://openaccess.thecvf.com/content_CVPR_2019/papers/Rajasegaran_DeepCaps_Going_Deeper_With_Capsule_Networks_CVPR_2019_paper.pdf
http://proceedings.mlr.press/v97/jeong19b/jeong19b.pdf
Bear in mind that the time it takes to train CapsNet is very much higher than CNN. Therefore, it is not easy to test out these architectures.

How to determine what type of layers do I need for my Deep learning model?

Suppose that I have want to make a model that does something. Now when I search about the topic in Google or YouTube, I find many related tutorials and it seems like some clever programmer had already implemented that model with Deep learning.
But how do they know that what type of layers, what type of activation functions, loss functions, optimizer, number of units etc. they need to solve that certain problem using deep learning.
Are there any techniques for knowing this, or its just a matter of understanding and experience? Also it would be very helpful if somebody could point me to some videos or articles answering my question.
This is more of a matter of understanding and experience. When building a model from scratch, you must understand which optimizer, loss, etc. makes sense for your particular problem. In order to choose these appropriately, you must understand the differences between the available optimizers, loss functions, etc.
In regards to choosing how many layers and nodes, what batch size, what learning rate, etc.-- these are all hyperparameters that you will need to test and tune as you experiment with your model.
I have a Deep Learning Fundamentals YouTube playlist that you may find helpful. It covers the fundamental basics of each of these topics in short videos. Additionally, this Deep Learning with Keras playlist may also be beneficial if you're wanting to focus more on coding after getting the basic concepts down.
Thanks for the question.
The CS231n Stanford lectures on CNN is the best for beginners refer to the video lectures here and class notes are available here
After watching the lectures and completing the assignments, you will get a basic idea of what Deep Learning is and all the algorithms available etc.
But when it comes to solving real-world problems this won't be sufficient So take this course by Jeremy Howard where he teaches more on how to approach a problem using Kaggle platform.
Keep on solving more problems experimenting new models and algorithms using several platforms like hackerearth, Kaggle, topcoder etc.

Which kinds of high level API of tensorflow should I learn?

I have studied tensorflow for about one month. I just feel that creating a network with primitive operations of Tensorflow is very verbose. Then I found some high level API, such as TF-Slim, TF Learn, Keras. But multiple choices confuse me so that I don't know which I should learn.
TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow, but as I investigated, it's only for convnets. What networks Keras can build are more diverse.
Can Anyone give a comparision between them so that I could choose which high level API I should learn ? In terms of :
1. popularity: which ones are the most popular ?
2. practicality: what kinds of network can they build ?
3. performance: what's their training/inference performance ?
... something else
Hope someone could give me a suggestion. Thanks.
I suggest you start with Keras.
It´s very easy to learn, it has a broad user base (see Shobhits link), there is a ton of reference code out there on GitHub and in tutorials / MOOCs / eBooks etc. and you can build almost anything with it. And I personally think that is has a good documentation (although some might disagree with that...).
Since it´s an API that connects to Tensorflow, Theano, CNTK (and possibly more frameworks in the future) you have even more flexibility.
Don´t worry too much about performance. That´s really not important while youre learning.

CNTK time series anomaly detection tutorial or documentation (RNN/LTSM)?

Problem
Do you have a tutorial for LTSM or RNN time series anomaly detection using deep learning with CNTK? If not, can you make one or suggest a series of simple steps here for us to follow?
I am a software developer and a member of a team investigating using deep learning on time series data we have for anomaly detection. We have not found anything on your python docs that can help us. It seems most of the tutorials are for visual recognition problems and not specific to the problem domain of interest to us.
Using LTSM and RNN in Anomaly Detection
I have found the following
This link references why we are trying to use time series for anomaly detection
This paper convinced us that the first link is a respected approach to the problem in general
This link also outlined the same approach
I look around on CNTK here, but didn't find any similar question and so I hope this question helps other developers in the future.
Additional Notes and Questions
My problem is that I am finding CNTK not that simple to use or as well documented as I had hoped. Frankly, our framework and stack is heavy on .NET and Microsoft technologies. So I repeat the question again for emphasis with a few follow ups:
Do you have any resources you feel you can recommend to developers learning neural networks, deep learning, and so on to help us understand what is going on under the hood with CNTK?
Build 2017 mentions C# is supported by CNTK. Can you please point us in the direction of where the documentation and support is for this?
Most importantly can you please help get us unstuck on trying to do time series anomaly analysis for time series using CNTK?
Thank you very much for time and assistance in reading and asking this question
Thanks for your feedback. Your suggestions help improve the toolkit.
First Bullet
I would suggest that you can start with the CNTK tutorials.
https://github.com/Microsoft/CNTK/tree/master/Tutorials
They are designed from CNTK 101 to 301. Suggest that you work through them. Many of them even though uses image data, the concept and the models are amenable to build solutions with numerical data. 101-103 series are great to understand basics of the train-test-predict workflow.
Second Bullet:
Once you have trained the model (using Python recommended). The model evaluation can be performed using different language bindings, C# being one of them.
https://github.com/Microsoft/CNTK/wiki/CNTK-Evaluation-Overview
Third Bullet
There are different approaches suggested in the papers you have cited. All of them are possible to do in CNTK with some changes to the code in the tutorials.
The key tutorial for you would be CNTK 106, CNTK 105, and CNTK 202
Anomaly as classification: This would involve you label your target value as 1 of N classes, with one of the class being "anomaly". Then you can combine 106 with 202, to classify the prediction
Anomaly as an autoencoder: You can need to study 105 autoencoder. Now instead of a dense network, you could apply the concept for Recurrent networks. Train only on the normal data. Once trained, pass any data through the trained model. The difference between the input and autoencoded version will be small for normal data but the difference will be much larger for anomalies. The 105 tutorial uses images, but you can train these models with any numerical data.
Hope you find these suggestions helpful.

How to predict using Tensorflow?

This is a newbie question for the tensorflow experts:
I reading lot of data from power transformer connected to an array of solar panels using arduinos, my question is can I use tensorflow to predict the power generation in future.
I am completely new to tensorflow, if can point me to something similar I can start with that or any github repo which is doing similar predictive modeling.
Edit: Kyle pointed me to the MNIST data, which I believe is a Image Dataset. Again, not sure if tensorflow is the right computation library for this problem or does it only work on Image datasets?
thanks, Rajesh
Surely you can use tensorflow to solve your problem.
TensorFlow™ is an open source software library for numerical
computation using data flow graphs.
So it works not only on Image dataset but also others. Don't worry about this.
And about prediction, first you need to train a model(such as linear regression) on you dataset, then predict. The tutorial code can be found in tensorflow homepage .
Get your hand dirty, you will find it works on your dataset.
Good luck.
You can absolutely use TensorFlow to predict time series. There are plenty of examples out there, like this one. And this is a really interesting one on using RNN to predict basketball trajectories.
In general, TF is a very flexible platform for solving problems with machine learning. You can create any kind of network you can think of in it, and train that network to act as a model for your process. Depending on what kind of costs you define and how you train it, you can build a network to classify data into categories, predict a time series forward a number of steps, and other cool stuff.
There is, sadly, no short answer for how to do this, but that's just because the possibilities are endless! Have fun!