Fairness metrics for multi-class classification - fairlearn

Are there any metrics implemented in Fairlearn or any published papers that I can refer to for use-cases around fairness measurement of multi-class classification where the metrics are AP and not accuracy? Thanks!

Update: The Fairlearn documentation now has a FAQ section on this topic https://fairlearn.org/main/faq.html Search for "Does Fairlearn support multi-class classification?"
Previous answer:
Fairlearn's metrics are designed for binary classification or regression. You could evaluate the various labels individually, of course. If you have a specific idea of what you'd like to see please open a new feature request.
Fairlearn does support a variety of metrics, not just accuracy. The user guide has a full list: https://fairlearn.org/v0.6.0/user_guide/assessment.html#scalar-results-from-metricframe
One example that comes to mind for a paper doing multi-class classification while thinking about fairness is CheXclusion by Seyyed-Kalantari et al. They mostly look into TPR differences when classifying chest x-rays.
The Fairlearn community would definitely be interested in hearing about your use case. Perhaps there's some way we can help. Feel free to reach out via Gitter or by creating your feature request (as mentioned above).

Related

YOLO v3 complete architecture

I am attempting to implement YOLO v3 in Tensorflow-Keras from scratch, with the aim of training my own model on a custom dataset. By that, I mean without using pretrained weights. I have gone through all three papers for YOLOv1, YOLOv2(YOLO9000) and YOLOv3, and find that although Darknet53 is used as a feature extractor for YOLOv3, I am unable to point out the complete architecture which extends after that - the "detection" layers talked about here. After a lot of reading on blog posts from Medium, kdnuggets and other similar sites, I ended up with a few significant questions:
Have I have missed the complete architecture of the detection layers (that extend after Darknet53 used for feature extraction) in YOLOv3 paper somewhere?
The author seems to use different image sizes at different stages of training. Does the network automatically do this upscaling/downscaling of images?
For preprocessing the images, is it really just enough to resize them and then normalize it (dividing by 255)?
Please be kind enough to point me in the right direction. I appreciate the help!

Is there a thorough exploration of the effect of momentum on Stochastic Gradient Descent?

Many CNN papers use momentum=0.9 when using Stochastic Gradient Descent in weight update. There is a good logic for using it, but what I am looking for is a thorough exploration of effects of that parameter. I've searched across many papers, and there are some insights here and there, but I have not been able a comprehensive exploration. Also, does it usefulness vary across different computer vision tasks like classification, segmentation, detection?
Here is a good review paper on this topic "A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay" by Leslie N. Smith
https://arxiv.org/pdf/1803.09820.pdf

How to determine what type of layers do I need for my Deep learning model?

Suppose that I have want to make a model that does something. Now when I search about the topic in Google or YouTube, I find many related tutorials and it seems like some clever programmer had already implemented that model with Deep learning.
But how do they know that what type of layers, what type of activation functions, loss functions, optimizer, number of units etc. they need to solve that certain problem using deep learning.
Are there any techniques for knowing this, or its just a matter of understanding and experience? Also it would be very helpful if somebody could point me to some videos or articles answering my question.
This is more of a matter of understanding and experience. When building a model from scratch, you must understand which optimizer, loss, etc. makes sense for your particular problem. In order to choose these appropriately, you must understand the differences between the available optimizers, loss functions, etc.
In regards to choosing how many layers and nodes, what batch size, what learning rate, etc.-- these are all hyperparameters that you will need to test and tune as you experiment with your model.
I have a Deep Learning Fundamentals YouTube playlist that you may find helpful. It covers the fundamental basics of each of these topics in short videos. Additionally, this Deep Learning with Keras playlist may also be beneficial if you're wanting to focus more on coding after getting the basic concepts down.
Thanks for the question.
The CS231n Stanford lectures on CNN is the best for beginners refer to the video lectures here and class notes are available here
After watching the lectures and completing the assignments, you will get a basic idea of what Deep Learning is and all the algorithms available etc.
But when it comes to solving real-world problems this won't be sufficient So take this course by Jeremy Howard where he teaches more on how to approach a problem using Kaggle platform.
Keep on solving more problems experimenting new models and algorithms using several platforms like hackerearth, Kaggle, topcoder etc.

CNTK time series anomaly detection tutorial or documentation (RNN/LTSM)?

Problem
Do you have a tutorial for LTSM or RNN time series anomaly detection using deep learning with CNTK? If not, can you make one or suggest a series of simple steps here for us to follow?
I am a software developer and a member of a team investigating using deep learning on time series data we have for anomaly detection. We have not found anything on your python docs that can help us. It seems most of the tutorials are for visual recognition problems and not specific to the problem domain of interest to us.
Using LTSM and RNN in Anomaly Detection
I have found the following
This link references why we are trying to use time series for anomaly detection
This paper convinced us that the first link is a respected approach to the problem in general
This link also outlined the same approach
I look around on CNTK here, but didn't find any similar question and so I hope this question helps other developers in the future.
Additional Notes and Questions
My problem is that I am finding CNTK not that simple to use or as well documented as I had hoped. Frankly, our framework and stack is heavy on .NET and Microsoft technologies. So I repeat the question again for emphasis with a few follow ups:
Do you have any resources you feel you can recommend to developers learning neural networks, deep learning, and so on to help us understand what is going on under the hood with CNTK?
Build 2017 mentions C# is supported by CNTK. Can you please point us in the direction of where the documentation and support is for this?
Most importantly can you please help get us unstuck on trying to do time series anomaly analysis for time series using CNTK?
Thank you very much for time and assistance in reading and asking this question
Thanks for your feedback. Your suggestions help improve the toolkit.
First Bullet
I would suggest that you can start with the CNTK tutorials.
https://github.com/Microsoft/CNTK/tree/master/Tutorials
They are designed from CNTK 101 to 301. Suggest that you work through them. Many of them even though uses image data, the concept and the models are amenable to build solutions with numerical data. 101-103 series are great to understand basics of the train-test-predict workflow.
Second Bullet:
Once you have trained the model (using Python recommended). The model evaluation can be performed using different language bindings, C# being one of them.
https://github.com/Microsoft/CNTK/wiki/CNTK-Evaluation-Overview
Third Bullet
There are different approaches suggested in the papers you have cited. All of them are possible to do in CNTK with some changes to the code in the tutorials.
The key tutorial for you would be CNTK 106, CNTK 105, and CNTK 202
Anomaly as classification: This would involve you label your target value as 1 of N classes, with one of the class being "anomaly". Then you can combine 106 with 202, to classify the prediction
Anomaly as an autoencoder: You can need to study 105 autoencoder. Now instead of a dense network, you could apply the concept for Recurrent networks. Train only on the normal data. Once trained, pass any data through the trained model. The difference between the input and autoencoded version will be small for normal data but the difference will be much larger for anomalies. The 105 tutorial uses images, but you can train these models with any numerical data.
Hope you find these suggestions helpful.

Object Oriented Bayesian Spam Filtering?

I was wondering if there is any good and clean object-oriented programming (OOP) implementation of Bayesian filtering for spam and text classification? This is just for learning purposes.
I definitely recommend Weka which is an Open Source Data Mining Software written in Java:
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
As mentioned above, it ships with a bunch of different classifiers like SVM, Winnow, C4.5, Naive Bayes (of course) and many more (see the API doc).
Note that a lot of classifiers are known to have much better perfomance than Naive Bayes in the field of spam detection or text classification.
Furthermore Weka brings you a very powerful GUIā€¦
Check out Chapter 6 of Programming Collective Intelligence
Maybe https://ci-bayes.dev.java.net/ or http://www.cs.cmu.edu/~javabayes/Home/node2.html?
I never played with it either.
Here is an implementation of Bayesian filtering in C#: A Naive Bayesian Spam Filter for C# (hosted on CodeProject).
nBayes - another C# implementation hosted on CodePlex
In French, but you should be able to find the download link :)
PHP Naive Bayesian Filter