Can a Bayesian filter be used to create multiple outputs - bayesian

I see that Bayesian filters are use well for binary choices - (spam:not spam, male:female etc). Is there any way for it to categorize multiple values (eg php+javascript, house+yard).
I've seen Naive bayesian classifier - multiple decisions but I want to know if multiple outputs are possible.
If not, what are other suggested approaches for categorization (with or without learning). Especially for php.

As the accepted answer of the question you linked to says: "It's definitely possible to have more than two classes.". In practice, one approach is to train multiple classifiers in parallel, e.g. one classifier for php vs. not php and another classifier for javascript vs. not javascript.
Other widely used multivariate classification methods include
artificial neural networks (also called multilayer perceptrons)
(boosted) decision trees
support vector machines
If you have a more detailed/follow up question on this, post it on http://stats.stackexchange.com .
I'm not sure what libraries for such a task are available for php but Swig is a tool to make libraries written in C/C++ usable from php.

Related

Building a deep neural network that produces output that is distributed as multivariate Standard normal distribution

I'm looking for a way to Build a deep neural network that produces output that is distributed as multivariate Standard normal distribution ~N(0,1).
I can use Pytorch or TensorFlow, whichever is easier for this task.
I actually have some input X, which in terms of this question can be assumed to be just a matrix of values ​​from the uniform distribution.
I put the input into the network, whose architecture can currently change.
And I want to get output, so in addition to other requirements I will have from it. I want that if we represent the values ​​obtained by all the possible x's, we get that it looks like a multivariate standard normal distribution ~N(0,1).
What I think needs to be done for this to happen is to choose the right loss function.
To do this, I thought of two ways:
Use of statistical tests.
A loss that tests a large number of properties (mean, standard deviation, ..).
Realizing 2 sounds complicated, so I started with 1.
I was looking for statistical tests already implemented in one of the packages ​​as a loss function, and I did not find anything like that.
I implemented statistical tests by myself to obtain output that is univariate standard normal distribution - and it seemed to work relatively well.
With the realization of multidimensional tests I became more entangled.
Do you know of any understandable tensorflow\pythorch functions that do something similar to what I'm trying to do?
Do you have another idea for the operation?
Do you have any comments regarding the methods I try to work with?
Thanks
Using pytorch functions can help you a lot. Considering that I don't know exactly what you will want with these results, I can refer you to pytorch with this link here.
In this link you will have all the pytorch loss functions and the calculations used in each one of them! just click on one and check how it works and see if it’s what you’re looking for.
For the second topic you can look at this same link I sent the BCEWithLogitcLoss function because it may be what you are looking for.

Which model (GPT2, BERT, XLNet and etc) would you use for a text classification task? Why?

I'm trying to train a model for a sentence classification task. The input is a sentence (a vector of integers) and the output is a label (0 or 1). I've seen some articles here and there about using Bert and GPT2 for text classification tasks. However, I'm not sure which one should I pick to start with. Which of these recent models in NLP such as original Transformer model, Bert, GPT2, XLNet would you use to start with? And why? I'd rather to implement in Tensorflow, but I'm flexible to go for PyTorch too.
Thanks!
It highly depends on your dataset and is part of the data scientist's job to find which model is more suitable for a particular task in terms of selected performance metric, training cost, model complexity etc.
When you work on the problem you will probably test all of the above models and compare them. Which one of them to choose first? Andrew Ng in "Machine Learning Yearning" suggest starting with simple model so you can quickly iterate and test your idea, data preprocessing pipeline etc.
Don’t start off trying to design and build the perfect system.
Instead, build and train a basic system quickly—perhaps in just a few
days
According to this suggestion, you can start with a simpler model such as ULMFiT as a baseline, verify your ideas and then move on to more complex models and see how they can improve your results.
Note that modern NLP models contain a large number of parameters and it is difficult to train them from scratch without a large dataset. That's why you may want to use transfer learning: you can download pre-trained model and use it as a basis and fine-tune it to your task-specific dataset to achieve better performance and reduce training time.
I agree with Max's answer, but if the constraint is to use a state of the art large pretrained model, there is a really easy way to do this. The library by HuggingFace called pytorch-transformers. Whether you chose BERT, XLNet, or whatever, they're easy to swap out. Here is a detailed tutorial on using that library for text classification.
EDIT: I just came across this repo, pytorch-transformers-classification (Apache 2.0 license), which is a tool for doing exactly what you want.
Well like others mentioned, it depends on the dataset and multiple models should be tried and best one must be chosen.
However, sharing my experience, XLNet beats all other models so far by a good margin. Hence if learning is not the objective, i would simple start with XLNET and then try a few more down the line and conclude. It just saves time in exploring.
Below repo is excellent to do all this quickly. Kudos to them.
https://github.com/microsoft/nlp-recipes
It uses hugging face transformers and makes them dead simple. 😃
I have used XLNet, BERT, and GPT2 for summarization tasks (English only). Based on my experience, GPT2 works the best among all 3 on short paragraph-size notes, while BERT performs better for longer texts (up to 2-3 pages). You can use XLNet as a benchmark.

How to specify the architecture of deep neural network in Tensorflow?

I am newbie in Tensorflow
Actually, I am testing some example in Tensorflow web-site, and I start to understand some features of the framwork, but what I don't understand is how I can design my architecture, I mean number of layers, type of Layer "conv, pool...", and if it is necessery to do that, because there are many predifined architectures like AmexNet,
Thanks,
I would strongly recommend working through their hands on tutorial, depending on if you have previous ML experience (https://www.tensorflow.org/get_started/mnist/pros) or not (https://www.tensorflow.org/get_started/mnist/beginners). The questions you are asking are answered in there.
The question on using predefined architectures or self defined depends on your use case. If you want to do something easy like classifying if there is only a car in the scene or not a more shallow architecture might work better, because it is faster and a more deep one is overkill. However most architectures are similar to the ones already defined in literature.
Another question that arises naturally, while talking about pre defined architecture is about transfer learning / fine tuning. Often pre defined architectures are already learned on some big dataset (mostly ImageNet) and already perform really well out of the box for many tasks. With little training data it makes a lot of sense to use this. With lots of training data it can hinder your progress though.

Compare deep learning framework between TensorFlow and PaddlePaddle

I want to study on the research of deep learning, but I don't know which framwork should I choice between TensorFlow and PaddlePaddle. who can make a contrast between the two frameworks? which one is better? especially in the running efficiency of CPU
It really depends what you are shooting for...
If you plan on training, CPU is not going to work well for you. Use colab or kaggle.
Assuming you do get a GPU, it depends if you want to focus on classification or object detection.
If you focus on classification, Keras is probably the easiest to work with or pytorch if you want some advanced stuff and to be able to change things.
If you plan on object detection, things are getting complicated... Inference is reasonably easy but training is complicated. There are actually 4 platforms you should consider:
Tensorflow - powerful but very difficult to work with. If you do not use Keras (and for OD you usually can't), you need to preprocess the dataset into tfrecords and it is a pain. The OD Api has very cryptic messages and it is very sensitive to the combination of tf version and api version. On the other hand, cool models like efficientdet are more or less easy to use.
MMdetection - very powerful framework, has lots of advanced models and once you understand how to work with it, you can easily work with and of the models it supports. Downside is that some models are slow to arrive (efficientdet, for example)
paddlepaddle - if you know Chinese, this should work ok, maybe. The documentation is a bit behind and usually requires lots of improvisation. Basically it is similar to mmdetection just with a few unique models and a few missing models.
detectron2 - I didn't work with this one, but it seems to support only a few models.
You probably need first to define for yourself what do you want to do and then choose.
Good luck!
It is not that trivial. Some models run faster with one kind of framework others with another. Furthermore, it depends on the hardware as well. See this blog. If inference is your only concern, then you can develop your model in any of the popular frameworks like TensorFlow, PyTorch, etc. In the end convert your model to ONNX format and benchmark its performance with DNN-Bench to choose the best inference engine for your application.

Object Oriented Bayesian Spam Filtering?

I was wondering if there is any good and clean object-oriented programming (OOP) implementation of Bayesian filtering for spam and text classification? This is just for learning purposes.
I definitely recommend Weka which is an Open Source Data Mining Software written in Java:
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
As mentioned above, it ships with a bunch of different classifiers like SVM, Winnow, C4.5, Naive Bayes (of course) and many more (see the API doc).
Note that a lot of classifiers are known to have much better perfomance than Naive Bayes in the field of spam detection or text classification.
Furthermore Weka brings you a very powerful GUI…
Check out Chapter 6 of Programming Collective Intelligence
Maybe https://ci-bayes.dev.java.net/ or http://www.cs.cmu.edu/~javabayes/Home/node2.html?
I never played with it either.
Here is an implementation of Bayesian filtering in C#: A Naive Bayesian Spam Filter for C# (hosted on CodeProject).
nBayes - another C# implementation hosted on CodePlex
In French, but you should be able to find the download link :)
PHP Naive Bayesian Filter