Does accord.net has lasso and ridge regression - accord.net

Recently I discovered for myself Lasso and Ridge regression in scikit python library. But as .Net developer I need the same functionality in accord.net Machine learning framework. I try to understand are they available in Accord.net. For example I see at accord net web site L1 and L2 regressor, and I know that Lasso and Ridge are implemented with L1 regularization and L2 regularization. But still I'm not sure. Can anybody confirm/refute that L1 and L2 of accord net are the same as Lasso and Ridge regularization in scikit ?

So far answer is the following. Lasso (L1) and Ridge (L2) in python has linear regression implemented. But Accord.net for L1 and L2 has only implemented for Logistic regression. This is clearly visible on screenshot from their web site:
Because Logisitc and Linear regression has different applications ( like apples and oranges ) I come to a conclusion that Ridge and Lasso Linear regression is not implemented in Accord.net.

Related

Random forest via tensorflow 2.3 is it possible?

I would like to impl. a random forest regression via tensorflow 2.3 but I cannot find any example for that. is it possible to do the random forest regression via tensorflow 2.3?
The same problem with svm, svr :/
I cannot use sklearn, because I have to use golang in running system. Maybe I can do the random forest regression via sklearn but how can I read the model via tensorflow? I think it is not possible.

Regression using MXNet

I have a regression model based on various independent features which eventually predict a value with a custom loss function. Somewhat similar to the link below.
https://www.evergreeninnovations.co/blog-quantile-loss-function-for-machine-learning/
The current model is built using Tensorflow library but now I want to use MXNet becuase of the speed and other advantages it provides. How to write a similar logic in MXNet with a custom loss function?
Simple regression with L2 loss is featured in 2 famous tutorials - you can just pick any of those and customize the loss:
In the D2L.ai book (used at many universities):
https://d2l.ai/chapter_linear-networks/linear-regression-gluon.html
In The Straight Dope (guide to the python API of MXNet,
gluon). A lot of that guide went into D2L.ai:
https://gluon.mxnet.io/chapter02_supervised-learning/linear-regression-gluon.html

Is there a thorough exploration of the effect of momentum on Stochastic Gradient Descent?

Many CNN papers use momentum=0.9 when using Stochastic Gradient Descent in weight update. There is a good logic for using it, but what I am looking for is a thorough exploration of effects of that parameter. I've searched across many papers, and there are some insights here and there, but I have not been able a comprehensive exploration. Also, does it usefulness vary across different computer vision tasks like classification, segmentation, detection?
Here is a good review paper on this topic "A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay" by Leslie N. Smith
https://arxiv.org/pdf/1803.09820.pdf

Why tensorflow's implementation of AdamOptimizer does not support L2 normalization

Tensorflow's implementation of AdamOptimzer do not have regularization params like that in ProximalAdamOptimizer, for example l2_regularization_strength, is it necessary to add l2 norm in AdamOptimzer?
Tensorflows Adam implementation is just that: An implementation of Adam, exactly how it is defined and tested in the paper.
If you want to use Adam with L2 regularization for your problem you simply have to add an L2 regularization term to your loss with some regularization strength you can choose yourself.
I can't tell you if that is necessary or helpful or what regularization and regularization strength to use, because that highly depends on the problem and is rather subjective.
Usually you add the regularization to your loss yourself, like it is described here. However tf.train.ProximalAdagradOptimizer includes a special non-standard regularization which is part of the algorithm and therefore also part of tf.train.ProximalAdagradOptimizer.

Training complexity of Linear SVM

Which is the actual computational complexity of the learning phase of SVM (let's say, that implemented in LibSVM)?
Thank you
Training complexity of nonlinear SVM is generally between O(n^2) and O(n^3) with n the amount of training instances. The following papers are good references:
Support Vector Machine Solvers by Bottou and Lin
SVM-optimization and steepest-descent line search by List and Simon
PS: If you want to use linear kernel, do not use LIBSVM. LIBSVM is a general purpose (nonlinear) SVM solver. It is not an ideal implementation for linear SVM. Instead, you should consider things like LIBLINEAR (by the same authors as LIBSVM), Pegasos or SVM^perf. These have much better training complexity for linear SVM. Training speed can be orders of magnitude better than using LIBSVM.
This is going to be heavily dependent on svm type and kernel. There is a rather technical discussion http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
For a quick answer, http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf, says expect it to be n^2.