I am working on a non-convex optimization these days and the question comes to my mind about the application of non-convex optimization in deep learning. How can be sure that our objective function is convex? Thanks
The standard definition is if f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) for 0≤θ≤1 and the domain of x,y is also convex.
So if you could prove that for your function, you would know it's convex.
In deep learning its very difficult to be sure that your objective function is Non Convex thats why initialization and hyperparameter tuning becomes very important
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 months ago.
Improve this question
I keep hearing that GPUs are useful because they are quick at linear algebra.
I see how a GPU can be utilised to quickly perform linear calculations, and I see why that is useful, but I don't see why these calculations need to be linear.
Why can't we have each GPU core take in 4 numbers a, b, c, d and compute a^b + c^d, or any other nonlinear function?
If the answer is that linear algebra is more efficient: how is linear algebra more efficient and how would one utilise linear algebra to compute or approximate an arbitrary nonlinear function (if specificity is required, assume the function is a nonlinear polynomial)?
GPUs are used for pretty much everything. Your observation is unrelated to GPUs or programming, it’s about books and articles on the subject.
Here’s reasons why you mostly see examples about linear algebra.
Linear algebra is relatively simple, easy to explain how massive parallelism helps.
Linear algebra is used for a lot of things. For some practical applications, speeding up just the linear algebra already causes massive performance win, despite the matrices involved are assembled on CPU with scalar code.
Linear algebra is simple enough to be abstracted away in a library like cuBLAS. Arbitrary nonlinear functions tend to require custom compute kernels, which is harder than just consuming a library someone else wrote.
GPUs are useful when the computations they need to perform can be parallelized, i.e., they can be executed with a "divide et impera" approach, dividing the problem into subproblems and solving each subproblem separately and then combining them into the solution to the original problem.
In Linear Algebra there is intensive use of matrix multiplication, which is the most classic of problems that can be solved by parallelization, that is why GPUs are so efficient in the practical applications that require it, such as deep learning.
We used DeepXDE for solving differential equations. (DeepXDE is a framework for solving differential equations, based on TensorFlow). It works fine, but the accuracy of the solution is limited, and optimizing the meta-parameters did not help. Is this limitation a well-known problem? How the accuracy of solutions can be increased? We used the Adam-optimizer; are there optimizers that are more suitable for numerical problems, if high precision is needed?
(I think the problem is not specific for some concrete equation, but if needed I add an example.)
There are actually some methods that could increase the accuracy of the model:
Random Resampling
Residual Adaptive Refinement (RAR): https://arxiv.org/pdf/1907.04502.pdf
They even have an implemented example in their github repository:
https://github.com/lululxvi/deepxde/blob/master/examples/Burgers_RAR.py
Also, You could try using a different architecture such as Multi-Scale Fourier NNs. They seem to outperform PINNs, in cases where the solution contains lots of "spikes".
I am looking for optimization modelling libraries in python like CVXPY and Pyomo with support of complex variables (variables with real and imaginary part) and non-linear problems. CVXPY support complex variables but doesn't support nonlinear function for constraints. On the other hand, Pyomo can support nonlinear problems but doesn't support complex variables.
In conclusion: I am working on a large scale nonlinear and nonconvex optimization problem with some comlex variables and I am looking for something like cvxpy for these types of problems.
Any suggestions!
Thanks
I have a question regarding to programming simple quadratic assignment problem with Gurobi? As you know the objective function is not linear. Can we model and solve it with Gurobi (I am using Gurobi/python interface)
Which is the actual computational complexity of the learning phase of SVM (let's say, that implemented in LibSVM)?
Thank you
Training complexity of nonlinear SVM is generally between O(n^2) and O(n^3) with n the amount of training instances. The following papers are good references:
Support Vector Machine Solvers by Bottou and Lin
SVM-optimization and steepest-descent line search by List and Simon
PS: If you want to use linear kernel, do not use LIBSVM. LIBSVM is a general purpose (nonlinear) SVM solver. It is not an ideal implementation for linear SVM. Instead, you should consider things like LIBLINEAR (by the same authors as LIBSVM), Pegasos or SVM^perf. These have much better training complexity for linear SVM. Training speed can be orders of magnitude better than using LIBSVM.
This is going to be heavily dependent on svm type and kernel. There is a rather technical discussion http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
For a quick answer, http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf, says expect it to be n^2.