Gauss-Newton products in Tensorflow - tensorflow

I would like to use the Gauss-Newton approximation to the Hessian as a metric for an optimization problem, such as the method used to fit the value function in GAE https://arxiv.org/abs/1506.02438. However, does anyone know how to efficiently compute these products? The issue is that I can not compute Jacobian off the shelf in Tensorflow, which makes it hard to do the per example one-rank computations. One solution is given in this technical report https://arxiv.org/pdf/1510.01799.pdf, however, this puts some constraints on the network architectures that can be used. Does it exist a more general solution?

As of april 2017 there is no general-purpose efficient way to compute per-example gradients in TensorFlow, calling tf.gradients on all examples is probably the best for now.

Related

Optimization of data-driven function as Tensorflow model

I try to find the optimum of a data-driven function represented as a Tensorflow model.
Means I trained a model to approximate a function and now want to find the optimum of this approximated function using a algorithm and software package/python library like ipopt, ipyopt, casadi, .... Or is there a possibility to do this directly in Tensorflow. I also have to define constraints, so I can't just use simple autodiff to do gradient decent and optimize my input.
Is there any idea how to realize this in an efficient way?
Maybe this image visualizes my problem to better understand what I'm looking for.

Function inverse tensorflow

Is there a way to find the inverse of neural network representation of a function in tensorflow v1? I require this to find the optimal function in an optimization problem that I am solving.
To be precise, the optimal function is found by minimizing the error computed as L2 norm of difference between the approximated optimal function C* (coded as a neural network object), and inverse of a value function V* (coded as another neural network object).
My problem is that I do not know how to write inverse of V* in tensorflow, as I cannot find something like tf.inverse().
Any help is much appreciated. Thanks.
Unless I am misunderstanding the situation, I believe that it is impossible to do this in a generalized way. Many functions do not have a perfect inverse. For a simple example, imagine a square(x) function that computes x2. You might think that the inverse is sqrt(y), but in reality the "correct" result could be either sqrt(y) or -sqrt(y), with no way of telling which is correct.
Similarly, with most neural networks I imagine it would be impossible to find the "true" mathematical inverse. There are architectures that attempt to train a neural net and its inverse simultaneously (autoencoders and BiGAN/ALI come to mind), and for some nets it might be possible to train an inverse empirically, but these can have extremely varying levels of accuracy that depend heavily on many factors.
Depending on how much control you have over V*, you might be able to design it in such a way that it is mathematically invertible (and then you would have to manually code the inverse), or you might be able to make it a simpler model that is not based on a neural net. However, if V* is an arbitrary preexisting net, then you're probably out of luck.
Further reading:
SO: local inverse of a neural network
AI.SE: Can we get the inverse of the function that a neural network represents?

Accuracy of solutions of differential equations with DeepXDE

We used DeepXDE for solving differential equations. (DeepXDE is a framework for solving differential equations, based on TensorFlow). It works fine, but the accuracy of the solution is limited, and optimizing the meta-parameters did not help. Is this limitation a well-known problem? How the accuracy of solutions can be increased? We used the Adam-optimizer; are there optimizers that are more suitable for numerical problems, if high precision is needed?
(I think the problem is not specific for some concrete equation, but if needed I add an example.)
There are actually some methods that could increase the accuracy of the model:
Random Resampling
Residual Adaptive Refinement (RAR): https://arxiv.org/pdf/1907.04502.pdf
They even have an implemented example in their github repository:
https://github.com/lululxvi/deepxde/blob/master/examples/Burgers_RAR.py
Also, You could try using a different architecture such as Multi-Scale Fourier NNs. They seem to outperform PINNs, in cases where the solution contains lots of "spikes".

Building a deep neural network that produces output that is distributed as multivariate Standard normal distribution

I'm looking for a way to Build a deep neural network that produces output that is distributed as multivariate Standard normal distribution ~N(0,1).
I can use Pytorch or TensorFlow, whichever is easier for this task.
I actually have some input X, which in terms of this question can be assumed to be just a matrix of values ​​from the uniform distribution.
I put the input into the network, whose architecture can currently change.
And I want to get output, so in addition to other requirements I will have from it. I want that if we represent the values ​​obtained by all the possible x's, we get that it looks like a multivariate standard normal distribution ~N(0,1).
What I think needs to be done for this to happen is to choose the right loss function.
To do this, I thought of two ways:
Use of statistical tests.
A loss that tests a large number of properties (mean, standard deviation, ..).
Realizing 2 sounds complicated, so I started with 1.
I was looking for statistical tests already implemented in one of the packages ​​as a loss function, and I did not find anything like that.
I implemented statistical tests by myself to obtain output that is univariate standard normal distribution - and it seemed to work relatively well.
With the realization of multidimensional tests I became more entangled.
Do you know of any understandable tensorflow\pythorch functions that do something similar to what I'm trying to do?
Do you have another idea for the operation?
Do you have any comments regarding the methods I try to work with?
Thanks
Using pytorch functions can help you a lot. Considering that I don't know exactly what you will want with these results, I can refer you to pytorch with this link here.
In this link you will have all the pytorch loss functions and the calculations used in each one of them! just click on one and check how it works and see if it’s what you’re looking for.
For the second topic you can look at this same link I sent the BCEWithLogitcLoss function because it may be what you are looking for.

Deep learning basic thoughts

I try to understand the basics of deep learning, lastly reading a bit through deeplearning4j. However, I don't really find an answer for: How does the training performance scale with the amount of training data?
Apparently, the cost function always depends on all the training data, since it just sums the squared error per input. Thus, I guess at each optimization step, all datapoints have to be taken into account. I mean deeplearning4j has the dataset iterator and the INDArray, where the data can live anywhere and thus (I think) doesn't limit the amount of training data. Still, doesn't that mean, that the amount of training data is directly related to the calculation time per step within the gradient descend?
DL4J uses iterator. Keras uses generator. Still the same idea - your data comes in batches, and used for SGD. So, minibatches matter, not the the whole amount of data you have.
Fundamentally speaking it doesn't (though your mileage may vary). You must research right architecture for your problem. Adding new data records may introduce some new features, which may be hard to capture with your current architecture. I'd safely always question my net's capacity. Retrain your model and check if metrics drop.