Optimization of data-driven function as Tensorflow model - tensorflow

I try to find the optimum of a data-driven function represented as a Tensorflow model.
Means I trained a model to approximate a function and now want to find the optimum of this approximated function using a algorithm and software package/python library like ipopt, ipyopt, casadi, .... Or is there a possibility to do this directly in Tensorflow. I also have to define constraints, so I can't just use simple autodiff to do gradient decent and optimize my input.
Is there any idea how to realize this in an efficient way?
Maybe this image visualizes my problem to better understand what I'm looking for.

Related

Function inverse tensorflow

Is there a way to find the inverse of neural network representation of a function in tensorflow v1? I require this to find the optimal function in an optimization problem that I am solving.
To be precise, the optimal function is found by minimizing the error computed as L2 norm of difference between the approximated optimal function C* (coded as a neural network object), and inverse of a value function V* (coded as another neural network object).
My problem is that I do not know how to write inverse of V* in tensorflow, as I cannot find something like tf.inverse().
Any help is much appreciated. Thanks.
Unless I am misunderstanding the situation, I believe that it is impossible to do this in a generalized way. Many functions do not have a perfect inverse. For a simple example, imagine a square(x) function that computes x2. You might think that the inverse is sqrt(y), but in reality the "correct" result could be either sqrt(y) or -sqrt(y), with no way of telling which is correct.
Similarly, with most neural networks I imagine it would be impossible to find the "true" mathematical inverse. There are architectures that attempt to train a neural net and its inverse simultaneously (autoencoders and BiGAN/ALI come to mind), and for some nets it might be possible to train an inverse empirically, but these can have extremely varying levels of accuracy that depend heavily on many factors.
Depending on how much control you have over V*, you might be able to design it in such a way that it is mathematically invertible (and then you would have to manually code the inverse), or you might be able to make it a simpler model that is not based on a neural net. However, if V* is an arbitrary preexisting net, then you're probably out of luck.
Further reading:
SO: local inverse of a neural network
AI.SE: Can we get the inverse of the function that a neural network represents?

Is there a tf.keras.optimizers implementation for L-BFGS?

Does anybody have a Tensorflow 2 tf.keras subclass for the L-BFGS algorithm? If one wants to use L-BFGS, one has currently two (official) options:
TF Probability
SciPy optimization
These two options are quite cumbersome to use, especially when using custom models. So I am planning to implement a custom subclass of tf.keras.optimizers to use L-BFGS. But before I start, I was curious, whether somebody already tackled this task?
I've implemented an interface between keras and SciPy optimize.
https://github.com/pedro-r-marques/keras-opt
I'm using 'cg' by default but you should also be able to use 'l-bfgs'. Take a look at the unit tests for example usage. I will add documentation as soon as possible.
Does anybody have a Tensorflow 2 tf.keras subclass for the L-BFGS algorithm?
Yes, here's (yet another) implementation L-BFGS (and any other scipy.optimize.minimize solver) for your consideration in case it fits your use case:
https://pypi.org/project/kormos/
https://github.com/mbhynes/kormos
This package has a similar goal to Pedro's answer above, but I would recommend it over the keras-opt package if you run into issues with memory consumption during training. I implemented kormos when trying to build a Rendle-type factorization machine and kept OOMing with other full-batch solver implementations.
These two options are quite cumbersome to use, especially when using custom models. So I am planning to implement a custom subclass of tf.keras.optimizers to use L-BFGS. But before I start, I was curious, whether somebody already tackled this task?
Agreed, it's a little cumbersome to fit the signatures of tfp and scipy into the parameter fitting procedure in keras, because of the way that keras steps in and out of an optimizer that has persistent state between calls, which is not how most [old school?] optimization libraries work.
This is addressed specifically in the kormos package since IMO during prototyping it's a pretty common workflow to alternate between either a stochastic optimizer and a full-batch deterministic optimizer, and this should be simple enough to do ad hoc in the python interpreter.
The package has models that extend keras.Model and keras.Sequential:
kormos.models.BatchOptimizedModel
kormos.models.BatchOptimizedSequentialModel
These can be compiled to be fit with either the standard or the scipy solvers; it would look something like this:
from tensorflow import keras
from kormos.models import BatchOptimizedSequentialModel
# Create an Ordinary Least Squares regressor
model = BatchOptimizedSequentialModel()
model.add(keras.layers.Dense(
units=1,
input_shape=(5,),
))
# compile the model for stochastic optimization
model.compile(loss=keras.losses.MeanSquaredError(), optimizer="sgd")
model.fit(...)
# compile the model for deterministic optimization using scipy.optimize.minimize
model.compile(loss=keras.losses.MeanSquaredError(), optimizer="L-BFGS-B")
model.fit(...)

APIs of make inferences in GPflow

I have built some gaussian process models in GPflow and learned them successfully, but I cannot find APIs that can help me to make inferences straightforwardly in GPflow, such as seperating the contributions of different kernels in a GPR model.
I know that I can do it manually, like calculating the covariance matrices, inverse and multiply, but such work can be quite annoying as the model gets more complex, like a multi-output SVGP model. Any suggestions?
Thanks in advance!
If you want to e.g. decompose an additive Kernel, I think the easiest way for vanilla GPR would be to just switch out the Kernel to the part you're interested in, while still keeping the learned hyperparameters.
I'm not totally sure about it, but I think it could also work out for SVGP, since the approximation itself is just a standard GP using the same kernel but conditioned on the Inducing Points.
However, I'm not sure if the decomposition of the Variational approximation can be assumed to be close to the decomposition of the true posterior.

How to run Tensorflow clustering algorithm model

I need to run k-means algorithm from Tensorflow in Go, i.e. cluster a graph intro subgraphs according to nodes similarity matrix.
I came across this article which shows an example on how to run a Keras trained model in Go. In this example the algo is of a supervised learning type. However in clustering algos, as I understand, there will be no model to save and export it to Go implementation.
The reason I am interested in Tensorflow, is because I think its code is optimized and will run much faster than k-mean implementation in Go, even with the scenario I described above.
I need an opinion of whether:
It is indeed impossible to use a Tensorflow k-mean algorithm in Go, and it is much better just to use k-means implemented in Go for this case.
It is possible to do this, and some sort of example or ideas on how to do this are very much appreciated.

Tensorflow: How to perform binary classification as pre-processing and perform linear regression training

In Tensorflow, you can either perform either classification or linear regression to train your inputs against the labels. Is it possible to perform some classification for your inputs (as pre-processing but not necessarily to use Tensorflow) and determine if you want to run the linear regression using Tensorflow?
For example in image denoising task, you have found that your linear regression algorithm can provide a good smoothing effect against the edges but in the meantime also remove the details for the texture objects. Therefore you would like to perform a binary classification to determine if an input is a texture object, and run the linear regression algorithm using Tensorflow; otherwise do nothing for texture object.
I understand Tensorflow supports transfer learning so I guess one of the possible solutions is to perform binary classification using Tensorflow, and transfer the "texture classification" knowledge to instruct Tensorflow to apply linear regression algorithm only when the input is a texture object? Please correct me if I am wrong as I am not too sure if the above task is do-able in Tensorflow (it would be great if you can describe how to do this in details if this is do-able :-) ).
I guess an alternative solution is to use some binary classification without Tensorflow, and filter out (remove) the texture inputs before passing them to Tensorflow.
Please kindly tell me if which of the above solution (or any other solution) is better (if do-able) for the above scenario? Any suggestions are welcome.