Forming conditional distributions in TensorFlow probability - tensorflow

I am using Tensorflow Probability to build a VAE which includes image pixels as well as some other variables. The output of the VAE:
tfp.distributions.Independent(tfp.distributions.Bernoulli(logits), 2, name="decoder-dist")
I am trying to understand how to form other conditional distributions based on this which I can use with the inference methods (MCMC or VI). Say the output above was P(A,B,C | Z), how would I take that distribution to form a posterior P(A|B, C, Z) that I could perform inference on? I have been trying to read through the docs but I am having some trouble grasping them.

The answer to your question depends very much on the nature of the joint model within which you'd like to do the conditioning. Much has been written about the topic, and in short it's a very hard problem in general :). Without knowing a bit more about the particulars of your problem, it's near impossible to recommend a useful generic inference procedure. However, we do have a bunch of examples (scripts and jupyter/colab notebooks) in the TFP repo here: https://github.com/tensorflow/probability/tree/master/tensorflow_probability/examples
In particular, there's
The Hierarchical Linear Model example, which is a sort of Rosetta stone showing how to do posterior inference using Hamiltonian Monte Carlo (an MCMC technique) in TFP, R, and Stan,
The Linear Mixed Effects Model example, showing how you might use VI to solve a standard LME problem,
among many others. You can click the "Run in Google Colab" link at the top of any of these notebooks to open and run on them on https://colab.research.google.com.
Please feel free, also, to reach out on to us via email at tfprobability#tensorflow.org. This is a public Google Group where users can engage with the team that builds TFP directly. If you provide us some more info there on what you'd like to do, we're happy to provide guidance on modeling and inference with TFP.
Hope this is gives at least a start in the right direction!

Related

Is TensorFlow the way to go for this optimization problem?

I have to optimize the result of a process that depends on a large number of variables, i.e. a laser engraving system where the engraving depth depends on the laser speed, distance, power and so on.
The final objective is the minimization of the engraving time, or the maximization of the laser speed. All the other parameters can vary, but must stay within safe bounds.
I have never used any machine learning tools, but to my very limited knowledge this seems like a good use case for TensorFlow or any other machine learning library.
I would experimentally gather data points to train the algorithm, test it and then use a gradient descent optimizer to find the parameters (within bounds) that maximize the laser travel velocity.
Does this sound feasible? How would you approach such a problem? Can you link to any examples available online?
Thank you,
Riccardo
I’m not quite sure if I understood the problem correctly, would you add some example data and a desired output?
As far as I understood, It could be feasible to use TensorFlow, but I believe there are better solutions to that problem. Let me expand on this.
TensorFlow is a framework focused in the development of Deep Learning models. These usually require lots of data (the number really depends on the problem) but I don’t believe that just you manually gathering this data would be enough unless your team is quite big or already have some data gathered.
Also, as you have a minimization (or maximization) problem given variables that lay within a known range, I think this can be a case of Operations Research optimization instead of Machine Learning. Check this example of OR.

Converting a deep learning model from GPU powered framework, such as Theano, to a common, easily handled one, such as Numpy

I have been playing around with building some deep learning models in Python and now I have a couple of outcomes I would like to be able to show friends and family.
Unfortunately(?), most of my friends and family aren't really up to the task of installing any of the advanced frameworks that are more or less necessary to have when creating these networks, so I can't just send them my scripts in the present state and hope to have them run.
But then again, I have already created the nets, and just using the finished product is considerably less demanding than making it. We don't need advanced graph compilers or GPU compute powers for the show and tell. We just need the ability to make a few matrix multiplications.
"Just" being a weasel word, regrettably. What I would like to do is convert the the whole model (connectivity,functions and parameters) to a model expressed in e.g. regular Numpy (which, though not part of standard library, is both much easier to install and easier to bundle reliably with a script)
I fail to find any ready solutions to do this. (I find it difficult to pick specific keywords on it for a search engine). But it seems to me that I can't be the first guy who wants to use a ready-made deep learning model on a lower-spec machine operated by people who aren't necessarily inclined to spend months learning how to set the parameters in an artificial neural network.
Are there established ways of transferring a model from e.g. Theano to Numpy?
I'm not necessarily requesting those specific libraries. The main point is I want to go from a GPU-capable framework in the creation phase to one that is trivial to install or bundle in the usage phase, to alleviate or eliminate the threshold the dependencies create for users without extensive technical experience.
An interesting option for you would be to deploy your project to heroku, like explained on this page:
https://github.com/sugyan/tensorflow-mnist

How to test the convergence in bugs model?

I want to explain the convergence in a bugs model with the command plot(). An example of the output is in the follow figure
I don't sure that I can read this output well, thanks to everyone :)
Unfortunately, it does not look as if you can confirm convergence from the figure that you are showing (EDIT: There is at least some information, see below). The left hand side of the figure is just a caterpillar plot, which effectively just shows the 95% intervals of the distribution for each parameter.
Assessing convergence is a much more nuanced process, as there are multiple ways to decide if your model has converged. What you will want to determine is that your model has appropriately explored the parameter space for each parameter (through trace plots, traceplot function in the coda library), between and within chain variance (the gelman-rubin diagnostic, gelman.diag in the coda library), and auto-correlation in your chains (autocorr.plot in coda). There are a variety of other measures that others have suggested to assess if your model has converged, and looking through the rest of the coda package will illustrate this.
I highly suggest that you go through the WINBUGS tutorial in their user manual (link to pdf), it has a section that addresses checking model convergence. You want to ensure that the traceplots are well-mixed (look to tutorial to see what that means), that your Gelman Rubin diagnostic is < 1.10 for each parameter (general rule), and that your chains are not too correlated (this will reduce your effective sample size in your chains).
Good luck, and read up a bit on the subject, it will greatly benefit you if you are interested in Bayesian inference!
Edit
As #jacobsocolar pointed out, and I completely missed, the plots that are available in this question do at least have some information that indicates the model did converge. I did not see the R-hat plot on the right side of the left plot. These values should be less than 1.1 for each parameter if the model did indeed converge. Eyeballing the above plot does hint that the model converged, but this would be far easier to see if there was a vertical line at the 1.1 mark on the plot, which there is not.
Your output figure is indeed enough to (begin to) assess convergence, contra M_Fidino's answer. Next to the caterpillar plot, there is a plot of 'r-hat' values. These are the Gelman-Rubin statistic--the ratio of between-chain variance to within-chain variance, and they are all < 1.10
This is an encouraging first sign that the model has converged, assuming that the initial values were chosen to be nicely overdispersed.
Otherwise, I agree with everything in M_Fidino's answer.

Automated Design in CAD, Analysis in FEA, and Optimization

I would like to optimize a design by having an optimizer make changes to a CAD file, which is then analyzed in FEM, and the results fed back into the optimizer to make changes on the design based on the FEM, until the solution converges to an optimum (mass, stiffness, else).
This is what I envision:
create a blueprint of the part in a CAD software (e.g. CATIA).
run an optimizer code (e.g. fmincon) from within a programming language (e.g. Python). The parameters of the optimizer are parameters of the CAD model (angles, lengths, thicknesses, etc.).
the optimizer evaluates a certain design (parameter set). The programming language calls the CAD software and modifies the design accordingly.
the programming language extracts some information (e.g. mass).
then the programming language extracts a STEP file and passes it a FEA solver (e.g. Abaqus) where a predefined analysis is performed.
the programming language reads the results (e.g. max van Mises stress).
the results from CAD and FEM (e.g. mass and stress) are fed to the optimizer, which changes the design accordingly.
until it converges.
I know this exists from within a closed architecture (e.g. isight), but I want to use an open architecture where the optimizer is called from within an open programming language (ideally Python).
So finally, here are my questions:
Can it be done, as I described it or else?
References, tutorials please?
Which softwares do you recommend, for programming, CAD and FEM?
Yes, it can be done. What you're describing is a small parametric structural sizing multidisciplinary optimization (MDO) environment. Before you even begin coding up the tools or environment, I suggest doing some preliminary work on a few areas
Carefully formulate the minimization problem (minimize f(x), where x is a vector containing ... variables, subject to ... constraints, etc.)
Survey and identify individual tools of interest
How would each tool work? Input variables? Output variables?
Outline in a Design Structure Matrix (a.k.a. N^2 diagram) how the tools will feed information (variables) to each other
What optimizer is best suited to your problem (MDF?)
Identify suitable convergence tolerance(s)
Once the above steps are taken, I would then start to think MDO implementation details. Python, while not the fastest language, would be an ideal environment because there are many tools that were built in Python to solve MDO problems like the one you have and the low development time. I suggest going with the following packages
OpenMDAO (http://openmdao.org/): a modern MDO platform written by NASA Glenn Research Center. The tutorials do a good job of getting you started. Note that each "discipline" in the Sellar problem, the 2nd problem in the tutorial, would include a call to your tool(s) instead of a closed-form equation. As long as you follow OpenMDAO's class framework, it does not care what each discipline is and treats it as a black-box; it doesn't care what goes on in-between an input and an output.
Scipy and numpy: two scientific and numerical optimization packages
I don't know what software you have access to, but here are a few tool-related tips to help you in your tool survey and identification:
Abaqus has a Python API (http://www.maths.cam.ac.uk/computing/software/abaqus_docs/docs/v6.12/pdf_books/SCRIPT_USER.pdf)
If you need to use a program that does not have an API, you can automate the GUI using Python's win32com or Pywinauto (GUI automation) package
For FEM/FEA, I used both MSC PATRAN and MSC NASTRAN on previous projects since they have command-line interfaces (read: easy to interface with via Python)
HyperSizer also has a Python API
Install Pythonxy (https://code.google.com/p/pythonxy/) and use the Spyder Python IDE (included)
CATIA can be automated using win32com (quick Google search on how to do it: http://code.activestate.com/recipes/347243-automate-catia-v5-with-python-and-pywin32/)
Note: to give you some sort of development time-frame, what you're asking will probably take at least two weeks to develop.
I hope this helps.

Examples of apache math optimization

I have a simple optimization problem and am looking for java software for that.
The Apache math optimization software looks just like what I want but I cant find documentation to suit my needs (where those needs are to useful to a beginner / non maths professional!)
Does anyone know of a worked, simple, example?
In case it helps, the problem is that I want to find the max r where
r1 = s1 * m1
r2 = s2 * m2
and there are some constraints and formula for defining the relationship between the variables. The Excel Solver works fine for this problem. I got LPSolve working great, but this problem requires a multiplication of s and m, so I understand LPSolve cant help as this makes the problem non linear.
I recently ported the derivative-free non-linear constrained optimization code COBYLA2 to Java. Since it does not explicitly rely on derivatives, the algorithm may require quite a few iterations for larger problems. Nonetheless, you are able to formulate your problem with both a non-linear objective function and (potentially) non-linear constraints.
You can read more about it and download the source code from here.
I am not aware of a simple Java-based NLP solver. (I did find an example of Quadratic programming (QP) in Apache Math Works, but it doesn't qualify since you asked for a non-math professional example.)
I have two suggestions for you to solve your non-linear program:
1.. Excel's Solver does have the ability to tackle non-linear problems. (Don't use LPSOLVE.) In fact, NLP is the default mode in Solver.
Here are two links to using Excel to solve NLPs: Example 1 - Step by step Solver walk-through that covers NLP and
Example 2 - A General Neural network example in Excel
Also for Excel, I like Paul Jensen's (utexas) ORMM Add-in's.
He has a module called Teach NLP. Chapter 10 of his book deals with NLP and is available from his site.
2.. If you are going to be doing even some amount of data analysis, then I recommend investing a few hours to download and learn the basics of R.
R has numerous packages and libraries for optimization. optim() and nlme are relavant for solving non-linear programs.
Just for completeness, I mention SAS, MATLAB and CPLEX as other options. If you have access to any of these, they all do a very good job with solving non-linear programs.
Hope these pointers help.