Functions scaling in OpenMDAO - optimization

Is there any way to apply a logarithmic scaling to the design objectives/constraints in OpenMDAO? In my optimization problem, I have to deal with an objective function that takes large values and varies significantly over the design space (between the orders 10^6 and 10^7), so I would ideally like to make the driver handle the log of the objective. I have modified my objective function directly for now, but it would be more convenient to do it at the driver level. Is that possible?

Currently openmdao drivers don't support nonlinear scaling, and doing so through a component is the correct approach. In the past I've sometimes made a separate "objective" component that exists to apply some transformation to the raw objective. That allows you to drop it into your model when necessary without the need to change the original calculations.

A middle ground between "modify the component" and "have OpenMDAO driver api do it for you" is to use an ExecComp.
prob.model.add_subsystem('log_scale', om.ExecComp('log_f = log(f)')
prob.model.connect('some_comp.f', 'log_scale.f')
prob.driver.add_objective('log_scale.log_f')
The exec-comp will handle the derivatives of that transformation for you. All you have to do is connect your objective into the right input on the ExecComp instance.

Related

Is TensorFlow the way to go for this optimization problem?

I have to optimize the result of a process that depends on a large number of variables, i.e. a laser engraving system where the engraving depth depends on the laser speed, distance, power and so on.
The final objective is the minimization of the engraving time, or the maximization of the laser speed. All the other parameters can vary, but must stay within safe bounds.
I have never used any machine learning tools, but to my very limited knowledge this seems like a good use case for TensorFlow or any other machine learning library.
I would experimentally gather data points to train the algorithm, test it and then use a gradient descent optimizer to find the parameters (within bounds) that maximize the laser travel velocity.
Does this sound feasible? How would you approach such a problem? Can you link to any examples available online?
Thank you,
Riccardo
I’m not quite sure if I understood the problem correctly, would you add some example data and a desired output?
As far as I understood, It could be feasible to use TensorFlow, but I believe there are better solutions to that problem. Let me expand on this.
TensorFlow is a framework focused in the development of Deep Learning models. These usually require lots of data (the number really depends on the problem) but I don’t believe that just you manually gathering this data would be enough unless your team is quite big or already have some data gathered.
Also, as you have a minimization (or maximization) problem given variables that lay within a known range, I think this can be a case of Operations Research optimization instead of Machine Learning. Check this example of OR.

idea behind xgboost/lightgbm/catboost in comparison

I'm trying to decide, which one of the following I will use in practice for regression tasks: xgboost, lightgbm or catboost (python 3).
So, what are general idea behind each of them? Why should I choose one, but not another?
I'm not interested in very slight difference in the accuracy score like 0.781 vs 0.782. Result should be tenable, and my tool should be robust, convenient in use. The workhorse.
As I understand about these methods, Just how they are implemented is different, otherwise they have implemented GBM methods.
So you should just try to do some hyper parameter tuning.
Also, its good idea to read this paper:
catboost-vs-light-gbm-vs-xgboost
You cannot determine a priori which Tree algorithm (or any algorithm) will be automatically the best. This is because of the https://en.wikipedia.org/wiki/No_free_lunch_theorem
It's best to try them all out. You should also throw in Random Forest (RF) as another one to try.
I will say that http://CatBoost.ai (CB) does have one advantage over the others: if you have Categorical Variables, CB will most likely beat the others because it can handle categorical variables directly without One-Hot-Encoding.
You might try http://H2O.ai 's grid search which supports several algorithms (RF, XGBoost, GBM, Linear Regression) with Hypertuning of parameters to see which one works best. You can run this overnight. (CB is not included in H2O's grid search)

Particles passing through each other

I am writing a code to simulate particle movement. (currently 2D soon 3D hopefully)
The thing is, if I use a relatively large timestep particles end up passing through each other.
Do you have any suggestion that would allow me to correct that without using a really small step?
(it is in C++ if that makes much difference).
The use of timestep to advance the clock introduces model artifacts which can destroy the model validity, as is happening in your case. Use discrete event scheduling instead. This paper from Winter Simulation Conference 2005 describes how to implement movement in a discrete event framework. Your model will not only be more accurate, it will probably run much faster as well.
So you will have to do some sort of collision detection to see if two objects would collide.
Depending on your data structure the detection could take many forms. If you just have a list of points you would have to check all against each other in N^2 each step for the particle (adding the movement vector to create a larger spacial foot print). This could be done by the GJK algorithm.
Using some spacial data structure could reduce the complexity by only running the GJK on a pruned set of particles, i.e. no need to check if they impossible could overlap.

Getting started with Finite Elements methods

There is a cubic block of fractured rock; the question is:
how to simulate fluid flow from top-side to down-side or left-side to right-side?
Is FEA (FEM,...) the only practical solution?
If so for the question above in its simplest conditions, that is, flow can happen only through fractures; no interaction between matrix and the fluid; etc etc how to have a quick simulation with FEA?
Is this practical someone with professionality in FEA could do this in a few minutes? Suppose there is already a suitable mesh generated.
If not what would you recommend to get started rapidly to be able to solve such simple cases?
Is there anybody having experience with similar problem (flow modeling); if so what did you use and how did you fulfilled the job?
Note that we are aware of availability of many FEM packages e.g., FEniCS, OpenFoam, ....
Your question refers to simulation of the fluid in the porous medium, e.g. the rock.
I highly recommend using LBM instead of FEM-based methods. LBM simulates the flow in porous media by nature. Phys Review E contains publications about that approach. What is even more attractive, LBM can be also easily parallelized on GPU.
There are a number of numerical techniques that could be used to solve this problem, finite elements being probably the most common. If you have a mesh of the fluid flow domain already (presumably the voids/cracks in the rock) it would be very straightforward to set up and run the flow model with pretty much any CFD package (finite element based or not) and most people with any exposure to FEA should be able to do it. I am assuming that you want to understand the fluid flow within the rock in some detail, rather than just evaluate the effects of the rock on the flow in some larger flow domain. In the latter case, there are other approaches which might be more computationally efficient.
You could use the one-dimensional form of Darcy's Law.

What's the fastest force-directed network graph engine for large data sets? [duplicate]

We currently have a dynamically updated network graph with around 1,500 nodes and 2,000 edges. It's ever-growing. Our current layout engine uses Prefuse - the force directed layout in particular - and it takes about 10 minutes with a hefty server to get a nice, stable layout.
I've looked a little GraphViz's sfpd algorithm, but haven't tested it yet...
Are there faster alternatives I should look at?
I don't care about the visual appearance of the nodes and edges - we process that separately - just putting x, y on the nodes.
We do need to be able to tinker with the layout properties for specific parts of the graph, for instance, applying special tighter or looser springs for certain nodes.
Thanks in advance, and please comment if you need more specific information to answer!
EDIT: I'm particularly looking for speed comparisons between the layout engine options. Benchmarks, specific examples, or just personal experience would suffice!
I wrote a JavaScript-based graph drawing library VivaGraph.js.
It calculates layout and renders graph with 2K+ vertices, 8.5K edges in ~10-15 seconds. If you don't need rendering part it should be even faster.
Here is a video demonstrating it in action: WebGL Graph Rendering With VivaGraphJS.
Online demo is available here. WebGL is required to view the demo but is not needed to calculate graphs layouts. The library also works under node.js, thus could be used as a service.
Example of API usage (layout only):
var graph = Viva.Graph.graph(),
layout = Viva.Graph.Layout.forceDirected(graph);
graph.addLink(1, 2);
layout.run(50); // runs 50 iterations of graph layout
// print results:
graph.forEachNode(function(node) { console.log(node.position); })
Hope this helps :)
I would have a look at OGDF, specifically http://www.ogdf.net/doku.php/tech:howto:frcl
I have not used OGDF, but I do know that Fast Multipole Multilevel is a good performant algorithm and when you're dealing with the types of runtimes involved with force directed layout with the number of nodes you want, that matters a lot.
Why, among other reasons, that algorithm is awesome: Fast Multipole method. The fast multipole method is a matrix multiplication approximation which reduces the O() runtime of matrix multiplication for approximation to a small degree. Ideally, you'd have code from something like this: http://mgarland.org/files/papers/layoutgpu.pdf but I can't find it anywhere; maybe a CUDA solution isn't up your alley anyways.
Good luck.
The Gephi Toolkit might be what you need: some layouts are very fast yet with a good quality: http://gephi.org/toolkit/
30 secondes to 2 minutes are enough to layout such a graph, depending on your machine.
You can use the ForAtlas layout, or the Yifan Hu Multilevel layout.
For very large graphs (+50K nodes and 500K links), the OpenOrd layout wil
In a commercial scenario, you might also want to look at the family of yFiles graph layout and visualization libraries.
Even the JavaScript version of it can perform layouts for thousands of nodes and edges using different arrangement styles. The "organic" layout style is an implementation of a force directed layout algorithm similar in nature to the one used in Neo4j's browser application. But there are a lot more layout algorithms available that can give better visualizations for certain types of graph structures and diagrams. Depending on the settings and structure of the problem, some of the algorithms take only seconds, while more complex implementations can also bring your JavaScript engine to its knees. The Java and .net based variants still perform quite a bit better, as of today, but the JavaScript engines are catching up.
You can play with these algorithms and settings in this online demo.
Disclaimer: I work for yWorks, which is the maker of these libraries, but I do not represent my employer on SO.
I would take a look at http://neo4j.org/ its open source which is beneficial in your case so you can customize it to your needs. The github account can be found here.