How to input constraints on parameters? - particle-swarm

I am currently doing a materials prediction project using PSO and I was wondering if anyone can provide any expertise. I utilize PSO as my method of operation but I am trying to handle a constraint
For eg: I have 17 input parameters for the algorithm to take references from and make predictions. However, these 17 elements should not exceed 100%. May I know how do I input the constraints?
enter image description here

Apply the constraint before the objective function is updated but after the particle position has been updated. Let say after velocity/location update, your particle is now located at [5,5] while your constraint (Ub) is [4,3]. Simply modify your particle location to [4,3]. Other people use more exotic method such as 'bouncing', like when hitting a wall with a ball. E.g., original particle location is [3,3] with velc of [4,2] (same Ub). Due to the constraint and bouncing, the particle is now at [0,1] (3+((4-3)-3)).
Code example for the former method
% Fixing the Boundary
bindex_up = x(pop_iter,:) > ub;
bindex_down = x(pop_iter,:) < lb;
x(pop_iter,bindex_up)=ub(bindex_up);
x(pop_iter,bindex_down)=lb(bindex_down);
Do not change the particle position, but if the particle location is outside the Ub or Lb, apply a penalty to the fitness/obj function.
Nature Inspired Metaheuristic has more details on this subject (Constrain handling) https://dl.acm.org/doi/10.5555/1628847

Related

How to perform dynamic optimization for a nonlinear discrete optimization problem with nonlinear constraints, using non-linear solvers like SNOPT?

I am new to the field of optimization and I need help in the following optimization problem. I have tried to solve it using normal coding to make sure that I got he correct results. However, the results I got are different and I am not sure my way of analysis is correct or not. This is a short description of the problem:
The objective function shown in the picture is used to find the optimal temperature of the insulating system that minimizes the total cost over a given horizon.
[This image provides the mathematical description of the objective function and the constraints] (https://i.stack.imgur.com/yidrO.png)
The data of the problems are as follow:
1-
Problem data:
A=1.07×10^8
h=1
T_ref=87.5
N=20
p1=0.001;
p2=0.0037;
This is the curve I want to obtain
2- Optimization variable:
u_t
3- Model type:
The model is a nonlinear cost function with non-linear constraints and it is solved using non-linear solver SNOPT.
4-The meaning of the symbols in the objective and constrained functions
The optimization is performed over a prediction horizon of N years.
T_ref is The reference temperature.
Represent the degree of polymerization in the kth year.
X_DP Represents the temperature of the insulating system in the kth year.
h is the time step (1 year) of the discrete-time model.
R is the ratio of the load loss at the rated load to the no-load loss.
E is the activation energy.
A is the pre-exponential constant.
beta is a linear coefficient representing the cost due to the decrement of the temperature.
I have developed the source code in MATLAB, this code is used to check if my analysis is correct or not.
I have tried to initialize the Ut value in its increasing or decreasing states so that I can have the curves similar to the original one. [This is the curve I obtained] (https://i.stack.imgur.com/KVv2q.png)
I have tried to simulate the problem using conventional coding without optimization and I got the figure shown above.
close all; clear all;
h=1;
N=20;
a=250;
R=8.314;
A=1.07*10^8;
E=111000;
Tref=87.5;
p1=0.0019;
p2=0.0037;
p3=0.0037;
Utt=[80,80.7894736842105,81.5789473684211,82.3684210526316,83.1578947368421,... % The value of Utt given here represent the temperature increament over a predictive horizon.
83.9473684210526,84.7368421052632,85.5263157894737,86.3157894736842,...
87.1052631578947,87.8947368421053,88.6842105263158,89.4736842105263,...
90.2631578947369,91.0526315789474,91.8421052631579,92.6315789473684,...
93.4210526315790,94.2105263157895,95];
Utt1 = [95,94.2105263157895,93.4210526315790,92.6315789473684,91.8421052631579,... % The value of Utt1 given here represent the temperature decreament over a predictive horizon.
91.0526315789474,90.2631578947369,89.4736842105263,88.6842105263158,...
87.8947368421053,87.1052631578947,86.3157894736842,85.5263157894737,...
84.7368421052632,83.9473684210526,83.1578947368421,82.3684210526316,...
81.5789473684211,80.7894736842105,80];
Ut1=zeros(1,N);
Ut2=zeros(1,N);
Xdp =zeros(N,N);
Xdp(1,1)=1000;
Xdp1 =zeros(N,N);
Xdp1(1,1)=1000;
for L=1:N-1
for k=1:N-1
%vt(k+L)=Ut(k-L+1);
Xdq(k+1,L) =(1/Xdp(k,L))+A*exp((-1*E)/(R*(Utt(k)+273)))*24*365*h;
Xdp(k+1,L)=1/(Xdq(k+1,L));
Xdp(k,L+1)=1/(Xdq(k+1,L));
Xdq1(k+1,L) =(1/Xdp1(k,L))+A*exp((-1*E)/(R*(Utt1(k)+273)))*24*365*h;
Xdp1(k+1,L)=1/(Xdq1(k+1,L));
Xdp1(k,L+1)=1/(Xdq1(k+1,L));
end
end
% MATLAB code
for j =1:N-1
Ut1(j)= -p1*(Utt(j)-Tref);
Ut2(j)= -p2*(Utt1(j)-Tref);
end
sum00=sum(Ut1);
sum01=sum(Ut2);
X1=1./Xdp(:,1);
Xf=1./Xdp(:,20);
Total= table(X1,Xf);
Tdiff =a*(Total.Xf-Total.X1);
X22=1./Xdp1(:,1);
X2f=1./Xdp1(:,20);
Total22= table(X22,X2f);
Tdiff22 =a*(Total22.X2f-Total22.X22);
obj=(sum00+(Tdiff));
ob1 = min(obj);
obj2=sum01+Tdiff22;
ob2 = min(obj2);
plot(Utt,obj,'-o');
hold on
plot(Utt1,obj)

Usage of scipy.optimize.fmin_slsqp for Integer design variable

I'm trying to use the scipy.optimize.slsqp for an industrial-related constrained optimization. A highly non-linear FE model is used to generate the objective and the constraint functions, and their derivatives/sensitivities.
The objective function is in the form:
obj=a number calculated from the FE model
A series of constraint functions are set, and most of them are in the form:
cons = real number i - real number j (calculated from the FE model)
I would like to try to restrict the design variables to integers as that would be what input into the plant machine.
Another consideration is to have a log file recording what design variable have been tried. if a set of design variable (integer) is already tried for, skip the calculation, perturb the design variable and try again. By limiting the design variable to integers, we are able to limit the number of trials (while leaving the design variable to real, a change in the e.g. 8th decimal point could be regarded as untried values).
I'm using SLSQP as it is one of the SQP method (please correct me if I am wrong), and the it is said to be powerful to deal with nonlinear problems. I understand the SLSQP algorithm is a gradient-based optimizer and there is no way I can implement the restriction of the design variables being integer in the algorithm coded in FORTRAN. So instead, I modified the slsqp.py file to the following (where it calls the python extension built from the FORTRAN algorithm):
slsqp(m, meq, x, xl, xu, fx, c, g, a, acc, majiter, mode, w, jw)
for i in range(len(x)):
x[i]=int(x[i])
The code stops at the 2nd iteration and output the following:
Optimization terminated successfully. (Exit mode 0)
Current function value: -1.286621577077517
Iterations: 7
Function evaluations: 0
Gradient evaluations: 0
However, one of the constraint function is violated (value at about -5.2, while the default convergence criterion of the optimization code = 10^-6).
Questions:
1. Since the FE model is highly nonlinear, I think it's safe to assume the objective and constraint functions will be highly nonlinear too (regardless of their mathematical form). Is that correct?
2. Based on the convergence criterion of the slsqp algorithm(please see below), one of which requires the sum of all constraint violations(absolute values) to be less than a very small value (10^-6), how could the optimization exit with successful termination message?
IF ((ABS(f-f0).LT.acc .OR. dnrm2_(n,s,1).LT.acc).AND. h3.LT.acc)
Any help or advice is appreciated. Thank you.

Maya Mel project lookAt target into place after motion capture import

I have a facial animation rig which I am driving in two different manners: it has an artist UI in the Maya viewports as is common for interactive animating, and I've connected it up with the FaceShift markerless motion capture system.
I envision a workflow where a performance is captured, imported into Maya, sample data is smoothed and reduced, and then an animator takes over for finishing.
Our face rig has the eye gaze controlled by a mini-hierarchy of three objects (global lookAtTarget and a left and right eye offset).
Because the eye gazes are controlled by this LookAt setup, they need to be disabled when eye-gaze-including motion capture data is imported.
After the motion capture data is imported, the eye gazes are now set with motion capture rotations.
I am seeking a short Mel routine that does the following: it marches through the motion capture eye rotation samples, backwards calculates and sets each eyes' LookAt target position, and averages the two to get the global LookAt target's position.
After that Mel routine is run, I can turn the eye's LookAt constraint back on, the eyes gaze control returns to rig, nothing has changed visually, and the animator will have their eye UI working in the Maya viewport again.
I'm thinking this should be common logic for anyone doing facial mocap. Anyone got anything like this already?
How good is the eye tracking in the mocap? There may be issues if the targets are far away: depending on the sampling of the data, you may get 'crazy eyes' which seem not to converge, or jumpy data. If that's the case you may need to junk the eye data altogether, or smooth it heavily before retargeting.
To find the convergence of the two eyes, you try this (like #julian I'm using locators, etc since doing all the math in mel would be irritating).
1) constrain a locator to one eye so that one axis oriented along the look vector and the other is in the plane of the second eye. Let's say the eye aims down Z and the second eye is in the XZ plane
2) make a second locator, parented to the first, and constrained to the second eye in the same way: pointing down Z, with the first eye in the XZ plane
3) the local Y rotation of the second locator is the angle of convergence between the two eyes.
4) Figure out the focal distance using the law of sines and a cheat for the offset of the second eye relative to the first. The local X distance of the second eye is one leg of a right triangle. The angles of the triangle are the convergence angle from #3 and 90- the convergence angle. In other words:
focal distance eye_locator2.tx
-------------- = ---------------
sin(90 - eye_locator2.ry) sin( eye_locator2.ry)
so algebraically:
focal distance = eye_locator2.tx * sin(90 - eye_locator2.ry) / sin( eye_locator2.ry)
You'll have to subtract the local Z of eye2, since the triangle we're solving is shifted backwards or forwards by that much:
focal distance = (eye_locator2.tx * sin(90 - eye_locator2.ry) / sin( eye_locator2.ry)) - eye_locator2.tz
5) Position the target along the local Z direction of the eye locator at the distance derived above. It sounds like the actual control uses two look targets that can be moved apart to avoid crosseyes - it's kind of judgement call to know how much to use that vs the actual convergence distance. For lots of real world data the convergence may be way too far away for animator convenience: a target 30 meters away is pretty impractical to work with, but might be simulated with a target 10 meters away with a big spread. Unfortunately there's no empirical answer for that one - it's a judgement call.
I don't have this script but it would be fairly simple. Can you provide an example maya scene? You don't need any math. Here's how you could go about it:
Assume the axis pointing through the pupil is positive X, and focal length is 10 units.
Create 2 locators. Parent one to each eye. Set their translations to
(10, 0, 0).
Create 2 more locators in worldspace. Point constrain them to the others.
Create a plusMinusAverage node.
Connect the worldspace locator's translations to plusMinusAverage1 input 1 and 2
Create another locator (the lookAt)
Connect the output of plusMinusAverage1 to the translation of the lookAt locator.
Bake the translation of the lookAt locator.
Delete the other 4 locators.
Aim constrain the eyes' X axes to the lookAt.
This can all be done in a script using commands: spaceLocator, createNode, connectAttr, setAttr, bakeSimulation, pointConstraint, aimConstraint, delete.
The solution ended up being quite simple. The situation is motion capture data on the rotation nodes of the eyes, while simultaneously wanting (non-technical) animator over-ride control for the eye gaze. Within Maya, constraints have a weight factor: a parametric 0-1 value controlling the influence of the constraint. The solution is for the animator to simply key the eyes' lookAt constraint weight to 1 when they want control over the eye gaze, key those same weights to 0 when they want the motion captured eye gaze, and use a smooth transition of those constraint weights to mask the transition. This is better than my original idea described above, because the original motion capture data remains in place, available as reference, allowing the animator to switch back and forth if need be.

How to calculate continuous effect of gravitational pull between simulated planets

so I am making a simple simulation of different planets with individual velocity flying around space and orbiting each other.
I plan to simulate their pull on each other by considering each planet as projecting their own "gravity vector field." Each time step I'm going to add the vectors outputted from each planets individual vector field equation (V = -xj + (-yj) or some notation like it) except the one being effected in the calculation, and use the effected planets position as input to the equations.
However this would inaccurate, and does not consider the gravitational pull as continuous and constant. Bow do I calculate the movement of my planets if each is continuously effecting the others?
Thanks!
In addition to what Blender writes about using Newton's equations, you need to consider how you will be integrating over your "acceleration field" (as you call it in the comment to his answer).
The easiest way is to use Euler's Method. The problem with that is it rapidly diverges, but it has the advantage of being easy to code and to be reasonably fast.
If you are looking for better accuracy, and are willing to sacrifice some performance, one of the Runge-Kutta methods (probably RK4) would ordinarily be a good choice. I'll caution you that if your "acceleration field" is dynamic (i.e. it changes over time ... perhaps as a result of planets moving in their orbits) RK4 will be a challenge.
Update (Based on Comment / Question Below):
If you want to calculate the force vector Fi(tn) at some time step tn applied to a specific object i, then you need to compute the force contributed by all of the other objects within your simulation using the equation Blender references. That is for each object, i, you figure out how all of the other objects pull (apply force) and those vectors when summed will be the aggregate force vector applied to i. Algorithmically this looks something like:
for each object i
Fi(tn) = 0
for each object j ≠ i
Fi(tn) = Fi(tn) + G * mi * mj / |pi(tn)-pj(tn)|2
Where pi(tn) and pj(tn) are the positions of objects i and j at time tn respectively and the | | is the standard Euclidean (l2) normal ... i.e. the Euclidean distance between the two objects. Also, G is the gravitational constant.
Euler's Method breaks the simulation into discrete time slices. It looks at the current state and in the case of your example, considers all of the forces applied in aggregate to all of the objects within your simulation and then applies those forces as a constant over the period of the time slice. When using
ai(tn) = Fi(tn)/mi
(ai(tn) = acceleration vector at time tn applied to object i, Fi(tn) is the force vector applied to object i at time tn, and mi is the mass of object i), the force vector (and therefore the acceleration vector) is held constant for the duration of the time slice. In your case, if you really have another method of computing the acceleration, you won't need to compute the force, and can instead directly compute the acceleration. In either event, with the acceleration being held as constant, the position at time tn+1, p(tn+1) and velocity at time tn+1, v(tn+1), of the object will be given by:
pi(tn+1) = 0.5*ai(tn)*(tn+1-tn)2 + vi(tn)*(tn+1-tn)+pi(tn)
vi(tn+1) = ai(tn+1)*(tn+1-tn) + vi(tn)
The RK4 method fits the driver of your system to a 2nd degree polynomial which better approximates its behavior. The details are at the wikipedia site I referenced above, and there are a number of other resources you should be able to locate on the web. The basic idea is that instead of picking a single force value for a particular timeslice, you compute four force vectors at specific times and then fit the force vector to the 2nd degree polynomial. That's fine if your field of force vectors doesn't change between time slices. If you're using gravity to derive the vector field, and the objects which are the gravitational sources move, then you need to compute their positions at each of the four sub-intervals in order compute the force vectors. It can be done, but your performance is going to be quite a bit poorer than using Euler's method. On the plus side, you get more accurate motion of the objects relative to each other. So, it's a challenge in the sense that it's computationally expensive, and it's a bit of a pain to figure out where all the objects are supposed to be for your four samples during the time slice of your iteration.
There is no such thing as "continuous" when dealing with computers, so you'll have to approximate continuity with very small intervals of time.
That being said, why are you using a vector field? What's wrong with Newton?
And the sum of the forces on an object is that above equation. Equate the two and solve for a
So you'll just have to loop over all the objects one by one and find the acceleration on it.

Continuous collision detection between two moving tetrahedra

My question is fairly simple. I have two tetrahedra, each with a current position, a linear speed in space, an angular velocity and a center of mass (center of rotation, actually).
Having this data, I am trying to find a (fast) algorithm which would precisely determine (1) whether they would collide at some point in time, and if it is the case, (2) after how much time they collided and (3) the point of collision.
Most people would solve this by doing triangle-triangle collision detection, but this would waste a few CPU cycles on redundant operations such as checking the same edge of one tetrahedron against the same edge of the other tetrahedron upon checking up different triangles. This only means I'll optimize things a bit. Nothing to worry about.
The problem is that I am not aware of any public CCD (continuous collision detection) triangle-triangle algorithm which takes self-rotation in account.
Therefore, I need an algorithm which would be inputted the following data:
vertex data for three triangles
position and center of rotation/mass
linear velocity and angular velocity
And would output the following:
Whether there is a collision
After how much time the collision occurred
In which point in space the collision occurred
Thanks in advance for your help.
The commonly used discrete collision detection would check the triangles of each shape for collision, over successive discrete points in time. While straightforward to compute, it could miss a fast moving object hitting another one, due to the collision happening between discrete points in time tested.
Continuous collision detection would first compute the volumes traced by each triangle over an infinity of time. For a triangle moving at constant speed and without rotation, this volume could look like a triangular prism. CCD would then check for collision between the volumes, and finally trace back if and at what time the triangles actually shared the same space.
When angular velocity is introduced, the volume traced by each triangle no longer looks like a prism. It might look more like the shape of a screw, like a strand of DNA, or some other non-trivial shapes you might get by rotating a triangle around some arbitrary axis while dragging it linearly. Computing the shape of such volume is no easy feat.
One approach might first compute the sphere that contains an entire tetrahedron when it is rotating at the given angular velocity vector, if it was not moving linearly. You can compute a rotation circle for each vertex, and derive the sphere from that. Given a sphere, we can now approximate the extruded CCD volume as a cylinder with the radius of the sphere and progressing along the linear velocity vector. Finding collisions of such cylinders gets us a first approximation for an area to search for collisions in.
A second, complementary approach might attempt to approximate the actual volume traced by each triangle by breaking it down into small, almost-prismatic sub-volumes. It would take the triangle positions at two increments of time, and add surfaces generated by tracing the triangle vertices at those moments. It's an approximation because it connects a straight line rather than an actual curve. For the approximation to avoid gross errors, the duration between each successive moments needs to be short enough such that the triangle only completes a small fraction of a rotation. The duration can be derived from the angular velocity.
The second approach creates many more polygons! You can use the first approach to limit the search volume, and then use the second to get higher precision.
If you're solving this for a game engine, you might find the precision of above sufficient (I would still shudder at the computational cost). If, rather, you're writing a CAD program or working on your thesis, you might find it less than satisfying. In the latter case, you might want to refine the second approach, perhaps by a better geometric description of the volume occupied by a turning, moving triangle -- when limited to a small turn angle.
I have spent quite a lot of time wondering about geometry problems like this one, and it seems like accurate solutions, despite their simple statements, are way too complicated to be practical, even for analogous 2D cases.
But intuitively I see that such solutions do exist when you consider linear translation velocities and linear angular velocities. Don't think you'll find the answer on the web or in any book because what we're talking about here are special, yet complex, cases. An iterative solution is probably what you want anyway -- the rest of the world is satisfied with those, so why shouldn't you be?
If you were trying to collide non-rotating tetrahedra, I'd suggest a taking the Minkowski sum and performing a ray check, but that won't work with rotation.
The best I can come up with is to perform swept-sphere collision using their bounding spheres to give you a range of times to check using bisection or what-have-you.
Here's an outline of a closed-form mathematical approach. Each element of this will be easy to express individually, and the final combination of these would be a closed form expression if one could ever write it out:
1) The equation of motion for each point of the tetrahedra is fairly simple in it's own coordinate system. The motion of the center of mass (CM) will just move smoothly along a straight line and the corner points will rotate around an axis through the CM, assumed to be the z-axis here, so the equation for each corner point (parameterized by time, t) is p = vt + x + r(sin(wt+s)i + cos(wt + s)j ), where v is the vector velocity of the center of mass; r is the radius of the projection onto the x-y plane; i, j, and k are the x, y and z unit vectors; and x and s account for the starting position and phase of rotation at t=0.
2) Note that each object has it's own coordinate system to easily represent the motion, but to compare them you'll need to rotate each into a common coordinate system, which may as well be the coordinate system of the screen. (Note though that the different coordinate systems are fixed in space and not traveling with the tetrahedra.) So determine the rotation matrices and apply them to each trajectory (i.e. the points and CM of each of the tetrahedra).
3) Now you have an equation for each trajectory all within the same coordinate system and you need to find the times of the intersections. This can be found by testing whether any of the line segments from the points to the CM of a tetrahedron intersects the any of the triangles of another. This also has a closed-form expression, as can be found here.
Layering these steps will make for terribly ugly equations, but it wouldn't be hard to solve them computationally (although with the rotation of the tetrahedra you need to be sure not to get stuck in a local minimum). Another option might be to plug it into something like Mathematica to do the cranking for you. (Not all problems have easy answers.)
Sorry I'm not a math boff and have no idea what the correct terminology is. Hope my poor terms don't hide my meaning too much.
Pick some arbitrary timestep.
Compute the bounds of each shape in two dimensions perpendicular to the axis it is moving on for the timestep.
For a timestep:
If the shaft of those bounds for any two objects intersect, half timestep and start recurse in.
A kind of binary search of increasingly fine precision to discover the point at which a finite intersection occurs.
Your problem can be cast into a linear programming problem and solved exactly.
First, suppose (p0,p1,p2,p3) are the vertexes at time t0, and (q0,q1,q2,q3) are the vertexes at time t1 for the first tetrahedron, then in 4d space-time, they fill the following 4d closed volume
V = { (r,t) | (r,t) = a0 (p0,t0) + … + a3 (p3,t0) + b0 (q0,t1) + … + b3 (q3,t1) }
Here the a0...a3 and b0…b3 parameters are in the interval [0,1] and sum to 1:
a0+a1+a2+a3+b0+b1+b2+b3=1
The second tetrahedron is similarly a convex polygon (add a ‘ to everything above to define V’ the 4d volume for that moving tetrahedron.
Now the intersection of two convex polygon is a convex polygon. The first time this happens would satisfy the following linear programming problem:
If (p0,p1,p2,p3) moves to (q0,q1,q2,q3)
and (p0’,p1’,p2’,p3’) moves to (q0’,q1’,q2’,q3’)
then the first time of intersection happens at points/times (r,t):
Minimize t0*(a0+a1+a2+a3)+t1*(b0+b1+b2+b3) subject to
0 <= ak <=1, 0<=bk <=1, 0 <= ak’ <=1, 0<=bk’ <=1, k=0..4
a0*(p0,t0) + … + a3*(p3,t0) + b0*(q0,t1) + … + b3*(q3,t1)
= a0’*(p0’,t0) + … + a3’*(p3’,t0) + b0’*(q0’,t1) + … + b3’*(q3’,t1)
The last is actually 4 equations, one for each dimension of (r,t).
This is a total of 20 linear constraints of the 16 values ak,bk,ak', and bk'.
If there is a solution, then
(r,t)= a0*(p0,t0) + … + a3*(p3,t0) + b0*(q0,t1) + … + b3*(q3,t1)
Is a point of first intersection. Otherwise they do not intersect.
Thought about this in the past but lost interest... The best way to go about solving it would be to abstract out one object.
Make a coordinate system where the first tetrahedron is the center (barycentric coords or a skewed system with one point as the origin) and abstract out the rotation by making the other tetrahedron rotate around the center. This should give you parametric equations if you make the rotation times time.
Add the movement of the center of mass towards the first and its spin and you have a set of equations for movement relative to the first (distance).
Solve for t where the distance equals zero.
Obviously with this method the more effects you add (like wind resistance) the messier the equations get buts its still probably the simplest (almost every other collision technique uses this method of abstraction). The biggest problem is if you add any effects that have feedback with no analytical solution the whole equation becomes unsolvable.
Note: If you go the route of of a skewed system watch out for pitfalls with distance. You must be in the right octant! This method favors vectors and quaternions though, while the barycentric coords favors matrices. So pick whichever your system uses most effectively.