Taking the difference of 2 nodes in a decision problem while keeping the model as an MILP - optimization

To explain the question it's best to start with this
picture
I am modeling an optimization decision problem and a feature that I'm trying to implement is heat transfer between the process stages (a = 1, 2) taking into account which equipment type is chosen (j = 1, 2, 3) by the binary decision variable y.
The temperatures for the equipment are fixed values and my goal is to find (in the case of the picture) dT = 120 - 70 = 50 while keeping the temperature difference as a parameter (I want to keep the problem linear and need to multiply the temperature difference with a variable later on).
Things I have tried:
dT = T[a,j] - T[a-1,j]
(this obviously gives T = 80 for T[a-1,j] which is incorrect)
T[a-1] = sum(T[a-1,j] * y[a-1,j] for j in (1,2,3)
This will make the problem non-linear when I multiply with another variable.
I am using pyomo and the linear "glpk" solver. Thank you for reading my post and if someone could help me with this it is greatly appreciated!

If you only have 2 stages and 3 pieces of equipment at each stage, you could reformulate and let a binary decision variable Y[i] represent each of the 9 possible connections and delta_T[i] be a parameter that represents the temp difference associated with the same 9 connections which could easily be calculated and put into a model parameter.
If you want to keep in double-indexed, and assuming that there will only be 1 piece of equipment selected at each stage, you could take the sum-product of the selection variable and temps at each stage and subtract them.
dT[a] = sum(T[a, j]*y[a, j] for j in J) - sum(T[a-1, j]*y[a-1, j] for j in J)
for a ∈ {2, 3, ..., N}

Related

Gekko Variable Definition - Primary vrs. Utility Decision Variable

I am trying to formulate and solve an optimization problem based on an article. The authors introduced 2 decision variables. Power of station i at time t, P_i,t, and a binary variable X_i,n which is 1 if vehicle n is assigned to station i.
They introduced some other variables, called utility variables. For instance, energy delivered from station i up to time t for vehicle n, E_i,t,n which is calculated based on primary decision variables and a few fix parameters.
My question is should I define the utility variables as Gekko variables? If yes, which type is more appropriate?
I = 4 # number of stations
T = 24 # hours of simulation
N = 5 # number of vehicles
p = m.Array(m.Var,(I,T),lb=0,ub= params.ev.max_power)
x = m.Array(m.Var,(I,N),lb=0,ub=1, integer = True)
Should I define E as follow to solve these equations as an example? This introduces extra variables that are not primary decision variables and are calculated based on other terms that depend on the primary decision variable.
E = m.Array(m.Var,(I,T,N),lb=0)
for i in range(I):
for n in range(N):
for t in range(T):
m.Equation(E[i][t][n] >= np.sum(0.25 * availability[n, :t] * p[i,:t]) - (M * (1 - x[i][n])))
m.Equation(E[i][t][n] <= np.sum(0.25 * availability[n, :t] * p[i,:t]) + (M * (1 - x[i][n])))
m.Equation(E[i][t][n] <= M * x[i][n])
m.Equation(E[i][t][n] >= -M * x[i][n])
All of those variable definitions and equations look correct. Here are a few suggestions:
There is no availability[] variable defined yet. If availability is a function of other decision variables, then it is generally more efficient to use an m.Intermediate() definition to define it.
As the total number of total decision variables increase, there is often a large increase in computational time. I recommend starting with a small problem initially and then scale-up to the larger sized problem.
Try the gekko m.sum() instead of sum or np.sum() for potentially more efficient calculations. Using m.sum() does increase the model compile time but generally decreases the optimization solve time, so it is a trade-off.

Restrain variable to a bounded region (interval) formulation in Mixed Integer Linear Programming

I have 4 non negative real variable that are A, B, C and X. Based on the current problem that I have, I notice that the variable X must belong to the interval of [B,C] and the relation will be a bunch of if-else conditions like this:
If A < B:
x = B
elseif A > C:
x = C
elseif B<=A<=C:
x = A
As you can see, it quite difficult to reformulate as a Mixed Integer Programming problem with corresponding decision variable (d1, d2 and d3). I have try reading some instructions regarding if-then formulation using big M method at this site:
https://www.math.cuhk.edu.hk/course_builder/1415/math3220/L2%20(without%20solution).pdf but it seem that this problem is more challenging than their tutorial.
Could you kindly provide me with a formulation for this situation ?
Thank you very much !

Ranking Big O Functions By Complexity

I am trying to rank these functions — 2n, n100, (n + 1)2, n·lg(n), 100n, n!, lg(n), and n99 + n98 — so that each function is the big-O of the next function, but I do not know a method of determining if one function is the big-O of another. I'd really appreciate if someone could explain how I would go about doing this.
Assuming you have some programming background. Say you have below code:
void SomeMethod(int x)
{
for(int i = 0; i< x; i++)
{
// Do Some Work
}
}
Notice that the loop runs for x iterations. Generalizing, we say that you will get the solution after N iterations (where N will be the value of x ex: number of items in array/input etc).
so This type of implementation/algorithm is said to have Time Complexity of Order of N written as O(n)
Similarly, a Nested For (2 Loops) is O(n-squared) => O(n^2)
If you have Binary decisions made and you reduce possibilities into halves and pick only one half for solution. Then complexity is O(log n)
Found this link to be interesting.
For: Himanshu
While the Link explains how log(base2)N complexity comes into picture very well, Lets me put the same in my words.
Suppose you have a Pre-Sorted List like:
1,2,3,4,5,6,7,8,9,10
Now, you have been asked to Find whether 10 exists in the list. The first solution that comes to mind is Loop through the list and Find it. Which means O(n). Can it be made better?
Approach 1:
As we know that List of already sorted in ascending order So:
Break list at center (say at 5).
Compare the value of Center (5) with the Search Value (10).
If Center Value == Search Value => Item Found
If Center < Search Value => Do above steps for Right Half of the List
If Center > Search Value => Do above steps for Left Half of the List
For this simple example we will find 10 after doing 3 or 4 breaks (at: 5 then 8 then 9) (depending on how you implement)
That means For N = 10 Items - Search time was 3 (or 4). Putting some mathematics over here;
2^3 + 2 = 10 for simplicity sake lets say
2^3 = 10 (nearly equals --- this is just to do simple Logarithms base 2)
This can be re-written as:
Log-Base-2 10 = 3 (again nearly)
We know 10 was number of items & 3 was the number of breaks/lookup we had to do to find item. It Becomes
log N = K
That is the Complexity of the alogorithm above. O(log N)
Generally when a loop is nested we multiply the values as O(outerloop max value * innerloop max value) n so on. egfor (i to n){ for(j to k){}} here meaning if youll say for i=1 j=1 to k i.e. 1 * k next i=2,j=1 to k so i.e. the O(max(i)*max(j)) implies O(n*k).. Further, if you want to find order you need to recall basic operations with logarithmic usage like O(n+n(addition)) <O(n*n(multiplication)) for log it minimizes the value in it saying O(log n) <O(n) <O(n+n(addition)) <O(n*n(multiplication)) and so on. By this way you can acheive with other functions as well.
Approach should be better first generalised the equation for calculating time complexity. liken! =n*(n-1)*(n-2)*..n-(n-1)so somewhere O(nk) would be generalised formated worst case complexity like this way you can compare if k=2 then O(nk) =O(n*n)

How leave's scores are calculated in this XGBoost trees?

I am looking at the below image.
Can someone explain how they are calculated?
I though it was -1 for an N and +1 for a yes but then I can't figure out how the little girl has .1. But that doesn't work for tree 2 either.
I agree with #user1808924. I think it's still worth to explain how XGBoost works under the hood though.
What is the meaning of leaves' scores ?
First, the score you see in the leaves are not probability. They are the regression values.
In Gradient Boosting Tree, there's only regression tree. To predict if a person like computer games or not, the model (XGboost) will treat it as a regression problem. The labels here become 1.0 for Yes and 0.0 for No. Then, XGboost puts regression trees in for training. The trees of course will return something like +2, +0.1, -1, which we get at the leaves.
We sum up all the "raw scores" and then convert them to probabilities by applying sigmoid function.
How to calculate the score in leaves ?
The leaf score (w) are calculated by this formula:
w = - (sum(gi) / (sum(hi) + lambda))
where g and h are the first derivative (gradient) and the second derivative (hessian).
For the sake of demonstration, let's pick the leaf which has -1 value of the first tree. Suppose our objective function is mean squared error (mse) and we choose the lambda = 0.
With mse, we have g = (y_pred - y_true) and h=1. I just get rid of the constant 2, in fact, you can keep it and the result should stay the same. Another note: at t_th iteration, y_pred is the prediction we have after (t-1)th iteration (the best we've got until that time).
Some assumptions:
The girl, grandpa, and grandma do NOT like computer games (y_true = 0 for each person).
The initial prediction is 1 for all the 3 people (i.e., we guess all people love games. Note that, I choose 1 on purpose to get the same result with the first tree. In fact, the initial prediction can be the mean (default for mean squared error), median (default for mean absolute error),... of all the observations' labels in the leaf).
We calculate g and h for each individual:
g_girl = y_pred - y_true = 1 - 0 = 1. Similarly, we have g_grandpa = g_grandma = 1.
h_girl = h_grandpa = h_grandma = 1
Putting the g, h values into the formula above, we have:
w = -( (g_girl + g_grandpa + g_grandma) / (h_girl + h_grandpa + h_grandma) ) = -1
Last note: In practice, the score in leaf which we see when plotting the tree is a bit different. It will be multiplied by the learning rate, i.e., w * learning_rate.
The values of leaf elements (aka "scores") - +2, +0.1, -1, +0.9 and -0.9 - were devised by the XGBoost algorithm during training. In this case, the XGBoost model was trained using a dataset where little boys (+2) appear somehow "greater" than little girls (+0.1). If you knew what the response variable was, then you could probably interpret/rationalize those contributions further. Otherwise, just accept those values as they are.
As for scoring samples, then the first addend is produced by tree1, and the second addend is produced by tree2. For little boys (age < 15, is male == Y, and use computer daily == Y), tree1 yields 2 and tree2 yields 0.9.
Read this
https://towardsdatascience.com/xgboost-mathematics-explained-58262530904a
and then this
https://medium.com/#gabrieltseng/gradient-boosting-and-xgboost-c306c1bcfaf5
and the appendix
https://gabrieltseng.github.io/appendix/2018-02-25-XGB.html

Linear programming and event occurrence

Suppose we have N (in this example N = 3) events that can happen depending on some variables. Each of them can generate certain profit or loses (event1 = 300, event2 = -100, event3 = 200), they are constrained by rules when they happen.
event 1 happens only when x > 5,
event 2 happens only when x = 2 and y = 3
event 3 happens only when x is odd.
The problem is to know the maximum profit.
Assume x, y are integer numbers >= 0
In the real problem there are many events and many dimensions.
(the solution should not be specific)
My question is:
Is this linear programming problem? If yes please provide solution to the example problem using this approach. If no please suggest some algorithms to optimize such problem.
This can be formulated as a mixed integer linear program. This is a linear program where some of the variables are constrained to be integer. Contrary to linear programs, solving the general integer program is NP-hard. However, there are many commercial or open source solvers that can solve efficiently large-scale problems. For up to 300 variables and constraints, you can use excel's solver.
Here is a way to formulate the above constraints:
If you go down this route, you might find this document useful.
the last constraint in an interesting one. I am assuming that x has to be integer, but if x can be either integer or continuous I will edit the answer accordingly.
I hope this helps!
Edit: L and U above should be interpreted as L1 and U1.
Edit 2: z2 needs to changed to (1-z2) on the 3rd and 4th constraint.
A specific answer:
seems more like a mathematical calculation than a programming problem, can't you just run a loop for x= 1->1000 to see what results occur?
for the example:
as x = 2 or 3 = -200 then x > 2 or 3, and if x < 5 doesn't get the 300, so all that is really happening is x > 5 and x = odd = maximum results.
x = 7 = 300 + 200 . = maximum profit for x
A general answer:
I don't see how to answer the question without seeing what the events are and how the events effect X ? Weather it's a linear or functional (mathematical) answer seems rather beside the point of finding the desired solution.