minimize/1 is not rearranging the order of solutions - optimization

For Colombia's Observatorio Fiscal[1], I am coding a simple tax minimization problem, using CLP(R) (in SWI-Prolog). I want to use minimize/1 to find the least solution first. It is instead listing the bigger solution first. Here is the code:
:- use_module(library(clpr)).
deduction(_,3). % Anyone can take the standard deduction.
deduction(Who,D) :- itemizedDeduction(Who,D). % Or they can itemize.
income(joe,10). % Joe makes $10 a year.
itemizedDeduction(joe,4). % He can deduct more if he itemizes.
taxableIncome(Who,TI) :-
deduction(Who,D),
income(Who,I),
TI is I - D,
minimize(TI).
Here is what an interactive session looks like:
?- taxableIncome(joe,N).
N = 7 ;
N = 6 ;
false.
If I switch the word "minimize" to "maximize" it behaves identically. If I include no minimize or maximize clause, it doesn't look for a third solution, but otherwise it behaves the same:
?- taxableIncome(joe,N).
N = 7 ;
N = 6.
[1] The Observatorio Fiscal is a new organization that aims to model the Colombian economy, in order to anticipate the effects of changes in the law, similar to what the Congressional Budget Office or the Tax Policy Center do in the United States.

First, let's add the following definition to the program:
:- op(950,fy, *).
*_.
Using (*)/1, we can generalize away individual goals in the program.
For example, let us generalize away the minimize/1 goal by placing * in front:
taxableIncome(Who,TI) :-
deduction(Who,D),
income(Who,I),
TI #= I - D,
* minimize(TI).
We now get:
?- taxableIncome(X, Y).
X = joe,
Y = 7 ;
X = joe,
Y = 6.
This shows that CLP(R) in fact has nothing to do with this issue! These answers show that everything is already instantiated at the time minimize/1 is called, so there is nothing left to minimize.
To truly benefit from minimize/1, you must express the task in the form of CLP(R)—or better: CLP(Q)— constraints, then apply minimize/1 on a constrained expression.
Note also that in SWI-Prolog, both CLP(R) and CLP(Q) have elementary mistakes, and you cannot trust their results.

Per Mat's response, I rewrote the program expressing the constraints using CLP. The tricky bit was that I had to first collect all (both) possible values for deduction, then convert those values into a CLP domain. I couldn't get that conversion to work in CLP(R), but I could in CLP(FD):
:- use_module(library(clpfd)).
deduction(_,3). % Anyone can take the same standard deduction.
deduction(Who,D) :- % Or they can itemize.
itemizedDeduction(Who,D).
income(joe,10).
itemizedDeduction(joe,4).
listToDomain([Elt],Elt).
listToDomain([Elt|MoreElts],Elt \/ MoreDom) :-
MoreElts \= []
, listToDomain(MoreElts,MoreDom).
taxableIncome(Who,TI) :-
income(Who,I)
, findall(D,deduction(Who,D),DList)
, listToDomain(DList,DDomain)
% Next are the CLP constraints.
, DD in DDomain
, TI #= I-DD
, labeling([min(TI)],[TI]).

Related

Gekko Variable Definition - Primary vrs. Utility Decision Variable

I am trying to formulate and solve an optimization problem based on an article. The authors introduced 2 decision variables. Power of station i at time t, P_i,t, and a binary variable X_i,n which is 1 if vehicle n is assigned to station i.
They introduced some other variables, called utility variables. For instance, energy delivered from station i up to time t for vehicle n, E_i,t,n which is calculated based on primary decision variables and a few fix parameters.
My question is should I define the utility variables as Gekko variables? If yes, which type is more appropriate?
I = 4 # number of stations
T = 24 # hours of simulation
N = 5 # number of vehicles
p = m.Array(m.Var,(I,T),lb=0,ub= params.ev.max_power)
x = m.Array(m.Var,(I,N),lb=0,ub=1, integer = True)
Should I define E as follow to solve these equations as an example? This introduces extra variables that are not primary decision variables and are calculated based on other terms that depend on the primary decision variable.
E = m.Array(m.Var,(I,T,N),lb=0)
for i in range(I):
for n in range(N):
for t in range(T):
m.Equation(E[i][t][n] >= np.sum(0.25 * availability[n, :t] * p[i,:t]) - (M * (1 - x[i][n])))
m.Equation(E[i][t][n] <= np.sum(0.25 * availability[n, :t] * p[i,:t]) + (M * (1 - x[i][n])))
m.Equation(E[i][t][n] <= M * x[i][n])
m.Equation(E[i][t][n] >= -M * x[i][n])
All of those variable definitions and equations look correct. Here are a few suggestions:
There is no availability[] variable defined yet. If availability is a function of other decision variables, then it is generally more efficient to use an m.Intermediate() definition to define it.
As the total number of total decision variables increase, there is often a large increase in computational time. I recommend starting with a small problem initially and then scale-up to the larger sized problem.
Try the gekko m.sum() instead of sum or np.sum() for potentially more efficient calculations. Using m.sum() does increase the model compile time but generally decreases the optimization solve time, so it is a trade-off.

Restrain variable to a bounded region (interval) formulation in Mixed Integer Linear Programming

I have 4 non negative real variable that are A, B, C and X. Based on the current problem that I have, I notice that the variable X must belong to the interval of [B,C] and the relation will be a bunch of if-else conditions like this:
If A < B:
x = B
elseif A > C:
x = C
elseif B<=A<=C:
x = A
As you can see, it quite difficult to reformulate as a Mixed Integer Programming problem with corresponding decision variable (d1, d2 and d3). I have try reading some instructions regarding if-then formulation using big M method at this site:
https://www.math.cuhk.edu.hk/course_builder/1415/math3220/L2%20(without%20solution).pdf but it seem that this problem is more challenging than their tutorial.
Could you kindly provide me with a formulation for this situation ?
Thank you very much !

Ranking Big O Functions By Complexity

I am trying to rank these functions — 2n, n100, (n + 1)2, n·lg(n), 100n, n!, lg(n), and n99 + n98 — so that each function is the big-O of the next function, but I do not know a method of determining if one function is the big-O of another. I'd really appreciate if someone could explain how I would go about doing this.
Assuming you have some programming background. Say you have below code:
void SomeMethod(int x)
{
for(int i = 0; i< x; i++)
{
// Do Some Work
}
}
Notice that the loop runs for x iterations. Generalizing, we say that you will get the solution after N iterations (where N will be the value of x ex: number of items in array/input etc).
so This type of implementation/algorithm is said to have Time Complexity of Order of N written as O(n)
Similarly, a Nested For (2 Loops) is O(n-squared) => O(n^2)
If you have Binary decisions made and you reduce possibilities into halves and pick only one half for solution. Then complexity is O(log n)
Found this link to be interesting.
For: Himanshu
While the Link explains how log(base2)N complexity comes into picture very well, Lets me put the same in my words.
Suppose you have a Pre-Sorted List like:
1,2,3,4,5,6,7,8,9,10
Now, you have been asked to Find whether 10 exists in the list. The first solution that comes to mind is Loop through the list and Find it. Which means O(n). Can it be made better?
Approach 1:
As we know that List of already sorted in ascending order So:
Break list at center (say at 5).
Compare the value of Center (5) with the Search Value (10).
If Center Value == Search Value => Item Found
If Center < Search Value => Do above steps for Right Half of the List
If Center > Search Value => Do above steps for Left Half of the List
For this simple example we will find 10 after doing 3 or 4 breaks (at: 5 then 8 then 9) (depending on how you implement)
That means For N = 10 Items - Search time was 3 (or 4). Putting some mathematics over here;
2^3 + 2 = 10 for simplicity sake lets say
2^3 = 10 (nearly equals --- this is just to do simple Logarithms base 2)
This can be re-written as:
Log-Base-2 10 = 3 (again nearly)
We know 10 was number of items & 3 was the number of breaks/lookup we had to do to find item. It Becomes
log N = K
That is the Complexity of the alogorithm above. O(log N)
Generally when a loop is nested we multiply the values as O(outerloop max value * innerloop max value) n so on. egfor (i to n){ for(j to k){}} here meaning if youll say for i=1 j=1 to k i.e. 1 * k next i=2,j=1 to k so i.e. the O(max(i)*max(j)) implies O(n*k).. Further, if you want to find order you need to recall basic operations with logarithmic usage like O(n+n(addition)) <O(n*n(multiplication)) for log it minimizes the value in it saying O(log n) <O(n) <O(n+n(addition)) <O(n*n(multiplication)) and so on. By this way you can acheive with other functions as well.
Approach should be better first generalised the equation for calculating time complexity. liken! =n*(n-1)*(n-2)*..n-(n-1)so somewhere O(nk) would be generalised formated worst case complexity like this way you can compare if k=2 then O(nk) =O(n*n)

Constraining on a Minimum of Distinct Values with PuLP

I've been using the PuLP library for a side project (daily fantasy sports) where I optimize the projected value of a lineup based on a series of constraints.
I've implemented most of them, but one constraint is that players must come from at least three separate teams.
This paper has an implementation (page 18, 4.2), which I've attached as an image:
It seems that they somehow derive an indicator variable for each team that's one if a given team has at least one player in the lineup, and then it constrains the sum of those indicators to be greater than or equal to 3.
Does anybody know how this would be implemented in PuLP?
Similar examples would also be helpful.
Any assistance would be super appreciated!
In this case you would define a binary variable t that sets an upper limit of the x variables. In python I don't like to name variables with a single letter but as I have nothing else to go on here is how I would do it in pulp.
assume that the variables lineups, players, players_by_team and teams are set somewhere else
x_index = [i,p for i in lineups for p in players]
t_index = [i,t for i in lineups for t in teams]
x = LpVariable.dicts("x", x_index, lowBound=0)
t = LpVAriable.dicts("t", t_index, cat=LpBinary)
for l in teams:
prob += t[i,l] <=lpSum([x[i,k] for k in players_by_team[l]])
prob += lpSum([t[i,l] for l in teams]) >= 3

Linear programming and event occurrence

Suppose we have N (in this example N = 3) events that can happen depending on some variables. Each of them can generate certain profit or loses (event1 = 300, event2 = -100, event3 = 200), they are constrained by rules when they happen.
event 1 happens only when x > 5,
event 2 happens only when x = 2 and y = 3
event 3 happens only when x is odd.
The problem is to know the maximum profit.
Assume x, y are integer numbers >= 0
In the real problem there are many events and many dimensions.
(the solution should not be specific)
My question is:
Is this linear programming problem? If yes please provide solution to the example problem using this approach. If no please suggest some algorithms to optimize such problem.
This can be formulated as a mixed integer linear program. This is a linear program where some of the variables are constrained to be integer. Contrary to linear programs, solving the general integer program is NP-hard. However, there are many commercial or open source solvers that can solve efficiently large-scale problems. For up to 300 variables and constraints, you can use excel's solver.
Here is a way to formulate the above constraints:
If you go down this route, you might find this document useful.
the last constraint in an interesting one. I am assuming that x has to be integer, but if x can be either integer or continuous I will edit the answer accordingly.
I hope this helps!
Edit: L and U above should be interpreted as L1 and U1.
Edit 2: z2 needs to changed to (1-z2) on the 3rd and 4th constraint.
A specific answer:
seems more like a mathematical calculation than a programming problem, can't you just run a loop for x= 1->1000 to see what results occur?
for the example:
as x = 2 or 3 = -200 then x > 2 or 3, and if x < 5 doesn't get the 300, so all that is really happening is x > 5 and x = odd = maximum results.
x = 7 = 300 + 200 . = maximum profit for x
A general answer:
I don't see how to answer the question without seeing what the events are and how the events effect X ? Weather it's a linear or functional (mathematical) answer seems rather beside the point of finding the desired solution.