Fewer Integer planning variables VS More Boolean planning variables - optaplanner

I am trying to model the following problem using Optaplanner:
A product is stored in a central distribution center (CDC) -> Supply
QuantityCDC is the available quantity in the CDC (10)
The product is required by a regional distribution center (RDC) ->
Demand
QuantityRDC is the required quantity by the RDC (5)
Model 1
Planning variable: CDCtoRDCquantity (quantity to be sent from CDC to
RDC) of type Integer
Value Range: [0, min (QuantityCDC, QuantityRDC) = 5]
Objective function: maximize the satisfaction of the RDC
Model 2
Planning variable: SelectOption Boolean variable
List of options: 1, 2, 3, 4 and 5
Constraint : only select one option at maximum (sum SelectOption <= 1)
Objective function: maximize the satisfaction of the RDC
My question is : which one of the models will be faster and scales better? Model 1 using fewer variables of type integer with well-defined value ranges or Model 2 using more variables of Boolean naturally modeling the ranges and having to add a constraint
(The example above is an oversimplification of the real problem that has more constraints: 10-20 regional distribution centers, 1000 products, not enough supply for all demand, not enough trucks,...)

Generally, if you can hard-code a constraint, you should prefer that.
Even the most efficient constraint will take time to process - if your data model makes it impossible for some constraints to be broken, you save time.
That said, both your options sound to me like there will be a whole lot of planning variables. While 2-3 planning variables is manageable, I definitely wouldn't go past that. Consider replacing variables with entities; such as RDCEntity, which has two variables - the CDC, and the quantity taken.

Related

CPLEX: How to run subset constraints in OPL?

CPLEX; Mixed Integer Linear Programming; Constraint Formulation:
There are 150 towns in the dataset, each town having several markets (or "mandis"). Total number of mandis in the dataset = 1800. I have a binary decision variable y[mandi][days]. I want to add a constraint which states that y[mandi][days] is equal for all mandis within any given town on any given day. y[mandi][days] could be different/same for the mandis in different towns on the same day.
Sample Data
I'm inputting the data from Excel. Please see the attached image. Can you help me out with how to formulate this constraint in OPL?
One way to achieve the above is to specify individual constraints on the set of mandis within each town. However, the number of constraints, in that case, would become 150, each referring to one town. Also, we might receive from the client an additional list of mandis for some towns, which would distort the mandi-town mapping numbering, and I would have to change the mapping in CPLEX again. Is there a better way to do this, which could take the mapping directly from excel in the attached image format?
Instead of the decision variable
dvar boolean y[mandi][days]
why not use
dvar boolean y[town][days]
?
And then when you need the y for a given mandi, you first get the town of that mandi and then get its y.

How to model conditional summation (like sumif in Excel) in CP-SAT?

I am trying to model a variant of Job Shop problem in CP-SAT. In this problem, the machines have capacities >=1 so I can NOT directly use AddNoOverlap constraint from CP-SAT.
The decision variables are START and FINISH times of job operations on machines.
I want to ensure that at any given time T on machine M, the number of active operations should be <= capacity at that time.
I am new to Constraint Programming field. Please guide me in how to use the most efficient way to model in Google CP-SAT solver.

How can i minimize the cost in this situation?

help, someone can help me?
Minimum cost flow with fixed costs and awards for strings saturated.
Consider the following variant of the problem of minimum cost flow where in addition
to the network G = (V, A) with values bi associated with nodes i ∈ V, such that
Pi∈V bi = 0 and costs cij for the unit cost of transport along the arc (i, j) ∈ A we also have that:
• in each arch is associated with a capacity value that indicates the maximum flow dij
transportable along the arc;
• the number of arcs along here has sent a strictly positive flow is no more than a percentage 100p1% of the total of the arches and for each of these arcs you pay a fixed cost of K;
• the number of arcs that are saturated (arcs along which is sent a flow equal to their capacity) is at least a percentage 100p2% of the total of the arches
(p2
Formulate the mathematical model for this problem, it is written in AMPL and defining the data of a particular instance, resolving it. Care must also be an analysis of what happens if you change some of the instance data. In particular, you may find an interval [p1, p2] as small as possible so that there is a solution of the problem.
I'm not sure that I clearly understand your problem, I try to give a possible solution for each question:
You should have a positive variable let's call it Xij for each arc, which defines the current flow passing on the arc between nodes i and j.
With this variable and the given parameter Dij you can add a constraint to express the capacity boundary: Xij<=Dij ForEach (i,j) belonging to A.
About the others constraint I suggest you to use a minimization objective function of the sum{ i in N , j in N } used[i,j] * k . Where used[i,j] is a binary variable that denotes if the corresponding flow is equal to zero or not.To relate the flow with this binary variable you should add an additional constraint like this:
x[i,j] <= d[i,j] * used[i,j]
As far the number of saturated arcs concern, you can solve the max flow problem, in which the solution is given by consecutive iterations of the augmenting flow algorithm.
I'm not sure that I answer to your questions if I don't feel free of posting exactly what is your decision problem ( which is the objective function and what are the constraints )

Can AMPL handle this recursively or is a remodeling neccessary?

I'm using AMPL to model a production where I have two particular constraints that I am not very sure how to handle.
subject to Constraint1 {t in T}:
prod[t] = sum{i in I} x[i,t]*u[i] + Recycle[f]*RecycledU[f];
subject to Constraint2 {t in T}:
Solditems[t]+Recycle[t]=prod[t];
EDIT: where x[i,t] is the amount of products from supply point i. u[i] denotes the "exchange rate" of the raw material from supply point i to create the product. I.E. a percentage of the raw material will become the finished products, whereas some raw material will go to waste. The same is true for RecycledU[f] where f is in F, which denotes the refinement station where it has been refined. The difference is that RecycledU[f] has a much lower percentage that will go to waste due to Recycled already being a finished product from f (albeitly a much less profitable one). I.e. Recycle has already "went through" the process of being a raw material earlier, x, but has become a finished product in some earlier stage, or hopefully (if it can be modelled) in the same time period as this. In the actual models things as "products" and "refinement station" is existent as well, but I figured for this question those could be abandoned to keep it more simple.
What I want to accomplish is that the amount of products produced is the sum of all items sold in time period t and the amount of products recycled in time period t (by recycled I mean that the finished product is kept at the production site for further refinement in some timestep g, g>t).
Is it possible to write two equal signs for prod[t] like I have done? Also, how to handle Recycle[t]? Can AMPL "understand" that since these are represented at the same time step, that AMPL must handle the constraints recursively, i.e. compute a solution for Recycle[t] and subsequently try to improve that solution in every timestep?
EDIT: The time periods are expressed in years which is why I want to avoid having an expression with Recycle[t-1].
EDIT2: prod and x are parameters and Recycle and Solditems are variables.
Hope anyone can shed some light into this!
Cenderze
The two constraints will be considered simultaneously (unless you explicitly exclude one from the problem). AMPL or optimization solvers don't have the notion of time steps and the complete problem is considered at the same time, so you might need to add some linking constraints between time periods yourself to model time periods. In particular, you might need to make sure that the inventory (such as the amount finished product is kept at the production site for further refinement) is carried over from one period to another, something like:
Recycle[t + 1] = Recycle[t] - RecycleDecrease + RecycleIncrease;
You have to figure out the expressions for the amounts by which Recycle is increased (RecycleIncrease) and decreased (RecycleDecrease).
Also if you want some kind of an iterative procedure with one constraint considered at a time instead, then you should use AMPL script.

dimensional and unit analysis in SQL database

Problem:
A relational database (Postgres) storing timeseries data of various measurement values. Each measurement value can have a specific "measurement type" (e.g. temperature, dissolved oxygen, etc) and can have specific "measurement units" (e.g. Fahrenheit/Celsius/Kelvin, percent/milligrams per liter, etc).
Question:
Has anyone built a similar database such that dimensional integrity is conserved? Have any suggestions?
I'm considering building a measurement_type and a measurement_unit table, both of these would have text two columns, ID and text. Then I would create foreign keys to these tables in the measured_value table. Text worries me somewhat because there's the possibility for non-unique duplicates (e.g. 'ug/l' vs 'µg/l' for micrograms per liter).
The purpose of this would be so that I can both convert and verify units on queries, or via programming externally. Ideally, I would have the ability later to include strict dimensional analysis (e.g. linking µg/l to the value 'M/V' (mass divided by volume)).
Is there a more elegant way to accomplish this?
I produced a database sub-schema for handling units an aeon ago (okay, I exaggerate slightly; it was about 20 years ago, though). Fortunately, it only had to deal with simple mass, length, time dimensions - not temperature, or electric current, or luminosity, etc. Rather less simple was the currency side of the game - there were a myriad different ways of converting between one currency and another depending on date, currency, and period over which conversion rate was valid. That was handled separately from the physical units.
Fundamentally, I created a table 'measures' with an 'id' column, a name for the unit, an abbreviation, and a set of dimension exponents - one each for mass, length, time. This gets populated with names such as 'volume' (length = 3, mass = 0, time = 0), 'density' (length = 3, mass = -1, time = 0) - and the like.
There was a second table of units, which identified a measure and then the actual units used by a particular measurement. For example, there were barrels, and cubic metres, and all sorts of other units of relevance.
There was a third table that defined conversion factors between specific units. This consisted of two units and the multiplicative conversion factor that converted unit 1 to unit 2. The biggest problem here was the dynamic range of the conversion factors. If the conversion from U1 to U2 is 1.234E+10, then the inverse is a rather small number (8.103727714749e-11).
The comment from S.Lott about temperatures is interesting - we didn't have to deal with those. A stored procedure would have addressed that - though integrating one stored procedure into the system might have been tricky.
The scheme I described allowed most conversions to be described once (including hypothetical units such as furlongs per fortnight, or less hypothetical but equally obscure ones - outside the USA - like acre-feet), and the conversions could be validated (for example, both units in the conversion factor table had to have the same measure). It could be extended to handle most of the other units - though the dimensionless units such as angles (or solid angles) present some interesting problems. There was supporting code that would handle arbitrary conversions - or generate an error when the conversion could not be supported. One reason for this system was that the various international affiliate companies would report their data in their locally convenient units, but the HQ system had to accept the original data and yet present the resulting aggregated data in units that suited the managers - where different managers each had their own idea (based on their national background and length of duty in the HQ) about the best units for their reports.
"Text worries me somewhat because there's the possibility for non-unique duplicates"
Right. So don't use text as a key. Use the ID as a key.
"Is there a more elegant way to accomplish this?"
Not really. It's hard. Temperature is it's own problem because temperature is itself an average, and doesn't sum like distance does; plus F to C conversion is not a multiply (as it is with every other unit conversion.)
A note about conversions: a lot of units are linearly related, and can be converted using a formula like "y = A + Bx", where A and B are constants which could be stored in the database for each pair of units that you need to convert between. For example, for Celsius to Farenheit the constants are A=32, B=1.8.
However, there are also rare exceptions. Converting between logarithmic and non-logarithmic units, for example. Or converting between mass-per-volume and molar-mass-per-volume (in which case you would need to know the molar mass of the compound being measured).
Of course, if you are sure that all the conversions required by the system are linear, then there's no need for over-engineering, just store the two constants. You can then extract standardized results from the database using straight SQL joins with calculated fields.