Transition matrix not included in my solution of my scheduling problem in CPLEX CP - optimization

My distance matrix in my no overlap constraint does not seem to work in my model outcome. I have formulated the distance matrix by means of a tuple set. I have tried this in 2 different ways as can be seen in the code. Both tuple sets seem to be correct and the distance matrix is added in the noOverlap constraint for the dvar sequence.
Nevertheless I do not see the added transition distance between products in the optimal results. Jobs seem to continue at the same time when a job is finished. Instead of waiting for a transition time. I would like this transition matrix to hold both for machine 1 and machine 2.
Could someone tell me what I did wrong in my model formulation? I have looked into the examples, but they seem to be constructed in the same way. So I do not know what I am doing wrong.
mod.
using CP;
// Number of Machines (Packing + Manufacturing)
int nbMachines = ...;
range Machines = 1..nbMachines;
// Number of Jobs
int nbJobs = ...;
range Jobs = 1..nbJobs;
int duration[Jobs,Machines] = ...;
int release = ...;
int due = ...;
tuple Matrix { int job1; int job2; int value; };
//{Matrix} transitionTimes ={<1,1,0>,<1,2,6>,<1,3,2>,<2,1,2>,<2,2,0>,<2,3,1>,<3,1,2>,<3,2,3>,<3,3,0>};
{Matrix} transitionTimes ={ <i,j, ftoi(abs(i-j))> | i in Jobs, j in Jobs };
dvar interval task[j in Jobs] in release..due;
dvar interval opttask[j in Jobs][m in Machines] optional size duration[j][m];
dvar sequence tool[m in Machines] in all(j in Jobs) opttask[j][m];
execute {
cp.param.FailLimit = 5000;
}
// Minimize the max timespan
dexpr int makespan = max(j in Jobs, m in Machines)endOf(opttask[j][m]);
minimize makespan;
subject to {
// Each job needs one unary resource of the alternative set s (28)
forall(j in Jobs){
alternative(task[j], all(m in Machines) opttask[j][m]);
}
forall(m in Machines){
noOverlap(tool[m],transitionTimes);
}
};
execute {
writeln(task);
};
dat.
nbMachines = 2;
nbJobs = 3;
duration = [
[5,6],
[3,4],
[5,7]
];
release = 1;
due = 30;
``

You should specify interval types for each sequence.
In your case, the type is the job id:
int JobId[j in Jobs] = j;
dvar sequence tool[m in Machines] in all(j in Jobs) opttask[j][m] types JobId;

Related

writing a boolean decision variable in cplex

PLEASE HELP ME: i have a boolean decision variable. moreover, set t itself in the domain of this variable should be calculated from an equation.**
Xro^t={1 if request r∈R_τ is assigned to offer o ∈ O_τ at time period t∈ Γ_ro,
0 Otherwise}
Γ_ro= [p, a]∩z
in this case, i am not sure if Γ_ro is a set or is a parameter?
You could rely on tuple set in order not to do the full Euclidian product:
range R=1..4;
range O=1..3;
int Tstart[r in R][o in O]=rand(5);
int Tend[r in R][o in O]=Tstart[r][o]+rand(4);
tuple rot
{
int r;
int o;
int t;
}
sorted {rot} rots={<r,o,t> | r in R,o in O,t in Tstart[r][o]..Tend[r][o]};
dvar boolean x[rots];
minimize sum(i in rots) x[i];
subject to
{
}

Using a solution from a model as an input to another one and Outputting each Solution Separately

I'm solving an optimization problem in which I need the result from one model to be used as a input in another model for 180 iterations. I'm using CPLEX with OPL language without any addon.
I tried to save the values from one model into an Excel file and reading those into the next model but since I'm going to do this 180 times I am worried I will make an error and have to restart or not even know I made an error.
Is it possible to have this run for 180 iterations and input each iteration's solution separately?
You can rely on warmstart for that.
2 simple examples in easy OPL
Warm start from a file:
include "zoo.mod";
main {
var filename = "c:/temp/mipstart.mst";
thisOplModel.generate();
cplex.readMIPStarts(filename);
cplex.solve();
writeln("Objective: " + cplex.getObjValue());
}
or with API
int nbKids=300;
// a tuple is like a struct in C, a class in C++ or a record in Pascal
tuple bus
{
key int nbSeats;
float cost;
}
// This is a tuple set
{bus} pricebuses=...;
// asserts help make sure data is fine
assert forall(b in pricebuses) b.nbSeats>0;assert forall(b in pricebuses) b.cost>0;
// To compute the average cost per kid of each bus
// you may use OPL modeling language
float averageCost[b in pricebuses]=b.cost/b.nbSeats;
// Let us try first with a naïve computation, use the cheapest bus
float cheapestCostPerKid=min(b in pricebuses) averageCost[b];
int cheapestBusSize=first({b.nbSeats | b in pricebuses : averageCost[b]==cheapestCostPerKid});
int nbBusNeeded=ftoi(ceil(nbKids/cheapestBusSize));
float cost0=item(pricebuses,<cheapestBusSize>).cost*nbBusNeeded;
execute DISPLAY_Before_SOLVE
{
writeln("The naïve cost is ",cost0);
writeln(nbBusNeeded," buses ",cheapestBusSize, " seats");
writeln();
}
int naiveSolution[b in pricebuses]=
(b.nbSeats==cheapestBusSize)?nbBusNeeded:0;
// decision variable array
dvar int+ nbBus[pricebuses];
// objective
minimize
sum(b in pricebuses) b.cost*nbBus[b];
// constraints
subject to
{
sum(b in pricebuses) b.nbSeats*nbBus[b]>=nbKids;
}
float cost=sum(b in pricebuses) b.cost*nbBus[b];
execute DISPLAY_After_SOLVE
{
writeln("The minimum cost is ",cost);
for(var b in pricebuses) writeln(nbBus[b]," buses ",b.nbSeats, " seats");
}
main
{
thisOplModel.generate();
// Warm start the naïve solution
cplex.addMIPStart(thisOplModel.nbBus,thisOplModel.naiveSolution);
cplex.solve();
thisOplModel.postProcess();
}

Array issue with endbefore start in CPLEX

I am trying to add an Endbeforestartconstraint to my contrained programming problem. However, I receive an error saying that my end beforestart is not an array type. I do not understand this as I almost copied the constraint and data from the sched_seq example in CPLEX, I only changed it to integers.
What I try to accomplish with the constraint, is that task 3 and task 1 will be performed before task 2 will start.
How I can fix the array error for this constraint?
Please find below the relevant parts of my code
tuple Precedence {int pre;int post;};
{Precedence} Precedences = {<3,2>,<1,2>};
dvar interval task[j in Jobs] in release..due;
dvar interval opttask[j in Jobs][m in Machines] optional size duration[j][m];
dvar sequence tool[m in Machines] in all(j in Jobs) opttask[j][m]
dexpr int makespan = max(j in Jobs, m in Machines)(endOf(opttask[j][m]));
minimize makespan;
subject to {
// Each job needs one unary resource of the alternative set s (28)
forall(j in Jobs){
alternative(task[j], all(m in Machines) opttask[j][m]);
}
// No overlap on machines
forall(j in Jobs)
forall(p in Precedences)
endBeforeStart(opttask[j][p.pre],opttask[j][p.post]);
forall(m in Machines){
noOverlap(tool[m],transitionTimes);
}
};
execute {
writeln(task);
dat.
nbMachines = 2;
nbJobs = 3;
duration = [
[5,6],
[4,4],
[5,8]
];
release = 1;
due = 30;
There are several errors in your model, on ranges or on inverted indices.
Also, next time, please post a complete program showing the problem, not just a partial one, this may help you to get quicker answers.
A corrected program:
using CP;
int nbMachines = 2;
int nbJobs = 3;
range Machines = 0..nbMachines-1;
range Jobs = 0..nbJobs-1;
int duration[Jobs][Machines] = [
[5,6],
[4,4],
[5,8]
];
int release = 1;
int due = 30;
tuple Precedence {int pre;int post;};
{Precedence} Precedences = {<2,1>,<0,1>};
dvar interval task[j in Jobs] in release..due;
dvar interval opttask[j in Jobs][m in Machines] optional size duration[j][m];
dvar sequence tool[m in Machines] in all(j in Jobs) opttask[j][m];
dexpr int makespan = max(j in Jobs, m in Machines)(endOf(opttask[j][m]));
minimize makespan;
subject to {
// Each job needs one unary resource of the alternative set s (28)
forall(j in Jobs){
alternative(task[j], all(m in Machines) opttask[j][m]);
}
// No overlap on machines
forall(m in Machines)
forall(p in Precedences)
endBeforeStart(opttask[p.pre][m],opttask[p.post][m]);
};
execute {
writeln(task);
}
You must have values in p.pre or p.post that are outside of the array indexing range.

dispatch_apply leaves one thread "hanging"

I am experimenting with multithreading following Apples Concurrency Programming Guide. The multithreaded function (dispatch_apply) replacing the for-loop seems straightforward and works fine with a simple printf statement. However, if the block calls a more cpu-intensive calculation, the program never ends or executes past dispatch_apply, and one thread (main thread?) seems stuck at 100%.
#import <Foundation/Foundation.h>
#define THREAD_COUNT 16
unsigned long long longest = 0;
unsigned long long highest = 0;
void three_n_plus_one(unsigned long step);
int main(int argc, const char * argv[]) {
#autoreleasepool {
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_apply(THREAD_COUNT, queue, ^(size_t i) {
three_n_plus_one(i);
});
}
return 0;
}
void three_n_plus_one(unsigned long step) {
unsigned long long start = step;
unsigned long long end = 1000000;
unsigned long long identifier = 0;
while (start <= end) {
unsigned long long current = start;
unsigned long long sequence = 1;
while (current != 1) {
sequence += 1;
if(current % 2 == 0)
current = current / 2;
else {
current = (current * 3) + 1;
}
if (current > highest) highest = current;
}
if (sequence > longest) {
longest = sequence;
identifier = start;
printf("thread %lu, number %llu with %llu steps to one\n", step, identifier, longest);
}
start += (unsigned long long)THREAD_COUNT;
}
}
Still, the loop seems to be finished. From what I understand, this should be fairly straight forward, still I'm left clueless as to what I'm doing wrong here.
What you're calling step is the index of the loop. It goes from 0 to THREAD_COUNT-1 in your code. Since you assign start to be step, that means your first iteration tries to compute the Colatz sequence starting at zero. That computes 0/2 == 0 and so is an infinite loop.
What you meant to write is:
unsigned long long start = step + 1;
Calling your block size "THREAD_COUNT" is misleading. The question is not how many threads are created (no threads may be created; that's up to the system). The question is how many chunks to divide the work into.
Note that reading and writing to longest and highest on multiple threads without synchronization is undefined behavior, so this code may do surprising things (particularly when optimized). Don't assume it's limited to getting the wrong values in longest and highest. The optimizer is allowed to assume no other thread touches those values while it runs, and can rearrange code dramatically based on that. But that's not the cause of this particular issue.
As Rob Napier said (+1), the reason one thread is “hanging” is because you have the endless loop when supplying zero, not because of any problem with dispatch_apply (called concurrentPerform in Swift).
But, the more subtle issue (and what makes concurrent code a little less “straightforward”) is that this code is not thread-safe. There are “data races”. You are accessing and mutating highest and longest concurrently from multiple threads. I would encourage using Thread Sanitizer (TSAN) when testing concurrent code, which is pretty good at identifying these data races.
E.g., edit your scheme and temporarily turn on the thread-sanitizer:
Then, when you run, it will warn you about the data races:
You can fix these races by synchronizing your access to these variables. A lock is one simple mechanism. I would also avoid doing a synchronization within the inner while loop, if you can. In this case, you can even remove it from the outer while loop, too. In this case, I might suggest a local variables to keep track of the current “longest” sequence, the “highest” value, and the identifier for that highest value, and then only compare to and update the shared variable when you are done, outside of both loops.
E.g. perhaps:
- (void)three_n_plus_one:(unsigned long) step {
unsigned long long start = step + 1;
unsigned long long end = 1000000;
unsigned long long tempHighest = start;
unsigned long long tempLongest = 1;
unsigned long long tempIdentifier = start;
while (start <= end) {
unsigned long long current = start;
unsigned long long sequence = 1;
while (current != 1) {
sequence += 1;
if (current % 2 == 0)
current = current / 2;
else {
current = (current * 3) + 1;
}
if (current > tempHighest) tempHighest = current;
}
if (sequence > tempLongest) {
tempLongest = sequence;
tempIdentifier = start;
}
start += (unsigned long long)THREAD_COUNT;
}
[lock lock]; // synchronize updating of shared memory
if (tempHighest > highest) {
highest = tempHighest;
}
if (tempLongest > longest) {
longest = tempLongest;
identifier = tempIdentifier;
}
[lock unlock];
}
I used a NSLock, but use whatever synchronization mechanism you want. But the idea is (a) to make sure to synchronize all interaction with shared memory and; (b) to reduce the necessary number of synchronizations to a bare minimum. (In this case, a naïve synchronization approach was 200× slower than the above, which minimizes the number of synchronizations to the bare minimum.)
When you are done fixing the data races, you can then turn TSAN off.

Is write_image atomic? Is it better to use atomic_max?

Full disclosure: I am cross-posting from the kronos opencl forums, since I have not received any reply there so far:
https://community.khronos.org/t/is-write-image-atomic-is-it-better-than-atomic-max/106418
I’m writing a connected components labelling algorithm for images (2d and 3d); I found no existing implementations and decided to write one based on pointer jumping and a “recollection step” (btw: if you are aware of an easy-to-use, production ready connected component labelling let me know).
The “recollection” step kernel pseudocode for 2d images is as follows:
1) global_id = (x,y)
2) read v from img[x,y], decode it to a pair (tx,ty)
3) read v1 from img[tx,ty]
4) do some calculations to extract a boolean value C and a target value T from v1, v, and the neighbours of (x,y) and (tx,ty)
5) *** IF ( C ) THEN WRITE T INTO (tx,ty).
Q1: all the kernels where “C” is true will compete for writing. Suppose it does not matter which one wins (writes last). I’ve done some tests on an intel GPU, and (with filtering disabled, and clamping enabled) there seems to be no issue at all, write_image seems to be atomic, there is a winning value and my algorithm converges very fast. Can I safely assume that write_image on “unfiltered” images is atomic?
Q2: What I really need is to write into (tx,ty) the maximum T obtained from each kernel. That would involve using buffers instead of images, do clamping myself (or use a larger buffer padded with zeroes), and ** using atomic_max in each kernel**. I did not do this yet out of laziness since I need to change my code to use a buffer just to test it, but I believe it would be far slower. Am I right?
For completeness, here is my actual kernel (to be optimized, any suggestions welcome!)
```
__kernel void color_components2(/* base image */ __read_only image2d_t image,
/* uint32 */ __read_only image2d_t inputImage1,
__write_only image2d_t outImage1) {
int2 gid = (int2)(get_global_id(0), get_global_id(1));
int x = gid.x;
int y = gid.y;
int lock = 0;
int2 size = get_image_dim(inputImage1);
const sampler_t sampler =
CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST;
uint4 base = read_imageui(image, sampler, gid);
uint4 ui4a = read_imageui(inputImage1, sampler, gid);
int2 t = (int2)(ui4a[0] % size.x, ui4a[0] / size.x);
unsigned int m = ui4a[0];
unsigned int n = ui4a[0];
if (base[0] > 0) {
for (int a = -1; a <= 1; a++)
for (int b = -1; b <= 1; b++) {
uint4 tmpa =
read_imageui(inputImage1, sampler, (int2)(t.x + a, t.y + b));
m = max(tmpa[0], m);
uint4 tmpb = read_imageui(inputImage1, sampler, (int2)(x + a, y + b));
n = max(tmpb[0], n);
}
}
if(n > m) write_imageui(outImage1,t,(uint4)(n,0,0,0));
}
```