Using Google OR tools for crew scheduling - optimization

I'm searching for a proper way to use Google OR tools for solving an annual ship crew scheduling problem. I tried to follow the scheduling problem examples provided, but I couldn't find a way to set a 4 dimensional decision variable needed ( D[i,j,k,t] , i for captains, j for engineers, k for ship and t for time-period (days or weeks)).
Although there are many examples given (for C#) the major problems I faced is the way to set and utilize this main decision variable, and how to use the Decision Builder, since in all examples the variables had 2 dimensions, and were .flattened in order to make the comparisons. Unfortunately, I haven't found a way to use smaller D-Variables, since the penalty score (minimize penalty problem) is estimated by possible sets of Captains-Engineers , Captains-Ships, and Engineer Ships.

Why don't you create you 4D array, and then fill it with variables one by one.
Here is the code for matrices:
public IntVar[,] MakeIntVarMatrix(int rows, int cols, long min, long max) {
IntVar[,] array = new IntVar[rows, cols];
for (int i = 0; i < rows; ++i) {
for (int j = 0; j < cols; ++j) {
array[i,j] = MakeIntVar(min, max);
}
}
return array;
}
This being said, please use the CP-SAT solver as the original CP solver is deprecated.
For an introduction:
see:
https://developers.google.com/optimization/cp/cp_solver#cp-solver_example/
https://github.com/google/or-tools/tree/master/ortools/sat/doc

Related

Determining growth function and Big O

Before anyone asks, yes this was a previous test question I got wrong and knew I got wrong because I honestly just don't understand growth functions and Big O. I've read the technical definition, I know what they are but not how to calculate them. My textbook gives examples off of real-life situations, but I still find it hard to interpret code. If someone can tell me their thought process on how they determine these, that would seriously help. (i.e. this section of code tells me to multiply n by x, etc, etc).
public static int sort(int lowI, int highI, int nums[]) {
int i = lowI;
int j = highI;
int pivot = nums[lowI +(highI-lowI)/2];
int counter = 0;
while (i <= j) {
while (nums[i] < pivot) {
i++;
counter++;
}
while (nums[j] > pivot) {
j--;
counter++;
}
count++;
if (i <= j) {
NumSwap(i, j, nums); //saves i to temp and makes i = j, j = temp
i++;
j--;
}
}
if(lowI< j)
{
return counter + sort(lowI, j, nums);
}
if(i < highI)
{
return counter + sort(i, highI, nums);
}
return counter;
}
It might help for you to read some explanations of Big-O. I think of Big-O as the number of "basic operations" computed as the "input size" increases. For sorting algorithms, "basic operations" usually means comparisons (or counter increments, in your case), and the "input size" is the size of the list to sort.
When I analyze for runtime, I'll start by mentally dividing the code into sections. I ignore one-off lines (like int i = lowI;) because they're only run once, and Big-O doesn't care about constants (though, note in your case that int i = lowI; runs once with each recursion, so it's not only run once overall).
For example, I'd mentally divide your code into three overall parts to analyze: there's the main while loop while (i <= j), the two while loops inside of it, and the two recursive calls at the end. How many iterations will those loops run for, depending on the values of i and j? How many times will the function recurse, depending on the size of the list?
If I'm having trouble thinking about all these different parts at once, I'll isolate them. For example, how long will one of the inner for loops run for, depending on the values of i and j? Then, how long does the outer while loop run for?
Once I've thought about the runtime of each part, I'll bring them back together. At this stage, it's important to think about the relationships between the different parts. "Nested" relationships (i.e. the nested block loops a bunch of times each time the outer thing loops once) usually mean that the run times are multiplied. For example, since the inner while loops are nested within the outer while loop, the total number of iterations is (inner run time + other inner) * outer. It also seems like the total run time would look something like this - ((inner + other inner) * outer) * recursions - too.

Variable names: How to abbreviate "index"?

I like to keep the names of my variables short but readable.
Still, when, for example, naming the variable that holds the index of an element in some list, I tend to use elemIndex because I don't know a proper (and universally understood) way of abbreviating the word "index".
Is there a canonic way of abbreviating "index"? Or is it best to spell it in full to avoid misunderstandings?
In my experience it depends on the context. Typically I can tell if something is an index from what it is used for, so I am often more interested in knowing what it is an index of.
My rule of thumb goes roughly like this:
If it is just a loop index in a short loop (e.g.: all fits on screen at once) and the context informs the reader what the index is doing, then you can probably get away with something simple like i.
Example: thresholding an image
//For each pixel in the image, threshold it
for (int i = 0; i < height; i++ ) {
for (int j = 0; j < width; j++) {
if (image[i][j] < 128) {
image[i][j] = 0;
} else {
image[i][j] = 255;
}
}
}
If the code section is larger, or you have multiple indeces going on, indicate which list it is an index into:
File[] files_in_dir = ...;
int num_files = files_in_dir.length();
for (int fileIdx = 0; fileIdx < num_files; fileIdx++) { //for each file in dir.
...
}
If, however the index is actually important to the meaning of the code, then specify it fully, for example:
int imageToDeleteIdx = 3; //index of the image to be deleted.
image_list.delete(imageToDeleteIdx);
However code should be considered "write once, read many" and your effort should be allocated as such; i.e.: lots on the writing, so the reading is easy. To this end, as was mentioned by Brad M, never assume the reader understands your abbreviations. If you are going to use abbreviations, at least declare them in the comments.
Stick to established and well known conventions. If you use common conventions, people will have fewer surprises when they read your code.
Programmers are used to using a lot of conventions from mathematics. E.g. in mathematics we typically label indices:
i, j, k
While e.g. coordinates are referred to with letters such as:
x, y, z
This depends of course on context. E.g. using i to denote some global index would be a terrible idea. Use short names for very local variables and longer names for more global functions and variables, is a good rule of thumb.
For me this style was influenced by Rob Pike, who elaborates more on this here. As someone with an interest in user interface design and experience I've also written more extensively about this.

Constrained Single-Objective Optimization

Introduction
I need to split an array filled with a certain type (let's take water buckets for example) with two values set (in this case weight and volume), while keeping the difference between the total of the weight to a minimum (preferred) and the difference between the total of the volumes less than 1000 (required). This doesn't need to be a full-fetched genetic algorithm or something similar, but it should be better than what I currently have...
Current Implementation
Due to not knowing how to do it better, I started by splitting the array in two same-length arrays (the array can be filled with an uneven number of items), replacing a possibly void spot with an item with both values being 0. The sides don't need to have the same amount of items, I just didn't knew how to handle it otherwise.
After having these distributed, I'm trying to optimize them like this:
func (main *Main) Optimize() {
for {
difference := main.Difference(WEIGHT)
for i := 0; i < len(main.left); i++ {
for j := 0; j < len(main.right); j++ {
if main.DifferenceAfter(i, j, WEIGHT) < main.Difference(WEIGHT) {
main.left[i], main.right[j] = main.right[j], main.left[i]
}
}
}
if difference == main.Difference(WEIGHT) {
break
}
}
for main.Difference(CAPACITY) > 1000 {
leftIndex := 0
rightIndex := 0
liters := 0
weight := 100
for i := 0; i < len(main.left); i++ {
for j := 0; j < len(main.right); j++ {
if main.DifferenceAfter(i, j, CAPACITY) < main.Difference(CAPACITY) {
newLiters := main.Difference(CAPACITY) - main.DifferenceAfter(i, j, CAPACITY)
newWeight := main.Difference(WEIGHT) - main.DifferenceAfter(i, j, WEIGHT)
if newLiters > liters && newWeight <= weight || newLiters == liters && newWeight < weight {
leftIndex = i
rightIndex = j
liters = newLiters
weight = newWeight
}
}
}
}
main.left[leftIndex], main.right[rightIndex] = main.right[rightIndex], main.left[leftIndex]
}
}
Functions:
main.Difference(const) calculates the absolute difference between the two sides, the constant taken as an argument decides the value to calculate the difference for
main.DifferenceAfter(i, j, const) simulates a swap between the two buckets, i being the left one and j being the right one, and calculates the resulting absolute difference then, the constant again determines the value to check
Explanation:
Basically this starts by optimizing the weight, which is what the first for-loop does. On every iteration, it tries every possible combination of buckets that can be switched and if the difference after that is less than the current difference (resulting in better distribution) it switches them. If the weight doesn't change anymore, it breaks out of the for-loop. While not perfect, this works quite well, and I consider this acceptable for what I'm trying to accomplish.
Then it's supposed to optimize the distribution based on the volume, so the total difference is less than 1000. Here I tried to be more careful and search for the best combination in a run before switching it. Thus it searches for the bucket switch resulting in the biggest capacity change and is also supposed to search for a tradeoff between this, though I see the flaw that the first bucket combination tried will set the liters and weight variables, resulting in the next possible combinations being reduced by a big a amount.
Conclusion
I think I need to include some more math here, but I'm honestly stuck here and don't know how to continue here, so I'd like to get some help from you, basically that can help me here is welcome.
As previously said, your problem is actually a constrained optimisation problem with a constraint on your difference of volumes.
Mathematically, this would be minimise the difference of volumes under constraint that the difference of volumes is less than 1000. The simplest way to express it as a linear optimisation problem would be:
min weights . x
subject to volumes . x < 1000.0
for all i, x[i] = +1 or -1
Where a . b is the vector dot product. Once this problem is solved, all indices where x = +1 correspond to your first array, all indices where x = -1 correspond to your second array.
Unfortunately, 0-1 integer programming is known to be NP-hard. The simplest way of solving it is to perform exhaustive brute force exploring of the space, but it requires testing all 2^n possible vectors x (where n is the length of your original weights and volumes vectors), which can quickly get out of hands. There is a lot of literature on this topic, with more efficient algorithms, but they are often highly specific to a particular set of problems and/or constraints. You can google "linear integer programming" to see what has been done on this topic.
I think the simplest might be to perform a heuristic-based brute force search, where you prune your search tree early when it would get you out of your volume constraint, and stay close to your constraint (as a general rule, the solution of linear optimisation problems are on the edge of the feasible space).
Here are a couple of articles you might want to read on this kind of optimisations:
UCLA Linear integer programming
MIT course on Integer programming
Carleton course on Binary programming
Articles on combinatorial optimisation & linear integer programming
If you are not familiar with optimisation articles or math in general, the wikipedia articles provides a good introduction, but most articles on this topic quickly show some (pseudo)code you can adapt right away.
If your n is large, I think at some point you will have to make a trade off between how optimal your solution is and how fast it can be computed. Your solution is probably suboptimal, but it is much faster than the exhaustive search. There might be a better trade off, depending on the exact configuration of your problem.
It seems that in your case, difference of weight is objective, while difference of volume is just a constraint, which means that you are seeking for solutions that optimize difference of weight attribute (as small as possible), and satisfy the condition on difference of volume attribute (total < 1000). In this case, it's a single objective constrained optimization problem.
Whereas, if you are interested in multi-objective optimization, maybe you wanna look at the concept of Pareto Frontier: http://en.wikipedia.org/wiki/Pareto_efficiency . It's good for keeping multiple good solutions with advantages in different objective, i.e., not losing diversity.

What naming conventions should I use on the second integer on a nested for loop?

I'm pretty new to programming, and I was just wondering in the following case what would be an appropriate name for the second integer I use in this piece of code
for (int i = 0; i < 10; i++)
{
for (int x = 0; x < 10; x++)
{
//stuff
}
}
I usually just name it x but I have a feeling that this could get confusing quickly. Is there a standard name for this kind of thing?
Depending upon what you're iterating over, a name might be easy or obvious by context:
for(struct mail *mail=inbox->start; mail ; mailid++) {
for (struct attachment *att=mail->attachment[0]; att; att++) {
/* work on all attachments on all mails */
}
}
For the cases where i makes the most sense for an outer loop variable, convention uses j, k, l, and so on.
But when you start nesting, look harder for meaningful names. You'll thank yourself in six months.
You could opt to reduce the nesting by making a method call. Inside of this method, you would be using a local variable also named i.
for (int i = 0; i < 10; i++)
{
methodCall(array[i], array);
}
I have assumed you need to pass the element at position i in the outer loop as well as the array to be iterated over in the inner loop - this is an assumption as you may actually require different arguments.
As always, you should measure the performance of this - there shouldn't be a massive overhead in making a method call within a loop, but this depends on the language.
Personally I feel that you should give variables meaningful names - here i and x mean nothing and will not help you understand your code in 3 months time, at which point it will appear to you as code written by a dyslexic monkey.
Name variables so that other people can understand what your code is trying to accomplish. You will save yourself time in the long run.
Since you said you are beginning, I'd say it's beneficial to experiment with multiple styles.
For the purposes of your example, my suggestion is simply replace x with j.
There's tons of real code that will use the convention of i, j, and k for single letter nested loop variables.
There's also tons that uses longer more meaningful names.
But there's much less that looks like your example.
So you can consider it a step forward because you're code looks more like real world code.

What are the practical uses of modulus (%) in programming? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Recognizing when to use the mod operator
What are the practical uses of modulus? I know what modulo division is. The first scenario which comes to my mind is to use it to find odd and even numbers, and clock arithmetic. But where else I could use it?
The most common use I've found is for "wrapping round" your array indices.
For example, if you just want to cycle through an array repeatedly, you could use:
int a[10];
for (int i = 0; true; i = (i + 1) % 10)
{
// ... use a[i] ...
}
The modulo ensures that i stays in the [0, 10) range.
I usually use them in tight loops, when I have to do something every X loops as opposed to on every iteration..
Example:
int i;
for (i = 1; i <= 1000000; i++)
{
do_something(i);
if (i % 1000 == 0)
printf("%d processed\n", i);
}
One use for the modulus operation is when making a hash table. It's used to convert the value out of the hash function into an index into the array. (If the hash table size is a power of two, the modulus could be done with a bit-mask, but it's still a modulus operation.)
To print a number as string, you need the modulus to find the value of a digit.
string number_to_string(uint number) {
string result = "";
while (number != 0) {
result = cast(char)((number % 10) + '0') ~ result;
// ^^^^^^^^^^^
number /= 10;
}
return result;
}
For the control number of international bank account numbers, the mod97 technique.
Also in large batches to do something after n iterations. Here is an example for NHibernate:
ISession session = sessionFactory.openSession();
ITransaction tx = session.BeginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.Save(customer);
if ( i % 20 == 0 ) { //20, same as the ADO batch size
//Flush a batch of inserts and release memory:
session.Flush();
session.Clear();
}
}
tx.Commit();
session.Close();
The usual implementation of buffered communications uses circular buffers, and you manage them with modulus arithmetic.
For languages that don't have bitwise operators, modulus can be used to get the lowest n bits of a number. For example, to get the lowest 8 bits of x:
x % 256
which is equivalent to:
x & 255
Cryptography. That alone would account for an obscene percentage of modulus (I exaggerate, but you get the point).
Try the Wikipedia page too:
Modular arithmetic is referenced in number theory, group theory, ring theory, knot theory, abstract algebra, cryptography, computer science, chemistry and the visual and musical arts.
In my experience, any sufficiently advanced algorithm is probably going to touch on one more of the above topics.
Well, there are many perspectives you can look at it. If you are looking at it as a mathematical operation then it's just a modulo division. Even we don't need this as whatever % do, we can achieve using subtraction as well, but every programming language implement it in very optimized way.
And modulu division is not limited to finding odd and even numbers or clock arithmetic. There are hundreds of algorithms which need this module operation, for example, cryptography algorithms, etc. So it's a general mathematical operation like other +, -, *, /, etc.
Except the mathematical perspective, different languages use this symbol for defining built-in data structures, like in Perl %hash is used to show that the programmer declared a hash. So it all varies based on the programing language design.
So still there are a lot of other perspectives which one can do add to the list of use of %.