AMPL: Modeling vehicles to departure "every n hours" - ampl

I want to model that departures from a node can only take place in a "every n hours" manner. I've started to model this using two variables - starttime[i,j,k] shows when vehicle k departured i with j as destination, x[i,j,k] is a binary variable having value 1 if vehicle k drove from i to j, and 0 otherwise. The model is:
maximize maxdrive: sum{i in V, j in V, k in K} traveltime[i,j]*x[i,j,k];
subject to TimeConstraint {k in K}:
sum{i in V, j in V} (traveltime[i,j]+servicetime[i])*x [i,j,k] <= 1440;
subject to StartTime{i in V,j in V, k in K}:
starttime[i,j,k] + traveltime[i,j] - 9000 * (1 - x[i,j,k]) <= starttime[j,i,k];
subject to yvar{i in V, j in V}:
sum{k in K} x[i,j,k] <= maxVisits[i,j];
subject to Constraint1{i in V, j in V, k in K, g in V, h in K}:
starttime[i,j,k] + TimeInterval[i]*x[i,j,k] <= starttime[i,g,h];
The constraint in question is "Constraint1" where i is the origin node, j the destination node, and k is the vehicle. The index g is used to show that the later departure can be to any destination node. TimeInterval corresponds to the interval intended, i.e. if TimeInterval at i is 2 hours, the starttime of the next vehicle to departure from i must not be less than 2 hours from previous departure. The origins corresponds to specific products (only available from said origin node) whereas I want the vehicles to not be bounded to a specific origin node - they should be able to jump between nodes to utilize backhauling etc. In other words, I want to conduct this constraint without having restraints on the vehicles themselves but rather the origin nodes.
The objective function to "maximize the traveltime" may seem strange, but the objective function is rather obsolete really. If the constraints are met, the solution is adequate. To maximize traveltime is merely an attempt to "force" the x variables to become 1.
The question is: how can I do this? With this formulation, all x[i,j,k] variables dissappears from the answer (without this constraint, some of the binary variables x becomes 1 and the other 0. The solution meets the maxVisits requirement. With the constraint all x variables becomes 0 and all starttimes becomes 0 as well. MINTO (The solver) doesn't state that the problem is infeasible either.
Also, how to separate the vehicles so the program recognizes that it is a comparison between all departures? I would rather to not include time dimensions, at it would give so much more variables.
EDIT: After trying a new model using a non-linear solver I've seen some strange results. Specifically, I'm using the limit 1440 (minutes) as an upper bound as to for how long a vehicle can operate each day. Using this model below the solution is 0 for every variable, but the starttime for all combinations of i,j,k is 720 (half of 1440). Does anyone have any clue in regards of what causing this solution? How did this constraint remove the link between starttime being higher than 0 requiring that x must be 1.
subject to StartTimeSelf{i in V, j in V, k in K, g in K, h in V}:
starttime[i,j,k]*x[i,j,k] + TimeInterval[i]*x[i,j,k] + y[i,k] <= starttime[i,h,g]*x[i,j,k];

Related

Is it possible to explicitly and minimally index all integer solutions to x + y + z = 2n?

For example, the analogous two dimensional question, x + y = 2n is easy to solve: one can simply consider pairs (i,2n-i) for i=1,2,...,n and thus index every solution, exactly once. We note that we have n such pairs solving x + y = 2n, for every fixed value of positive integer n, and so the cardinality of such a set is equal to n as expected.
However, trying to repeat the same problem for x + y + z = 2n, it is not clear to me how (or if it is possible) to write down a minimal set {(2n-i-j,i,j)} such that varying i and j over particular intervals precisely produces every such triplet, exactly once. It can be shown that the number of elements in such a minimal set would be equal to the nearest integer to n^2/3.
It is not hard to see how one can obtain such an indexing with repetitions, or how one can algorithmically remove repetitions, but what I would like to know is whether there is a clean, general construction, as for the x + y = 2n case. Is this possible, or will one always have to artificially restrict certain values of the parameters on the intervals for which they are defined?

Indexing variables in sets in Xpress Mosel

I'm trying to solve a linear relaxation of a problem I've already solved with a Python library in order to see if it behaves in the same way in Xpress Mosel.
One of the index sets I'm using is not the typical c=1..n but a set of sets, meaning I've taken the 1..n set and have created all the combinations of subsets possible (for example the set 1..3 creates the set of sets {{1},{2},{3},{1,2},{2,3},{1,2,3}}).
In one of my constraints, one of the indexes must run inside each one of those subsets.
The respective code in Python is as follows (using the Gurobi library):
cluster=[1,2,3,4,5,6]
cluster1=[]
for L in range(1,len(cluster)+1):
for subset in itertools.combinations(cluster, L):
clusters1.append(list(subset))
ConstraintA=LinExpr()
ConstraintB=LinExpr()
for i in range(len(nodes)):
for j in range(len(nodes)):
if i<j and A[i][j]==1:
for l in range(len(clusters1)):
ConstraintA+=z[i,j]
for h in clusters1[l]:
restricao2B+=(x[i][h]-x[j][h])
model.addConstr(ConstraintA,GRB.GREATER_EQUAL,ConstraintB)
ConstraintA=LinExpr()
ConstraintB=LinExpr()
(In case the code above is confusing, which I suspect it to be)The constraint I'm trying to write is:
z(i,j)>= sum_{h in C1}(x(i,h)-x(j,h)) forall C1 in C
in which the C1 is each of those subsets.
Is there a way to do this in Mosel?
You could use some Mosel code along these lines (however, independently of the language that you are using, please be aware that the calculated 'set of all subsets' very quickly grows in size with an increasing number of elements in the original set C, so this constraint formulation will not scale up well):
declarations
C: set of integer
CS: set of set of integer
z,x: array(I:range,J:range) of mpvar
end-declarations
C:=1..6
CS:=union(i in C) {{i}}
forall(j in 1..C.size-1)
forall(s in CS | s.size=j, i in C | i > max(k in s) k ) CS+={s+{i}}
forall(s in CS,i in I, j in J) z(i,j) >= sum(h in s) (x(i,h)-x(j,h))
Giving this some more thought, the following version working with lists in place of sets is more efficient (that is, faster):
uses "mmsystem"
declarations
C: set of integer
L: list of integer
CS: list of list of integer
z,x: array(I:range,J:range) of mpvar
end-declarations
C:=1..6
L:=list(C)
qsort(SYS_UP, L) ! Making sure L is ordered
CS:=union(i in L) [[i]]
forall(j in 1..L.size-1)
forall(s in CS | s.size=j, i in L | i > s.last ) CS+=[s+[i]]
forall(s in CS,i in I, j in J) z(i,j) >= sum(h in s) (x(i,h)-x(j,h))

Is this O(N) algorithm actually O(logN)?

I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?

Segment tree - query complexity

I am having problems with understanding segment tree complexity. It is clear that if you have update function which has to change only one node, its complexity will be log(n).
But I have no idea why complexity of query(a,b), where (a,b) is interval that needs to be checked, is log(n).
Can anyone provide me with intuitive / formal proof to understand this?
There are four cases when query the interval (x,y)
FIND(R,x,y) //R is the node
% Case 1
if R.first = x and R.last = y
return {R}
% Case 2
if y <= R.middle
return FIND(R.leftChild, x, y)
% Case 3
if x >= R.middle + 1
return FIND(R.rightChild, x, y)
% Case 4
P = FIND(R.leftChild, x, R.middle)
Q = FIND(R.rightChild, R.middle + 1, y)
return P union Q.
Intuitively, first three cases reduce the level of tree height by 1, since the tree has height log n, if only first three cases happen, the running time is O(log n).
For the last case, FIND() divide the problem into two subproblems. However, we assert that this can only happen at most once. After we called FIND(R.leftChild, x, R.middle), we are querying R.leftChild for the interval [x, R.middle]. R.middle is the same as R.leftChild.last. If x > R.leftChild.middle, then it is Case 1; if x <= R.leftChild, then we will call
FIND ( R.leftChild.leftChild, x, R.leftChild.middle );
FIND ( R.leftChild.rightChild, R.leftChild.middle + 1, , R.leftChild.last );
However, the second FIND() returns R.leftChild.rightChild.sum and therefore takes constant time, and the problem will not be separate into two subproblems (strictly speaking, the problem is separated, though one subproblem takes O(1) time to solve).
Since the same analysis holds on the rightChild of R, we conclude that after case4 happens the first time, the running time T(h) (h is the remaining level of the tree) would be
T(h) <= T(h-1) + c (c is a constant)
T(1) = c
which yields:
T(h) <= c * h = O(h) = O(log n) (since h is the height of the tree)
Hence we end the proof.
This is my first time to contribute, hence if there are any problems, please kindly point them out and I would edit my answer.
A range query using a segment tree basically involves recursing from the root node. You can think of the entire recursion process as a traversal on the segment tree: any time a recursion is needed on a child node, you are visiting that child node in your traversal. So analyzing the complexity of a range query is equivalent to finding the upper bound for the total number of nodes that are visited.
It turns out that at any arbitrary level, there are at most 4 nodes that can be visited. Since the segment tree has a height of log(n) and that at any level there are at most 4 nodes that can be visited, the upper bound is actually 4*log(n). The time complexity is therefore O(log(n)).
Now we can prove this with induction. The base case is at the first level where the root node lies. Since the root node has at most two child nodes, we can only visit at most those two child nodes, which is at most 4 nodes.
Now suppose it is true that at an arbitrary level (say level i) we visit at most 4 nodes. We want to show that we will visit at most 4 nodes at the next level (level i+1) as well. If we had visited only 1 or 2 nodes at level i, it's trivial to show that at level i+1 we will visit at most 4 nodes because each node can have at most 2 child nodes.
So let's focus on the assumption that 3 or 4 nodes were visited at level i, and try to show that at level i+1 we can also have at most 4 visited nodes. Now since the range query is asking for a contiguous range, we know that the 3 or 4 nodes visited at level i can be categorized into 3 partitions of nodes: a leftmost single node whose segment range is only partially covered by the query range, a rightmost single node whose segment range is only partially covered by the query range, and 1 or 2 middle nodes whose segment range is fully covered by the query range. Since the middle nodes have their segment range(s) fully covered by the query range, there would be no recursion at the next level; we just use their precomputed sums. We are left with possible recursions on the leftmost node and the rightmost node at the next level, which is obviously at most 4.
This completes the proof by induction. We have proven that at any level at most 4 nodes are visited. The time complexity for a range query is therefore O(log(n)).
An interval of length n can be represented by k nodes where k <= log(n)
We can prove it based on how the binary system works.

AMPL: Model terminals within a destination city

I've encountered a problem which I have not found any solution to reading the AMPL documentation of sets.
What I want to model is that a city, say Kir, must have for instance 9 deliveries from another city, for instance Sto. However, these deliveries must arrive in Kir at some specific terminals, each terminal being open only for a small amount of time (approx 2 minutes) each day. The same must be true for the origin node. The route from Sto must be specified from a specific terminal (so the path can be "followed" in the results).
I've started to model using the "set V within K" operation for sets, but that requires that V must be the same set, or a subset of K where K is the set representing the "nodes" - Kir, Sto and so on and V is the set of names of the terminals "Terminal1", "Terminal2" etc.
I've started to check for instance "set K dimension 4" defined as for instance:
set K dimension 4;
data;
set K:=
Sto Kir Terminal1 Terminal2
Bod Kir Terminal3 Terminal2;
Where set K represents from which city (for example Sto) a delivery should be driven (to for example Kir), where the departing terminal in Sto is Terminal1 and the delivering terminal in Kir is Terminal2. This has the downside of having to specifiy a large number of combinations (there are approximately 22 terminals in Kir alone etc) manually. I don't know how to model the constraints then either. For instance the "one dimension" set I've previously had:
subject to yvar{i in V, j in V}:
sum{k in H} x[i,j,k] <= maxVisits[i,j];
where V is the set of cities alone, and H is the set of vehicles, maxVisits represents the maximum amount of deliviries from city i to city j and x is 1 if a delivery is made from i to j using vehicle k. I don't understand how this could be modeled, using the four dimensional set K.
Regards,
One way to model this is to index x over K and H and change the summation to include terminals:
var x{K, H} binary;
subject to yvar{i in V, j in V}:
sum{(i,j,t,u) in K, k in H} x[i,j,t,u,k] <= maxVisits[i,j];
The indexing (i,j,t,u) in K in the summation will iterate over pairs of terminals that are endpoints of routes from city i to city j. Note that i and j are fixed here because they are defined in the constraint indexing {i in V, j in V}.