Transportation cost optimisation using OMPR for a large data set - optimization

I am solving a transport optimization problem given a set of constraints.
The following are the three key data sets that I have
#demand file
demand - has demand(DEMAND) across 4821(DPP) sale points(D)
head(demand)
D PP DEMAND DPP
1 ADILABAD (V) - T:11001 OPC:PACK 131.00 ADILABAD (V) - T:11001:OPC:PACK
2 ADILABAD (V) - T:13003 OPC:PACK 235.00 ADILABAD (V) - T:13003:OPC:PACK
3 ADILABAD (V) - T:2006 PPC:PACK 30.00 ADILABAD (V) - T:2006:PPC:PACK
4 ADILABAD (V) - T:4001 OPC:PACK 30.00 ADILABAD (V) - T:4001:OPC:PACK
5 ADILABAD (V) - T:7006 OPC:NPACK 34.84 ADILABAD (V) - T:7006:OPC:NPACK
6 AHMEDABAD:1001 OPC:PACK 442.10 AHMEDABAD:1001:OPC:PACK
#Capacity file
cc - has capacity constraint (MaxP, MinP) across 1823 sources(SOURCE)
head(cc,4)
SOURCE MinP MaxP
1 CHILAMKUR:P:OPC:NPACK:0:R 900 10806
2 CHILAMKUR:P:OPC:NPACK:0:W 900 10806
3 CHILAMKUR:P:OPC:PACK:0:R 5628 67536
4 CHILAMKUR:P:OPC:PACK:0:W 5628 67536
#LandingCost file
LCMat - This is a matrix with the landing cost to deliver the product across the demand location (DPP) from a given source(SOURCE). This is an 1823 x 4821 matrix. Since the landing costs to all locations do not exist from a given source, I have replace that with a huge cost (10^6) to such DPPs.
I am using the OMPR package in R to optimize shipping material to meet the demand.
This is potentially a very simple transport problem but it is taking a lot of time. I am using a 16GB ram machine
The following is the code. Could anyone guide me on what I should do better?
a = Sys.time()
grid = expand.grid(i = 1:nrow(LCMat),j = 1:ncol(LCMat))
grid_solve = grid[which(LCMat < 10^6),]
grid_notsolve = grid[which(LCMat >= 10^6),]
model <- MILPModel() %>%
add_variable(x[grid$i, grid$j],lb = 0, type = "continuous") %>%
add_constraint(x[grid_notsolve$i, grid_notsolve$j] == 0) %>%
add_constraint(sum_over(x[i,j], i = 1:nrow(LCMat)) <= demand$DEMAND[j], j = 1:ncol(LCMat)) %>%
add_constraint(sum_over(x[i,j], j = 1:ncol(LCMat)) <= cc$MaxP[i], i = 1:nrow(LCMat)) %>%
add_constraint(sum_over(x[i,j], j = 1:ncol(LCMat)) >= cc$MinP[i], i = 1:nrow(LCMat)) %>%
set_objective(sum_expr(LCMat[grid_solve$i,grid_solve$j]*x[grid_solve$i,grid_solve$j]),"min")
solution = model %>% solve_model(with_ROI(solver = "glpk", verbose = TRUE))
Sys.time() - a

Two options to potentially speed things up:
Make sure you use the latest CRAN versions of ompr and listcomp.
Try to use filter conditions to only create/use variables that are relevant to the model, instead of adding all nrow(LCMat)*ncol(LCMat) variables and then setting (potentially) a lot of them to 0. See the code below for an example. Depending on how sparse your problem is that could help as well.
The following code takes a sparse matrix (i.e. a matrix with many 0 elements or 10^6 elements in your case) and only generates x[i,j] variables that have an entry in sparse_matrix which is greater than 0. It hopefully illustrates how to use that feature and apply it to your case.
library(ompr)
sparse_matrix <- matrix(
c(
1, 0, 0, 1,
0, 1, 0, 1,
0, 0, 0, 1,
1, 0, 0, 0
), byrow = TRUE, ncol = 4
)
is_connected <- function(i, j) {
sparse_matrix[i, j] > 0
}
n <- nrow(sparse_matrix)
m <- ncol(sparse_matrix)
model <- MIPModel() |>
add_variable(x[i, j], i = 1:n, j = 1:m, is_connected(i, j)) |>
set_objective(sum_over(x[i, j], i = 1:n, j = 1:m, is_connected(i, j))) |>
add_constraint(sum_over(x[i, j], i = 1:n, is_connected(i, j)) <= 1, j = 1:m)
variable_keys(model)
#> [1] "x[1,1]" "x[1,4]" "x[2,2]" "x[2,4]" "x[3,4]" "x[4,1]"
extract_constraints(model)
#> $matrix
#> 3 x 6 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 1 . . . . 1
#> [2,] . . 1 . . .
#> [3,] . 1 . 1 1 .
#>
#> $sense
#> [1] "<=" "<=" "<="
#>
#> $rhs
#> [1] 1 1 1
Created on 2022-03-12 by the reprex package (v2.0.1)

Both OMPR and GLPK are slow for large models.
You are duplicating sum_over(x[i,j], j = 1:ncol(LCMat)). That leads to more nonzero elements than needed. I usually try to prevent that (even at the expense of more variables).

Related

Correlation of error terms in time-series model

I am reading this statistics book where they have mentioned that the attached top plot has no correlation between adjacent residuals. Whereas, the bottom most has correlation with p-0.9. Can anybody please provide some direction as to how to analyze this? Thank you very much for your time.
Correlated errors mean that the lag 1 correlation is p. That is, Cor(Yi, Yi-1) = p. This can be modelled using Yi = mu + p epsiloni-1 + epsiloni where epsiloni ~ N(0, 1) for all i. We can verify that the correlation between adjacent data points is p: Cov(Yi, Yi-1) = Cov(p epsiloni-1 + epsiloni, p epsiloni-2 + epsiloni-1) = Cov(p epsiloni-1, epsiloni-1) = p Var(epsiloni-1) = p. Code to demonstrate appears below:
set.seed(123)
epsilonX <- rnorm(100, 0, 1)
epsilonY <- rnorm(100, 0, 1)
epsilonZ <- rnorm(100, 0, 1)
X <- NULL
Y <- NULL
Z <- NULL
Y[1] <- epsilonY[1]
X[1] = epsilonX[1]
Z[1] = epsilonZ[1]
rhoX = 0
rhoY = 0.5
rhoZ = 0.9
for (i in 2:100) {
Y[i] <- rhoY * epsilonY[i-1] + epsilonY[i]
X[i] <- rhoX * epsilonX[i-1] + epsilonX[i]
Z[i] <- rhoZ * epsilonZ[i-1] + epsilonZ[i]
}
param = par(no.readonly = TRUE)
par(mfrow=c(3,1))
plot(X, type='o', xlab='', ylab='Residual', main=expression(rho*"=0.0"))
abline(0, 0, lty=2)
plot(Y, type='o', xlab='', ylab='Residual', main=expression(rho*"=0.5"))
abline(0, 0, lty=2)
plot(Z, type='o', xlab='', ylab='Residual', main=expression(rho*"=0.9"))
abline(0, 0, lty=2)
#par(param)
acf(X)
acf(Y)
acf(Z)
Note from the acf plots that the lag 1 correlation is insignificant for p = 0, higher for p = 0.5 data (~0.3), and still higher for p = 0.9 data (~0.5).

How to solve simple linear programming problem with lpSolve

I am trying to maximize the function $a_1x_1 + \cdots +a_nx_n$ subject to the constraints $b_1x_1 + \cdots + b_nx_n \leq c$ and $x_i \geq 0$ for all $i$. For the toy example below, I've chosen $a_i = b_i$, so the problem is to maximize $0x_1 + 25x_2 + 50x_3 + 75x_4 + 100x_5$ given $0x_1 + 25x_2 + 50x_3 + 75x_4 + 100x_5 \leq 100$. Trivially, the maximum value of the objective function should be 100, but when I run the code below I get a solution of 2.5e+31. What's going on?
library(lpSolve)
a <- seq.int(0, 100, 25)
b <- seq.int(0, 100, 25)
c <- 100
optimal_val <- lp(direction = "max",
objective.in = a,
const.mat = b,
const.dir = "<=",
const.rhs = c,
all.int = TRUE)
optimal_val
b is not a proper matrix. You should do, before the lp call:
b <- seq.int(0, 100, 25)
b <- matrix(b,nrow=1)
That will give you an explicit 1 x 5 matrix:
> b
[,1] [,2] [,3] [,4] [,5]
[1,] 0 25 50 75 100
Now you will see:
> optimal_val
Success: the objective function is 100
Background: by default R will consider a vector as a column matrix:
> matrix(c(1,2,3))
[,1]
[1,] 1
[2,] 2
[3,] 3

Hard Big O complexity for 3 loops

I try to calculate Big O complexity for this code but I always fail....
I tried to nest SUM's or to get the number of steps for each case like:
i=1 j=1 k=1 (1 step)
i=2 j=1,2 k=1,2,3,4 (4 steps)
. . . . . . . . . . . . . . .
i=n (i said n = 2^(log n) j = 1,2,4,8,16,.....,n k=1,2,3,4,.....n^2 (n^2 steps)
then sum all the steps together, I need help.
for (int i=1; i<=n; i*=2)
for (int j=1; j<=i; j*=2)
for(int k=1; k<=j*j; k++)
//code line with complexity code O(1)
Let's take a look at the number of times the inner loop runs: j2. But j steps along in powers of 2 up to i. i in turn steps in powers of 2 up to n. So let's "draw" a little graphic of the terms of the sum that would give us the total number of iterations:
---- 1
^ 1 4
| 1 4 16
log2(n) ...
| 1 4 16 ... n2/16
v 1 4 16 ... n2/16 n2/4
---- 1 4 16 ... n2/16 n2/4 n2
|<------log2(n)------>|
The graphic can be interpreted as follows: each value of i corresponds to a row. Each value of j is a column within that row. The number itself is the number of iterations k goes through. The values of j are the square roots of the numbers. The values of i are the square roots of the last element in each row. The sum of all the numbers is the total number of iterations.
Looking at the bottom row, the terms of the sum are (2z)2 = 22z for z = 1 ... log2(n). The number of times that the terms appear in the sum is modulated by the height of the column. The height for a given term is log2(n) + 1 - z (basically a count down from log2(n)).
So the final sum is
log2(n)
Σ 22z(log2(n) + 1 - z)
z = 1
Here is what Wolfram Alpha has to say about evaluating the sum: http://m.wolframalpha.com/input/?i=sum+%28%28log%5B2%2C+n%5D%29+%2B+1+-+z%29%282%5E%282z%29%29%2C+z%3D1+to+log%5B2%2C+n%5D:
C1n2 - C2log(n) - C3
Cutting out all the less significant terms and constants, the result is
O(n2)
For the outermost loop:
sum_{i in {1, 2, 4, 8, 16, ...}} 1, i <= n (+)
<=>
sum_{i in {2^0, 2^1, 2^2, ... }} 1, i <= n
Let 2^I = i:
2^I = i <=> e^{I log 2} = i <=> I log 2 = log i <=> I = (log i)/(log 2)
Thus, (+) is equivalent to
sum_{I in {0, 1, ... }} 1, I <= floor((log n)/(log 2)) ~= log n (*)
Second outermost loop:
sum_{j in {1, 2, 4, 8, 16, ...}} 1, j <= i (++)
As above, 2^I = i, and let 2^J = j. Similarly to above,
(++) is equivalent to:
sum_{J in {0, 1, ... }} 1, J <= floor((log (2^I))/(log 2)) = floor(I/(log 2)) ~= I (**)
To touch base, only the outermost and second outermost
have now been reduced to
sum_{I in {0, 1, ... }}^{log n} sum_{J in {0, 1, ...}}^{I} ...
Which is (if there would be no innermost loop) O((log n)^2)
Innermost loop is a trivial one if we can express the largest bound in terms of `n`.
sum_{k in {1, 2, 3, 4, ...}} 1, k <= j^2 (+)
As above, let 2^J = j and note that j^2 = 2^(2J)
sum_{k in {1, 2, 3, 4, ...}} 1, k <= 2^(2J)
Thus, k is bounded by 2^(2 max(J)) = 2^(2 max(I)) = 2^(2 log(n) ) = 2n^2 (***)
Combining (*), (**) and (***), the asymptotic complexity of the three nested loops is:
O(n^2 log^2 n) (or, O((n log n)^2)).

Sample without replacement

How to sample without replacement in TensorFlow? Like numpy.random.choice(n, size=k, replace=False) for some very large integer n (e.g. 100k-100M), and smaller k (e.g. 100-10k).
Also, I want it to be efficient and on the GPU, so other solutions like this with tf.py_func are not really an option for me. Anything which would use tf.range(n) or so is also not an option because n could be very large.
This is one way:
n = ...
sample_size = ...
idx = tf.random_shuffle(tf.range(n))[:sample_size]
EDIT:
I had posted the answer below but then read the last line of your post. I don't think there is a good way to do it if you absolutely cannot produce a tensor with size O(n) (numpy.random.choice with replace=False is also implemented as a slice of a permutation). You could resort to a tf.while_loop until you have unique indices:
n = ...
sample_size = ...
idx = tf.zeros(sample_size, dtype=tf.int64)
idx = tf.while_loop(
lambda i: tf.size(idx) == tf.size(tf.unique(idx)),
lambda i: tf.random_uniform(sample_size, maxval=n, dtype=int64))
EDIT 2:
About the average number of iterations in the previous method. If we call n the number of possible values and k the length of the desired vector (with k ≤ n), the probability that an iteration is successful is:
p = product((n - (i - 1) / n) for i in 1 .. k)
Since each iteartion can be considered a Bernoulli trial, the average number of trials unitl first success is 1 / p (proof here). Here is a function that calculates the average numbre of trials in Python for some k and n values:
def avg_iter(k, n):
if k > n or n <= 0 or k < 0:
raise ValueError()
avg_it = 1.0
for p in (float(n) / (n - i) for i in range(k)):
avg_it *= p
return avg_it
And here are some results:
+-------+------+----------+
| n | k | Avg iter |
+-------+------+----------+
| 10 | 5 | 3.3 |
| 100 | 10 | 1.6 |
| 1000 | 10 | 1.1 |
| 1000 | 100 | 167.8 |
| 10000 | 10 | 1.0 |
| 10000 | 100 | 1.6 |
| 10000 | 1000 | 2.9e+22 |
+-------+------+----------+
You can see it varies wildy depending on the parameters.
It is possible, though, to construct a vector in a fixed number of steps, although the only algorithm I can think of is O(k2). In pure Python it goes like this:
import random
def sample_wo_replacement(n, k):
sample = [0] * k
for i in range(k):
sample[i] = random.randint(0, n - 1 - len(sample))
for i, v in reversed(list(enumerate(sample))):
for p in reversed(sample[:i]):
if v >= p:
v += 1
sample[i] = v
return sample
random.seed(100)
print(sample_wo_replacement(10, 5))
# [2, 8, 9, 7, 1]
print(sample_wo_replacement(10, 10))
# [6, 5, 8, 4, 0, 9, 1, 2, 7, 3]
This is a possible way to do it in TensorFlow (not sure if the best one):
import tensorflow as tf
def sample_wo_replacement_tf(n, k):
# First loop
sample = tf.constant([], dtype=tf.int64)
i = 0
sample, _ = tf.while_loop(
lambda sample, i: i < k,
# This is ugly but I did not want to define more functions
lambda sample, i: (tf.concat([sample,
tf.random_uniform([1], maxval=tf.cast(n - tf.shape(sample)[0], tf.int64), dtype=tf.int64)],
axis=0),
i + 1),
[sample, i], shape_invariants=[tf.TensorShape((None,)), tf.TensorShape(())])
# Second loop
def inner_loop(sample, i):
sample_size = tf.shape(sample)[0]
v = sample[i]
j = i - 1
v, _ = tf.while_loop(
lambda v, j: j >= 0,
lambda v, j: (tf.cond(v >= sample[j], lambda: v + 1, lambda: v), j - 1),
[v, j])
return (tf.where(tf.equal(tf.range(sample_size), i), tf.tile([v], (sample_size,)), sample), i - 1)
i = tf.shape(sample)[0] - 1
sample, _ = tf.while_loop(lambda sample, i: i >= 0, inner_loop, [sample, i])
return sample
And an example:
with tf.Graph().as_default(), tf.Session() as sess:
tf.set_random_seed(100)
sample = sample_wo_replacement_tf(10, 5)
for i in range(10):
print(sess.run(sample))
# [3 0 6 8 4]
# [5 4 8 9 3]
# [1 4 0 6 8]
# [8 9 5 6 7]
# [7 5 0 2 4]
# [8 4 5 3 7]
# [0 5 7 4 3]
# [2 0 3 8 6]
# [3 4 8 5 1]
# [5 7 0 2 9]
This is quite intesive on tf.while_loops, though, which are well-known not to be particularly fast in TensorFlow, so I wouldn't know how fast can you really get with this method without some kind of benchmarking.
EDIT 4:
One last possible method. You can divide the range of possible values (0 to n) in "chunks" of size c and pick a random amount of numbers from each chunk, then shuffle everything. The amount of memory that you use is limited by c, and you don't need nested loops. If n is divisible by c, then you should get about a perfect random distribution, otherwise values in the last "short" chunk would receive some extra probability (this may be negligible depending on the case). Here is a NumPy implementation. It is somewhat long to account for different corner cases and pitfalls, but if c ≥ k and n mod c = 0 several parts get simplified.
import numpy as np
def sample_chunked(n, k, chunk=None):
chunk = chunk or n
last_chunk = chunk
parts = n // chunk
# Distribute k among chunks
max_p = min(float(chunk) / k, 1.0)
max_p_last = max_p
if n % chunk != 0:
parts += 1
last_chunk = n % chunk
max_p_last = min(float(last_chunk) / k, 1.0)
p = np.full(parts, 2)
# Iterate until a valid distribution is found
while not np.isclose(np.sum(p), 1) or np.any(p > max_p) or p[-1] > max_p_last:
p = np.random.uniform(size=parts)
p /= np.sum(p)
dist = (k * p).astype(np.int64)
sample_size = np.sum(dist)
# Account for rounding errors
while sample_size < k:
i = np.random.randint(len(dist))
while (dist[i] >= chunk) or (i == parts - 1 and dist[i] >= last_chunk):
i = np.random.randint(len(dist))
dist[i] += 1
sample_size += 1
while sample_size > k:
i = np.random.randint(len(dist))
while dist[i] == 0:
i = np.random.randint(len(dist))
dist[i] -= 1
sample_size -= 1
assert sample_size == k
# Generate sample parts
sample_parts = []
for i, v in enumerate(np.nditer(dist)):
if v <= 0:
continue
c = chunk if i < parts - 1 else last_chunk
base = chunk * i
sample_parts.append(base + np.random.choice(c, v, replace=False))
sample = np.concatenate(sample_parts, axis=0)
np.random.shuffle(sample)
return sample
np.random.seed(100)
print(sample_chunked(15, 5, 4))
# [ 8 9 12 13 3]
A quick benchmark of sample_chunked(100000000, 100000, 100000) takes about 3.1 seconds in my computer, while I haven't been able to run the previous algorithm (sample_wo_replacement function above) to completion with the same parameters. It should be possible to implement it in TensorFlow, maybe using tf.TensorArray, although it would require significant effort to get it exactly right.
use the gumbel-max trick here: https://github.com/tensorflow/tensorflow/issues/9260
z = -tf.log(-tf.log(tf.random_uniform(tf.shape(logits),0,1)))
_, indices = tf.nn.top_k(logits + z,K)
indices are what you want. This tick is so easy~!
The following works fairly fast on the GPU, and I did not encounter memory issues when using n~100M and k~10k (using NVIDIA GeForce GTX 1080 Ti):
def random_choice_without_replacement(n, k):
"""equivalent to 'numpy.random.choice(n, size=k, replace=False)'"""
return tf.math.top_k(tf.random.uniform(shape=[n]), k, sorted=False).indices

How do I sum the coefficients of a polynomial in Maxima?

I came up with this nice thing, which I am calling 'partition function for symmetric groups'
Z[0]:1;
Z[n]:=expand(sum((n-1)!/i!*z[n-i]*Z[i], i, 0, n-1));
Z[4];
6*z[4]+8*z[1]*z[3]+3*z[2]^2+6*z[1]^2*z[2]+z[1]^4
The sum of the coefficients for Z[4] is 6+8+3+6+1 = 24 = 4!
which I am hoping corresponds to the fact that the group S4 has 6 elements like (abcd), 8 like (a)(bcd), 3 like (ab)(cd), 6 like (a)(b)(cd), and 1 like (a)(b)(c)(d)
So I thought to myself, the sum of the coefficients of Z[20] should be 20!
But life being somewhat on the short side, and fingers giving trouble, I was hoping to confirm this automatically. Can anyone help?
This sort of thing points a way:
Z[20],z[1]=1,z[2]=1,z[3]=1,z[4]=1,z[5]=1,z[6]=1,z[7]=1,z[8]=1;
But really...
I don't know a straightforward way to do that; coeff seems to handle only a single variable at a time. But here's a way to get the list you want. The basic idea is to extract the terms of Z[20] as a list, and then evaluate each term with z[1] = 1, z[2] = 1, ..., z[20] = 1.
(%i1) display2d : false $
(%i2) Z[0] : 1 $
(%i3) Z[n] := expand (sum ((n - 1)!/i!*z[n - i]*Z[i], i, 0, n-1)) $
(%i4) z1 : makelist (z[i] = 1, i, 1, 20);
(%o4) [z[1] = 1,z[2] = 1,z[3] = 1,z[4] = 1,z[5] = 1,z[6] = 1,z[7] = 1, ...]
(%i5) a : args (Z[20]);
(%o5) [121645100408832000*z[20],128047474114560000*z[1]*z[19],
67580611338240000*z[2]*z[18],67580611338240000*z[1]^2*z[18],
47703960944640000*z[3]*z[17],71555941416960000*z[1]*z[2]*z[17], ...]
(%i6) a1 : ev (a, z1);
(%o6) [121645100408832000,128047474114560000,67580611338240000, ...]
(%i7) apply ("+", a1);
(%o7) 2432902008176640000
(%i8) 20!;
(%o8) 2432902008176640000