Model exists relation in a linear program (PuLP) - optimization

I'm trying to implement an exists relation as part of an LP using PuLP. I'd like to check whether there is a j for which x_ij + x_kj = 2.
for i in range(g):
for k in range(g):
prob += lp.lpSum((x[(i, j)] + x[(k, j)] == 2) for j in range(t)) == y[(i, k)]
The variables are all binary. I tried to model it using a sum of boolean expressions, which should result in 1 if y_ik is 1 and 0 otherwise. However, this approach doesn't work.

Related

What is a time complexity of the following algorithm in Big Theta Notation?

res = 0
for i in range (1,n):
j = i
while j % 2 == 0:
j = j/2
res = res + j
I understand that upper bound is O(nlogn), however I'm wondering if it's possible to find a stronger constraint? I'm stuck with the analysis.
Some ideas that may be helpful:
Could create a function (g(n)) that annotates your function (f(n)) to include how many operations occur when running f(n)
def f(n):
res = 0
for i in range (1,n):
j = i
while j % 2 == 0:
j = j/2
res = res + j
return res
def g(n):
comparisons = 0
operations = 0
assignments = 0
assignments += 1
res = 0
assignments += 1. # i = 1
comparisons += 1. # i < n
for i in range (1,n):
assignments += 1
j = i
operations += 1
comparisons += 1
while j % 2 == 0:
operations += 1
assignments += 1
j = j/2
operations += 1
assignments += 1
res = res + j
operations += 1
comparisons += 1
operations += 1 # i + 1
assignments += 1 # assign to i
comparisons += 1 # i < n ?
return operations + comparisons + assignments
For n = 1, the code runs without hitting any loops: assigning the value of res; assigning i as 1; comparing i to n and skipping the loop as a result.
For n > 1, you get into the for loop, and the for statement is all that is changing the loop varaible, so the complexity of the rest of the code is at least O(n).
Once in the loop:
if i is odd, then you only assign j, perform the mod operation and compare to zero. That will be the case for half the values of i, so each run of the loop from 2 to n will (half the time) add a fixed number of a few operations (including the loop operations). So, that's still O(n), just with a larger constant.
if i is even, then we divide by 2 until it is odd. This is what we need to work out the impact of.
Based on my counting of the different operations, I get:
g_initial_setup = 3 (every time)
g_for_any_i = 6 (half the time, it is just this)
g_for_even_i = 6 for each time we divide by two (the other half of the time)
For a random even i between 2 and n, half the time we will only need to divide by two once, half the remaining time by two again, half the remaining time by two again, etc. So we have an infinite series as n goes to infinity of sum(1/2^i) for 1 < i < n, and multiply that by the 6 operations done for each halving of j.
I would expect from this:
g(n) = 3 + (n * 6) + (n * 6) * sum( 1 / pow(2,m) for m between 1 and n )
Given that the infinite series 1/2^n = 1, we simplify that to:
g(n) = 3 + 12n as n approaches infinity.
That implies that the algorithm is O(n). Huh. I did not expect that.
Let's try out the function g(n) from above, counting all the operations that are occurring as f(n) is computed.
g(1) = 3 operations
g(2) = 9
g(3) = 21
g(4) = 27
g(5) = 45
g(10) = 123
g(100) = 1167
g(1000) = 11943
g(10000) = 119943
g(100000) = 1199931
g(1000000) = 11999919
g(10000000) = 119999907
Okay, unless I've really made a serious error here, it's O(n).

Maximizing with constraint for number of distinct SKU not greater than X

I'm building a optimization tool using Pulp.
It's purpose is to define which SKU to take and which SKU to leave from each warehouse.
I'm having trouble with the following constraint:
"The maximum of different SKUs selected should not exceed 500"
That is to say, that no matter how many units you take, as long as they do not exceed 500 varieties (different SKUs), its all good.
This is what I've got so far
#simplex
df=pd.read_excel(ruta+"actual/202109.xlsx", nrows=20) #leemos la nueva base del mes
# Create variables and model
x = pulp.LpVariable.dicts("x", df.index, lowBound=0)
mod = pulp.LpProblem("Budget", pulp.LpMaximize)
# Objective function
objvals = {idx: (1.0)*(df['costo_unitario'][idx]) for idx in df.index}
mod += sum([x[idx]*objvals[idx] for idx in df.index])
# Lower and upper bounds:
for idx in df.index:
mod += x[idx] <= df['unidades_sobrestock'][idx]
# Budget sum
mod += sum([x[idx] for idx in df.index]) <= max_uni
# Solve model
mod.solve()
# Output solution
for idx in df.index:
print (str(idx) + " " + str(x[idx].value()))
print ('Objective' + " " + str(pulp.value(mod.objective)))
In the same dataframe, I have a column with the SKU of each particular row df['SKU']
I'm imagining that the constraint should look something like:
for idx in df.index:
mod += df['SKU'].count(distinct) <= 500
but that doesn't seem to work.
Thanks!
You will need a binary variable y[i] to indicate if a SKU is used. In math-like notation:
x[i] ≤ maxx[i]*y[i] (y[i] = 0 ==> x[i] = 0)
sum(i, y[i]) ≤ maxy (limit number of different SKUs)
y[i] ∈ {0,1} (binary variable)
where
maxx[i] = upperbound on x[i]
maxy = limit on number of different SKUs

How to find the value of integer k efficiently for which q divides b ^ k finitely?

We have given two integers b and q, and we want to find the minimum value of an integer 'k' for which q completely divides b^k or k does not exist. Can we find out the value of k efficiently? Not just iterating each value of k (0, 1, 2, 3, ...) and checking (b^k) % q == 0) where q <= k or q >= k.
First of all, k will never equal zero unless q=1. k will never equal one unless q=b.
Next, if you can factorize q and b, then you can reason about them.
If there are any prime factors of b that are not factors of q at all, then k does not exist. Otherwise, k has to be large enough so that every factor of b^k is represented in q.
Here's some pseudo-code:
if (q==1) return 0;
if (q==b) return 1;
// qfactors and bfactors are arrays, one element per factor
let qfactors = prime_factorization(q);
let bfactors = prime_factorization(b);
let kmin=0;
foreach (f in bfactors.unique) {
let bcount = bfactors.count(f);
let qcount = qfactors.count(f);
if (qcount==0 || qcount < bcount) return -1; // k does not exist
kmin_f = ceiling(bcount/qcount);
if (kmin_f > kmin) let kmin = kmin_f;
}
return kmin;
If q = 1 ; k = 0
If b = q ; k = 1
If b > q and factors ; k = 1
If b < q and factors ; k != I
If b != q and not factors ; k != I
We know,
Dividend = Divisor x Quotient + Reminder
=> Dividend = Divisor x Quotient [Here, Reminder = 0]
Now go for calculation of Maxima and Minima as lower the value of Quotient is lower the value of 'k'.
If you consider the Quotient as 1 (lowest but spl case) then your formula for 'k' becomes,
k = log q/log b
I found a solution-
If q divides pow(b,k) then all prime factors of q are prime factors of b. Now we can do iterations q = q ÷ gcd(b,q) while gcd(q,b)≠1. If q≠1 after iterations, there are prime factors of q which are not prime factors of b then k doesn't exist else k = no of iteration.

Iterating over multidimensional Numpy array

What is the fastest way to iterate over all elements in a 3D NumPy array? If array.shape = (r,c,z), there must be something faster than this:
x = np.asarray(range(12)).reshape((1,4,3))
#function that sums nearest neighbor values
x = np.asarray(range(12)).reshape((1, 4,3))
#e is my element location, d is the distance
def nn(arr, e, d=1):
d = e[0]
r = e[1]
c = e[2]
return sum(arr[d,r-1,c-1:c+2]) + sum(arr[d,r+1, c-1:c+2]) + sum(arr[d,r,c-1]) + sum(arr[d,r,c+1])
Instead of creating a nested for loop like the one below to create my values of e to run the function nn for each pixel :
for dim in range(z):
for row in range(r):
for col in range(c):
e = (dim, row, col)
I'd like to vectorize my nn function in a way that extracts location information for each element (e = (0,1,1) for example) and iterates over ALL elements in my matrix without having to manually input each locational value of e OR creating a messy nested for loop. I'm not sure how to apply np.vectorize to this problem. Thanks!
It is easy to vectorize over the d dimension:
def nn(arr, e):
r,c = e # (e[0],e[1])
return np.sum(arr[:,r-1,c-1:c+2],axis=2) + np.sum(arr[:,r+1,c-1:c+2],axis=2) +
np.sum(arr[:,r,c-1],axis=?) + np.sum(arr[:,r,c+1],axis=?)
now just iterate over the row and col dimensions, returning a vector, that is assigned to the appropriate slot in x.
for row in <correct range>:
for col in <correct range>:
x[:,row,col] = nn(data, (row,col))
The next step is to make
rows = [:,None]
cols =
arr[:,rows-1,cols+2] + arr[:,rows,cols+2] etc.
This kind of problem has come up many times, with various descriptions - convolution, smoothing, filtering etc.
We could do some searches to find the best, or it you prefer, we could guide you through the steps.
Converting a nested loop calculation to Numpy for speedup
is a question similar to yours. There's only 2 levels of looping, and sum expression is different, but I think it has the same issues:
for h in xrange(1, height-1):
for w in xrange(1, width-1):
new_gr[h][w] = gr[h][w] + gr[h][w-1] + gr[h-1][w] +
t * gr[h+1][w-1]-2 * (gr[h][w-1] + t * gr[h-1][w])
Here's what I ended up doing. Since I'm returning the xv vector and slipping it in to the larger 3D array lag, this should speed up the process, right? data is my input dataset.
def nn3d(arr, e):
r,c = e
n = np.copy(arr[:,r-1:r+2,c-1:c+2])
n[:,1,1] = 0
n3d = np.ma.masked_where(n == nodata, n)
xv = np.zeros(arr.shape[0])
for d in range(arr.shape[0]):
if np.ma.count(n3d[d,:,:]) < 2:
element = nodata
else:
element = np.sum(n3d[d,:,:])/(np.ma.count(n3d[d,:,:])-1)
xv[d] = element
return xv
lag = np.zeros(shape = data.shape)
for r in range(1,data.shape[1]-1): #boundary effects
for c in range(1,data.shape[2]-1):
lag[:,r,c] = nn3d(data,(r,c))
What you are looking for is probably array.nditer:
a = np.arange(6).reshape(2,3)
for x in np.nditer(a):
print(x, end=' ')
which prints
0 1 2 3 4 5

For loop with variable upper bound

I'd like to write a for loop with a variable upper limit in Mathematica 9. So, instead of
j = 0;
For[n = 1, n <= 3, n++, j = j + n];
j
(*6*)
I'd like to do
N = 3;
j = 0;
For[n = 1, n <= N, n++, j = j + n];
j
n
(*
0
1
*)
. But, as shown, this does not give the right result at all; it would appear from the value of n that the body of the loop was not evaluated at all.
I've looked through the Mathematica docs both on for loops and and on loops and control structures more generally (and also done some DuckDuckGo searches), but there's still something fundamental I'm missing. What is it?
For completeness, I should note that my ultimate goal is to put this in a function:
foo[N] =
Module[{j = 0},
For[n = 1, n <= N, n++, j = j + n;];
j]
foo[3]
Your code shows several common new user's problems. For example:
N is a reserved word
You shouldn't start your identifiers with Upper Case letters
The function foo[] should be defined with SetDelayed (:=) and not
with Set (=)
You need to use patterns (_) in the function definition arguments
For[]loops, and iterations in general should be avoided in
Mathematica
I think you could carefully read all the answers to this post to get a better grip on Mathematica.
Anyway, your code may be rewritten as
foo[k_] := Module[{j = 0}, For[n = 1, n <= k, n++, j = j + n]; j]
foo[3]
(*6*)
But this is horrible Mathematica coding.
The following are much better ways in Mathematica:
foo[j_ , k_] := Fold[Plus, j, Range#k]
foo[j_ , k_] := j + Total#Range#k
foo[j_ , k_] := j + Tr#Range#k