Maximizing with constraint for number of distinct SKU not greater than X - dataframe

I'm building a optimization tool using Pulp.
It's purpose is to define which SKU to take and which SKU to leave from each warehouse.
I'm having trouble with the following constraint:
"The maximum of different SKUs selected should not exceed 500"
That is to say, that no matter how many units you take, as long as they do not exceed 500 varieties (different SKUs), its all good.
This is what I've got so far
#simplex
df=pd.read_excel(ruta+"actual/202109.xlsx", nrows=20) #leemos la nueva base del mes
# Create variables and model
x = pulp.LpVariable.dicts("x", df.index, lowBound=0)
mod = pulp.LpProblem("Budget", pulp.LpMaximize)
# Objective function
objvals = {idx: (1.0)*(df['costo_unitario'][idx]) for idx in df.index}
mod += sum([x[idx]*objvals[idx] for idx in df.index])
# Lower and upper bounds:
for idx in df.index:
mod += x[idx] <= df['unidades_sobrestock'][idx]
# Budget sum
mod += sum([x[idx] for idx in df.index]) <= max_uni
# Solve model
mod.solve()
# Output solution
for idx in df.index:
print (str(idx) + " " + str(x[idx].value()))
print ('Objective' + " " + str(pulp.value(mod.objective)))
In the same dataframe, I have a column with the SKU of each particular row df['SKU']
I'm imagining that the constraint should look something like:
for idx in df.index:
mod += df['SKU'].count(distinct) <= 500
but that doesn't seem to work.
Thanks!

You will need a binary variable y[i] to indicate if a SKU is used. In math-like notation:
x[i] ≤ maxx[i]*y[i] (y[i] = 0 ==> x[i] = 0)
sum(i, y[i]) ≤ maxy (limit number of different SKUs)
y[i] ∈ {0,1} (binary variable)
where
maxx[i] = upperbound on x[i]
maxy = limit on number of different SKUs

Related

What is a time complexity of the following algorithm in Big Theta Notation?

res = 0
for i in range (1,n):
j = i
while j % 2 == 0:
j = j/2
res = res + j
I understand that upper bound is O(nlogn), however I'm wondering if it's possible to find a stronger constraint? I'm stuck with the analysis.
Some ideas that may be helpful:
Could create a function (g(n)) that annotates your function (f(n)) to include how many operations occur when running f(n)
def f(n):
res = 0
for i in range (1,n):
j = i
while j % 2 == 0:
j = j/2
res = res + j
return res
def g(n):
comparisons = 0
operations = 0
assignments = 0
assignments += 1
res = 0
assignments += 1. # i = 1
comparisons += 1. # i < n
for i in range (1,n):
assignments += 1
j = i
operations += 1
comparisons += 1
while j % 2 == 0:
operations += 1
assignments += 1
j = j/2
operations += 1
assignments += 1
res = res + j
operations += 1
comparisons += 1
operations += 1 # i + 1
assignments += 1 # assign to i
comparisons += 1 # i < n ?
return operations + comparisons + assignments
For n = 1, the code runs without hitting any loops: assigning the value of res; assigning i as 1; comparing i to n and skipping the loop as a result.
For n > 1, you get into the for loop, and the for statement is all that is changing the loop varaible, so the complexity of the rest of the code is at least O(n).
Once in the loop:
if i is odd, then you only assign j, perform the mod operation and compare to zero. That will be the case for half the values of i, so each run of the loop from 2 to n will (half the time) add a fixed number of a few operations (including the loop operations). So, that's still O(n), just with a larger constant.
if i is even, then we divide by 2 until it is odd. This is what we need to work out the impact of.
Based on my counting of the different operations, I get:
g_initial_setup = 3 (every time)
g_for_any_i = 6 (half the time, it is just this)
g_for_even_i = 6 for each time we divide by two (the other half of the time)
For a random even i between 2 and n, half the time we will only need to divide by two once, half the remaining time by two again, half the remaining time by two again, etc. So we have an infinite series as n goes to infinity of sum(1/2^i) for 1 < i < n, and multiply that by the 6 operations done for each halving of j.
I would expect from this:
g(n) = 3 + (n * 6) + (n * 6) * sum( 1 / pow(2,m) for m between 1 and n )
Given that the infinite series 1/2^n = 1, we simplify that to:
g(n) = 3 + 12n as n approaches infinity.
That implies that the algorithm is O(n). Huh. I did not expect that.
Let's try out the function g(n) from above, counting all the operations that are occurring as f(n) is computed.
g(1) = 3 operations
g(2) = 9
g(3) = 21
g(4) = 27
g(5) = 45
g(10) = 123
g(100) = 1167
g(1000) = 11943
g(10000) = 119943
g(100000) = 1199931
g(1000000) = 11999919
g(10000000) = 119999907
Okay, unless I've really made a serious error here, it's O(n).

Model exists relation in a linear program (PuLP)

I'm trying to implement an exists relation as part of an LP using PuLP. I'd like to check whether there is a j for which x_ij + x_kj = 2.
for i in range(g):
for k in range(g):
prob += lp.lpSum((x[(i, j)] + x[(k, j)] == 2) for j in range(t)) == y[(i, k)]
The variables are all binary. I tried to model it using a sum of boolean expressions, which should result in 1 if y_ik is 1 and 0 otherwise. However, this approach doesn't work.

Division by Zero error in calculating series

I am trying to compute a series, and I am running into an issue that I don't know why is occurring.
"RuntimeWarning: divide by zero encountered in double_scalars"
When I checked the code, it didn't seem to have any singularities, so I am confused. Here is the code currently(log stands for natural logarithm)(edit: extending code if that helps):
from numpy import pi, log
#Create functions to calculate the sums
def phi(z: int):
k = 0
phi = 0
#Loop through 1000 times to try to approximate the series value as if it went to infinity
while k <= 100:
phi += ((1/(k+1)) - (1/(k+(2*z))))
k += 1
return phi
def psi(z: int):
psi = 0
k = 1
while k <= 101:
psi += ((log(k))/( k**(2*z)))
k += 1
return psi
def sig(z: int):
sig = 0
k = 1
while k <= 101:
sig += ((log(k))**2)/(k^(2*z))
k += 1
return sig
def beta(z: int):
beta = 0
k = 1
while k <= 101:
beta += (1/(((2*z)+k)^2))
k += 1
return beta
#Create the formula to approximate the value. For higher accuracy, either calculate more derivatives of Bernoulli numbers or increase the boundry of k.
def Bern(z :int):
#Define Euler–Mascheroni constant
c = 0.577215664901532860606512
#Begin computations (only approximation)
B = (pi/6) * (phi(1) - c - 2 * log(2 * pi) - 1) - z * ((pi/6) * ((phi(1)- c - (2 * log(2 * pi)) - 1) * (phi(1) - c) + beta(1) - 2 * psi(1)) - 2 * (psi(1) * (phi(1) - c) + sig(1) + 2 * psi(1) * log(2 * pi)))
#output
return B
A = int(input("Choose any value: "))
print("The answer is", Bern(A + 1))
Any help would be much appreciated.
are you sure you need a ^ bitwise exclusive or operator instead of **? I've tried to run your code with input parameter z = 1. And on a second iteration the result of k^(2*z) was equal to 0, so where is from zero division error come from (2^2*1 = 0).

Minimum of a variable and a constant in PULP python integer programming

I am stuck with a problem in Integer Programming constraint using PULP in python. I have 2 variables x1, x2 and a constant y. How do i write a constraint on x1 = min(x2 ,y1).
I have written below two condition:
x1 < y1;
x1 < x2
But it is giving me x1 = 0 for my problem.
It should take one of the values from x2 and y1
Thanks in advance. Will really appreciate your help.
Code used:
*import pandas as pd
from pulp import *
data = pd.read_csv("Test.csv")
limit = LpVariable("limit",0, 1000, cat='Integer')
sales = LpVariable.dicts("Sales", (i for i in data.index), lowBound=0, cat="Integer")
####### Defining the Problem
prob = pulp.LpProblem("Profit", pulp.LpMaximize)
prob += pulp.lpSum((1-data.loc[i,'Prize']) * sales[i] for i in data.index)
####### Constraints
for idx in data.index:
max_sales = data.loc[idx, 'Sales'] + data.loc[idx, 'Rejec']
prob += sales[idx] <= max_sales
prob += sales[idx] <= limit
###### Getting the output
prob.solve()
for v in prob.variables():
print v.name,v.varValue
print value(prob.objective)
Data Used (try.csv)
enter image description here

Generating all unique crossword puzzle grids

I want to generate all unique crossword puzzle grids of a certain grid size (4x4 is a good size). All possible puzzles, including non-unique puzzles, are represented by a binary string with the length of the grid area (16 in the case of 4x4), so all possible 4x4 puzzles are represented by the binary forms of all numbers in the range 0 to 2^16.
Generating these is easy, but I'm curious if anyone has a good solution for how to programmatically eliminate invalid and duplicate cases. For example, all puzzles with a single column or single row are functionally identical, hence eliminating 7 of those 8 cases. Also, according to crossword puzzle conventions, all squares must be contiguous. I've had success removing all duplicate structures, but my solution took several minutes to execute and probably was not ideal. I'm at something of a loss for how to detect contiguity so if anyone has ideas on this it'd be much appreciated.
I'd prefer solutions in python but write in whichever language you prefer. If anyone wants, I can post my python code for generating all grids and removing duplicates, slow as it may be.
Disclaimer: mostly untested other than all tests do have an impact by filtering out some grids and a few spotted errors were fixed. Can certainly be optimized.
def is_valid_grid (n):
row_mask = ((1 << n) - 1)
top_row = row_mask << n * (n - 1)
left_column = 0
right_column = 0
for row in range (n):
left_column |= (1 << (n - 1)) << row * n
right_column |= 1 << row * n
def neighborhood (grid):
return (((grid & ~left_column) << 1)
| ((grid & ~right_column) >> 1)
| ((grid & ~top_row) << n)
| (grid >> n))
def is_contiguous (grid):
# Start with a single bit and expand with neighbors as long as
# possible. If we arrive at the starting grid then it is
# contiguous, else not.
part = (grid ^ (grid & (grid - 1)))
while True:
expanded = (part | (neighborhood (part) & grid))
if expanded != part:
part = expanded
else:
break
return part == grid
def flip_y (grid):
rows = []
for k in range (n):
rows.append (grid & row_mask)
grid >>= n
for row in rows:
grid = (grid << n) | row
return grid
def rotate (grid):
rotated = 0
for x in range (n):
for y in range (n):
if grid & (1 << (n * y + x)):
rotated |= (1 << (n * x + (n - 1 - y)))
return rotated
def transform (grid):
yield flip_y (grid)
for k in range (3):
grid = rotate (grid)
yield grid
yield flip_y (grid)
def do_is_valid_grid (grid):
# Any square in the topmost row?
if not (grid & top_row):
return False
# Any square in the leftmost column?
if not (grid & left_column):
return False
# Is contiguous?
if not is_contiguous (grid):
return False
# Of all transformations, we pick only that which gives the
# smallest number.
for transformation in transform (grid):
# A transformation can produce a grid without a square in the topmost row and/or leftmost column.
while not (transformation & top_row):
transformation <<= n
while not (transformation & left_column):
transformation <<= 1
if transformation < grid:
return False
return True
return do_is_valid_grid
def valid_grids (n):
do_is_valid_grid = is_valid_grid (n)
for grid in range (2 ** (n * n)):
if do_is_valid_grid (grid):
yield grid
for grid in valid_grids (4):
print grid