Proc Optmodel Conditional Constraint SAS - optimization

I'm fairly new to proc optmodel and having trouble sorting out some syntax.
Here is my dataset.
data opt_test;
input ID GRP $ x1 MIN MAX y z;
cards;
2 F 10 9 11 1.5 100
3 F 10 9 11 1.2 50
4 F 11 9 11 .9 20
8 G 5 4 6 1.2 300
9 G 6 4 6 .9 200
1 H 21 18 22 1.2 300
7 H 20 18 22 .8 1000
;
run;
There are a few things going on here:
The IDs within a GRP must have the same x2, which is constrained by MIN and MAX. I now wish to further constrain the increase/decrease of x2 based on the value of y. If y<1, I do not want x2 to go below .95*x1. If y>1, I do not want x2 to exceed 1.05*x1. I've looked online and tried a few things to make this happen. Here is my latest attempt, cond_1 and cond_2 are the problems of interest, as everything else works:
proc optmodel;
set<num> ID;
string GRP{ID};
set GRPS = setof{i in ID} GRP[i];
set IDperGRP{gi in GRPS} = {i in ID: GRP[i] = gi};
number x1{ID};
number MIN{ID};
number MAX{ID};
var x2{gi in GRPS} >= max{i in IDperGRP[gi]} MIN[i]
<= min{i in IDperGRP[gi]} MAX[i]
;
impvar x2byID{i in ID} = x2[GRP[i]];
number y{ID};
number z{ID};
read data opt_test into
ID=[ID]
GRP
x1
MIN
MAX
y
z
;
max maximize = sum{gi in GRPS} sum{i in IDperGRP[gi]}
(x2[gi]) * (1-(x2[gi]-x1[i])*y[i]/x1[i]) * z[i];
con cond_1 {i in ID}: x2[i] >=
if y[i]<1 then .95*x1[i] else 0;
con cond_2 {i in ID}: x2[i] <=
if y[i]>=1 then 1.05*x1[i] else 99999999;
solve;
create data results from [ID]={ID} x2=x2byID GRP x1 MIN MAX y z;
print x2 maximize;
quit;

I would calculate the global max and min in a data step outside of PROC OPTMODEL and then set the values. Like so:
data opt_test;
set opt_test;
if y < 1 then
min2 = .95*x1;
else
min2 = 0;
if y>=1 then
max2 = 1.05*x1;
else
max2 = 9999999999;
Min_old = min;
max_old = max;
MIN = max(min,min2);
MAX = min(max,max2);
run;
But you have a problem with group G. Use expand to see it.
proc optmodel;
set<num> ID;
string GRP{ID};
set GRPS = setof{i in ID} GRP[i];
set IDperGRP{gi in GRPS} = {i in ID: GRP[i] = gi};
number x1{ID};
number MIN{ID};
number MAX{ID};
var x2{gi in GRPS} >= max{i in IDperGRP[gi]} MIN[i]
<= min{i in IDperGRP[gi]} MAX[i]
;
impvar x2byID{i in ID} = x2[GRP[i]];
number y{ID};
number z{ID};
read data opt_test into
ID=[ID]
GRP
x1
MIN
MAX
y
z
;
max maximize = sum{gi in GRPS} sum{i in IDperGRP[gi]}
(x2[gi]) * (1-(x2[gi]-x1[i])*y[i]/x1[i]) * z[i];
/*con cond_1 {i in ID}: x2[i] >=
if y[i]<1 then .95*x1[i] else 0;
con cond_2 {i in ID}: x2[i] <=
if y[i]>=1 then 1.05*x1[i] else 99999999;*/
expand;
solve;
create data results from [ID]={ID} x2=x2byID GRP x1 MIN MAX y z;
print x2 maximize;
quit;
You will see that X2[G] is infeasible:
Var x2[G] >= 5.7 <= 5.25
X2[G] starts in [4,6];
For ID=8, X=5 and Y=1.2. By your logic, this sets the max to 5.25 (5*1.2).
Now X2[G] in [4,5.25]
For ID=9, X=6 and Y=0.9. By your logic this sets the min to 5.7 (0.95*6).
X2[G] in [5.7,5.25] <-- BAD!

The biggest problem with the model in the question is that the indexing of the var x2 is incorrect. You could fix that by referring to the group of the ID as such:
con cond_1 {i in ID}: x2[GRP[i]] >=
if y[i] < 1 then .95*x1[i] else 0;
con cond_2 {i in ID}: x2[GRP[i]] <=
if y[i]>=1 then 1.05*x1[i] else 99999999;
but a description of the constraint that reads closer to the business problem is to put a filter in the constraint definition itself:
con Cond_1v2 {i in ID: y[i] < 1} : x2[GRP[i]] >= .95 * x1[i];
con Cond_2v2 {i in ID: Y[i] >= 1}: x2[GRP[i]] <= 1.05 * x1[i];
In either case, the problem becomes infeasible because of the constraint Cond2v2, as you can see using expand (as #DomPazz pointed out), and in particular the expand / iis option, which will print the conflicting constraints when it can determine them:
solve with nlp / iis=on;
expand / iis;

Related

Simulate data for repeated binary measures

I can generate a binary variable y as follows:
clear
set more off
gen y =.
replace y = rbinomial(1, .5)
How can I generate n variables y_1, y_2, ..., y_n with a correlation of rho?
This is #pjs's solution in Stata for generating pairs of variables:
clear
set obs 100
set seed 12345
generate x = rbinomial(1, 0.7)
generate y = rbinomial(1, 0.7 + 0.2 * (1 - 0.7)) if x == 1
replace y = rbinomial(1, 0.7 * (1 - 0.2)) if x != 1
summarize x y
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
x | 100 .72 .4512609 0 1
y | 100 .67 .4725816 0 1
correlate x y
(obs=100)
| x y
-------------+------------------
x | 1.0000
y | 0.1781 1.0000
And a simulation:
set seed 12345
tempname sim1
tempfile mcresults
postfile `sim1' mu_x mu_y rho using `mcresults', replace
forvalues i = 1 / 100000 {
quietly {
clear
set obs 100
generate x = rbinomial(1, 0.7)
generate y = rbinomial(1, 0.7 + 0.2 * (1 - 0.7)) if x == 1
replace y = rbinomial(1, 0.7 * (1 - 0.2)) if x != 1
summarize x, meanonly
scalar mean_x = r(mean)
summarize y, meanonly
scalar mean_y = r(mean)
corr x y
scalar rho = r(rho)
post `sim1' (mean_x) (mean_y) (rho)
}
}
postclose `sim1'
use `mcresults', clear
summarize *
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
mu_x | 100,000 .7000379 .0459078 .47 .89
mu_y | 100,000 .6999094 .0456385 .49 .88
rho | 100,000 .1993097 .1042207 -.2578483 .6294388
Note that in this example I use p = 0.7 and rho = 0.2 instead.
This is #pjs's solution in Stata for generating a time-series:
clear
set seed 12345
set obs 1
local p = 0.7
local rho = 0.5
generate y = runiform()
if y <= `p' replace y = 1
else replace y = 0
forvalues i = 1 / 99999 {
set obs `= _N + 1'
local rnd = runiform()
if y[`i'] == 1 {
if `rnd' <= `p' + `rho' * (1 - `p') replace y = 1 in `= `i' + 1'
else replace y = 0 in `= `i' + 1'
}
else {
if `rnd' <= `p' * (1 - `rho') replace y = 1 in `= `i' + 1'
else replace y = 0 in `= `i' + 1'
}
}
Results:
summarize y
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
y | 100,000 .70078 .4579186 0 1
generate id = _n
tsset id
corrgram y, lags(5)
-1 0 1 -1 0 1
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor]
-------------------------------------------------------------------------------
1 0.5036 0.5036 25366 0.0000 |---- |----
2 0.2567 0.0041 31955 0.0000 |-- |
3 0.1273 -0.0047 33576 0.0000 |- |
4 0.0572 -0.0080 33903 0.0000 | |
5 0.0277 0.0032 33980 0.0000 | |
Correlation is a pairwise measure, so I'm assuming that when you talk about binary (Bernoulli) values Y1,...,Yn having a correlation of rho you're viewing them as a time series Yi: i = 1,...,n, of Bernoulli values having a common mean p, variance p*(1-p), and a lag 1 correlation of rho.
I was able to work it out using the definition of correlation and conditional probability. Given it was a bunch of tedious algebra and stackoverflow doesn't do math gracefully, I'm jumping straight to the result, expressed in pseudocode:
if Y[i] == 1:
generate Y[i+1] as Bernoulli(p + rho * (1 - p))
else:
generate Y[i+1] as Bernoulli(p * (1 - rho))
As a sanity check you can see that if rho = 0 it just generates Bernoulli(p)'s, regardless of the prior value. As you already noted in your question, Bernoulli RVs are binomials with n = 1.
This works for all 0 <= rho, p <= 1. For negative correlations, there are constraints on the relative magnitudes of p and rho so that the parameters of the Bernoullis are always between 0 and 1.
You can analytically check the conditional probabilities to confirm correctness. I don't use Stata, but I tested this pretty thoroughly in the JMP statistical software and it works like a charm.
IMPLEMENTATION (Python)
import random
def Bernoulli(p):
return 1 if random.random() <= p else 0 # yields 1 w/ prob p, 0 otherwise
N = 100000
p = 0.7
rho = 0.5
last_y = Bernoulli(p)
for _ in range(N):
if last_y == 1:
last_y = Bernoulli(p + rho * (1 - p))
else:
last_y = Bernoulli(p * (1 - rho))
print(last_y)
I ran this and redirected the results to a file, then imported the file into JMP. Analyzing it as a time series produced:
The sample mean was 0.69834, with a standard deviation of 0.4589785 [upper right of the figure]. The lag-1 estimates for autocorrelation and partial correlation are 0.5011 [bottom left and right, respectively]. These estimated values are all excellent matches to a Bernoulli(0.7) with rho = 0.5, as specified in the demo program.
If the goal is instead to produce (X,Y) pairs with the specified correlation, revise the loop to:
for _ in range(N):
x = Bernoulli(p)
if x == 1:
y = Bernoulli(p + rho * (1 - p))
else:
y = Bernoulli(p * (1 - rho))
print(x, y)

Calculate the sum of the digits using python [duplicate]

This question already has an answer here:
Competitive Programming Python: Repeated sum of digits Error
(1 answer)
Closed 5 years ago.
I would like to find the sum of the digits using python. when i enter a birth year 1982 the result should be 1+9+8+2 = 20 final total result is 2+0 = 2.
The reason that i am posting this question is i didn't find any simple python solution for this.
This is my code
num = int(input("Enter your birth year: "))
x = num //1000
x1 = (num - x*1000)//100
x2 = (num - x*1000 - x1*100)//10
x3 = num - x*1000 - x1*100 - x2*10
x4 = x+x1+x2+x3
num2 = int(x4)
x6 = num2 //10
x7 = (num2 -x6)//10
print("your birth number is" ,x6+x7)
but i am not getting the correct sum value.
Sum the digits of an integer until the result is a one-digit integer:
def sum_digits(num):
num = str(num)
if len(num) < 2:
return int(num)
else:
return sum_digits(sum([int(dig) for dig in str(num)]))
>> sum_digits(1982)
2
Or a simpler version for the case your number is a year:
def sum_digits(num):
return sum([int(dig) for dig in str(num)])
Just call the function twice
>> sum_digits(sum_digits(1982))
2
Try adding some debug statements to inspect values as your program runs.
num = int(input("Enter your birth year: "))
x = num //1000
x1 = (num - x*1000)//100
x2 = (num - x*1000 - x1*100)//10
x3 = num - x*1000 - x1*100 - x2*10
print (x, x1, x2, x3)
x4 = x+x1+x2+x3
print (x4)
num2 = int(x4)
x6 = num2 //10
x7 = (num2 -x6)//10
print (x6, x7)
print("your birth number is" ,x6+x7)
You'll quickly find your problem.

Minimum of a variable and a constant in PULP python integer programming

I am stuck with a problem in Integer Programming constraint using PULP in python. I have 2 variables x1, x2 and a constant y. How do i write a constraint on x1 = min(x2 ,y1).
I have written below two condition:
x1 < y1;
x1 < x2
But it is giving me x1 = 0 for my problem.
It should take one of the values from x2 and y1
Thanks in advance. Will really appreciate your help.
Code used:
*import pandas as pd
from pulp import *
data = pd.read_csv("Test.csv")
limit = LpVariable("limit",0, 1000, cat='Integer')
sales = LpVariable.dicts("Sales", (i for i in data.index), lowBound=0, cat="Integer")
####### Defining the Problem
prob = pulp.LpProblem("Profit", pulp.LpMaximize)
prob += pulp.lpSum((1-data.loc[i,'Prize']) * sales[i] for i in data.index)
####### Constraints
for idx in data.index:
max_sales = data.loc[idx, 'Sales'] + data.loc[idx, 'Rejec']
prob += sales[idx] <= max_sales
prob += sales[idx] <= limit
###### Getting the output
prob.solve()
for v in prob.variables():
print v.name,v.varValue
print value(prob.objective)
Data Used (try.csv)
enter image description here

algorithm to deal with series of values

With a series with a START, INCREMENT, and MAX:
START = 100
INCREMENT = 30
MAX = 315
e.g. 100, 130, 160, 190, 220, 250, 280, 310
Given an arbitrary number X return:
the values remaining in the series where the first value is >= X
the offset Y (catch up amount needed to get from X to first value of the series).
Example
In:
START = 100
INCREMENT = 30
MAX = 315
X = 210
Out:
Y = 10
S = 220, 250, 280, 310
UPDATE -- From MBo answer:
float max = 315.0;
float inc = 30.0;
float start = 100.0;
float x = 210.0;
float k0 = ceil( (x-start) / inc) ;
float k1 = floor( (max - start) / inc) ;
for (int i=k0; i<=k1; i++)
{
NSLog(#" output: %d: %f", i, start + i * inc);
}
output: 4: 220.000000
output: 5: 250.000000
output: 6: 280.000000
output: 7: 310.000000
MBo integer approach will be nicer.
School math:
Start + k0 * Inc >= X
k0 * Inc >= X - Start
k0 >= (X - Start) / Inc
Programming math:
k0 = Ceil(1.0 * (X - Start) / Inc)
k1 = Floor(1.0 * (Max - Start) / Inc)
for i = k0 to k1 (including both ends)
output Start + i * Inc
Integer math:
k0 = (X - Start + Inc - 1) / Inc //such integer division makes ceiling
k1 = (Max - Start) / Inc //integer division makes flooring
for i = k0 to k1 (including both ends)
output Start + i * Inc
Example:
START = 100
INCREMENT = 30
MAX = 315
X = 210
k0 = Ceil((210 - 100) / 30) = Ceil(3.7) = 4
k1 = Floor((315 - 100) / 30) = Floor(7.2) = 7
first 100 + 4 * 30 = 220
last 100 + 7 * 30 = 310
Solve the inequation
X <= S + K.I <= M
This is equivalent to
K0 = Ceil((X - S) / I) <= K <= Floor((M - S) / I) = K1
and
Y = X - (S + K0.I).
Note that it is possible to have K0 > K1, and there is no solution.

Hoare Logic, calculate pre condition

if x < 15:
x = x+1
else:
x = 0
the post condition is: Q = {0 <= x <= 15}
is the correct pre condition P1 = {-1 <= x} or P2 = {0 <= x <= 15}
And how can I calculate it?
Both are valid preconditions for the code fragment and postcondition, so you want to choose the weaker one, which in this case is P1. (P2 specifies a narrower range of values for x, all of which are present in the range specified by P1.)