Julia - Gurobi Callbacks on array of JuMP variables - optimization

In Gurobi and JuMP 0.21, it is well documented here on how you would access a variable with a callback:
using JuMP, Gurobi, Test
model = direct_model(Gurobi.Optimizer())
#variable(model, 0 <= x <= 2.5, Int)
#variable(model, 0 <= y <= 2.5, Int)
#objective(model, Max, y)
cb_calls = Cint[]
function my_callback_function(cb_data, cb_where::Cint)
# You can reference variables outside the function as normal
push!(cb_calls, cb_where)
# You can select where the callback is run
if cb_where != GRB_CB_MIPSOL && cb_where != GRB_CB_MIPNODE
return
end
# You can query a callback attribute using GRBcbget
if cb_where == GRB_CB_MIPNODE
resultP = Ref{Cint}()
GRBcbget(cb_data, cb_where, GRB_CB_MIPNODE_STATUS, resultP)
if resultP[] != GRB_OPTIMAL
return # Solution is something other than optimal.
end
end
# Before querying `callback_value`, you must call:
Gurobi.load_callback_variable_primal(cb_data, cb_where)
x_val = callback_value(cb_data, x)
y_val = callback_value(cb_data, y)
# You can submit solver-independent MathOptInterface attributes such as
# lazy constraints, user-cuts, and heuristic solutions.
if y_val - x_val > 1 + 1e-6
con = #build_constraint(y - x <= 1)
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
elseif y_val + x_val > 3 + 1e-6
con = #build_constraint(y + x <= 3)
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
end
if rand() < 0.1
# You can terminate the callback as follows:
GRBterminate(backend(model))
end
return
end
# You _must_ set this parameter if using lazy constraints.
MOI.set(model, MOI.RawParameter("LazyConstraints"), 1)
MOI.set(model, Gurobi.CallbackFunction(), my_callback_function)
optimize!(model)
#test termination_status(model) == MOI.OPTIMAL
#test primal_status(model) == MOI.FEASIBLE_POINT
#test value(x) == 1
#test value(y) == 2
i.e., you would use x_val = callback_value(cb_data, x). However, how should you do when you have an array of variables with specific indexes not starting at 1, i.e. my variables are not in a vector but declared thanks to:
#variable(m, x[i=1:n, j=i+1:n], Bin)
Should I access x with double for loops on its two dimensions and call multiple times callback_value? If so, the indexes for j will not be the same, won't they?

Use broadcasting:
x_val = callback_value.(Ref(cb_data), x)
Or just call callback_value(cb_data, x[i, j]) when you need the value.
For example:
using JuMP, Gurobi
model = Model(Gurobi.Optimizer)
#variable(model, 0 <= x[i=1:3, j=i+1:3] <= 2.5, Int)
function my_callback_function(cb_data)
x_val = callback_value.(Ref(cb_data), x)
display(x_val)
for i=1:3, j=i+1:3
con = #build_constraint(x[i, j] <= floor(Int, x_val[i, j]))
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
end
end
MOI.set(model, MOI.LazyConstraintCallback(), my_callback_function)
optimize!(model)
yields
julia> optimize!(model)
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (mac64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 0 rows, 3 columns and 0 nonzeros
Model fingerprint: 0x5d543c3a
Variable types: 0 continuous, 3 integer (0 binary)
Coefficient statistics:
Matrix range [0e+00, 0e+00]
Objective range [0e+00, 0e+00]
Bounds range [2e+00, 2e+00]
RHS range [0e+00, 0e+00]
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = -0.0
[2, 3] = -0.0
[1, 3] = -0.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = 2.0
[1, 3] = 2.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = 2.0
[1, 3] = 2.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = -0.0
[1, 3] = -0.0
Presolve time: 0.00s
Presolved: 0 rows, 3 columns, 0 nonzeros
Variable types: 0 continuous, 3 integer (0 binary)
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = -0.0
[2, 3] = -0.0
[1, 3] = -0.0
Found heuristic solution: objective 0.0000000
Explored 0 nodes (0 simplex iterations) in 0.14 seconds
Thread count was 8 (of 8 available processors)
Solution count 1: 0
Optimal solution found (tolerance 1.00e-04)
Best objective 0.000000000000e+00, best bound 0.000000000000e+00, gap 0.0000%
User-callback calls 31, time in user-callback 0.14 sec

Related

Best way to get joint probability matrix from categorical data

My goal is to get joint probability (here we use count for example) matrix from data samples. Now I can get the expected result, but I'm wondering how to optimize it. Here is my implementation:
def Fill2DCountTable(arraysList):
'''
:param arraysList: List of arrays, length=2
each array is of shape (k, sampleSize),
k == 1 (or None. numpy will align it) if it's single variable
else k for a set of variables of size k
:return: xyJointCounts, xMarginalCounts, yMarginalCounts
'''
jointUniques, jointCounts = np.unique(np.vstack(arraysList), axis=1, return_counts=True)
_, xReverseIndexs = np.unique(jointUniques[[0]], axis=1, return_inverse=True) ###HIGHLIGHT###
_, yReverseIndexs = np.unique(jointUniques[[1]], axis=1, return_inverse=True)
xyJointCounts = np.zeros((xReverseIndexs.max() + 1, yReverseIndexs.max() + 1), dtype=np.int32)
xyJointCounts[tuple(np.vstack([xReverseIndexs, yReverseIndexs]))] = jointCounts
xMarginalCounts = np.sum(xyJointCounts, axis=1) ###HIGHLIGHT###
yMarginalCounts = np.sum(xyJointCounts, axis=0)
return xyJointCounts, xMarginalCounts, yMarginalCounts
def Fill3DCountTable(arraysList):
# :param arraysList: List of arrays, length=3
jointUniques, jointCounts = np.unique(np.vstack(arraysList), axis=1, return_counts=True)
_, xReverseIndexs = np.unique(jointUniques[[0]], axis=1, return_inverse=True)
_, yReverseIndexs = np.unique(jointUniques[[1]], axis=1, return_inverse=True)
_, SReverseIndexs = np.unique(jointUniques[2:], axis=1, return_inverse=True)
SxyJointCounts = np.zeros((SReverseIndexs.max() + 1, xReverseIndexs.max() + 1, yReverseIndexs.max() + 1), dtype=np.int32)
SxyJointCounts[tuple(np.vstack([SReverseIndexs, xReverseIndexs, yReverseIndexs]))] = jointCounts
SMarginalCounts = np.sum(SxyJointCounts, axis=(1, 2))
SxJointCounts = np.sum(SxyJointCounts, axis=2)
SyJointCounts = np.sum(SxyJointCounts, axis=1)
return SxyJointCounts, SMarginalCounts, SxJointCounts, SyJointCounts
My use scenario is to do conditional independence test over variables. SampleSize is usually quite big (~10k) and each variable's categorical cardinality is relatively small (~10). I still find the speed not satisfying.
How to best optimize this code, or even logic outside the code? I may have some thoughts:
The ###HIGHLIGHT### lines. On a single X I may calculate (X;Y1), (Y2;X), (X;Y3|S1)... for many times, so what if I save cache variable's (and conditional set's) {uniqueValue: reversedIndex} dictionary and its marginal count, and then directly get marginalCounts (no need to sum) and replace to get reverseIndexs (no need to unique).
How to further use matrix parallelization to do CITest in batch, i.e. calculate (X;Y|S1), (X;Y|S2), (X;Y|S3)... simultaneously?
Will torch be faster than numpy, on same CPU? Or on GPU?
It's an open question. Thank you for any possible ideas. Big thanks for your help :)
================== A test example is as follows ==================
xs = np.array( [2, 4, 2, 3, 3, 1, 3, 1, 2, 1] )
ys = np.array( [5, 5, 5, 4, 4, 4, 4, 4, 6, 5] )
Ss = np.array([ [1, 0, 0, 0, 1, 0, 0, 0, 1, 1],
[1, 1, 1, 0, 1, 0, 1, 0, 1, 0] ])
xyJointCounts, xMarginalCounts, yMarginalCounts = Fill2DCountTable([xs, ys])
SxyJointCounts, SMarginalCounts, SxJointCounts, SyJointCounts = Fill3DCountTable([xs, ys, Ss])
get 2D from (X;Y): xMarginalCounts=[3 3 3 1], yMarginalCounts=[5 4 1], and xyJointCounts (added axes name FYI):
xy| 4 5 6
--|-------
1 | 2 1 1
2 | 0 2 1
3 | 3 0 0
4 | 0 1 0
get 3D from (X;Y|{Z1,Z2}): SxyJointCounts is of shape 4x4x3, where the first 4 means the cardinality of {Z1,Z2} (00, 01, 10, 11 with respective SMarginalCounts=[3 3 1 3]). SxJointCounts is of shape 4x4 and SyJointCounts is of shape 4x3.

Convert a function from Python to TensorFlow

I am trying to convert the R3Det Model that outputs rotated bounding boxes to a TensorFlow Lite model for on device inference on mobile devices. The problem that I am facing is that a part of the inference model uses python code wrapped by tf.py_func which is not serializable. I am trying to convert the function to TensorFlow but it contains a for loop and some OpenCV funtion calls, and I have no idea how to convert these into TensorFlow code. I would appreciate it, if anybody can help me out with this. The python function is given below.
def nms_rotate_cpu(boxes, scores, iou_threshold, max_output_size):
"""
:param boxes: format [x_c, y_c, w, h, theta]
:param scores: scores of boxes
:param threshold: iou threshold (0.7 or 0.5)
:param max_output_size: max number of output
:return: the remaining index of boxes
"""
keep = []
order = scores.argsort()[::-1]
num = boxes.shape[0]
suppressed = np.zeros((num), dtype=np.int)
for _i in range(num):
if len(keep) >= max_output_size:
break
i = order[_i]
if suppressed[i] == 1:
continue
keep.append(i)
r1 = ((boxes[i, 0], boxes[i, 1]), (boxes[i, 2], boxes[i, 3]), boxes[i, 4])
area_r1 = boxes[i, 2] * boxes[i, 3]
for _j in range(_i + 1, num):
j = order[_j]
if suppressed[i] == 1:
continue
if np.sqrt((boxes[i, 0] - boxes[j, 0])**2 + (boxes[i, 1] - boxes[j, 1])**2) > (boxes[i, 2] + boxes[j, 2] + boxes[i, 3] + boxes[j, 3]):
inter = 0.0
else:
r2 = ((boxes[j, 0], boxes[j, 1]), (boxes[j, 2], boxes[j, 3]), boxes[j, 4])
area_r2 = boxes[j, 2] * boxes[j, 3]
inter = 0.0
try:
int_pts = cv2.rotatedRectangleIntersection(r1, r2)[1]
if int_pts is not None:
order_pts = cv2.convexHull(int_pts, returnPoints=True)
int_area = cv2.contourArea(order_pts)
inter = int_area * 1.0 / (area_r1 + area_r2 - int_area + cfgs.EPSILON)
except:
"""
cv2.error: /io/opencv/modules/imgproc/src/intersection.cpp:247:
error: (-215) intersection.size() <= 8 in function rotatedRectangleIntersection
"""
# print(r1)
# print(r2)
inter = 0.9999
if inter >= iou_threshold:
suppressed[j] = 1
return np.array(keep, np.int64)

how to implement the variable array with one and zero in tensorflow

I'm totally new on tensorflow, and I just want to implement a kind of selection function by using matrices multiplication.
example below:
#input:
I = [[9.6, 4.1, 3.2]]
#selection:(single "1" value , and the other are "0s")
s = tf.transpose(tf.Variable([[a, b, c]]))
e.g. s could be [[0, 1, 0]] or [[0, 0, 1]] or [[1, 0, 0]]
#result:(multiplication)
o = tf.matul(I, s)
sorry for the poor expression,
I intend to find the 'solution' in distribution functions with different means and sigmas. (value range from 0 to 1).
so now, i have three variable i, j, index.
value1 = np.exp(-((index - m1[i]) ** 2.) / s1[i]** 2.)
value2 = np.exp(-((index - m2[j]) ** 2.) / s2[j]** 2.)
m1 = [1, 3, 5] s = [0.2, 0.4, 0.5]. #first graph
m2 = [3, 5, 7]. s = [0.5, 0.5, 1.0]. #second graph
I want to get the max or optimization of total value
e.g. value1 + value2 = 1+1 = 2 and one of the solutions: i = 2, j=1, index=5
or I could do this in the other module?

Tensorflow, i-th element min-max clamping

Given a tensor of rank 1 eg. p = [x y z w], how can I "min-max clamp" within the provided boundaries: max = [1 10 5 3] and min = [-1 -10 -5 -3] such that the i-th element in p is always within the boundaries defined by mini and maxi
Extra: Would it be possible to do this for ranks > 1?
I found the following solution adequate. See the documentation for tf.minimum and tf.maximum. Solution:
import tensorflow as tf
p = tf.Variable([-1, 1, 3, 7])
clamp_min = tf.Variable([1, 1, 1, 1])
clamp_max = tf.Variable([5, 5, 5, 5])
p = tf.minimum(p, clamp_max)
p = tf.maximum(p, clamp_min)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
print(sess.run(p))
Produces:
[1 1 3 5]

Tensorflow equivalent for this MATLAB code

I want to create a Tensor with as uppertriangular part the values from a vector. I have found in MATLAB this can be done with
a = [1 2 3 4 5 6 7 8 9 10];
b = triu(ones(5),1);
b = b'
b(b==1) = a
b = b'
My tensorflow implementation so far
b = tf.matrix_band_part(tf.ones([dim,dim]), 0, -1) # make upper triangular part 1
b = tf.transpose(b)
...
b = tf.transpose(b)
Who can help me?
I haven't seen a fantastic way to do it, but it's certainly possible. Here's one way (expanding the last dimension of a tensor into a matrix; the preceding dimensions may be batch dimensions):
import tensorflow as tf
def matrix_with_upper_values(upper_values):
# Check that the input is at least a vector
upper_values = tf.convert_to_tensor(upper_values)
upper_values.get_shape().with_rank_at_least(1)
# Put the batch dimensions last
upper_values = tf.transpose(
upper_values,
tf.concat(0, [[tf.rank(upper_values) - 1],
tf.range(tf.rank(upper_values) - 1)]))
input_shape = tf.shape(upper_values)[0]
# Compute the size of the matrix that would have this upper triangle
matrix_size = (1 + tf.cast(tf.sqrt(tf.cast(input_shape * 8 + 1, tf.float32)),
tf.int32)) // 2
# Check that the upper triangle size is valid
check_size_op = tf.Assert(
tf.equal(matrix_size ** 2, input_shape * 2 + matrix_size),
["Not a valid upper triangle size: ", input_shape])
with tf.control_dependencies([check_size_op]):
matrix_size = tf.identity(matrix_size)
# Compute indices for the whole matrix and the upper diagonal
index_matrix = tf.reshape(tf.range(matrix_size ** 2),
[matrix_size, matrix_size])
diagonal_indicies = (matrix_size * tf.range(matrix_size)
+ tf.range(matrix_size))
upper_triangular_indices, _ = tf.unique(tf.reshape(
tf.matrix_band_part(
index_matrix, 0, -1) # upper triangular part
- tf.diag(diagonal_indicies), # remove diagonal
[-1]))
batch_dimensions = tf.shape(upper_values)[1:]
return_shape_transposed = tf.concat(0, [[matrix_size, matrix_size],
batch_dimensions])
# Fill everything else with zeros; later entries get priority
# in dynamic_stitch
result_transposed = tf.reshape(
tf.dynamic_stitch(
[index_matrix,
upper_triangular_indices[1:]], # discard 0
[tf.zeros(return_shape_transposed, dtype=upper_values.dtype),
upper_values]),
return_shape_transposed)
# Transpose the batch dimensions to be first again
return tf.transpose(
result_transposed,
tf.concat(0, [tf.range(2, tf.rank(upper_values) + 1), [0, 1]]))
with tf.Session():
print(matrix_with_upper_values([1]).eval())
print(matrix_with_upper_values([2,7,1]).eval())
print(matrix_with_upper_values([3,1,4,1,5,9]).eval())
print(matrix_with_upper_values([]).eval())
print(matrix_with_upper_values([[2,7,1],[4,3,5]]).eval())
print(matrix_with_upper_values(tf.zeros([0, 3])).eval())
Prints:
[[0 1]
[0 0]]
[[0 2 7]
[0 0 1]
[0 0 0]]
[[0 3 1 4]
[0 0 1 5]
[0 0 0 9]
[0 0 0 0]]
[[ 0.]]
[[[0 2 7]
[0 0 1]
[0 0 0]]
[[0 4 3]
[0 0 5]
[0 0 0]]]
[]