Tensorflow, i-th element min-max clamping - tensorflow

Given a tensor of rank 1 eg. p = [x y z w], how can I "min-max clamp" within the provided boundaries: max = [1 10 5 3] and min = [-1 -10 -5 -3] such that the i-th element in p is always within the boundaries defined by mini and maxi
Extra: Would it be possible to do this for ranks > 1?

I found the following solution adequate. See the documentation for tf.minimum and tf.maximum. Solution:
import tensorflow as tf
p = tf.Variable([-1, 1, 3, 7])
clamp_min = tf.Variable([1, 1, 1, 1])
clamp_max = tf.Variable([5, 5, 5, 5])
p = tf.minimum(p, clamp_max)
p = tf.maximum(p, clamp_min)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
print(sess.run(p))
Produces:
[1 1 3 5]

Related

takes 1 positional argument but 2 were given in #tf.function

I have a network written with tensorflow Keras, in part of my code I need to use scipy.cKDTree, so I decorated my function with #tf.function. When I want to make the tree I receive the following error. (Let me know if more details are required.)
The error happens when it tries to make cKDTree. The size of the pc2e is shape=(46080, 3).
In similar questions I found that it could be because of the Pillow version, I changed the version and didn't solve the error.
Also is there a better way to have KDTree in tensorflow?
TypeError: in user code:
/home/***/My_Models.py:731 var_layer *
tree2 = cKDTree(pc2e, leafsize=500, balanced_tree=False)
ckdtree.pyx:522 scipy.spatial.ckdtree.cKDTree.__init__ **
TypeError: __array__() takes 1 positional argument but 2 were given
Process finished with exit code 1
The function:
#tf.function
def var_layer(self, inputs, output): # output: x y z i j k w
inputs_v = tf.Variable(inputs)
pc1_raw, pc2_raw = tf.split(inputs_v, num_or_size_splits=2, axis=4)
# B x T x W x H x Channels
s0, s1, s2, s3, s4 = pc1_raw.shape[0], pc1_raw.shape[1], pc1_raw.shape[2], pc1_raw.shape[3], pc1_raw.shape[4]
pc1 = tf.reshape(pc1_raw[:, -1, :, :, 0:3], shape=[-1, s2 * s3, 3])
pc2 = tf.reshape(pc2_raw[:, -1, :, :, 0:3], shape=[-1, s2 * s3, 3])
# normal2 = tf.reshape(pc2_raw[:, -1, :, :, 3:6], [-1, s2 * s3, 3])
# normal1 = tf.reshape(pc1_raw[:, -1, :, :, 3:6], [-1, s2 * s3, 3])
Rq, Tr3 = tfg.dual_quaternion.to_rotation_translation(output)
R33 = tfg.rotation_matrix_3d.from_quaternion(Rq)
RT = tf.concat([R33, tf.expand_dims(Tr3, axis=2)], -1)
RT = tf.pad(RT, [[0, 0], [0, 1], [0, 0]], constant_values=[0.0, 0.0, 0.0, 1.0])
pc1 = tf.pad(pc1, [[0, 0], [0, 0], [0, 1]], constant_values=1)
pc1 = tf.transpose(pc1, perm=[0, 2, 1])
pc1_tr = tf.linalg.matmul(RT, pc1)
pc1_tr = pc1_tr[:, 0:3]
pc1_tr = tf.transpose(pc1_tr, perm=[0, 2, 1]) # B x WH x 3
# remove zero values
for epoch in range(self.Epochs):
pc2e = pc2[epoch]
print(pc2e)
tree2 = cKDTree(pc2e, leafsize=500, balanced_tree=False)
dist_in, ind = tree2.query(pc1_tr[epoch], k=1)
nonempty = np.count_nonzero(dist_in)
dist_in = np.sum(np.abs(dist_in))
if nonempty != 0:
dist_in = np.divide(dist_in, nonempty)
dist_p2p = dist_in
print(dist_p2p)
return dist_p2p
versions:
Tensorflow 2.3.0
Scipy 1.4.1
pillow==8.2.0
Input of the function is a point cloud with this shape: Batch x Time x W x H x Channels
and the size of pc2e is shape=(46080, 3)

Julia - Gurobi Callbacks on array of JuMP variables

In Gurobi and JuMP 0.21, it is well documented here on how you would access a variable with a callback:
using JuMP, Gurobi, Test
model = direct_model(Gurobi.Optimizer())
#variable(model, 0 <= x <= 2.5, Int)
#variable(model, 0 <= y <= 2.5, Int)
#objective(model, Max, y)
cb_calls = Cint[]
function my_callback_function(cb_data, cb_where::Cint)
# You can reference variables outside the function as normal
push!(cb_calls, cb_where)
# You can select where the callback is run
if cb_where != GRB_CB_MIPSOL && cb_where != GRB_CB_MIPNODE
return
end
# You can query a callback attribute using GRBcbget
if cb_where == GRB_CB_MIPNODE
resultP = Ref{Cint}()
GRBcbget(cb_data, cb_where, GRB_CB_MIPNODE_STATUS, resultP)
if resultP[] != GRB_OPTIMAL
return # Solution is something other than optimal.
end
end
# Before querying `callback_value`, you must call:
Gurobi.load_callback_variable_primal(cb_data, cb_where)
x_val = callback_value(cb_data, x)
y_val = callback_value(cb_data, y)
# You can submit solver-independent MathOptInterface attributes such as
# lazy constraints, user-cuts, and heuristic solutions.
if y_val - x_val > 1 + 1e-6
con = #build_constraint(y - x <= 1)
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
elseif y_val + x_val > 3 + 1e-6
con = #build_constraint(y + x <= 3)
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
end
if rand() < 0.1
# You can terminate the callback as follows:
GRBterminate(backend(model))
end
return
end
# You _must_ set this parameter if using lazy constraints.
MOI.set(model, MOI.RawParameter("LazyConstraints"), 1)
MOI.set(model, Gurobi.CallbackFunction(), my_callback_function)
optimize!(model)
#test termination_status(model) == MOI.OPTIMAL
#test primal_status(model) == MOI.FEASIBLE_POINT
#test value(x) == 1
#test value(y) == 2
i.e., you would use x_val = callback_value(cb_data, x). However, how should you do when you have an array of variables with specific indexes not starting at 1, i.e. my variables are not in a vector but declared thanks to:
#variable(m, x[i=1:n, j=i+1:n], Bin)
Should I access x with double for loops on its two dimensions and call multiple times callback_value? If so, the indexes for j will not be the same, won't they?
Use broadcasting:
x_val = callback_value.(Ref(cb_data), x)
Or just call callback_value(cb_data, x[i, j]) when you need the value.
For example:
using JuMP, Gurobi
model = Model(Gurobi.Optimizer)
#variable(model, 0 <= x[i=1:3, j=i+1:3] <= 2.5, Int)
function my_callback_function(cb_data)
x_val = callback_value.(Ref(cb_data), x)
display(x_val)
for i=1:3, j=i+1:3
con = #build_constraint(x[i, j] <= floor(Int, x_val[i, j]))
MOI.submit(model, MOI.LazyConstraint(cb_data), con)
end
end
MOI.set(model, MOI.LazyConstraintCallback(), my_callback_function)
optimize!(model)
yields
julia> optimize!(model)
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (mac64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 0 rows, 3 columns and 0 nonzeros
Model fingerprint: 0x5d543c3a
Variable types: 0 continuous, 3 integer (0 binary)
Coefficient statistics:
Matrix range [0e+00, 0e+00]
Objective range [0e+00, 0e+00]
Bounds range [2e+00, 2e+00]
RHS range [0e+00, 0e+00]
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = -0.0
[2, 3] = -0.0
[1, 3] = -0.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = 2.0
[1, 3] = 2.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = 2.0
[1, 3] = 2.0
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = 2.0
[2, 3] = -0.0
[1, 3] = -0.0
Presolve time: 0.00s
Presolved: 0 rows, 3 columns, 0 nonzeros
Variable types: 0 continuous, 3 integer (0 binary)
JuMP.Containers.SparseAxisArray{Float64,2,Tuple{Int64,Int64}} with 3 entries:
[1, 2] = -0.0
[2, 3] = -0.0
[1, 3] = -0.0
Found heuristic solution: objective 0.0000000
Explored 0 nodes (0 simplex iterations) in 0.14 seconds
Thread count was 8 (of 8 available processors)
Solution count 1: 0
Optimal solution found (tolerance 1.00e-04)
Best objective 0.000000000000e+00, best bound 0.000000000000e+00, gap 0.0000%
User-callback calls 31, time in user-callback 0.14 sec

Best way to get joint probability matrix from categorical data

My goal is to get joint probability (here we use count for example) matrix from data samples. Now I can get the expected result, but I'm wondering how to optimize it. Here is my implementation:
def Fill2DCountTable(arraysList):
'''
:param arraysList: List of arrays, length=2
each array is of shape (k, sampleSize),
k == 1 (or None. numpy will align it) if it's single variable
else k for a set of variables of size k
:return: xyJointCounts, xMarginalCounts, yMarginalCounts
'''
jointUniques, jointCounts = np.unique(np.vstack(arraysList), axis=1, return_counts=True)
_, xReverseIndexs = np.unique(jointUniques[[0]], axis=1, return_inverse=True) ###HIGHLIGHT###
_, yReverseIndexs = np.unique(jointUniques[[1]], axis=1, return_inverse=True)
xyJointCounts = np.zeros((xReverseIndexs.max() + 1, yReverseIndexs.max() + 1), dtype=np.int32)
xyJointCounts[tuple(np.vstack([xReverseIndexs, yReverseIndexs]))] = jointCounts
xMarginalCounts = np.sum(xyJointCounts, axis=1) ###HIGHLIGHT###
yMarginalCounts = np.sum(xyJointCounts, axis=0)
return xyJointCounts, xMarginalCounts, yMarginalCounts
def Fill3DCountTable(arraysList):
# :param arraysList: List of arrays, length=3
jointUniques, jointCounts = np.unique(np.vstack(arraysList), axis=1, return_counts=True)
_, xReverseIndexs = np.unique(jointUniques[[0]], axis=1, return_inverse=True)
_, yReverseIndexs = np.unique(jointUniques[[1]], axis=1, return_inverse=True)
_, SReverseIndexs = np.unique(jointUniques[2:], axis=1, return_inverse=True)
SxyJointCounts = np.zeros((SReverseIndexs.max() + 1, xReverseIndexs.max() + 1, yReverseIndexs.max() + 1), dtype=np.int32)
SxyJointCounts[tuple(np.vstack([SReverseIndexs, xReverseIndexs, yReverseIndexs]))] = jointCounts
SMarginalCounts = np.sum(SxyJointCounts, axis=(1, 2))
SxJointCounts = np.sum(SxyJointCounts, axis=2)
SyJointCounts = np.sum(SxyJointCounts, axis=1)
return SxyJointCounts, SMarginalCounts, SxJointCounts, SyJointCounts
My use scenario is to do conditional independence test over variables. SampleSize is usually quite big (~10k) and each variable's categorical cardinality is relatively small (~10). I still find the speed not satisfying.
How to best optimize this code, or even logic outside the code? I may have some thoughts:
The ###HIGHLIGHT### lines. On a single X I may calculate (X;Y1), (Y2;X), (X;Y3|S1)... for many times, so what if I save cache variable's (and conditional set's) {uniqueValue: reversedIndex} dictionary and its marginal count, and then directly get marginalCounts (no need to sum) and replace to get reverseIndexs (no need to unique).
How to further use matrix parallelization to do CITest in batch, i.e. calculate (X;Y|S1), (X;Y|S2), (X;Y|S3)... simultaneously?
Will torch be faster than numpy, on same CPU? Or on GPU?
It's an open question. Thank you for any possible ideas. Big thanks for your help :)
================== A test example is as follows ==================
xs = np.array( [2, 4, 2, 3, 3, 1, 3, 1, 2, 1] )
ys = np.array( [5, 5, 5, 4, 4, 4, 4, 4, 6, 5] )
Ss = np.array([ [1, 0, 0, 0, 1, 0, 0, 0, 1, 1],
[1, 1, 1, 0, 1, 0, 1, 0, 1, 0] ])
xyJointCounts, xMarginalCounts, yMarginalCounts = Fill2DCountTable([xs, ys])
SxyJointCounts, SMarginalCounts, SxJointCounts, SyJointCounts = Fill3DCountTable([xs, ys, Ss])
get 2D from (X;Y): xMarginalCounts=[3 3 3 1], yMarginalCounts=[5 4 1], and xyJointCounts (added axes name FYI):
xy| 4 5 6
--|-------
1 | 2 1 1
2 | 0 2 1
3 | 3 0 0
4 | 0 1 0
get 3D from (X;Y|{Z1,Z2}): SxyJointCounts is of shape 4x4x3, where the first 4 means the cardinality of {Z1,Z2} (00, 01, 10, 11 with respective SMarginalCounts=[3 3 1 3]). SxJointCounts is of shape 4x4 and SyJointCounts is of shape 4x3.

Pairwise distance between a set of Matrices in Keras/Tensorflow

I want to calculate pairwise distance between a set of Tensor (e.g 4 Tensor). Each matrix is 2D Tensor. I don't know how to do this in vectorize format. I wrote following sudo-code to determine what I need:
E.shape => [4,30,30]
sum = 0
for i in range(4):
for j in range(4):
res = calculate_distance(E[i],E[j]) # E[i] is one the 30*30 Tensor
sum = sum + reduce_sum(res)
Here is my last try:
x_ = tf.expand_dims(E, 0)
y_ = tf.expand_dims(E, 1)
s = x_ - y_
P = tf.reduce_sum(tf.norm(s, axis=[-2, -1]))
This code works But I don't know how do this in a Batch. For instance when E.shape is [BATCH_SIZE * 4 * 30 * 30] my code doesn't work and Out Of Memory will happen. How can I do this efficiently?
Edit: After a day, I find a solution. it's not perfect but works:
res = tf.map_fn(lambda x: tf.map_fn(lambda y: tf.map_fn(lambda z: tf.norm(z - x), x), x), E)
res = tf.reduce_mean(tf.square(res))
Your solution with expand_dims should be okay if your batch size is not too large. However, given that your original pseudo code loops over range(4), you should probably expand axes 1 and 2, instead of 0 and 1.
You can check the shape of the tensors to ensure that you're specifying the correct axes. For example,
batch_size = 8
E_np = np.random.rand(batch_size, 4, 30, 30)
E = K.variable(E_np) # shape=(8, 4, 30, 30)
x_ = K.expand_dims(E, 1)
y_ = K.expand_dims(E, 2)
s = x_ - y_ # shape=(8, 4, 4, 30, 30)
distances = tf.norm(s, axis=[-2, -1]) # shape=(8, 4, 4)
P = K.sum(distances, axis=[-2, -1]) # shape=(8,)
Now P will be the sum of pairwise distances between the 4 matrices for each of the 8 samples.
You can also verify that the values in P is the same as what would be computed in your pseudo code:
answer = []
for batch_idx in range(batch_size):
s = 0
for i in range(4):
for j in range(4):
a = E_np[batch_idx, i]
b = E_np[batch_idx, j]
s += np.sqrt(np.trace(np.dot(a - b, (a - b).T)))
answer.append(s)
print(answer)
[149.45960605637578, 147.2815068236368, 144.97487402393705, 146.04866735065312, 144.25537059201062, 148.9300986019226, 146.61229889228133, 149.34259789169045]
print(K.eval(P).tolist())
[149.4595947265625, 147.281494140625, 144.97488403320312, 146.04867553710938, 144.25537109375, 148.9300994873047, 146.6123046875, 149.34259033203125]
Tensorflow allows to compute the Frobenius norm via tf.norm function. In case of 2D matrices, it's equivalent to 1-norm.
The following solution isn't vectorized and assumes that the first dimension in E is known statically:
E = tf.random_normal(shape=[5, 3, 3], dtype=tf.float32)
F = tf.split(E, E.shape[0])
total = tf.reduce_sum([tf.norm(tensor=(lhs-rhs), ord=1, axis=(-2, -1)) for lhs in F for rhs in F])
Update:
An optimized vectorized version of the same code:
E = tf.random_normal(shape=[1024, 4, 30, 30], dtype=tf.float32)
lhs = tf.expand_dims(E, axis=1)
rhs = tf.expand_dims(E, axis=2)
total = tf.reduce_sum(tf.norm(tensor=(lhs - rhs), ord=1, axis=(-2, -1)))
Memory concerns: upon evaluating this code,
tf.contrib.memory_stats.MaxBytesInUse() reports that the peak memory consumption is 73729792 = 74Mb, which indicates relatively moderate overhead (the raw lhs-rhs tensor is 59Mb). Your OOM is most likely caused by the duplication of BATCH_SIZE dimension when you compute s = x_ - y_, because your batch size is much larger than the number of matrices (1024 vs 4).

How to find an index of the first matching element in TensorFlow

I am looking for a TensorFlow way of implementing something similar to Python's list.index() function.
Given a matrix and a value to find, I want to know the first occurrence of the value in each row of the matrix.
For example,
m is a <batch_size, 100> matrix of integers
val = 23
result = [0] * batch_size
for i, row_elems in enumerate(m):
result[i] = row_elems.index(val)
I cannot assume that 'val' appears only once in each row, otherwise I would have implemented it using tf.argmax(m == val). In my case, it is important to get the index of the first occurrence of 'val' and not any.
It seems that tf.argmax works like np.argmax (according to the test), which will return the first index when there are multiple occurrences of the max value.
You can use tf.argmax(tf.cast(tf.equal(m, val), tf.int32), axis=1) to get what you want. However, currently the behavior of tf.argmax is undefined in case of multiple occurrences of the max value.
If you are worried about undefined behavior, you can apply tf.argmin on the return value of tf.where as #Igor Tsvetkov suggested.
For example,
# test with tensorflow r1.0
import tensorflow as tf
val = 3
m = tf.placeholder(tf.int32)
m_feed = [[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]]
tmp_indices = tf.where(tf.equal(m, val))
result = tf.segment_min(tmp_indices[:, 1], tmp_indices[:, 0])
with tf.Session() as sess:
print(sess.run(result, feed_dict={m: m_feed})) # [2, 0, 1]
Note that tf.segment_min will raise InvalidArgumentError when there is some row containing no val. In your code row_elems.index(val) will raise exception too when row_elems don't contain val.
Looks a little ugly but works (assuming m and val are both tensors):
idx = list()
for t in tf.unpack(m, axis=0):
idx.append(tf.reduce_min(tf.where(tf.equal(t, val))))
idx = tf.pack(idx, axis=0)
EDIT:
As Yaroslav Bulatov mentioned, you could achieve the same result with tf.map_fn:
def index1d(t):
return tf.reduce_min(tf.where(tf.equal(t, val)))
idx = tf.map_fn(index1d, m, dtype=tf.int64)
Here is another solution to the problem, assuming there is a hit on every row.
import tensorflow as tf
val = 3
m = tf.constant([
[0 , 0, val, 0, val],
[val, 0, val, val, 0],
[0 , val, 0, 0, 0]])
# replace all entries in the matrix either with its column index, or out-of-index-number
match_indices = tf.where( # [[5, 5, 2, 5, 4],
tf.equal(val, m), # [0, 5, 2, 3, 5],
x=tf.range(tf.shape(m)[1]) * tf.ones_like(m), # [5, 1, 5, 5, 5]]
y=(tf.shape(m)[1])*tf.ones_like(m))
result = tf.reduce_min(match_indices, axis=1)
with tf.Session() as sess:
print(sess.run(result)) # [2, 0, 1]
Here is a solution which also considers the case the element is not included by the matrix (solution from github repository of DeepMind)
def get_first_occurrence_indices(sequence, eos_idx):
'''
args:
sequence: [batch, length]
eos_idx: scalar
'''
batch_size, maxlen = sequence.get_shape().as_list()
eos_idx = tf.convert_to_tensor(eos_idx)
tensor = tf.concat(
[sequence, tf.tile(eos_idx[None, None], [batch_size, 1])], axis = -1)
index_all_occurrences = tf.where(tf.equal(tensor, eos_idx))
index_all_occurrences = tf.cast(index_all_occurrences, tf.int32)
index_first_occurrences = tf.segment_min(index_all_occurrences[:, 1],
index_all_occurrences[:, 0])
index_first_occurrences.set_shape([batch_size])
index_first_occurrences = tf.minimum(index_first_occurrences + 1, maxlen)
return index_first_occurrences
And:
import tensorflow as tf
mat = tf.Variable([[1,2,3,4,5], [2,3,4,5,6], [3,4,5,6,7], [0,0,0,0,0]], dtype = tf.int32)
idx = 3
first_occurrences = get_first_occurrence_indices(mat, idx)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(first_occurrence) # [3, 2, 1, 5]