How to create a multi-dimensional one-hot tensor - tensorflow

I have a list of K (x_i, y_i) pairs where 0 <= x_i < X and 0 <= y_i < Y represented as a tensor of shape [K, 2].
I want to create a tensor T of shape [K, X, Y], where T[i, x, y] = 1 if x = x_i and y = y_i, 0 otherwise.
I know that for a list of indices I can use tf.one_hot, but not sure if I can reuse it here? something like tf.one_hot(pairs, depth=(X,Y))

From this SO post we get a slick way to do this in numpy:
(np.arange(a.max()) == a[...,None]-1).astype(int)
Fully using that trick, now we just have to port this to tensorflow:
# for the numpy, full credit to #Divakar and https://stackoverflow.com/questions/34987509/tensorflow-max-of-a-tensor-along-an-axis
print('first an awesome way to do it in numpy...')
a = np.array([[1,2,4],[3,1,0]])
print((np.arange(a.max()) == a[...,None]-1).astype(int))
# porting this to tensorflow...
print('\nnow in tensorflow...')
b = tf.constant([[1,2,4],[3,1,0]])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(tf.cast(tf.equal(tf.range(tf.reduce_max(b)),tf.reshape(b,[2,3,1])-1),tf.int32)))
Returns:
first an awesome way to do it in numpy...
[[[1 0 0 0]
[0 1 0 0]
[0 0 0 1]]
[[0 0 1 0]
[1 0 0 0]
[0 0 0 0]]]
now in tensorflow...
[[[1 0 0 0]
[0 1 0 0]
[0 0 0 1]]
[[0 0 1 0]
[1 0 0 0]
[0 0 0 0]]]
That was fun.

I think the best solution uses tf.sparse_to_dense. For example, if we want ones in positions (6,2), (3,4), (4,5) of a 10x8 matrix:
indices = sorted([[6,2],[3,4],[4,5]])
one_hot_encoded = tf.sparse_to_dense(sparse_indices=indices, output_shape=[10,8], sparse_values=1)
with tf.Session() as session:
tf.global_variables_initializer().run()
print(one_hot_encoded.eval())
This returns the following:
[[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0.]]
Furthermore, the inputs (e.g. indices) might be a tf.Variable object, no need for it to be constant.
It has a couple of restrictions, namely indices must be sorted (hence the sorted above) and not repeated. You can also use tf.one_hot directly. In that case, you need the indices as two vectors of all the x before and all the y after, i.e. list(zip(*indices)). Then one can do:
new_indices = list(zip(*indices))
# one of the following: the first one is for xy index convention:
flat_indices = new_indices[1] * depth[1] + new_indices[0]
# this other for ij convention:
# flat_indices = new_indices[0] * depth[1] + new_indices[1]
# Apply tf.one_hot to the flattened vector, then sum along the newly created dimension
one_hot_flat = tf.reduce_sum(tf.one_hot(flat_indices, depth=np.prod(im_size)), axis=0)
# Finally reshape
one_hot_encoded = tf.reshape(oh, im_size)
with tf.Session() as session:
tf.global_variables_initializer().run()
print(one_hot_encoded.eval())
This returns the same as the above. However, indices don't need to be sorted, and they can be repeated (in which case, the corresponding entry will be the number of appearances; for a simple "1" everywhere, replace tf.reduce_sum with tf.reduce_max). Also this supports variables.
However, for large indices / depths, memory consumption may be a problem. It creates a temporary N x W x H tensor, where N is the number of indices tuples, and that might get problematic. Therefore, the first solution is probably preferable, when possible.
Actually, if one is okay with using sparse tensor, the most memory-efficient way is probably just:
sparse = tf.SparseTensor(indices=indices, values=[1]*len(indices), dense_shape=[10, 8])
When run, this returns a more cryptic:
SparseTensorValue(indices=array([[3, 4],
[4, 5],
[6, 2]]), values=array([1, 1, 1], dtype=int32), dense_shape=array([10, 8]))

Related

numpy function to use for mathematical dot product to produce scalar

Question
What numpy function to use for mathematical dot product in the case below?
Backpropagation for a Linear Layer
Define sample (2,3) array:
In [299]: dldx = np.arange(6).reshape(2,3)
In [300]: w
Out[300]:
array([[0.1, 0.2, 0.3],
[0. , 0. , 0. ]])
Element wise multiplication:
In [301]: dldx*w
Out[301]:
array([[0. , 0.2, 0.6],
[0. , 0. , 0. ]])
and summing on the last axis (size 3) produces a 2 element array:
In [302]: (dldx*w).sum(axis=1)
Out[302]: array([0.8, 0. ])
Your (6) is the first term, dropping the 0. One might argue that the use of a dot/inner in (5) is a bit sloppy.
np.einsum borrows ideas from physics, where dimensions may be higher. This case can be expressed as
In [303]: np.einsum('ij,ik->i',dldx,w)
Out[303]: array([1.8, 0. ])
inner and dot do more calculations that we want. We just want the diagonal:
In [304]: np.dot(dldx,w.T)
Out[304]:
array([[0.8, 0. ],
[2.6, 0. ]])
In [305]: np.inner(dldx,w)
Out[305]:
array([[0.8, 0. ],
[2.6, 0. ]])
In matmul/# terms, the size 2 dimension is a 'batch' one, so we have to add dimensions:
In [306]: dldx[:,None,:]#w[:,:,None]
Out[306]:
array([[[0.8]],
[[0. ]]])
This is (2,1,1), so we need to squeeze out the 1s.

How to solve a facility location allocation (IP) problem in CVXPY

I am learning to solve optimization problems using CVXPY, so I started with the following simple facility location allocation problem.
The Code in CVXPY is given as:
Fi = np.array([1,1,1]) # Fixed cost of each facility
Ci = np.array([15, 10, 10]) # Capacity of each facility
Dj = np.array([5, 5, 5, 3, 3, 4]) # Demand of each facility
Cij = np.ones(m,n)
n = len(Dj)
m = len(Fi)
# Decision Variables
Xij = cvx.Bool(m,n) # (m,n) vector
Yi = cvx.Bool(m) # column vector of length (m,1)
# Objective
fixed_cost = cvx.sum_entries(Fi*Yi)
var_cost = cvx.sum_entries(Cij.T * Dj *Xij)
total_cost = fixed_cost + var_cost
objective = cvx.Minimize(total_cost)
# Maximum facility locations to be selected?
constraints.append(cvx.sum_entries(Yi)==2)
# Sum of demands allocated to a facility shall be <= facility capacity -
# Capacity Fixed Cost
constraints.append(cvx.sum_entries(Dj * Xij.T, axis=0) <= Ci*Yi)
# Every demand point shall be supplied by only one facility.
constraints.append(cvx.sum_entries(Xij, axis=1) == 1)
# Solve the problem
prob = cvx.Problem(objective, constraints)
prob.solve(solver=cvx.GLPK_MI)
# Print the values
#print("status:", prob.status)
print("optimal value", prob.value)
print("Selected Facility Locations", Yi.value)
print("Assigned Nodes", Xij.value, )
As per the last constraint, a demand location should be supplied by only one facility, however the output of Xij.value shows wrong results.
Using CVXPY version: 0.4.10
status: optimal
optimal value 91.0
Selected Facility Locations [[1.]
[0.]
[1.]]
Assigned Nodes to Facility 1) [[1. 0. 0. 0. 0. 0.]]
Assigned Nodes to Facility 2) [[1. 0. 0. 0. 0. 0.]]
Assigned Nodes to Facility 3) [[1. 0. 0. 0. 0. 0.]]
The Xij.value should be something like this:
Using CVXPY version: 0.4.10
status: optimal
optimal value 91.0
Selected Facility Locations [[1.]
[1.]
[0.]]
Assigned Nodes to Facility 1) [[1. 1. 1. 0. 0. 0.]]
Assigned Nodes to Facility 2) [[0. 0. 0. 1. 1. 1.]]
Assigned Nodes to Facility 3) [[0. 0. 0. 0. 0. 0.]]
Which means, facility 1 and 2 are selected.
The first three points are allocated to facility 1 and the next three to facility 2.

Creating all zeros except one nonzero element in tensorflow

I want to create an M*N tensor where all elements are all zeros except one random element per row which shall be one but I don't know how.
This is one way to do that:
import tensorflow as tf
m = 4
n = 6
dt = tf.float32
random_idx = tf.random_uniform((m, 1), maxval=n, dtype=tf.int32)
result = tf.cast(tf.equal(tf.range(n)[tf.newaxis], random_idx), dtype=dt)
with tf.Session() as sess:
print(sess.run(result))
Output:
[[ 0. 0. 0. 0. 0. 1.]
[ 0. 0. 1. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]]

convert from one-hot encoding to class label

I have tensor named y, which has values from one-hot encoding over class labels:
y = [[ 0. 0. 1. ..., 0. 0. 0.],[ 1. 0. 0. ..., 0. 0. 0.],[ 0. 0. 0. ..., 0. 1. 0.],
...,[ 0. 0. 0. ..., 0. 0. 0.],[ 0. 0. 0. ..., 0. 0. 1.],[ 0. 0. 1. ..., 0. 0. 0.]]
so here first row has third element as '1' so it represents class label
for that image.
Am trying to get all class labels from the given one-hot encoded array,
or the given example it should be something like this:
y = [2,0,8,...,9,2]
I think the simplest way is:
import numpy as np
y = np.argmax(y)

Plotting a histogram of 2D numpyArray of (latitude, latitude), in order to determine the proper values for DBSCAN

I am trying to apply DBSCAN on a dataset of (Lan,Lat) .. The algorithm is very sensitive for the parameter; EPS & MinPts.
I would like to have a look through a Histogram over the data, to determine the proper values. Unfortunately, Matplotlib Hist() take only 1D array.
Passing a 2D matrix as argument, Hist() treats each column as a separate input.
Scatter plot and histograms:
Does anyone has a way to solve this,
If you follow the DBSCAN article, you only need the 4-nearest-neighbor distance for each object, not all pairwise distances. I.e., a 1 dimensional array.
Instead of doing a histogram, they sort the values, and try to choose a knee in this plot.
find the 4 nearest neighbor of each object
collect all 4NN distances in one array
sort this array in descending order
plot the resulting curve
look for a knee, often best at around 5%-10% of your x axis (so 95%-90% of objects are core points).
For details, see the original DBSCAN publication!
You could use numpy.histogram2d:
import numpy as np
np.random.seed(2016)
N = 100
arr = np.random.random((N, 2))
xedges = np.linspace(0, 1, 10)
yedges = np.linspace(0, 1, 10)
lat = arr[:, 0]
lng = arr[:, 1]
hist, xedges, yedges = np.histogram2d(lat, lng, (xedges, yedges))
print(hist)
yields
[[ 0. 0. 5. 0. 3. 0. 0. 0. 3.]
[ 0. 3. 0. 3. 0. 0. 4. 0. 2.]
[ 2. 2. 1. 1. 1. 1. 3. 0. 1.]
[ 2. 1. 0. 3. 1. 2. 1. 1. 3.]
[ 3. 0. 3. 2. 0. 1. 0. 2. 0.]
[ 3. 2. 3. 1. 1. 2. 1. 1. 0.]
[ 2. 3. 0. 1. 0. 1. 3. 0. 0.]
[ 1. 1. 1. 1. 2. 0. 2. 1. 1.]
[ 0. 1. 1. 0. 1. 1. 2. 0. 0.]]
Or to visualize the histogram:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.imshow(hist)
plt.show()