Tensorflow landmark heatmap - tensorflow

I am trying to draw landmark heatmaps with tensorflow.
My current approach is using tf.scatter_nd like this:
def draw_lmarks(x):
def draw_lmarks_inner(x2):
return tf.scatter_nd(x2[0], x2[1], shape=(IMGSIZE, IMGSIZE))
ret = tf.map_fn(draw_lmarks_inner, x, dtype="float32")
return tf.reshape(tf.reduce_max(ret, axis=0), [IMGSIZE, IMGSIZE, 1])
return tf.map_fn(draw_lmarks, [locations, vals], dtype="float32")
But this is quite slow as i have to create an IMAGESIZE*IMAGESIZE image for each batch times landmarks.
So i poked around and found tf.tensor_scatter_nd_update which i could use like:
img = tf.zeros((IMGSIZE,IMGSIZE), dtype="float32")
def draw_lmarks(x):
return tf.tensor_scatter_nd_update(img, x[0], x[1])
imgs = tf.map_fn(draw_lmarks, [locations, vals], dtype="float32")
Which allows me to only generate batch_size images which runs considerably faster.
... BUT, this doesn't use the highest values at one point but instead simply overwrites.
There is the tf.scatter_max function which sounds like what i need but this seems to expect different shaped inputs.
Is there a way to use the second approach but instead of overwriting values takes the maximum value at one point ?
Shapes:
location = (-1, 68, 16, 16, 2)
vals = (-1, 68, 16, 16)
To visualize:
This is what the second (faster) function returns:
while i need something like

I think you will be much better off by first setting the seeds of your landmarks and then convolve the result with your heatmap template. Something like
import tensorflow as tf
num_loc = 10
im_dim = 32
locations = tf.random.uniform((num_loc, 2), maxval=im_dim, dtype=tf.int32)
centers = tf.scatter_nd(locations, [1]*num_loc, (im_dim, im_dim))
heatmap = tf.nn.conv2d(centers[None, :, :, None], heatmap_template[:, :, None, None], (1, 1, 1, 1), 'SAME')[0, :, :, 0]

Related

How to make 2 tensors the same length by mean | median imputation of the shortest tensor?

I'm trying to subclass a the base Keras layer to create a layer that will merge the rank 1 output of 2 layers of a skip connection by outputting the Dot product of 2 tensors. The 2 incoming tensors are created by Dense layers parsed by a Neural Architecture Search algorithm that randomly selects the number of Dense units and hence the length of the 2 tensors. These of course will usually not be of the same length. I am trying an experiment to see if casting them to the same length by means of appending the shorter tensor with a mathematically meaningful imputation: [e.g. mean | median | hypotenuse | cos | ... etc] then merging them by means of the dot product will outperform Add or Concatenate merging strategies. To make them the same length:
I try the overall strategy:
Find the shorter tensor.
Pass it to tf.reduce_mean() (aliasing the resulting mean as "rm" for the sake of discussion).
Create a list of [rm for rm in range(['difference in length of the longer tensor and the shorter tensor']). Cast as a tensor if necessary.
[pad | concatenate] the shorter tensor with the result of the operation above to make it equal in length.
Here is where I am running into a dead wall:
Since the tf operation reduce_mean is returning a future with its shape set as None (not assumed to be a scalar of 1), they are in a state of having a shape of '(None,)', which the tf.keras.layers.Dot layer refuses to ingest and throws a ValueError, as it does not see them as being the same length, though they always will be:
KerasTensor(type_spec=TensorSpec(shape=(None,), dtype=tf.float32, name=None), name='tf.math.reduce_mean/Mean:0', description="created by layer 'tf.math.reduce_mean'")
ValueError: A Concatenate layer should be called on a list of at least 1 input. Received: input_shape=[[(None,), (None,)], [(None, 3)]]
My code (in the package/module):
import tensorflow as tf
import numpy as np
class Linear1dDot(tf.keras.layers.Layer):
def __init__(self, input_dim=None,):
super(Linear1dDot, self).__init__()
def __call__(self, inputs):
max_len = tf.reduce_max(tf.Variable(
[inp.shape[1] for inp in inputs]))
print(f"max_len:{max_len}")
for i in range(len(inputs)):
inp = inputs[i]
print(inp.shape)
inp_lenght = inp.shape[1]
if inp_lenght < max_len:
print(f"{inp_lenght} < {max_len}")
# pad_with = inp.reduce_mean()
pad_with = tf.reduce_mean(inp, axis=1)
print(pad_with)
padding = [pad_with for _ in range(max_len - inp_lenght)]
inputs[i] = tf.keras.layers.concatenate([padding, [inp]])
# inputs[i] = tf.reshape(
# tf.pad(inp, padding, mode="constant"), (None, max_len))
print(inputs)
return tf.keras.layers.Dot(axes=1)(inputs)
...
# Alternatively substituting the last few lines with:
pad_with = tf.reduce_mean(inp, axis=1, keepdims=True)
print(pad_with)
padding = tf.keras.layers.concatenate(
[pad_with for _ in range(max_len - inp_lenght)])
inputs[i] = tf.keras.layers.concatenate([padding, [inp]])
# inputs[i] = tf.reshape(
# tf.pad(inp, padding, mode="constant"), (None, max_len))
print(inputs)
return tf.keras.layers.Dot(axes=1)(inputs)
... and countless other permutations of attempts ...
Does anyone know a workaround or have any advice? (other than 'Don't try to do this.')?
In the parent folder of this module's package ...
Test to simulate a skip connection merging into the current layer:
from linearoneddot.linear_one_d_dot import Linear1dDot
x = tf.constant([1, 2, 3, 4, 5])
y = tf.constant([0, 9, 8])
inp1 = tf.keras.layers.Input(shape=3)
inp2 = tf.keras.layers.Input(shape=5)
xd = tf.keras.layers.Dense(3, "relu")(inp1)
yd = tf.keras.layers.Dense(5, 'elu')(inp2)
combined = Linear1dDot()([xd, yd]) # tf.keras.layers.Dot(axes=1)([xd, yd])
z = tf.keras.layers.Dense(2)(combined)
model = tf.keras.Model(inputs=[inp1, inp2], outputs=z) # outputs=z)
print(model([x, y]))
print(model([np.random.random((3, 3)), np.random.random((3, 5))]))
Does anyone know a workaround that will be able to get the mean of the shorter rank 1 tensor as a scalar, which I can then append / pad to the shorter tensor to a set intended langth (same length as the longer tensor).
Try this, hope this will work, Try to padd the shortest input with 1, and then concat it with the input then take the dot product, then finally subtract the extra ones which were added in the dot product...
class Linear1dDot(tf.keras.layers.Layer):
def __init__(self,**kwargs):
super(Linear1dDot, self).__init__()
def __call__(self, inputs):
_input1 , _input2 = inputs
_input1_shape = _input1.shape[1]
_input2_shape = _input2.shape[1]
difference = tf.math.abs(_input1_shape - _input2_shape)
padded_input = tf.ones(shape=(1,difference))
if _input1_shape > _input2_shape:
padded_tensor = tf.concat([_input2 ,padded_input],axis=1)
scaled_output = tf.keras.layers.Dot(axes=1)([padded_tensor, _input1])
scaled_output -= tf.reduce_sum(padded_input)
return scaled_output
else:
padded_tensor = tf.concat([_input1 , padded_input],axis=1)
scaled_output = tf.keras.layers.Dot(axes=1)([padded_tensor, _input2])
scaled_output -= tf.reduce_sum(padded_input)
return scaled_output
x = tf.constant([[1, 2, 3, 4, 5, 9]])
y = tf.constant([[0, 9, 8]])
inp1 = tf.keras.layers.Input(shape=3)
inp2 = tf.keras.layers.Input(shape=5)
xd = tf.keras.layers.Dense(5, "relu")(x)
yd = tf.keras.layers.Dense(3, 'elu')(y)
combined = Linear1dDot()([xd, yd]) # tf.keras.layers.Dot(axes=1)([xd, yd])
Output:
<tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[4.4694786]], dtype=float32)>

How to apply dropout in tensorflow to multidimensional tensors?

I have a 3D tensor called X, of shape say [2,20,300] and I would like to apply dropout to only the third dimension. However, I want the dropped elements to be the same for the 20 instances (second dimension) but not necessarily for first dimension.
What is the behaviour of the following:
tf.nn.dropout(X[0], keep_prob=p)
Would it only act on the dimension that I want? If so, then for multiple first dimensions, I could loop over them and apply the above line.
See the documentation of tf.nn.dropout:
By default, each element is kept or dropped independently. If
noise_shape is specified, it must be broadcastable to the shape of x,
and only dimensions with noise_shape[i] == shape(x)[i] will make
independent decisions
So it is as simple as:
import tensorflow as tf
import numpy as np
data = np.arange(300).reshape((1, 1, 300))
data = np.tile(data, (2, 20, 1))
data_op = tf.convert_to_tensor(data.astype(np.float32))
data_op = tf.nn.dropout(data_op, 0.5, noise_shape=[2, 1, 300])
with tf.Session() as sess:
data = sess.run(data_op)
for b in range(2):
for c in range(20):
assert np.allclose(data[0, 0, :], data[0, c, :])
assert np.allclose(data[1, 0, :], data[1, c, :])
print((data[0, 0, :] - data[1, 0, :]).sum())
# output something != 0 with high probability#

Indexing tensor with another tensor

I have a tensor xi of shape (?, 20, 10) and another tensor y_data of shape (?, 20, 1). I want to use the y_data tensor to "index" the xi tensor in order to do something like tf.exp(xi[y_data] - tf.log(tf.reduce_sum(xi, axis=2)).
E.g. tf.exp(xi[:, :, 4] - tf.log(tf.reduce_sum(xi, axis=2))) results in a tensor of shape (?, 20). I just want to get the index, here 4, out ot another tensor.
Thanks in advance!
In this case, I would use a loop on the possible values for y_data which I will assume go from 0 to 9.
result = tf.zeros(tf.shape(y_data), tf.float32)
for i in range(10):
result = tf.where(tf.equal(y_data, i), tf.exp(xi[:, :, i:i+1]), result)
result = tf.reshape(result, [-1, 20])
result -= tf.log(tf.reduce_sum(xi, axis=2))
Probably not the most efficient but that's the only way I could think of.

Why does sharing layers in keras make building the graph extremely slow (tensorflow backend)

I am building a graph where the input is split into a list of tensors of length 30. I then use a shared RNN layer on each element of the list.
It takes ~ 1 minute until the model is compiled. Does it have to be like this (why?) or is there anything I am doing wrong?
Code:
shared_lstm = keras.layers.LSTM(4, return_sequences=True)
shared_dense = TimeDistributed(keras.layers.Dense(1, activation='sigmoid'))
inp_train = keras.layers.Input([None, se.action_space, 3])
# Split each possible measured label into a list:
inputs_train = [ keras.layers.Lambda(lambda x: x[:, :, i, :])(inp_train) for i in range(se.action_space) ]
# Apply the shared weights on each tensor:
lstm_out_train = [shared_lstm(x) for x in inputs_train]
dense_out_train = [(shared_dense(x)) for x in lstm_out_train]
# Merge the tensors again:
out_train = keras.layers.Lambda(lambda x: K.stack(x, axis=2))(dense_out_train)
# "Pick" the unique element along where the inp_train tensor is == 1.0 (along axis=2, in the next time step, of the first dimension of axis=3)
# (please disregard this line if it seems too complex)
shift_and_pick_layer = keras.layers.Lambda(lambda x: K.sum(x[0][:, :-1, :, 0] * x[1][:, 1:, :, 0], axis=2))
out_train = shift_and_pick_layer([out_train, inp_train])
m_train = keras.models.Model(inp_train, out_train)

Model Won't Train (Loss Doesn't Move)

TL;DR: I can't find my mistake when using the Tensorflow optimizer to train an extremely small neural net. The loss either doesn't move or moves once then gets stuck (it seems to really like the value 0.693147 which is ln(2)...).
Issue and Code: I'm trying to implement the 12-net part of the cascade classifier in Li et al (here) in Tensorflow. It's an extremely simple net, but nothing I try seems to get it training.
import tensorflow as tf
import tensorflow.contrib.slim as slim
import cv2
import numpy as np
input_tensor = tf.placeholder(tf.float32, shape=[1, 12, 12, 3])
input_label = tf.placeholder(tf.float16, shape=[1, 2])
conv_1 = slim.conv2d(input_tensor, 16, (3, 3), scope='conv1')
pool_1 = slim.max_pool2d(conv_1, (3, 3), 2, scope='pool1')
flatten = slim.flatten(pool_1)
fully_con = slim.fully_connected(flatten, 16, scope='full_con')
fully_con_2 = slim.fully_connected(fully_con, 2, scope='output')
probs = tf.nn.softmax(fully_con_2)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=input_label, logits=fully_con_2))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001).minimize(loss)
This defines the net. It takes in a (for now, single) 12x12 image and label, does a single 3x3 convolution with stride 1 and 16 filters, a 3x3 max pool with stride 2, then fully connects to 16 features, and finally makes a binary classification. I am able to perform a forward pass through the code, so I don't think the issue is here. This is my training loop - I have 3 12x12 images (2 faces, 1 tree) and just alternately feed them to the optimizer (clearly not best training practice, but I'm just trying to get it to work):
if __name__ == '__main__':
im = cv2.imread('resized.jpg').reshape(1, 12, 12, 3).astype('float16')
im2 = cv2.imread('resized2.jpg').reshape(1, 12, 12, 3).astype('float16')
im3 = cv2.imread('resize3.jpg').reshape(1, 12, 12, 3).astype('float16')
im_lab_1 = np.array([[0, 1]], dtype='float16')
im_lab_2 = np.array([[0, 1]], dtype='float16')
im_lab_3 = np.array([[1, 0]], dtype='float16')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(loss, feed_dict={input_tensor: im3, input_label: im_lab_3}))
for i in range(50000):
if i % 3 == 0:
# _, l = sess.run([optimizer, loss], feed_dict=feed1)
# print(l)
optimizer.run(feed_dict={input_tensor: im, input_label: im_lab_1})
elif i % 4 == 0:
# _, l = sess.run([optimizer, loss], feed_dict=feed2)
# print(l)
optimizer.run(feed_dict={input_tensor: im2, input_label: im_lab_2})
elif i % 5 == 0:
optimizer.run(feed_dict={input_tensor: im3, input_label: im_lab_3})
print(sess.run(loss, feed_dict={input_tensor: im3, input_label: im_lab_3}))
I've tried both optimizer.run(...) and the commented out sess.run([optimizer, loss]...). The first sess.run(loss...) seems to spit out something correct, but after that, the loss gets stuck and never moves again. Clearly, I'm doing something wrong here, and any help would be appreciated!