SSD monilenet v2 model : how to decode the results of raw output? - tensorflow

I finally have fist SSD_mobilnet_v2 model started.
[https://tfhub.dev/iree/lite-model/ssd_mobilenet_v2_100/fp32/default/1
But compare with other models of SSD_mobilnet_v1, the outoput is not decoded and without NMS. As described in the web,
raw_outputs/box_encodings: an fp32 array of shape [1, M, 4] containing
decoded detection boxes without Non-Max suppression. M is the number
of raw detections.
raw_outputs/class_predictions: a fp32 array of
shape [1, M, 91] and contains class logits for raw detection boxes.
Score conversion layer not included. M is the number of raw
detections.
could anyone help me understand more how should I do next step to transfer the raw outout bounding box relative coords and raw detections to final bounding box and scores?
below is what I have for relative coords
((1, 8004, 4), array([[[ 1.0489864 , 0.2852646 , -7.456381 ,
-4.7595205 ],
[-0.35197747, 1.0335344 , -2.378837 , -0.6666372 ],
[ 0.8332888 , 0.8247721 , -0.3024159 , -3.290514 ],
...,
[ 0.1398478 , -0.25943977, 0.29718816, -1.410246 ],
[-0.1175375 , -0.08291922, -1.461216 , 0.22000816],
[ 0.13272771, -0.51625276, -0.5618129 , -1.0699694 ]]],
dtype=float32))
what is the meaning of the 4 channels for each prediction bounding box? why some values are negative?
Regarding raw predictions, for each bounding box we have predictions of 8004 bounding box, and total 91 classes. but why we see negative probabilities?
The probability should be in the range from 0-1?
((1, 8004, 91), array([[[-9.669849 , -4.4239364, -4.8566256, ...,
-5.8348265,
-5.1578894, -4.801747 ],
[-9.669044 , -5.7234015, -6.342394 , ..., -7.4027104,
-6.728588 , -6.4829254],
[-9.590075 , -5.214742 , -6.874845 , ..., -6.9183044,
-6.844805 , -6.4774513],
...,
[-4.8303924, -5.1854134, -4.871473 , ..., -4.9025354,
-4.829895 , -4.7791467],
[-4.830332 , -4.9423876, -4.8391323, ..., -4.9813066,
-4.8254986, -4.7414174],
[-4.832156 , -4.925433 , -4.9521995, ..., -5.1177106,
-4.7640305, -4.6455407]]], dtype=float32))
I sense I need to understand more about basic definitions of SSD mobilenet models. Could anyone be kind and give me some guide?
some reference links or a small pieces of code would be very helpful for me to understand.
thanks in advance!
regards
Cliff

Related

Is it possible to enforce mathematical constraints between tensorflow neural network output nodes?

Basically this:
Is it possible to enforce mathematical constraints between tensorflow neural network output nodes in the last layer?
For example, monotonicity between nodes, such as output node 1 being larger than node 2, which in turn is larger than node 3, and so forth.
In general -- not really, not directly, at least. Keras layers support arguments for constraints on the weights, and you may be able to translate a desired output constraint into a weight constraint instead -- but otherwise, you will need to think about how to set up the structure of your network such that the constraints are fulfilled.
Here is a sketch for how a monotonicity constraint might be possible. Actually including this in a model likely requires creating a custom Layer subclass or perhaps using the functional API.
First, let's create some dummy data. This could be the output of a standard Dense layer (4 is batch size, 5 the number of outputs).
raw_outputs = tf.random.normal([4, 5])
>>> <tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[ 0.3989258 , -1.7693167 , 0.13419539, 1.1059834 , 0.3271042 ],
[ 0.6493515 , -1.4397207 , 0.05153034, -0.2730962 , -1.1569825 ],
[-1.3043666 , 0.20206456, -0.3841469 , 1.8338723 , 1.2728293 ],
[-0.3725195 , 1.1708363 , -0.01634515, -0.01382025, 1.2707714 ]],
dtype=float32)>
Next, make all outputs be positive using softplus. Think of this as the output activation function. Any function that returns values >= 0 will do. For example, you could use tf.exp but the exponential growth might lead to numerical issues. I would not recommend relu since the hard 0s prevent gradients from flowing -- usually a bad idea in the output layer.
positive_outputs = tf.nn.softplus(raw_outputs)
>>> <tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[0.9123723 , 0.15738781, 0.7624942 , 1.3918277 , 0.8700147 ],
[1.0696293 , 0.21268418, 0.71924424, 0.56589293, 0.2734058 ],
[0.24007489, 0.7992745 , 0.5194075 , 1.9821143 , 1.5197192 ],
[0.5241344 , 1.4409455 , 0.685008 , 0.68626094, 1.5181118 ]],
dtype=float32)>
Finally, use cumsum to add up the values:
constrained = tf.cumsum(positive_outputs, reverse=True, axis=-1)
>>> <tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[4.0940967, 3.1817245, 3.0243368, 2.2618425, 0.8700147],
[2.8408566, 1.7712271, 1.558543 , 0.8392987, 0.2734058],
[5.0605907, 4.8205156, 4.021241 , 3.5018334, 1.5197192],
[4.8544607, 4.3303266, 2.889381 , 2.204373 , 1.5181118]],
dtype=float32)>
As we can see, the outputs for each batch element are monotonically decreasing! This is because each of our original outputs (positive_outputs) basically just encodes how much is added at each unit, and because we forced them to be positive, the numbers can only get larger (or smaller in this case because of reverse=True in cumsum).
There are many ways one is neurons learning and the second is mathematics Fn or both by scores weights or rewards or learning from other neurons.
This is one way I teach the networks to lean, you may study from the online sample.
X = tf.compat.v1.placeholder(tf.float32, shape=( (10, 88, 80, 4)))
y = tf.compat.v1.placeholder(tf.float32, shape=(1, 1))
X_action = tf.compat.v1.get_variable('X_action', dtype = tf.float32, initializer = tf.random.normal((1, 1))) # X_var
in_training_mode = tf.compat.v1.get_variable('in_training_mode', dtype = tf.float32, initializer = tf.random.normal((1, 1))) # X_var
loss = tf.reduce_mean(input_tensor=tf.square((X * y) - (X * X_action))) *
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate)
training_op = optimizer.minimize(loss)
This is another way me for the similar tasks with simple configuration just select the correct optimizer and loss Fn that performs faster learning, the previous sample is just Error roots mean sequare.
optimizer = tf.keras.optimizers.Nadam(
learning_rate=0.00001, beta_1=0.9, beta_2=0.999, epsilon=1e-07,
name='Nadam'
)
lossfn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=lossfn, metrics=['accuracy'])current version.

Sparse matrix visualisation

I'm working on FEM analysis. I just wanted to evaluate a simple matrix multiplication and see the numeric result. How can I see the elements of the sparse matrix?
the code that I have used for is:
U_h= 0.5 * np.dot(np.dot(U[np.newaxis], K), U[np.newaxis].T)
Since U is a 1x3 matrix, K is 3x3 matrix and U.T is 3x1 matrix, I expect a 1x1 matrix with a single number in it. However, the result is "[[<3x3 sparse matrix of type 'class 'numpy.float64' with 3 stored elements in Compressed Sparse Row format>]]"
In [260]: M = sparse.random(5,5,.2, format='csr')
What you got was the repr format of the matrix:
In [261]: M
Out[261]:
<5x5 sparse matrix of type '<class 'numpy.float64'>'
with 5 stored elements in Compressed Sparse Row format>
In [262]: repr(M)
Out[262]: "<5x5 sparse matrix of type '<class 'numpy.float64'>'\n\twith 5 stored elements in Compressed Sparse Row format>"
The str format used print is:
In [263]: print(M)
(1, 0) 0.7152749140462651
(1, 1) 0.4298096228326874
(1, 3) 0.8148327301300698
(4, 0) 0.23366934073409018
(4, 3) 0.6117499168861333
In [264]: str(M)
Out[264]: ' (1, 0)\t0.7152749140462651\n (1, 1)\t0.4298096228326874\n (1, 3)\t0.8148327301300698\n (4, 0)\t0.23366934073409018\n (4, 3)\t0.6117499168861333'
If the matrix isn't big, displaying it as a dense array is nice. M.toarray() does that, or for short:
In [265]: M.A
Out[265]:
array([[0. , 0. , 0. , 0. , 0. ],
[0.71527491, 0.42980962, 0. , 0.81483273, 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0.23366934, 0. , 0. , 0.61174992, 0. ]])
for a graphical inspection use plt.spy()
see an applied example here
see the reference manual here

2 dimensional value prediction with Tensorflow

I have searched long and couldn't find a solution yet. I have a data-set with 3 input signals, which produce irregular values and an associated relative 2-dimensional position (x and y).
I want to build a Tensorflow estimator to predict based on these 3 input values the position of x and y for samples in which I only have the 3 input signals. The values are stored in a panda Series.
pd.Series('index' = [0,1,..., 100], 'value' = [-45, -38, ..., -90], 'signal_source' = ['Jimmy', 'Bob', ..., 'Bob'], 'x_position' = [2, 2, ..., 5], 'y_position’ = [3, 3, ..., 1])
I couldn't find a tensorflow estimator outputting 2 numerical values and optimizing the output to the euclidean distance of the predicted position to the real position. Is there a name for this kind of problem or can anyone help me building such an estimator please.

Tensorflow prediction binary strings

i'm trying to create a convolutional neural network, which predicts whether or not to sell for a hydropower dam, the issue i am having is the output. I input two inputs, price(a normalized float) and waterinflow (either 1 or 0 at this point).
My issue is that running this and trying to get the answer as a set of actions 0/1, gives me floats which do not make any sense other than if the output is set as the corresponding number instead of the set of actions. This is fine when the amount of actions are small, but will be horrible later on when the number of actions are extended.
Does anyone know how i can make it so that it outputs the actions as either 0 or 1, instead of the floats which seem to be certainty of the prediction.
Meaning if there are 4 actions, and the correct answer is 0, 1, 0, 1, then the predictions should be in the same form(4 actions either 0 or 1)
Any help would be much appreciated
Binary output from Normalized Probability
What you are looking for is a method of converting your normalized probability output to a binary one.
This is very straight forward in Tensorflow and involves added a tf.round function. The trick is to make sure you do not use the output tf.round in training. This is best demonstrated using a working code example.
Working code example
This code calculates the XOR function using a neural net. The outputs are y_out (the probability output) and y_binary (the casting of the probability output to binary)
### imports
import tensorflow as tf
import numpy as np
### constant data
x = [[0.,0.],[1.,1.],[1.,0.],[0.,1.]]
y_ = [[1.,0.],[1.,0.],[0.,1.],[0.,1.]]
### induction
# 1x2 input -> 2x3 hidden sigmoid -> 3x1 sigmoid output
# Layer 0 = the x2 inputs
x0 = tf.placeholder( dtype=tf.float32 , shape=[None,2] )
y0 = tf.placeholder( dtype=tf.float32 , shape=[None,2] )
# Layer 1 = the 2x3 hidden sigmoid
m1 = tf.Variable( tf.random_uniform( [2,3] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))
b1 = tf.Variable( tf.random_uniform( [3] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))
h1 = tf.sigmoid( tf.matmul( x0,m1 ) + b1 )
# Layer 2 = the 3x2 softmax output
m2 = tf.Variable( tf.random_uniform( [3,2] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))
b2 = tf.Variable( tf.random_uniform( [2] , minval=0.1 , maxval=0.9 , dtype=tf.float32 ))
y_logit = tf.matmul( h1,m2 ) + b2
y_out = tf.nn.softmax( y_logit )
y_binary = tf.round( y_out )
### loss
# loss : a loss function that uses y_logit or y_out , but NOT y_binary
loss = tf.reduce_sum( tf.square( y0 - y_out ) )
# training step
train = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
### training
# run 500 times using all the X and Y
# print out the loss and any other interesting info
with tf.Session() as sess:
sess.run( tf.global_variables_initializer() )
print "\nloss"
for step in range(500) :
sess.run(train, feed_dict={x0:x,y0:y_})
if (step + 1) % 100 == 0 :
print sess.run(loss, feed_dict={x0:x,y0:y_})
y_out_value , y_binary_value = sess.run([y_out,y_binary], feed_dict={x0:x,y0:y_})
print "\nThe expected output is :"
print np.array(y_)
print "\nThe softmax output is :"
print np.array(y_out_value)
print "\nThe binary output is :"
print np.array(y_binary_value)
print ""
Output
The expected output is :
[[ 1. 0.]
[ 1. 0.]
[ 0. 1.]
[ 0. 1.]]
The softmax output is :
[[ 0.96538627 0.03461381]
[ 0.81609273 0.18390732]
[ 0.11534476 0.88465524]
[ 0.0978259 0.90217412]]
The binary output is :
[[ 1. 0.]
[ 1. 0.]
[ 0. 1.]
[ 0. 1.]]
As you can see, you can retrieve the probability outputs OR the probabilities cast as binary and still have all the benefits of classic logits.
Cheers.
I guess it is important to note that the output of neural nets are actually posterior probability computed on each element of the classes present---for a typical classification problem.
The figures returned tells you how likely is the ouput to be of class A, B, C given the input x. So that you can not expect to get 0 or 1 always.
#An example would be if I get
Output = [0.5,0.2,0.3] given input x.
#I predict the class should be A because it has posterior of 0.5
(the highest value of the 3 values returned).
Class = A (0.5)
# Or I might as well round it up. Tensor flow can do this for you
So I guess you should get the output and apply probabilistic assumptions thats fit your model like say the highest value in the returned predictions gives the class it belongs.
It might not be easy to wait for absolute one or zero prediction.
Be careful of this fact I wrote above. Its a common mistake. And please do read the paper below. Once you have posteriors, you can add and build models on them. There is no limitation to what you can achieve!
For example you can apply Gaussian Mixture models/ Markov models/Build decision Tress/Combine expert systems on the output, those are the elegant and scientific approach.
Read this paper for more info.
http://www.ee.iisc.ac.in/people/faculty/prasantg/downloads/NeuralNetworksPosteriors_Lippmann1991.pdf
Hope it helps!

matplotlib: Get the colormap array

I am new to matplotlib, and have get stuck in colormaps.
In matplotlib how do I get the whole array of RGB colors for a specific colormap, let's say for "hot". For example if I was in MATLAB I would have just done this:
# in matlab
c = hot(256);
disp(c)
Any ideas?
You can look up the values by calling the colormap as a function, and it accepts numpy arrays to query many values at once:
In [12]: from matplotlib import cm
In [13]: cm.hot(range(256))
Out[13]:
array([[ 0.0416 , 0. , 0. , 1. ],
[ 0.05189484, 0. , 0. , 1. ],
[ 0.06218969, 0. , 0. , 1. ],
...,
[ 1. , 1. , 0.96911762, 1. ],
[ 1. , 1. , 0.98455881, 1. ],
[ 1. , 1. , 1. , 1. ]])
Got it! So you just go in the command window of your Matlab and type
cmap = colormap(nameOfTheColormapYouWant)
Possible colormap in Matlab are: parula, jet, hsv, hot, cool, spring, summer,autumn,winter, gray, bone, copper, pink, lines, colorcube, prism, flag.
You get a matrix where each row is the color code used for the colormap.