Mean IoU from keras.metrics is not showing the right value - tensorflow

I'm trying to get mean IoU of my binary semantic segmentation model using tensorflow.keras.metrics MeanIoU. However the output shows that the MeanIoU is 1.0, which should not be correct because the loss(binary crossentropy) is decreasing during the training. Does anyone have an idea how can I get the right value?
Here is what I have tried so far.
from tensorflow.keras.metrics import MeanIoU
#Test generator using validation data.
test_image_batch, test_mask_batch = val_img_gen.__next__()
print(test_mask_batch.shape) #(32, 256, 256, 3)
#Convert categorical to integer for visualization and IoU calculation
test_mask_batch_argmax = np.argmax(test_mask_batch, axis=3)
test_pred_batch = (model.predict(test_image_batch)> 0.5).astype(np.uint8)
print(test_pred_batch.shape) # (32, 256, 256, 1)
test_pred_batch_argmax = np.argmax(test_pred_batch, axis=3)
print(test_mask_batch.shape) #(32, 256, 256, 3)
n_classes = 2
IOU_keras = MeanIoU(num_classes=n_classes)
IOU_keras.update_state(test_pred_batch_argmax, test_mask_batch_argmax)
print("Mean IoU =", IOU_keras.result().numpy())
#output -- Mean IoU = 1.0

Related

Why is batch_size being multiplied to GradientTape results in Tensorflow?

I'm trying to get the gradients of a loss function w.r.t to another tensor. But the gradients are being multiplied by input batch size that I feed into my model.
import tensorflow as tf
from tensorflow.keras import Sequential, layers
#Sample States and Returns
states = tf.random.uniform(shape = (100,4))
returns = tf.constant([float(i) for i in range(100)])
#Creating dataset to feed data to model
states = tf.data.Dataset.from_tensor_slices(states)
returns = tf.data.Dataset.from_tensor_slices(returns)
#zipping datasets into one
batch_size = 4
dataset = tf.data.Dataset.zip((states, returns)).batch(batch_size)
model = Sequential([layers.Dense(128, input_shape =(4,), activation = tf.nn.relu),
layers.Dense(1, activation = tf.nn.tanh)])
for state_batch, returns_batch in dataset:
with tf.GradientTape(persistent=True) as tape:
values = model(state_batch)
loss = returns_batch - values
# d_loss/d_values should be -1.0, but i'm getting -1.0 * batch_size
print(tape.gradient(loss,values))
break
Output:
tf.Tensor(
[[-4.]
[-4.]
[-4.]
[-4.]], shape=(4, 1), dtype=float32)
Expected Output:
tf.Tensor(
[[-1.]
[-1.]
[-1.]
[-1.]], shape=(4, 1), dtype=float32)
From the code, you can see that loss = returns - values. So it should be d_loss/d_values = -1.0 , but the result I'm getting is d_loss/d_values = -1.0 * batch_size. Someone please point out why this is happening? How can I get the real results?
colab link : https://colab.research.google.com/drive/1x4pyGJ5ccRVSMzDAeLzcPXRtO7cNFnJf?usp=sharing
The problem is in this line:
loss = returns_batch - values
Here, returns_batch has shape (4,), but values has shape (4, 1). The subtraction operation broadcasts the tensors, resulting in a loss tensor that has shape (4, 4), with four repeated columns. For this reason, changing a single value of values affects four elements of returns_batch, hence the scaled gradient value. You can fix it for example like this:
loss = returns_batch - tf.squeeze(values, axis=1)

why tensorflow TFLiteConverter.from_session require the same size for input and output

I am trying to use TFLiteConverter to convert my network. So I tried the sample code first. It works. But after some modification, it sends back error. Seems the input_array and output_array must be the same size. I just don't understand why. Can anybody help me?
I modified the size of img from and the size of var from [1,64,64,3 to [1,64,3,1]
the complete code is pasted bellowenter code here
import tensorflow as tf
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 1))
var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 3, 1))
val = tf.matmul(img,var)
out = tf.identity(val, name="out")
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(val.shape)
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
The ERROR message:
ValueError: Dimensions must be equal, but are 1 and 3 for 'MatMul' (op: 'BatchMatMulV2') with input shapes: [1,64,64,1], [1,64,3,1].
The problem is not with the TFLite conversion, but with build the graph in the first place.
tf.matmul operates on the inner-most 2D matrices in your tensors. So in your case, you are trying to matrix multiply a matrix of shape 64x1 by a matrix of size 3x1, which is not valid. Matrix multiplication requires that columns of the first operand is equal to the rows in the second operand, but here 1 != 3 so it doesn't work.
For example, replace the 3 by a 1 then it will work :
import tensorflow as tf
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 1))
var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 3, 1))
val = tf.matmul(img,var)
out = tf.identity(val, name="out")

Neural network output issue

I built a neural network with tensorflow, here the code :
class DQNetwork:
def __init__(self, state_size, action_size, learning_rate, name='DQNetwork'):
self.state_size = state_size
self.action_size = action_size
self.learning_rate = learning_rate
with tf.variable_scope(name):
# We create the placeholders
self.inputs_ = tf.placeholder(tf.float32, shape=[state_size[1], state_size[0]], name="inputs")
self.actions_ = tf.placeholder(tf.float32, [None, self.action_size], name="actions_")
# Remember that target_Q is the R(s,a) + ymax Qhat(s', a')
self.target_Q = tf.placeholder(tf.float32, [None], name="target")
self.fc = tf.layers.dense(inputs = self.inputs_,
units = 50,
kernel_initializer=tf.contrib.layers.xavier_initializer(),
activation = tf.nn.elu)
self.output = tf.layers.dense(inputs = self.fc,
units = self.action_size,
kernel_initializer=tf.contrib.layers.xavier_initializer(),
activation=None)
# Q is our predicted Q value.
self.Q = tf.reduce_sum(tf.multiply(self.output, self.actions_))
# The loss is the difference between our predicted Q_values and the Q_target
# Sum(Qtarget - Q)^2
self.loss = tf.reduce_mean(tf.square(self.target_Q - self.Q))
self.optimizer = tf.train.AdamOptimizer(self.learning_rate).minimize(self.loss)
But i have an issue with the output,
the output should normaly be at the same size than "action_size", and action_size value is 3
but i got an output like [[5][3]] instead of just [[3]] and i realy don't understand why...
This network got 2 dense layers, one with 50 perceptrons and the other with 3 perceptrons (= action_size).
state_size is format : [[9][5]]
If someone know why my output is two dimensions i will be very thankful
Your self.inputs_ placeholder has shape (5, 9). You perform the matmul(self.inputs_, fc1.w) operation in dense layer fc1 which has shape (9, 50) and it results in shape (5, 50). You then apply another dense layer with shape (50, 3) which results in output shape (5, 3).
The same schematically:
matmul(shape(5, 9), shape(9, 50)) ---> shape(5, 50) # output of 1st dense layer
matmul(shape(5, 50), shape(50, 3)) ---> shape(5, 3) # output of 2nd dense layer
Usually, the first dimension of the input placeholder represents batch size and the second dimension is the dimension of inputs feature vector. So for each sample in a batch you (batch size is 5 in your case) you get the output shape 3.
To get probabilities, use this:
import tensorflow as tf
import numpy as np
inputs_ = tf.placeholder(tf.float32, shape=(None, 9))
actions_ = tf.placeholder(tf.float32, shape=(None, 3))
fc = tf.layers.dense(inputs=inputs_, units=2)
output = tf.layers.dense(inputs=fc, units=3)
reduced = tf.reduce_mean(output, axis=0)
probs = tf.nn.softmax(reduced) # <--probabilities
inputs_vals = np.ones((5, 9))
actions_vals = np.ones((1, 3))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(probs.eval({inputs_:inputs_vals,
actions_:actions_vals}))
# [0.01858923 0.01566187 0.9657489 ]

ValueError: Cannot feed value of shape (64, 200, 75) for Tensor 'TargetsData/Y:0', which has shape '(200, 75)'

I know this is a dumb question but I cant seem to figure it out. I feed in a numpy array of (?,200,75) and get this error:
ValueError: Cannot feed value of shape (64, 200, 75) for Tensor 'TargetsData/Y:0', which has shape '(200, 75)'
Here is my code:
import numpy as np
import tflearn
print("loading features....")
features = np.load("features_xs.npy")
print("loading classes....")
classes = np.load("classes_xs.npy")
symbols = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p'
,'q','r','s','t','u','v','w','x','y','z',
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U',
'V','W','X','Y','Z','1','2','3','4','5','6','7','8','9','0','.',',',
'!','?',':',';','\'','(',')','-','_',' ','"',]
num_symbols = len(symbols)
input_layer = tflearn.input_data(shape=[None, 200,num_symbols])
input_layer = tflearn.flatten(input_layer)
dense1 = tflearn.fully_connected(input_layer, 1000, activation='tanh',
regularizer='L2', weight_decay=0.001)
dense2 = tflearn.fully_connected(dense1, 2000, activation='tanh',
regularizer='L2', weight_decay=0.001)
dense2 = tflearn.fully_connected(dense2, 1000, activation='tanh',
regularizer='L2', weight_decay=0.001)
dropout2 = tflearn.dropout(dense2, 0.8)
final = tflearn.fully_connected(dropout2, (200*num_symbols), activation='tanh')
reshape = tflearn.reshape(final, [200,num_symbols], name="Reshape")
Adam = tflearn.Adam(learning_rate=0.01)
net = tflearn.regression(reshape, optimizer=Adam,
loss='categorical_crossentropy')
# Training
model = tflearn.DNN(net, tensorboard_verbose=0)
model.fit(features, classes, n_epoch=1, show_metric=True, run_id="dense_model")
model.save("model")
num_symbols is == to 75 in case you're wondering
I can't find the solution please help thanks.
Run the following code:
print(classes.shape)
You will be getting an output of (64, 200, 75). But your final layer reshape is expecting shape of (200, 75). You will have to supply values with shape of (200, 75) from your classes variable to resolve the error.

Visualization of Keras Convolution Layer Outputs

I have written the following code for this question where there are two convolution layers (Conv1 and Conv2 for short) and I would like to plot all the outputs of each layer (it's self-contained). Everything is fine for Conv1, but I am missing something about Conv2.
I am feeding a 1x1x25x25 (num images, num channels, height, width (my convention, neither TF or Theano convention)) image to Conv1 which has FOUR 5x5 filters. That means its output shape is 4x1x1x25x25 (num filters, num images, num channels, height, width), resulting in 4 plots.
Now, this output is being fed to Conv1 which has SIX 3x3 filters. Hence, the output of Conv2 should be 6x(4x1x1x25x25), but it is not! It's rather 6x1x1x25x25. That means, there are only 6 plots rather than 6x4, but why? The following functions also prints the shape of each output which they are
(1, 1, 25, 25, 4)
-------------------
(1, 1, 25, 25, 6)
-------------------
but should be
(1, 1, 25, 25, 4)
-------------------
(1, 4, 25, 25, 6)
-------------------
Right?
import numpy as np
#%matplotlib inline #for Jupyter ONLY
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Conv2D
from keras import backend as K
model = Sequential()
# Conv1
conv1_filter_size = 5
model.add(Conv2D(nb_filter=4, nb_row=conv1_filter_size, nb_col=conv1_filter_size,
activation='relu',
border_mode='same',
input_shape=(25, 25, 1)))
# Conv2
conv2_filter_size = 3
model.add(Conv2D(nb_filter=6, nb_row=conv2_filter_size, nb_col=conv2_filter_size,
activation='relu',
border_mode='same'))
# The image to be sent through the model
img = np.array([
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[0.],[1.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[1.],[0.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]]])
def get_layer_outputs(image):
'''This function extracts the numerical output of each layer.'''
outputs = [layer.output for layer in model.layers]
comp_graph = [K.function([model.input] + [K.learning_phase()], [output]) for output in outputs]
# Feeding the image
layer_outputs_list = [op([[image]]) for op in comp_graph]
layer_outputs = []
for layer_output in layer_outputs_list:
print(np.array(layer_output).shape, end='\n-------------------\n')
layer_outputs.append(layer_output[0][0])
return layer_outputs
def plot_layer_outputs(image, layer_number):
'''This function handels plotting of the layers'''
layer_outputs = get_layer_outputs(image)
x_max = layer_outputs[layer_number].shape[0]
y_max = layer_outputs[layer_number].shape[1]
n = layer_outputs[layer_number].shape[2]
L = []
for i in range(n):
L.append(np.zeros((x_max, y_max)))
for i in range(n):
for x in range(x_max):
for y in range(y_max):
L[i][x][y] = layer_outputs[layer_number][x][y][i]
for img in L:
plt.figure()
plt.imshow(img, interpolation='nearest')
plot_layer_outputs(img, 1)
The output of a convolution layer is bundled as one image with multiple channels. These could be thought as feature channels, in contrast with color channels. For example, if a convolution layer has F number of filters, it will output an image with F number of channels, no matter how many (color or feature) channels the input image had. This is why Conv2 produces 6 feature maps rather than 6x4.
In more details, a convolution filter will convolve over all input channels and the linear combination of its convolution would be fed to its activation function.