I created an TensorFlow model which I trained it on 13 class of clothes, It works for the most part but if I gave the model an Image for a human face or something other that what it is trained for it will always give prediction, This is bad because for example when i use it in an android app and the photo has something wrong or just a totally unrelated image to the model it will still give prediction ... how can I fix that ?
My Model:
base_model = tf.keras.applications.InceptionResNetV2(
include_top=False,
weights='imagenet',
input_shape=(image_width,image_height,3)
)
base_model.trainable=False
model = tf.keras.Sequential([
base_model,
tf.keras.layers.BatchNormalization(renorm=True),
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(categorys_size, activation='softmax')])
How i predict on Android using TensorFlow-Lite:
val outputs = aiModel.process(input)
val outputFeature0 = outputs.outputFeature0AsTensorBuffer
val confidences = outputFeature0.floatArray
var maxPos = 0
var maxConfidence = 0f
for (i in confidences.indices) {
Log.d("Debug","Ai $i: ${confidences[i]}")
if (confidences[i] > maxConfidence) {
maxConfidence = confidences[i]
maxPos = i
}
}
As suggested in the answer changed my output layer activation function to a sigmoid function, And this is the test result
Last layer in the model:
tf.keras.layers.Dense(categorys_size, activation='sigmoid')])
Trained item prediction:
0: 0.008963287
1: 0.31714076
2: 0.3260302
3: 0.16326761
4: 0.98595774
5: 0.99898005
6: 0.73391
7: 0.105021596
8: 0.6949855
9: 0.23735091
10: 0.10482347
11: 0.44860303
12: 0.972826
13: 0.048107207
Prediction: 5
Non-trained 'False Data':
0: 0.021587461
1: 0.04939705
2: 0.9529922
3: 0.97660494
4: 0.19534937
5: 0.0668163
6: 0.032978743
7: 0.5045526
8: 0.6911509
9: 0.98788655
10: 0.9927062
11: 0.11638865
12: 0.62173533
13: 0.23424217
Prediction: 10
the model prediction is always >50% no matter what
Change your output layer activation function to a sigmoid function. This will output values between 0 and 1 for every class. Then you have actual probabilities per class. For the softmax function, all your probabilities will always add up to 1, so you will never have all the values close to 0.
In your postprocessing of your prediction, you iterate over the different outputs and then show the most probable prediction to the user. But if all probabilities in your output vector are below a certain threshold, you can just not put out a prediction at all.
Related
I am trying to use TF to solve a custom gym environment, all within Google Colab.
The main script is the TF "DQN Tutorial" available here.
In place of env_name = "CartPole-v0" I am using env_name = "gym_examples/GridWorld-v0", where gym_examples/GridWorld-v0 is the sample custom environment described in the gym documentation here. (That example uses gym v0.25.0 but TF requires gym <= v0.23.0, so I also had to tweak the rendering code a bit to make it work in v0.23.0.)
The environment loads fine via env = suite_gym.load(env_name), and subsequent code cells run fine as well, until the following two cells:
fc_layer_params = (100, 50)
action_tensor_spec = tensor_spec.from_spec(env.action_spec())
num_actions = action_tensor_spec.maximum - action_tensor_spec.minimum + 1
# Define a helper function to create Dense layers configured with the right
# activation and kernel initializer.
def dense_layer(num_units):
return tf.keras.layers.Dense(
num_units,
activation=tf.keras.activations.relu,
kernel_initializer=tf.keras.initializers.VarianceScaling(
scale=2.0, mode='fan_in', distribution='truncated_normal'))
# QNetwork consists of a sequence of Dense layers followed by a dense layer
# with `num_actions` units to generate one q_value per available action as
# its output.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
q_values_layer = tf.keras.layers.Dense(
num_actions,
activation=None,
kernel_initializer=tf.keras.initializers.RandomUniform(
minval=-0.03, maxval=0.03),
bias_initializer=tf.keras.initializers.Constant(-0.2))
q_net = sequential.Sequential(dense_layers + [q_values_layer])
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
train_step_counter = tf.Variable(0)
agent = dqn_agent.DqnAgent(
train_env.time_step_spec(),
train_env.action_spec(),
q_network=q_net,
optimizer=optimizer,
td_errors_loss_fn=common.element_wise_squared_loss,
train_step_counter=train_step_counter)
agent.initialize()
After that cell, I get an error:
ValueError: Exception encountered when calling layer "sequential_2" (type Sequential).
Layer "dense_6" expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor: shape=(1, 2), dtype=int64, numpy=array([[2, 2]])>, <tf.Tensor: shape=(1, 2), dtype=int64, numpy=array([[3, 2]])>]
Call arguments received by layer "sequential_2" (type Sequential):
• inputs={'agent': 'tf.Tensor(shape=(1, 2), dtype=int64)', 'target': 'tf.Tensor(shape=(1, 2), dtype=int64)'}
• network_state=()
• kwargs={'step_type': 'tf.Tensor(shape=(1,), dtype=int32)', 'training': 'None'}
In call to configurable 'DqnAgent' (<class 'tf_agents.agents.dqn.dqn_agent.DqnAgent'>)
I'm too much of a TF novice to understand what's going on here. I suspect it's because the action state changed from 2 states (in CartPole) to 4 (in the custom GridWorld environment). But beyond that I cannot figure it out.
This can be solved by using an Embedding layer as your first layer. In this example (Embedding(16, 4)), 16 is the grid size (4x4), and 4 is the output dimension.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
For example, replacing the above line with the code below will eradicate the error.
dense_layers = [
# First layer
tf.keras.layers.Embedding(16, 4),
# Other layers
tf.keras.layers.Dense(100, activation=tf.keras.activations.relu)
]
Source and for further explanation:
https://martin-ueding.de/posts/reinforcement-learning-with-frozen-lake/
I am attempting to print the predicted probabilities of each class outcome from my trained model, when I present new raw data. This is a multi-class classification problem, with 8 outputs and 21 inputs.
I am able to print 1 outcome when I present new data, for example:
"Example 0 prediction: 1 (15.0%)"
Instead, I would expect to see something similar to the below. Where the probabilities of each class (0, 1, 2, 3, 4, 6, Wide, Out) are shown:
Example 0 prediction 0: (12.5%), prediction 1: (12.5%), prediction 2: (12.5%), prediction 3: (12.5%), prediction 4: (12.5%), prediction 6: (12.5%), prediction Wide: (12.5%), prediction Out: (12.5%)
Please note I have tried searching for similar issues including here, here and here as well as consulted the TensorFlow documentation. However, these mainly discuss alterations to the model itself e.g. softmax activation on the final layer, categorical crossentropy as the loss function etc. so that probabilities are generated.
I have included the model architecture as well as the prediction code for full visibility.
Model:
earlystopping = callbacks.EarlyStopping(monitor ="val_loss",
mode ="min", patience = 125,
restore_best_weights = True)
#define Keras
model = Sequential()
model.add(Dense(50, input_dim=21))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5,input_shape=(50,)))
model.add(Dense(50))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5,input_shape=(50,)))
model.add(Dense(8, activation='softmax'))
#compile the keras model
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
model.fit(X, dummy_y, validation_split=0.25, epochs=1000, batch_size=100, verbose=1, callbacks=[earlystopping])
_, accuracy3 = model.evaluate(X, dummy_y, verbose=0)
print('Accuracy: %.2f' % (accuracy3*100))
Making predictions:
class_names = ['0', '1', '2','3','4','6','Wide','Out']
predict_dataset = tf.convert_to_tensor([
[1,5,1,0.459,0.322,0.041,0.002,0.103,0.032,0.041,14,0.404,0.284,0.052,0.008,0.128,0.044,0.037,0.043,54,0,],
[1,18,5,0.512,0.286,0,0,0.083,0.024,0.095,13,0.24,0.44,0.08,0,0.08,0.08,0,0.08,173,3],
[2,11,13,0.5,0.417,0,0,0.083,0,0.083,82,0.35,0.36,0.042,0.003,0.135,0.039,0.051,0.02,51,7]
])
predictions = model(predict_dataset, training=False)
for i, logits in enumerate(predictions):
class_idx = tf.argmax(logits).numpy()
p = tf.nn.softmax(logits)[class_idx]
name = class_names[class_idx]
print("Example {} prediction: {} ({:4.1f}%)".format(i, name,100*p))
Output:
Example 0 prediction: 1 (15.0%)
Example 1 prediction: 1 (16.0%)
Example 2 prediction: 0 (16.9%)
I have tried making changes to the for loop which makes use of TensorFlow's logits, but I am still unable to get it to print each outcome and associated probability.
Any guidance is much appreciated.
In the end instead of trying to implement a For loop, I instead just printed each outcome from the numpy array.
Not the cleanest of ways, but it does the job. Hopefully useful to someone in the future.
predict_dataset = tf.convert_to_tensor([
[1,5,1,0.459,0.322,0.041,0.002,0.103,0.032,0.041,14,0.404,0.284,0.052,0.008,0.128,0.044,0.037,0.043,54,0,155]
])
predictions = model3(predict_dataset, training=False)
predictions2 = predictions.numpy()
prob_0 = predictions2[0,0]
prob_1 = predictions2[0,1]
prob_2 = predictions2[0,2]
prob_3 = predictions2[0,3]
prob_4 = predictions2[0,4]
prob_wide = predictions2[0,5]
prob_6 = predictions2[0,6]
prob_wicket = predictions2[0,7]
print(prob_0)
print(prob_1)
print(prob_2)
print(prob_3)
print(prob_4)
print(prob_wide)
print(prob_6)
print(prob_wicket)
Output
0.28349978
0.32451397
0.06382967
0.0053077294
0.20397986
0.07999096
6.386134e-08
0.038877998
I am struggling for the last hour to understand what i am doing wrong. I am a novice in NN, but this is not my first code.
def simple_model(lr=0.1):
X = Input(shape=(6144,))
out = Dense(1)(X)
model = Model(inputs=X, outputs=out)
opt = tf.keras.optimizers.SGD(learning_rate=lr)
model.compile(optimizer=opt, loss='mean_squared_error')
model.summary()
return model
mod = simple_model()
a = np.zeros(6144)
v = mod.predict(a)
running this i get the following error:
WARNING:tensorflow:Model was constructed with shape (None, 6144) for input Tensor("input_1:0", shape=(None, 6144), dtype=float32), but it was called on an input with incompatible shape (32, 1).
......
ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 6144 but received input with shape [32, 1]
Where does this [32, 1] come from ?!
I am sure there is some silly mistake in my code, but can't see it :(
p.s. It does compile the mode and prints the summary before throwing an error
mod = simple_model()
a = np.zeros(6144)
#Add this line
a = np.expand_dims(a,axis=0)
v = mod.predict(a)
The reason why your error appears is that Keras + TensorFlow only allow batch predictions. When we use expand_dims function, we actually create a batch of dimension 1.
I am trying to build an ensamble DNN model. I train e.g. 5 models, take the weights and average them. After that I wanted to clone a first model and assign the new weights. But it does not work.
The Model is built like this:
def build_DNN_model(self):
# initialize the DNN
ann = tf.keras.models.Sequential()
# add first hidden layer
num_neurons = self.num_neurons
ann.add(tf.keras.layers.Dense(units=num_neurons, activation='relu', kernel_initializer=tf.constant_initializer(1.)))
ann.add(tf.keras.layers.Dropout(0.5))
# add second hidden layer
ann.add(tf.keras.layers.Dense(units=num_neurons, activation='relu'))
ann.add(tf.keras.layers.Dropout(0.5))
# add output layer
ann.add(tf.keras.layers.Dense(units=1))
# compile
ann.compile(optimizer='adam', loss='mean_squared_error')
return ann
Then the model is fitted to the data, actually I do 5 models, and fit all of them to the same data.
After that I create a list of KerasModel Objects, called "members".
And now I would like to assign my new weights to a clone of one of the models. But even if I do that:
members[0].set_weights(members[0].get_weights())
it returns me None.
I use Tensoflow 2 version.
I would appreciate your help very much.
You should define the input shape in your first layer of the model
after doing this I simply create 2 models like yours (m1,m2) and assign to m2 the same weights to m1... they are the same
def build_DNN_model(input_dim):
# initialize the DNN
ann = tf.keras.models.Sequential()
# add first hidden layer
num_neurons = 32
ann.add(tf.keras.layers.Dense(units=num_neurons, activation='relu',
kernel_initializer=tf.constant_initializer(1.),
input_dim=input_dim))
ann.add(tf.keras.layers.Dropout(0.5))
# add second hidden layer
ann.add(tf.keras.layers.Dense(units=num_neurons, activation='relu'))
ann.add(tf.keras.layers.Dropout(0.5))
# add output layer
ann.add(tf.keras.layers.Dense(units=1))
# compile
ann.compile(optimizer='adam', loss='mean_squared_error')
return ann
m1 = build_DNN_model((100))
m2 = build_DNN_model((100))
m2.set_weights(m1.get_weights())
# check the weights
[(w1==w2).all() for w1,w2 in zip(m1.get_weights(),m2.get_weights())]
# [True, True, True, True, True, True]
the notebook
EDIT1: assign random weights to m1:
m1.set_weights([np.random.uniform(0,1, i.shape) for i in m1.get_weights()])
EDIT2: here you find the working implementation of model_weight_ensemble in your contest from https://machinelearningmastery.com/polyak-neural-network-model-weight-ensemble/
Creating a simple model:
def create_model1():
model = tf.keras.Sequential()
model.add(tf.keras.Input(shape=(13,)))
model.add(tf.keras.layers.Dense(units = 6, activation='relu', name = 'd1'))
model.add(tf.keras.layers.Dense(units = 2, activation='softmax', name = 'd2'))
return model
Model Architecture:
Looking at layers:
model.layers
Ouput:
[<tensorflow.python.keras.layers.core.Dense at 0x2193acc95c8>,
<tensorflow.python.keras.layers.core.Dense at 0x2193ad3ad08>]
Looking at the weights of 2nd dense layer:
model.layers[1].weights
Output:
[<tf.Variable 'd2/kernel:0' shape=(6, 2) dtype=float32, numpy=
array([[ 0.11061734, 0.61788374],
[ 0.31208295, 0.19295567],
[-0.6812483 , 0.05383837],
[ 0.39284903, 0.69312006],
[-0.519426 , 0.67820543],
[-0.7337165 , 0.11025453]], dtype=float32)>,
<tf.Variable 'd2/bias:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>]
Setting weights:
new_weights = [tf.random.uniform(shape = (6,2)), tf.random.uniform(shape = (2,))]
model.layers[1].set_weights(new_weights)
For setting weights the shape of new_weights should match the shape of weights of that particular layer.
Here, new_weights is a list containing two values. 1st element is the weight of the kernel and 2nd element is the weight for bias.
In this tf tutorial, the U-net model has been divided into 2 parts, first contraction where they have used Mobilenet and it is not trainable. In second part, I'm not able to understand what all layers are being trained. As far as I could see, only the last layer conv2dTranspose seems trainable. Am I right?
And if I am how could only one layer is able to do such a complex task as segmentation?
Tutorial link: https://www.tensorflow.org/tutorials/images/segmentation
The code for the Image Segmentation Model, from the Tutorial is shown below:
def unet_model(output_channels):
inputs = tf.keras.layers.Input(shape=[128, 128, 3])
x = inputs
# Downsampling through the model
skips = down_stack(x)
x = skips[-1]
skips = reversed(skips[:-1])
# Upsampling and establishing the skip connections
for up, skip in zip(up_stack, skips):
x = up(x)
concat = tf.keras.layers.Concatenate()
x = concat([x, skip])
# This is the last layer of the model
last = tf.keras.layers.Conv2DTranspose(
output_channels, 3, strides=2,
padding='same') #64x64 -> 128x128
x = last(x)
return tf.keras.Model(inputs=inputs, outputs=x)
First part of the Model is Downsampling uses not the entire Mobilenet Architecture but only the Layers,
'block_1_expand_relu', # 64x64
'block_3_expand_relu', # 32x32
'block_6_expand_relu', # 16x16
'block_13_expand_relu', # 8x8
'block_16_project'
of the Pre-Trained Model, Mobilenet, which are non-trainable.
Second part of the Model (which is of your interest), before the layer, Conv2DTranspose is Upsampling part, which is present in the list,
up_stack = [
pix2pix.upsample(512, 3), # 4x4 -> 8x8
pix2pix.upsample(256, 3), # 8x8 -> 16x16
pix2pix.upsample(128, 3), # 16x16 -> 32x32
pix2pix.upsample(64, 3), # 32x32 -> 64x64
]
It means that it is accessing a Function named upsample from the Module, pix2pix. The code for the Module, pix2pix is present in this Github Link.
Code for the function, upsample is shown below:
def upsample(filters, size, norm_type='batchnorm', apply_dropout=False):
"""Upsamples an input.
Conv2DTranspose => Batchnorm => Dropout => Relu
Args:
filters: number of filters
size: filter size
norm_type: Normalization type; either 'batchnorm' or 'instancenorm'.
apply_dropout: If True, adds the dropout layer
Returns:
Upsample Sequential Model
"""
initializer = tf.random_normal_initializer(0., 0.02)
result = tf.keras.Sequential()
result.add(
tf.keras.layers.Conv2DTranspose(filters, size, strides=2,
padding='same',
kernel_initializer=initializer,
use_bias=False))
if norm_type.lower() == 'batchnorm':
result.add(tf.keras.layers.BatchNormalization())
elif norm_type.lower() == 'instancenorm':
result.add(InstanceNormalization())
if apply_dropout:
result.add(tf.keras.layers.Dropout(0.5))
result.add(tf.keras.layers.ReLU())
return result
This means that the second part of the Model comprises of the Upsampling Layers, whose functionality is defined above, with the Number of Filters being 512, 256, 128 and 64.