handling unlabelled pixels in semantic segmentation/unet model - tensorflow

I have a label data, consisting of 4 values [0,1,2,3].
It has 3 defined labels [1,2,3], where 0 refers to unlabelled pixel.
The goal is predict each 0s (unlabelled pixels) into one of three classes [1,2,3].
Following is unet model run for an example data.
data = np.random.randint(low=1,high=29, size=(300, 160, 160, 10)) # (samples, width, height, channels)
labels = np.random.randint(low=0,high=3, size=(300, 160, 160)) # (samples, width, height, channels)
input_dim = (160,160,10) #(width, height, channels)
n_class = len(np.unique(labels))
model = unet_model()
model.compile(optimizer = tf.keras.optimizers.Adam(0.0001),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
model.fit(data,
labels,
epochs=10,
verbose=2)
However, it is predicting 0s values as well.
How can I handle unlabelled pixels [0s] so that the model predicts only one of the [1,2,3]?

Related

How can I concatenate Tensorflow Dataset columns?

I have a Keras model that takes an input layer with shape (n, 288, 1), of which 288 is the number of features. I am using a TensorFlow dataset tf.data.experimental.make_batched_features_dataset and my input layer will be (n, 1, 1) which means it gives one feature to the model at a time. How can I make an input tensor with the shape of (n, 288, 1)? I mean how can I use all my features in one tensor?
Here is my code for the model:
def _gzip_reader_fn(filenames):
"""Small utility returning a record reader that can read gzip'ed files."""
return tf.data.TFRecordDataset(filenames, compression_type='GZIP')
def _input_fn(file_pattern, tf_transform_output, batch_size):
"""Generates features and label for tuning/training.
Args:
file_pattern: input tfrecord file pattern.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
transformed_feature_spec = (
tf_transform_output.transformed_feature_spec().copy())
dataset = tf.data.experimental.make_batched_features_dataset(
file_pattern=file_pattern,
batch_size=batch_size,
features=transformed_feature_spec,
reader=_gzip_reader_fn,
label_key=features.transformed_name(features.LABEL_KEY))
return dataset
def _build_keras_model(nb_classes=2, input_shape, learning_rate):
# Keras needs the feature definitions at compile time.
input_shape = (288,1)
input_layer = keras.layers.Input(input_shape)
padding = 'valid'
if input_shape[0] < 60:
padding = 'same'
conv1 = keras.layers.Conv1D(filters=6, kernel_size=7, padding=padding, activation='sigmoid')(input_layer)
conv1 = keras.layers.AveragePooling1D(pool_size=3)(conv1)
conv2 = keras.layers.Conv1D(filters=12, kernel_size=7, padding=padding, activation='sigmoid')(conv1)
conv2 = keras.layers.AveragePooling1D(pool_size=3)(conv2)
flatten_layer = keras.layers.Flatten()(conv2)
output_layer = keras.layers.Dense(units=nb_classes, activation='sigmoid')(flatten_layer)
model = keras.models.Model(inputs=input_layer, outputs=output_layer)
optimizer = keras.optimizers.Adam(lr=learning_rate)
# Compile Keras model
model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
model.summary(print_fn=logging.info)
return model
This is the error:
tensorflow:Model was constructed with shape (None, 288, 1) for input Tensor("input_1:0", shape=(None, 288, 1), dtype=float32), but it was called on an input with incompatible shape (128, 1, 1).

Preprocessing test images using opencv for prediction

I am working on a dataset of gray images that are saved under RGB format. I trained VGG16 on this dataset, and preprocessed them this way:
train_data_gen = ImageDataGenerator(rescale=1./255,rotation_range = 20,
width_shift_range = 0.2,
height_shift_range = 0.2,
horizontal_flip = True)
validation_data_gen = ImageDataGenerator(rescale=1./255)
train_gen= train_data_gen.flow_from_directory(trainPath,
target_size=(224, 224),
batch_size = 64,
class_mode='categorical' )
validation_gen= validation_data_gen.flow_from_directory(validationPath, target_size=(224, 224),
batch_size = 64, class_mode='categorical' )
When the training was done, both training and validation accuracy were high (92%).
In the prediction phase, I first tried to preprocess images as indicated in https://keras.io/applications/ :
img = image.load_img(img_path, target_size=(image_size,image_size))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
However, the test accuracy was very low! around 50%. I tried the prediction again on train and validation samples and got a low accuracy, also around 50%, which means that the problem is in the prediction phase.
Instead, I preprocessed images using OpenCV library, and the accuracy was better, but still not as expected. I tried to make the prediction on train samples (where accuracy during training was 92%), and during the prediction I got 82%. Here is the code:
img = cv2.imread(imagePath)
#np.flip(img, axis=-1)
img= cv2.resize(img, (224, 224),
interpolation = cv2.INTER_AREA)
img = np.reshape(img,
(1, img.shape[0], img.shape[1], img.shape[2]))
img = img/255.
The result is the same with/without flipping the image. What's wrong with the preprocessing step?
Thanks
The error was in the interpolation parameter of resize function. It should be cv2.INTER_NEAREST instead of cv2.INTER_AREA.

a problem using LSTM network (neural networks)

im trying to create a speaker diarization system using lstm (im trying to make the network tell the difference between speakers).
this is the model i've created:
model = Sequential()
model.add(LSTM(768, batch_input_shape=(39, 40, 1), return_sequences=True))
model.add(Dense(256))
model.add(LSTM(768, return_sequences=True))
model.add(Dense(256))
model.add(LSTM(768, return_sequences=True))
model.add(Dense(4))
there are 4 different speakers.
in my dataset i have the array 'features' (256 at length for 256 speech segments).
for each segment in 'features' i have 39 vectors to represent each segment and each of these vectors is at size 40.
each of these 39 vectors is extracted from a different time window. (i used log mel filterbank energies).
i also have the array 'lables' which is also 256 at length and contains the lables for each segment.
i used 'to_categorical' for it:
labels = tf.keras.utils.to_categorical(labels, num_classes=4)
i tried using a generator to feed it to the network but it didnt work.
this is the class i used:
class KerasBatchGenerator(object):
def __init__(self, features, batch_size, labels):
self.features = features
self.batch_size = batch_size
self.labels = labels
def generate(self):
while True:
for i in self.labels:
for j in self.features:
temp = [j, i]
# temp = np.expand_dims(temp, axis=1)
temp = np.expand_dims(temp, axis=2)
yield tuple(temp)
and the code i used to run the network is:
train_data_generator = KerasBatchGenerator(features, batch_size, labels)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(train_data_generator.generate(), 100, 1)
please help!!!
If i guessed correctly, you want to classify which input in spoke by which speaker.
In that case your final layer should have a shape (batch_size, numOfClasses) or (39, 4)
But if you take close look at the summary the output shape for final layer is (39, 40, 4)
to get the proper shape remove the argument return_sequences=True from last LSTM layer.

Are these images too 'noisy' to be correctly classified by a CNN?

I'm attempting to build an image classifier to identify between 2 types of images on property sites. I've split my dataset into 2 categories: [Property, Room]. I'm hoping to be able to differentiate between whether the image is of the outside of some property or a room inside the property.
Below are 2 examples of the types of image I am using. My dataset consists of 800 images for each category, and then a training set of an additional 160 images for each category (not present in the training set).
I always seem to be get reasonable results in training, but then when I test against some real samples it usually ends up classifying all of the images into a single category.
Below you can see the model I am using:
train_datagen = ImageDataGenerator(
rescale=1./255,
width_shift_range=0.1,
height_shift_range=0.1,
rotation_range=10,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
) # set validation split
validate_datagen = ImageDataGenerator(rescale=1./255)
IMG_HEIGHT = IMG_WIDTH = 128
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (11,11), activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, 3), padding='same'),
tf.keras.layers.MaxPooling2D(11, 11),
# tf.keras.layers.Dropout(0.5),
# Second convolutional layer
tf.keras.layers.Conv2D(64, (11, 11), padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(11, 11),
# tf.keras.layers.Dropout(0.5),
# Flattening
tf.keras.layers.Flatten(),
# Full connection
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid')
])
from tensorflow.keras.optimizers import RMSprop
model.compile(
optimizer=RMSprop(lr=0.001),
loss='binary_crossentropy',
metrics=['accuracy']
)
# now train the model
history = model.fit_generator(
train_generator,
validation_data=validation_generator,
steps_per_epoch=75, #100
epochs=5, # 15, or 20, and 100 steps per epoch
validation_steps=50,
verbose=1
)
# Predict image
def load_image(img_path, show=False):
test_image = image.load_img(img_path, target_size=(IMG_HEIGHT, IMG_WIDTH))
test_image = image.img_to_array(test_image)
test_image /= 255.
test_image = np.expand_dims(test_image, axis = 0)
return test_image
def predict_image(img_path, show=False):
loaded_img = load_image(img_path, show)
pred = model.predict(loaded_img)
return 'property' if pred[0][0] == 0.0 else 'room'
print('Prediction is...')
print(predict_image('path/to/my/img')
Can anyone suggest the possible reasons for this? I've tried using different epochs and batch sizes, augmenting the images further, changing the Conv2D and Pooling layer size but nothing seems to help.
Do I perhaps not have enough data, or are they bad images to begin with? This is my first foray into ML so apologies if any of questions seem obvious.
You are not post-processing the output of the classifier correctly, it outputs a probability in [0, 1], with values < 0.5 corresponding to the first class, and values >= 0.5 for the second class. You should change the code accordingly.
Try Data Augmentation: it augments the image to some random transformations like Random Rotation, Random Zoom, Random Horizontal Flip, width shift and height shift. And also try to implement Batch Normalisation.

Resolving: Error when checking input: expected dense_125_input to have 2 dimensions, but got array with shape (192, 192, 1)

I'm trying to get my first net running. The following error occures:
ValueError: Error when checking input: expected dense_125_input to have 2 dimensions, but got array with shape (192, 192, 1)
# ... images 300 px width/height
def preprocess_image(image):
image = tf.image.decode_jpeg(image, channels=1)
image = tf.image.resize(image, [192, 192])
image /= 255.0 # normalize to [0,1] range
return image
# creating the dataset
def prepare_data_train(path, label_from_filename, show=False):
images = []
labels = []
for file in glob.glob(path + '*.jpg'):
label = label_from_filename(file)
if label != False:
images.append(file)
labels.append(label)
path_ds = tf.data.Dataset.from_tensor_slices(images)
image_ds = path_ds.map(load_and_preprocess_image, num_parallel_calls=AUTOTUNE)
label_ds = tf.data.Dataset.from_tensor_slices(tf.cast(labels, tf.int32))
image_label_ds = tf.data.Dataset.zip((image_ds, label_ds))
# shuffling, batch size
BATCH_SIZE = 20
image_count = len(images)
# Setting a shuffle buffer size as large as the dataset ensures that the data is
# completely shuffled.
ds = image_label_ds.shuffle(buffer_size=image_count)
ds = ds.repeat()
ds = ds.batch(BATCH_SIZE)
# `prefetch` lets the dataset fetch batches, in the background while the model is training.
ds = ds.prefetch(buffer_size=AUTOTUNE)
keras_ds = ds.map(change_range)
image_batch, label_batch = next(iter(keras_ds))
return image_label_ds
# running ...
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(192,)))
model.add(Dense(2, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
train_ds = prepare_data_train(path_train, label_from_filename, False)
validation_ds = prepare_data_test(path_test, label_from_filename, False)
# error when fitting
history = model.fit(train_ds,
batch_size=20,
epochs=10,
verbose=2,
validation_steps=2,
steps_per_epoch=2,
validation_data=validation_ds)
How to resolve it? Is reshaping needed, how?
Based on the images the net should predict 1 or 2.
The error comes from this line in your code:
model.add(Dense(100, activation='relu', input_shape=(192,)))
Namely, the shape of your input is 3 dimensional [width, height, channels] or [192, 192, 1]. So, if you really want to have that dense layer at the start, change the model definition to:
model = Sequential()
model.add(Flatten(input_shape=[192, 192, 1]))
model.add(Dense(100, activation='relu'))
model.add(Dense(2, activation='softmax'))
The line model.add(Flatten(input_shape=[192, 192, 1])) will flatten your input to be a single vector for each element in the batch. Then, you can proceed as you want.
This error comes when You put wrong train data to model.fit attribute. Check Your train_ds by print(train_ds) before passing it to model.fit. It should return something like:
<SkipDataset shapes: ((192, 192, 1), ()), types: (tf.float32, tf.int64)>