Train YOLO with height = 320 and width = 320 - object-detection

I am training my object detection model with two classes using YOLOV3. In the config file of yolov3.cfg its default height = 416 and width = 416. Now I want to train it with height = 320 and width = 320 so I made these changes and started my training but this is giving me average loss = Nan. But when I train it with 416 x 416 it works completely fine.
So how can I train my model with 320 x 320 instead of 416 x 416.

So YOLOV3 has a default height and weight of 416. So to get 320 I have changed it in the .tflite model instead of .cfg.

Related

fixed_shape_resizer of 320 x 320 in YOLOV3: changing height and width of config file

I have a dataset of two classes bottle and mouse. I annotated the images and I am training it with yolov3.
step1: trained with yolov3
step2: convert .weights to .tflite
step3: create metadata for .tflite file
step4: Integrate the metadata .tflite file to kotlin application on android studio
Now to have my detection model on to android app I have to ensure that fixed_shape_resizer is 320 x 320. Inorder to implement this I have change the height and width of the yolov3.cfg file.
batch=4
subdivisions=16
width=320
height=320
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 4000
policy=steps
steps=3800,4200
scales=.1,.1
[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear
[yolo]
mask = 6,7,8
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
But when I use the above .cfg file to train using below command my loss = Nan
!darknet/darknet detector train dataset/labelled_data.data darknet/cfg/yolov3_custom.cfg custom_weight/darknet53.conv.74 -dont_show
How do I solve this issue?
Is there any other way to set fixed_shape_resizer in yolov3?
Which model of yolo is better to deploy on phone?

Custom Object Detection TFLite model error (pascal voc) - ValueError: The size of the train_data (0) couldn't be smaller than batch_size (2)

I am trying to build a custom object detection model in jupyter notebook using tflite model maker but I have some problems.
I am getting images with pascal_voc (not csv) so I splited the train data/test data into different files
os.mkdir('C:/Users/user/Desktop/GradProject5/train_test_split/train')
os.mkdir('C:/Users/user/Desktop/GradProject5/train_test_split/test')
image_paths = os.listdir('C:/Users/user/anaconda3/envs/DarkflowTest/data/dataset')
random.shuffle(image_paths)
for i, image_path in enumerate(image_paths):
if i < int(len(image_paths) * 0.8):
shutil.copy(f'C:/Users/user/anaconda3/envs/DarkflowTest/data/dataset/{image_path}', 'C:/Users/user/Desktop/GradProject5/train_test_split/train')
shutil.copy(f'C:/Users/user/anaconda3/envs/DarkflowTest/data/annotations/{image_path.replace("jpg", "xml")}', 'C:/Users/user/Desktop/GradProject5/train_test_split/train')
else:
shutil.copy(f'C:/Users/user/anaconda3/envs/DarkflowTest/data/dataset/{image_path}', 'C:/Users/user/Desktop/GradProject5/train_test_split/test')
shutil.copy(f'C:/Users/user/anaconda3/envs/DarkflowTest/data/annotations/{image_path.replace("jpg", "xml")}', 'C:/Users/user/Desktop/GradProject5/train_test_split/test')
test_image_dir='C:/Users/user/Desktop/GradProject/\train_test_split/test/'
#annotations_dir = 'C:/Users/user/anaconda3/envs/DarkflowTest/data/annotations/'
train_data=object_detector.DataLoader.from_pascal_voc(train_image_dir+'image/',train_image_dir+'xml/',label_map={1:"pill",2:"text"})
test_datal=object_detector.DataLoader.from_pascal_voc(test_image_dir+'image/',test_image_dir+'xml/',label_map={1:"pill",2:"text"})
Then loaded train data and test data with DataLoader
model = object_detector.create(train_data, model_spec=spec, batch_size=2, train_whole_model=True)
Then I tried to create a model but I am getting this error.
[[1]: https://i.stack.imgur.com/ZBQtk.png][1]
ValueError Traceback (most recent call last)
in
----> 1 model = object_detector.create(train_data, model_spec=spec, batch_size=2, train_whole_model=True)
~\anaconda3\lib\site-packages\tensorflow_examples\lite\model_maker\core\task\object_detector.py in create(cls, train_data, model_spec, validation_data, epochs, batch_size, train_whole_model, do_train)
285 if do_train:
286 tf.compat.v1.logging.info('Retraining the models...')
--> 287 object_detector.train(train_data, validation_data, epochs, batch_size)
288 else:
289 object_detector.create_model()
~\anaconda3\lib\site-packages\tensorflow_examples\lite\model_maker\core\task\object_detector.py in train(self, train_data, validation_data, epochs, batch_size)
137 # TODO(b/171449557): Upstream this to the parent class.
138 if len(train_data) < batch_size:
--> 139 raise ValueError('The size of the train_data (%d) couldn't be smaller '
140 'than batch_size (%d). To solve this problem, set '
141 'the batch_size smaller or increase the size of the '
ValueError: The size of the train_data (0) couldn't be smaller than batch_size (2). To solve this problem, set the batch_size smaller or increase the size of the train_data.
Am I getting error because
train_data=object_detector.DataLoader.from_pascal_voc(train_image_dir+'image/',train_image_dir+'xml/',label_map={1:"pill",2:"text"})
it failed to load train data..?
I think I did everything right and still struggling with this error..
Please help me if you know the solution to this problem!
I met the same issue, and simply remove the trailing / from the test image dir, then the issue is gone.
train_image_dir = '/Users/twinssbc/VSProject/mobile_phone/train'
train_process_image_dir = '/Users/twinssbc/VSProject/mobile_phone/train_preprocess'
train_label_dir = '/Users/twinssbc/VSProject/mobile_phone/train_label'
train_data_loader = object_detector.DataLoader.from_pascal_voc(train_process_image_dir, train_label_dir, label_map=label_map)
spec = model_spec.get('efficientdet_lite4')
model = object_detector.create(train_data_loader, spec, batch_size=5, epochs=50, train_whole_model=True)
test_data_loader = object_detector.DataLoader.from_pascal_voc(test_process_image_dir, test_label_dir, label_map=label_map)
model.evaluate(test_data_loader)
check the format of your pascal voc xml..... I solved this issue by reformating the xml which had <?xml version= .bla bla bla uft bla bla> you might need to remove this line.. and if you still get another file not found error... check that the extension of your xml filename is same as what your file name is in the folder in use. ps this happened to me because I used label studio to annotate and tried to use the data in tflie model builder directly

handling unlabelled pixels in semantic segmentation/unet model

I have a label data, consisting of 4 values [0,1,2,3].
It has 3 defined labels [1,2,3], where 0 refers to unlabelled pixel.
The goal is predict each 0s (unlabelled pixels) into one of three classes [1,2,3].
Following is unet model run for an example data.
data = np.random.randint(low=1,high=29, size=(300, 160, 160, 10)) # (samples, width, height, channels)
labels = np.random.randint(low=0,high=3, size=(300, 160, 160)) # (samples, width, height, channels)
input_dim = (160,160,10) #(width, height, channels)
n_class = len(np.unique(labels))
model = unet_model()
model.compile(optimizer = tf.keras.optimizers.Adam(0.0001),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
model.fit(data,
labels,
epochs=10,
verbose=2)
However, it is predicting 0s values as well.
How can I handle unlabelled pixels [0s] so that the model predicts only one of the [1,2,3]?

Partially random elements in batch after Dataset.shuffle()

I'm using tf.data.Dataset and tf.keras in TF2.1 to train on a dataset. But I saw strange behavior that the resulting batches do not show fully random as I expected. I mean, I usually see elements from only 2 classes in one batch even my dataset has 4 classes. My code is as follows:
def process_train_sample(file_path):
sp = tf.strings.regex_replace(file_path, train_data_dir, '')
cls = tf.math.argmax(tf.cast(tf.math.equal(tf.strings.split(sp, os.path.sep)[0],['A','B','C','D']), tf.int64))
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img, channels=3) # RGB
img = tf.image.resize(img, (224, 224))
img = tf.cast(img, tf.float32)
img = img - np.array([123.68, 116.779, 103.939])
img = img / 255.0
cls = tf.expand_dims(cls, 0)
return img, cls
train_data_list = glob.glob(os.path.join(train_data_dir, '**', '*.jpg'), recursive=True)
train_data_list = tf.data.Dataset.from_tensor_slices(train_data_list)
train_ds = train_data_list.map(process_train_sample, num_parallel_calls=tf.data.experimental.AUTOTUNE)
train_ds = train_ds.shuffle(10000)
train_ds = train_ds.batch(batch_size)
for img, cls in train_ds.take(10):
print('img: ', img.numpy().shape, 'cls: ', cls.numpy())
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
metrics=['categorical_accuracy', 'categorical_crossentropy'])
model.fit(train_ds, epochs=50)
When I'm training on a dataset with 4 classes - A,B,C,D, I found the training accuracy does not increase stably, instead, it fluctuates up and down. Then I checked my data input pipeline by showing labels batch by batch as in the for-loop and found that each batch contains only elements from 2 classes, instead of 4. It seems the dataset is not shuffled as I expected which may cause the accuracy not to increase steadily. But I don't see what's wrong in my code.
In .shuffle(10 000), the 10 000 is the buffer size, which means it will sample from the first 10 000 images. As you have ~30 000 images, this results in images only from the first and second classes in the first batches batches. As you continue training, you will start to sample from class (1,2,3), then only (2,3), then (2,3,4), then (3,4), then (3,4,1), then (4,1), then (4,1,2), then (1,2), then (1,2,3), and so on. Try setting shuffle buffer size to 30 000 if you have the memory, or if you don't, first shuffle your list of paths, and then use a large batch size.

Backpropagation Using Tensorflow and Numpy MSE not Dropping

I am trying to create a Backpropagation but I do not want to use the GradientDescentOptimizer from TF. I just wanted to update my own weights and biases. The problem is that the Mean Square Error or Cost is not approaching to zero. It just stays at some 0.2xxx. Is it because of my inputs which are 520x1600 (yes, each input has 1600 units and yes, there are 520 of them) or my number of neurons in the Hidden Layer is problematic? I have tried implementing this using the GradientDescentOptimizer and minimize(cost) which is working fine (Cost reduces near to zero as training goes on) but maybe I have an issue in my code of updating the weights and biases.
Here's my code:
import tensorflow as tf
import numpy as np
from BPInputs40 import pattern, desired;
#get the inputs and desired outputs, 520 inputs, each has 1600 units
train_in = pattern
train_out = desired
learning_rate=tf.constant(0.5)
num_input_neurons = len(train_in[0])
num_output_neurons = len(train_out[0])
num_hidden_neurons = 20
#weight matrix initialization with random values
w_h = tf.Variable(tf.random_normal([num_input_neurons, num_hidden_neurons]), dtype=tf.float32)
w_o = tf.Variable(tf.random_normal([num_hidden_neurons, num_output_neurons]), dtype=tf.float32)
b_h = tf.Variable(tf.random_normal([1, num_hidden_neurons]), dtype=tf.float32)
b_o = tf.Variable(tf.random_normal([1, num_output_neurons]), dtype=tf.float32)
# Model input and output
x = tf.placeholder("float")
y = tf.placeholder("float")
def sigmoid(v):
return tf.div(tf.constant(1.0),tf.add(tf.constant(1.0),tf.exp(tf.negative(v*0.001))))
def derivative(v):
return tf.multiply(sigmoid(v), tf.subtract(tf.constant(1.0), sigmoid(v)))
output_h = tf.sigmoid(tf.add(tf.matmul(x,w_h),b_h))
output_o = tf.sigmoid(tf.add(tf.matmul(output_h,w_o),b_o))
error = tf.subtract(output_o,y) #(1x35)
mse = tf.reduce_mean(tf.square(error))
delta_o=tf.multiply(error,derivative(output_o))
delta_b_o=delta_o
delta_w_o=tf.matmul(tf.transpose(output_h), delta_o)
delta_backprop=tf.matmul(delta_o,tf.transpose(w_o))
delta_h=tf.multiply(delta_backprop,derivative(output_h))
delta_b_h=delta_h
delta_w_h=tf.matmul(tf.transpose(x),delta_h)
#updating the weights
train = [
tf.assign(w_h, tf.subtract(w_h, tf.multiply(learning_rate, delta_w_h))),
tf.assign(b_h, tf.subtract(b_h, tf.multiply(learning_rate, tf.reduce_mean(delta_b_h, 0)))),
tf.assign(w_o, tf.subtract(w_o, tf.multiply(learning_rate, delta_w_o))),
tf.assign(b_o, tf.subtract(b_o, tf.multiply(learning_rate, tf.reduce_mean(delta_b_o, 0))))
]
sess = tf.Session()
sess.run(tf.global_variables_initializer())
err,target=1, 0.005
epoch, max_epochs = 0, 2000000
while epoch < max_epochs:
epoch += 1
err, _ = sess.run([mse, train],{x:train_in,y:train_out})
if (epoch%1000 == 0):
print('Epoch:', epoch, '\nMSE:', err)
answer = tf.equal(tf.floor(output_o + 0.5), y)
accuracy = tf.reduce_mean(tf.cast(answer, "float"))
print(sess.run([output_o], feed_dict={x: train_in, y: train_out}));
print("Accuracy: ", (1-err) * 100 , "%");
Update: I got it working now. The MSE dropped to almost zero once I increased the number of neurons in the hidden layer. I tried using 5200 and 6400 neurons for the hidden layer and with just 5000 epochs, the accuracy was almost 99%. Also, the largest learning rate I used is 0.1 because when above that, the MSE will not be close to zero.
I'm not an expert in this field, but it looks like your weights are updated correctly. And the fact that your MSE decreases from some higher values to 0.2xxx is the strong indicator of that. I would definitely try to run this problem with way more hidden neurons (e.g. 500)
Btw, are your inputs normalized? If not, that obviously could be the reason