fixed_shape_resizer of 320 x 320 in YOLOV3: changing height and width of config file - object-detection

I have a dataset of two classes bottle and mouse. I annotated the images and I am training it with yolov3.
step1: trained with yolov3
step2: convert .weights to .tflite
step3: create metadata for .tflite file
step4: Integrate the metadata .tflite file to kotlin application on android studio
Now to have my detection model on to android app I have to ensure that fixed_shape_resizer is 320 x 320. Inorder to implement this I have change the height and width of the yolov3.cfg file.
batch=4
subdivisions=16
width=320
height=320
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 4000
policy=steps
steps=3800,4200
scales=.1,.1
[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear
[yolo]
mask = 6,7,8
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
But when I use the above .cfg file to train using below command my loss = Nan
!darknet/darknet detector train dataset/labelled_data.data darknet/cfg/yolov3_custom.cfg custom_weight/darknet53.conv.74 -dont_show
How do I solve this issue?
Is there any other way to set fixed_shape_resizer in yolov3?
Which model of yolo is better to deploy on phone?

Related

post quantization int8 and prune my model after i trained it using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8

im trying to inference my model in Arduino 33BLE and to do so i trained my model using ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8 i got a model size with 6.5mb with 88%map#0.5IOU which is nice so i tried to quantize the model using int8 but the model size increased to 11.5mb and the accuracy was trash i dont know what happened if someone can help me it will be great
my code to quantize the model
import tensorflow as tf
import io
import PIL
import numpy as np
import tensorflow_datasets as tfds
def representative_dataset_gen():
folder = "/content/dataset/train/images"
image_size = 320
raw_test_data = []
files = glob.glob(folder+'/*.jpeg')
for file in files:
image = Image.open(file)
image = image.convert("RGB")
image = image.resize((image_size, image_size))
#Quantizing the image between -1,1;
image = (2.0 / 255.0) * np.uint8(image) - 1.0
#image = np.asarray(image).astype(np.float32)
image = image[np.newaxis,:,:,:]
raw_test_data.append(image)
for data in raw_test_data:
yield [data]
converter = tf.lite.TFLiteConverter.from_saved_model('/content/gdrive/MyDrive/customTF2/data/tflite/saved_model')
converter.representative_dataset = representative_dataset_gen
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
tflite_model = converter.convert()
with open('/mydrive/customTF2/data/tflite/saved_model/detect8.tflite', 'wb') as f:
f.write(tflite_model)
also if there is away to prune the model it will also help with reducing the size to less than 1mb.
i also tried yolov5 and pruned and quantize the model to 1.9mb but couldnt go further, and then tried to convert the tflite model to .h model to inference in esp32 (instead since my tflite model is larger than 1mb), but the model size also increased to 11mb.
i tried post training quantization for my model but the model size increased instead of decreasing, not only that,the model performance reduced drastically. for the pruning part i couldnt do using movbilenetV2 and i hope someone can help

Train YOLO with height = 320 and width = 320

I am training my object detection model with two classes using YOLOV3. In the config file of yolov3.cfg its default height = 416 and width = 416. Now I want to train it with height = 320 and width = 320 so I made these changes and started my training but this is giving me average loss = Nan. But when I train it with 416 x 416 it works completely fine.
So how can I train my model with 320 x 320 instead of 416 x 416.
So YOLOV3 has a default height and weight of 416. So to get 320 I have changed it in the .tflite model instead of .cfg.

Tensorflow, gradients become NAN even when I clip it

It seems like I have an exploding gradient issue during the training of my reinforcement learning policy.
However, I'm using a gradient clipping by norm with 0.2 as the clipping factor.
I've check both my inputs and my loss and none of them are NAN. Only my gradients face an issue.
All of the gradients without exception becomes Nan in only 1 step and I don't understand how it is possible since I'm clipping it. Shouldn't tensorflow transform the nan gradients into a clipped vector ?
Here is the input data when the nan gradients appear :
INPUT : [0.1, 0.0035909, 0.06, 0.00128137, 0.6, 0.71428571, 0.81645947, 0.46802986, 0.04861736, 0.01430704, 0.08, 0.08966659, 0.02, 0.]
Here are the 10 previous loss value (last value being the one when the gradients become NaN)
[-0.0015171316, -0.0015835371, 0.0002261286, 0.0003917102, -0.0024305983, -0.0054471847, 0.00082066684, 0.0038477872, 0.012144111]
Here is the network I'm using, hiddens_dims is a list containing the number of nodes of the consecutive Dense layers (I'm dynamically making those layers) :
class NeuralNet(tf.keras.Model):
def __init__(self, hiddens_dim = [4,4] ):
self.hidden_layers = [tf.keras.layers.Dense(hidden_dim,
activation= 'elu',
kernel_initializer= tf.keras.initializers.VarianceScaling(),
kernel_regularizer= tf.keras.regularizers.L1(l1= 0.001),
name= f'hidden_{i}')
for i,hidden_dim in enumerate(hiddens_dim)
]
# Output layers
self.output_layer = tf.keras.layers.Dense(self.out_dim,
activation= 'softmax',
kernel_initializer= tf.keras.initializers.GlorotNormal(),
name= 'output')
def call(self, input):
x = input
for layer in self.hidden_layers :
x = layer(x)
output = self.output_layer(x)
return output
Now here is the part where I update the gradient manually :
model = NeuralNet([4,4])
optim = tf.keras.optimizers.Adam(learning_rate= 0.01)
...
with tf.GradientTape() as tape :
loss = compute_loss(rewards, log_probs)
grads = tape.gradient(loss, self.model.trainable_variables)
grads = [(tf.clip_by_norm(grad, clip_norm=self.clip)) for grad in grads]
optim.apply_gradients( zip(grads, self.model.trainable_variables) )
And Finally, here are the gradients in the previous iteration, right before the catastrophe :
Gradient Hidden Layer 1 : [
[-0.00839788, 0.00738428, 0.0006091 , 0.00240378],
[-0.00171666, 0.00157034, 0.00012367, 0.00051114],
[-0.0069742 , 0.00618575, 0.00050313, 0.00201353],
[-0.00263796, 0.00235524, 0.00018991, 0.00076653],
[-0.01119559, 0.01178695, 0.0007518 , 0.00383774],
[-0.08692611, 0.07620181, 0.00630627, 0.02480747],
[-0.10398869, 0.09012008, 0.00754619, 0.02933704],
[-0.04725896, 0.04004722, 0.00343443, 0.01303552],
[-0.00493888, 0.0043246 , 0.00035772, 0.00140733],
[-0.00559061, 0.00484629, 0.00040546, 0.00157689],
[-0.00595227, 0.00524359, 0.00042967, 0.00170693],
[-0.02488269, 0.02446024, 0.00177054, 0.00796351],
[-0.00850916, 0.00703857, 0.00062265, 0.00229139],
[-0.00220688, 0.00196331, 0.0001586 , 0.0006386 ]]
Gradient Hidden Layer 2 : [
[-2.6317715e-04, -2.1482834e-04, 3.0761934e-04, 3.1322116e-04],
[ 8.4564053e-03, 6.7548533e-03, -9.8721031e-03, -1.0047102e-02],
[-3.8322039e-05, -3.1298561e-05, 4.3669730e-05, 4.4472294e-05],
[ 3.6933038e-03, 2.9515910e-03, -4.3102605e-03, -4.3875999e-03]]
Gradient Output Layer :
[-0.0011955 , 0.0011955 ],
[-0.00074397, 0.00074397],
[-0.0001833 , 0.0001833 ],
[-0.00018749, 0.00018749]]
I'm not very familiar with tensorflow so maybe I'm not training the model correctly ? However, the model seemed to train correctly before the gradients become crazy.
I know I can use many other methods to counter exploding gradient (batch norm, dropout, decrease the learning rate etc) but I want to understand why gradient clipping is not working here ? I thought that gradient can't explode when we clip it by definition
Thank you

Is there a simple way to subset 10 percent of train and test data using ImageDataGenerator?

I have a structure of directories like this:
-root_dir----------------------------
--train
---dog (contains 750 images of dogs)
---cat (contains 750 images of cats)
---mouse (contains 750 images of mice)
--test
---dog (contains 250 images of dogs)
---cat (contains 250 images of cats)
---mouse (contains 250 images of mice)
That is how I load the data:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_data_gen = ImageDataGenerator(rescale=1./255)
train_data = train_data_gen.flow_from_directory(directory='/root_dir/train/',
target_size=(224, 224),
class_mode='categorical',
batch_size=32,
seed=42)
test_data_gen = ImageDataGenerator(rescale=1./255)
test_data = test_data_gen.flow_from_directory(directory='/root_dir/test/',
target_size=(224, 224),
class_mode='categorical',
batch_size=32,
seed=42)
It works fine.
train_data contains 750 images of each class.
However, I need to run fast experiments only on 10 percent of the data.
I need train_data_10_percent_subset that contains 75 randomly chosen images of each class.
Is there a simple way with ImageDataGenerator to randomly choose 10 percent of the images in the train directory in each sub-folder?
I need a generator that contains 75 images of each class from train subfolders
you can do this
train_data_gen = ImageDataGenerator(rescale=1./255, validation_split=.1)
train_data = train_data_gen.flow_from_directory(directory='/root_dir/train/',
target_size=(224, 224),
class_mode='categorical',
batch_size=32,
seed=42, subset='validation')
setting validation_split to .1 reserves 10 % of the data for validation. Setting subset='validation' will make the train_data have 10% of the training data

Feeding tf.data Dataset with multidimensional output to Keras model

I want to feed a tf.data Dataset to a Keras model, but I get the following error:
AttributeError: 'DatasetV1Adapter' object has no attribute 'ndim'
This dataset will be used to solve a segmentation problem, so both input and output will be images (3D tensors)
The dataset is created with this code:
dataset = tf.data.Dataset.list_files(TRAIN_PATH + "*.png",shuffle=False)
def process_path(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_png(img, channels=3)
train_image_path=tf.strings.regex_replace(file_path,"image","mask")
mask = tf.io.read_file(train_image_path)
mask = tf.image.decode_png(mask, channels=1)
mask = tf.squeeze(mask)
mask = tf.one_hot(tf.cast(mask, tf.int32), Num_Classes, axis = -1)
return img,mask
dataset = dataset.map(process_path)
dataset = dataset.batch(32,drop_remainder=True)
Taking an item from the dataset shows that I get a tuple containing an input tensor and an output tensor, whose dimensions are correct:
Input: (batch-size, image height, image width, 3 channels)
Output: (batch-size, image height, image width, 4 channels)
When fitting the model I get an error:
model.fit(dataset, epochs = 50)
I've solved the provem moving to Keras 2.4.3 and Tensorflow 2.2
Everything was right but apparently the previous release of Keras did not manage this tf.data correctly.
Here's a tutorial I've found very useful on this.