raise ValueError('Image with id {} already added.'.format(image_id)) in Tensorflow object detection api - tensorflow

Image training is ok with ssd_mobilenet_v1_coco in tensorflow object detection api.
getting the error while testing:
File "/home/hipstudents/anaconda3/envs/tensorflow_gpuenv/lib/python3.6/site-packages/object_detection-0.1-py3.6.egg/object_detection/utils/object_detection_evaluation.py", line 203, in add_single_ground_truth_image_info
raise ValueError('Image with id {} already added.'.format(image_id))
Please help.
System Info:
What is the top-level directory of the model you are using: ~/
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes, written scripts to convert .xml files to tf record
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Compiled from source
TensorFlow version (use command below): 1.11.0
Bazel version (if compiling from source): 0.16.1
CUDA/cuDNN version: 9.0.176, cuDNN: 9.0
GPU model and memory: GeForce GTX1080Ti, 11GB
Exact command to reproduce: python eval.py --logtostderr --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --checkpoint_dir=training/ --eval_dir=eval/
I created dataset manually. Then label it using labelimg. after labeling I created csv file for image annotation and file name. then I create tf record. I follow this tutorial: https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
My tfrecord generator for training and testing image:
"""
Usage:
# From tensorflow/models/
# Create train data:
python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=train.record
# Create test data:
python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_import
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
# TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == 'Field':
return 1
else:
None
def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), 'Images')
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.app.run()

In the ssd_mobilenet_coco_v1.config file, num_examples was 8000. In my case, test dataset only has 121 samples. I forgot to update that and got new kind of error that I couldn't find on the Internet. As it is a silly mistake, so I think a very few people did that. this answer might help someone who will do this kind of mistake. I changed the following in the config file and the error is resolved:
eval_config: {
#num of test images. In my case 121. Previously It was 8000
num_examples: 121
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}

In my case, the problem was that I'd included images multiple times when constructing tfrecord files. Though now obvious, I hadn't noticed that many categories of the Open Images Dataset share the same images (which would have the same id in the evaluation, thus the error...). Once I'd corrected the algorithm creating the tfrecords, the error was gone.

I have solved the problem with this article: https://www.coder.work/article/3120495
By just adding 2 line
eval_config {
num_examples: 50
use_moving_averages: false
metrics_set: "coco_detection_metrics"
}

Related

AttributeError: 'DataFrame' object has no attribute '_data' [Not a duplicate]

I was trying to run the main.py but it threw an error about attribution.
My Python version is Python 3.5. I am using the CNTK Docker release 2.6-cpu-python3.5. I cannot update the Python version because of CNTK. It only supports Python 3.5 and will only run in Ubuntu 16.04.
Pandas version: pandas==0.25.3
The Error
Traceback (most recent call last):
File "/workspace/main.py", line 5, in <module>
from model import extract_patches, score_patch, del_cache
File "/workspace/model.py", line 2, in <module>
from regressionModel import extract_features, predict_label
File "/workspace/regressionModel.py", line 26, in <module>
regression_model = read_model['model'][0]
File "/usr/local/lib/python3.5/dist-packages/pandas/core/frame.py", line 2898, in __getitem__
if self.columns.is_unique and key in self.columns:
File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 5063, in __getattr__
return object.__getattribute__(self, name)
File "pandas/_libs/properties.pyx", line 65, in pandas._libs.properties.AxisProperty.__get__
File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 5063, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute '_data'
main.py
import os
import flask
import numpy as np
from flask import jsonify, request
from model import extract_patches, score_patch, del_cache
app = flask.Flask(__name__)
#app.route('/url/<path:argument>')
def url(argument):
# create a patch folder
patch_path = './patches'
if not os.path.exists(patch_path):
os.mkdir(patch_path)
# get image url from the query string
imageURL = request.url.split('=',1)[1]
# extract patches from imageURL
dimension, face_loc, image_dim = extract_patches(imageURL)
# score each patch
patch_score= score_patch(patch_path)
# delete the downloaded image and the patches from local
del_cache(patch_path)
if os.path.exists('temp.jpg'):
os.remove('temp.jpg')
data = dict()
data['patch_score'] = []
for key in dimension:
tmp = []
tmp[:] = dimension[key]
tmp.append(patch_score[key])
data['patch_score'].append(tmp)
data['image_score'] = round(np.mean(list(patch_score.values())), 2)
data['face_loc'] = face_loc['face_loc']
data['img_dim'] = image_dim
return jsonify(patch_score = str(data['patch_score']), image_score = str(data['image_score']), face_loc = str(data['face_loc']), image_dim = str(data['img_dim']))
if __name__ == '__main__':
app.run(host='0.0.0.0', port = 9580) # port number can be changed in your case
model.py
import getPatches
from regressionModel import extract_features, predict_label
import os
import shutil
def extract_patches(imageURL):
patch_path = './patches'
dimension_dict = dict()
face_dict = dict()
image_dim = []
try:
dim, face, img = getPatches.extract_patches(imageURL, dimension_dict,face_dict, image_dim, patch_path)
print ("extract patches pass")
except:
print ('cannot extract patches from the image')
return dim, face, img
def score_patch(patch_path):
patch_score = dict()
for file in next(os.walk(patch_path))[2]:
file_path = os.path.join(patch_path, file)
score_features = extract_features (file_path)[0].flatten()# extract features from CNTK pretrained model
pred_score_label = predict_label(score_features) # score the extracted features using trained regression model
patch_score[file.split('.')[0]] = float("{0:.2f}".format(pred_score_label[0]))
return patch_score
def infer_label(patch_score, label_mapping):
max_score_name, max_score_value = max(patch_score.items(), key=lambda x:x[1])
pred_label = label_mapping[round(max_score_value)-1]
return pred_label
def del_cache(patch_folder):
shutil.rmtree(patch_folder)
return
regressionModel.py
import numpy as np
import pandas as pd
import cntk as C
from PIL import Image
import pickle
from cntk import load_model, combine
import cntk.io.transforms as xforms
from cntk.logging import graph
from cntk.logging.graph import get_node_outputs
pretrained_model = 'ResNet152_ImageNet_Caffe.model'
pretrained_node_name = 'pool5'
regression_model = 'cntk_regression.dat'
image_width = 224
image_height = 224
# load CNTK pretrained model
#model_file = os.path.join(pretrained_model_path, pretrained_model_name)
loaded_model = load_model(pretrained_model) # a full path is required
node_in_graph = loaded_model.find_by_name(pretrained_node_name)
output_nodes = combine([node_in_graph.owner])
# load the stored regression model
read_model = pd.read_pickle(regression_model)
regression_model = read_model['model'][0]
train_regression = pickle.loads(regression_model)
def extract_features(image_path):
img = Image.open(image_path)
resized = img.resize((image_width, image_height), Image.ANTIALIAS)
bgr_image = np.asarray(resized, dtype=np.float32)[..., [2, 1, 0]]
hwc_format = np.ascontiguousarray(np.rollaxis(bgr_image, 2))
arguments = {loaded_model.arguments[0]: [hwc_format]}
output = output_nodes.eval(arguments)
return output
def predict_label(features):
return train_regression.predict(features.reshape(1,-1))
https://pypi.org/project/cntk/#files has CNTK 2.7 for Python 3.6. Still an obsolete version, but not quite as obsolete.

Tensorflow lite only using the first item in the labelmap.txt file when identifying items

I have installed tensorflow 1.15 and created a custom model. I converted it into a .tflite file so tensorflow lite can read it. Then I ran the following code:
import os
import argparse
import cv2
import numpy as np
import sys
import glob
import importlib.util
parser = argparse.ArgumentParser()
parser.add_argument('--modeldir', help='Folder the .tflite file is located in', required=True)
parser.add_argument('--graph', help='Name of the .tflite file, if different than detect.tflite', default='detect.tflite')
parser.add_argument('--labels', help='Name of the labelmap file, if different than labelmap.txt', default='labelmap.txt')
parser.add_argument('--threshold', help='Minimum confidence threshold for displaying detected objects', default=0.5)
parser.add_argument('--image', help='Name of the single image to perform detection on. To run detection on multiple images, use --imagedir', default=None)
parser.add_argument('--imagedir', help='Name of the folder containing images to perform detection on. Folder must contain only images.', default=None)
parser.add_argument('--edgetpu', help='Use Coral Edge TPU Accelerator to speed up detection', action='store_true')
args = parser.parse_args()
MODEL_NAME = args.modeldir
GRAPH_NAME = args.graph
LABELMAP_NAME = args.labels
min_conf_threshold = float(args.threshold)
use_TPU = args.edgetpu
IM_NAME = args.image
IM_DIR = args.imagedir
if (IM_NAME and IM_DIR):
print('Error! Please only use the --image argument or the --imagedir argument, not both. Issue "python TFLite_detection_image.py -h" for help.')
sys.exit()
if (not IM_NAME and not IM_DIR):
IM_NAME = 'test1.jpg'
pkg = importlib.util.find_spec('tflite_runtime')
if pkg:
from tflite_runtime.interpreter import Interpreter
if use_TPU:
from tflite_runtime.interpreter import load_delegate
else:
from tensorflow.lite.python.interpreter import Interpreter
if use_TPU:
from tensorflow.lite.python.interpreter import load_delegate
if use_TPU:
if (GRAPH_NAME == 'detect.tflite'):
GRAPH_NAME = 'edgetpu.tflite'
CWD_PATH = os.getcwd()
if IM_DIR:
PATH_TO_IMAGES = os.path.join(CWD_PATH,IM_DIR)
images = glob.glob(PATH_TO_IMAGES + '/*')
elif IM_NAME:
PATH_TO_IMAGES = os.path.join(CWD_PATH,IM_NAME)
images = glob.glob(PATH_TO_IMAGES)
PATH_TO_CKPT = os.path.join(CWD_PATH,MODEL_NAME,GRAPH_NAME)
PATH_TO_LABELS = os.path.join(CWD_PATH,MODEL_NAME,LABELMAP_NAME)
with open(PATH_TO_LABELS, 'r') as f:
labels = [line.strip() for line in f.readlines()]
if labels[0] == '???':
del(labels[0])
if use_TPU:
interpreter = Interpreter(model_path=PATH_TO_CKPT, experimental_delegates=[load_delegate('libedgetpu.so.1.0')])
print(PATH_TO_CKPT)
else:
interpreter = Interpreter(model_path=PATH_TO_CKPT)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]
floating_model = (input_details[0]['dtype'] == np.float32)
input_mean = 127.5
input_std = 127.5
for image_path in images:
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
imH, imW, _ = image.shape
image_resized = cv2.resize(image_rgb, (width, height))
input_data = np.expand_dims(image_resized, axis=0)
if floating_model:
input_data = (np.float32(input_data) - input_mean) / input_std
interpreter.set_tensor(input_details[0]['index'],input_data)
interpreter.invoke()
boxes = interpreter.get_tensor(output_details[0]['index'])[0] # Bounding box coordinates of detected objects
classes = interpreter.get_tensor(output_details[1]['index'])[0] # Class index of detected objects
scores = interpreter.get_tensor(output_details[2]['index'])[0] # Confidence of detected objects
for i in range(len(scores)):
if ((scores[i] > min_conf_threshold) and (scores[i] <= 1.0)):
ymin = int(max(1,(boxes[i][0] * imH)))
xmin = int(max(1,(boxes[i][1] * imW)))
ymax = int(min(imH,(boxes[i][2] * imH)))
xmax = int(min(imW,(boxes[i][3] * imW)))
cv2.rectangle(image, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)
object_name = labels[int(classes[i])] # Look up object name from "labels" array using class index
label = '%s: %d%%' % (object_name, int(scores[i]*100)) # Example: 'person: 72%'
labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2) # Get font size
label_ymin = max(ymin, labelSize[1] + 10) # Make sure not to draw label too close to top of window
cv2.rectangle(image, (xmin, label_ymin-labelSize[1]-10), (xmin+labelSize[0], label_ymin+baseLine-10), (255, 255, 255), cv2.FILLED) # Draw white box to put label text in
cv2.putText(image, label, (xmin, label_ymin-7), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 0), 2)
cv2.imshow('Object detector', image)
if cv2.waitKey(0) == ord('q'):
break
cv2.destroyAllWindows()
Now, my custom model seems to work. It located the items on the image correctly but it labels everything with the first item on the labelmap.txt. For example:
labelmap.txt:
key
remote
The model identifies the remotes in the images but labels them as "key" because it is the first thing in the labelmap.txt. I don't know why this is happening, can someone please help me. I am sorry if anything is unclear. Please let me know and I will try my best to clarify a little better. Thank you.
I followed the https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi.

I need help.. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 0: invalid start byte

from __future__ import division
from __future__ import print_function
from __future__ import absolute_import
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.compat.v1.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
flags.DEFINE_string('image_dir', '', 'Path to images')
FLAGS = flags.FLAGS
# replace row_label with the name you annotated your images as
def class_text_to_int(row_label):
if row_label == 'Masked':
return 1
elif row_label == 'No_Masked':
return 2
else :
None
def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.io.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(FLAGS.image_dir)
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.compat.v1.app.run()
this is my code named generate_tfrecord.py.
I downloaded this code from github as my first tensorflow tfrecord making example, but it makes error.
I am Korean, and I found that this error occurs when my computer name is korean.
But when I typed 'hostname' in my cmd, it returned 'DESKTOP-7AU~~~', which does not include Korean letters.
If you make comment about your required code or information, I will try to give it to you.
in my images - all folder, there are 764 sets of img+xml file and I have already run "xml_to_csv.py"
this code is from https://github.com/Bengemon825/TF_Object_Detection2020
this simplest way: you can rename your hostname with ascii characters.
your can search the question that how to rename the hostname by google.
this problem caused by Python read an non-unicode characters and cannot decode by utf-8.
I had a very similar problem and here is how I solved it - took me many hours to figure out:
If you are a Mac user, MacOS has 'invisible' folder organizing files of the format .DS_Store in every folder. When iterating through your images folder, the code runs into these .DS_Store files which the utf-8 decoder cannot decode. Deleting them is totally harmless although they in fact do re-appear but you don't have to worry about that
So you can get rid of them like this
OR (I preferred this option once figured out the problem): In your code, you can explicitly by-pass them with an if statement that checks only for .xml files or .csv files or .txt whichever files you are working with in your images folder/directory. So something like:
path = 'path to folder containing your .xml files or .csv files or .txt files'
if '.xml' in str(path):
I have also realized that when people directly use this generate_tfrecord.py as is, many tend to forget to explicitly call out their file paths correctly. This also happens for people using the create_pascal_tf_record.py python script of the object_detection api for TensorFlow.
For example, from your code above, flags.DEFINE_string('csv_input', '', 'Path to the CSV input'), you need to fill in the ' ' with your csv directory path and not leave it empty. For example flags.DEFINE_string('csv_input', 'add your csv directory path here', 'Path to the CSV input'). You have to do the same for all the flags.DEFINE_string instances or else you must explicitly spell out the path if you don't want to use the flags.DEFINE_string instances
I hope this is helpful to anyone using a Mac and running into all sorts of UnicodeDecodeError for TFRECORD files. I'm not sure if Windows users run into something similar. Also there could be other reasons but for me this happened to be the cause

Tensorflow Object-Detection Fine-Tuning leads to incorrect accuracy values

I am working with the Tensorflow Object-Detection API and want to use a pre-trained Faster R-CNN Resnet101 model on Kitti image data and fine-tune it on Cityscapes image data. I downloaded the pre-trained model here.
This script creates the tfrecord files. I use this script to create tfrecord files from Cityscape (CS) images.
The CS tf_records are afterward used in order to fine-tune the pre-trained Resnet model. For this task, I use this
python3.5 model_main.py --pipeline_config_path={Path to config file in ../samples/configs/} --model_dir={Output directory} --num_train_steps={Train Steps} --sample_1_of_n_eval_examples=1 --alsologtostderr
Using only CS Training and Validation data lead to an COCO accuracy of -1.000
Average Precision (AP) #[ IoU=0.5:0.95 | area=all | maxDets=100 ] = -1.000
....
I tried different things:
Train on CS data and validate on Kitti data. This lead to an COCO accuracy that is not -1.000 but very low. Between 0.01 and 1.5% (after 10.000 training steps)
Looked at Tensorboard visualizations. The loss falls from 0.05 to 0.01 over the first 1.500 iterations and stays over the last 8.500 iterations around 2.5e-4 and does not change much. (I would upload an image if I would know how..)
Fine-tuned the pre-trained model with manipulated Kitti data. I changed the content of the tfrecord files that create the Kitti tfrecord files. By this, I mean I deleted all of the useless variables (like 3D Annotations and so on) in the tfrecord data in order to have similar content to the CS tfrecords I created (code below). Using these manipulated Kitti data also lead to a validation accuracy that seems to be normal (around 70-80%). Therefore, I expect that this error is not caused by an missing attribute in the tfrecords.
An inference of the CS data on the pre-trained Resnet model leads to an accuracy around 20% and this is what I expect. Kitti inference leads to an accuracy around 85%.
Using CS tfrecords with the following content per image:
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename.encode('utf8')),
'image/source_id': dataset_util.bytes_feature(filename.encode('utf8')),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format.encode('utf8')),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
Using this code to encode an image
with tf.gfile.GFile(os.path.join(image_path, '{}'.format(currentImageName)), 'rb') as fid:
encoded_image_data = fid.read()
encoded_image_io = io.BytesIO(encoded_image_data)
Could the coding of the data be the reason? Or what could be another source for an error? As mentioned, I tried several things and none of them worked as expected. Fine-tuning should not be that hard or do I miss any point?
As mentioned in point 4, I tested the inference and the tf_record files and therefore, I expect that it is possible to fine-tune the model.
In general, I expect that the accuracy is not close to 0% after 10.000 iterations.
Everything looks a bit strange and I do not know what the error is. Therefore, I would appreciate each hint/remark/solution for this issue.
EDIT:
def create_tf_example(currentName, anno_path, image_path):
currentNameSplit = currentName.split('.')[0]
currentImageName = currentNameSplit + '.png'
with tf.gfile.GFile(os.path.join(image_path, '{}'.format(currentImageName)), 'rb') as fid:
encoded_image_data = fid.read()
encoded_image_io = io.BytesIO(encoded_image_data)
image = Image.open(encoded_image_io)
image = np.asarray(image)
width = int(image.shape[1])
height = int(image.shape[0])
filename = os.path.join(image_path, '{}'.format(currentImageName))
image_format = 'png' # b'jpeg' or b'png'
with open(anno_path + currentName) as file:
lines = file.readlines()
xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
xmaxs = [] # List of normalized right x coordinates in bounding box
# (1 per box)
ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
ymaxs = [] # List of normalized bottom y coordinates in bounding box
# (1 per box)
classes_text = [] # List of string class name of bounding box (1 per box)
classes = [] # List of integer class id of bounding box (1 per box)
for li in range(len(lines)):
print('Lines[li]: {}'.format(lines[li]))
xmins.append(float(lines[li].split()[0]) / width)
xmaxs.append(float(lines[li].split()[2]) / width)
ymins.append(float(lines[li].split()[1]) / height)
ymaxs.append(float(lines[li].split()[3]) / height)
classID = lines[li].split()[4]
if int(classID) == 0:
className = 'Car'
classes_text.append(className.encode('utf8'))
classID = 0
classes.append(classID+1) # add 1 because class 0 is always reserved for 'background'
elif int(classID) == 1:
className = 'Person'
classes_text.append(className.encode('utf8'))
classID = 1
classes.append(classID+1)
else:
print('Error with Image Annotations in {}'. format(currentName))
difficult_obj = [0] * len(xmins)
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename.encode('utf8')),
'image/source_id': dataset_util.bytes_feature(filename.encode('utf8')),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format.encode('utf8')),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer_training = tf.python_io.TFRecordWriter(FLAGS.output_path_Training)
writer_valid = tf.python_io.TFRecordWriter(FLAGS.output_path_Test)
writer_test = tf.python_io.TFRecordWriter(FLAGS.output_path_Valid)
allAnnotationFiles = []
os.chdir(FLAGS.anno_path)
for file in sorted(glob.glob("*.{}".format('txt'))):
allAnnotationFiles.append(file)
counter=0
for currentName in allAnnotationFiles:
if counter < 2411:
tf_example = create_tf_example(currentName, FLAGS.anno_path, FLAGS.image_path)
writer_training.write(tf_example.SerializeToString())
counter += 1
elif counter > 2411 and counter < 2972:
tf_example = create_tf_example(currentName, FLAGS.anno_path, FLAGS.image_path)
writer_valid.write(tf_example.SerializeToString())
counter += 1
elif counter <= 3475:
tf_example = create_tf_example(currentName, FLAGS.anno_path, FLAGS.image_path)
writer_test.write(tf_example.SerializeToString())
counter += 1
writer_training.close()
writer_test.close()
writer_valid.close()
if __name__ == '__main__':
tf.app.run()

HOG +SVM training with iniria dataset, TypeError: samples is not a numpy array, neither a scalar

I'm working on pedestrian detection with a team. I am trying to figure out an error that keeps showing up that says "TypeError: samples is not a numpy array, neither a scalar" which when appear points to the line of code that is svm.train(X_data, cv2.ml.ROW_SAMPLE, labels12)
i tried following dozens of online guides but i still couldn't solve the problem, and im also very new to this
import cv2
import numpy as np
from skimage import feature
from skimage import exposure
import glob
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
# training
X_data = []
labels1 = []
label = []
files = glob.glob ("new_pos_1/crop*.PNG")
for myFile in files:
# print(myFile)
image = cv2.imread(myFile,)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
X_data.append (image)
labels1.append('Pedestrian')
print('X_data shape:', np.array(X_data).shape)
labels12 = np.array([labels1])
print('labels12 shape:',np.array(labels12).shape)
print('labels shape:', np.array(labels1).shape)
#Testing
Y_data = []
files = glob.glob ("new_pos_1/person*.PNG")
for myFile in files:
# print(myFile)
image = cv2.imread (myFile)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Y_data.append (image)
label.append('Pedestrian')
print('Y_data shape:', np.array(Y_data).shape)
print('label shape:', np.array(label).shape)
hog_features = []
for image in np.array(X_data):
(fd, hogImage) = feature.hog(image, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(2, 2),
transform_sqrt=True, block_norm="L2-Hys", visualise=True)
hogImage = exposure.rescale_intensity(hogImage, out_range=(0, 255))
hogImage = hogImage.astype("uint8")
hog_features.append(fd)
print("I'm done hogging")
print(hog_features)
svm = cv2.ml.SVM_create()
svm.setKernel(cv2.ml.SVM_LINEAR)
svm.setType(cv2.ml.SVM_C_SVC)
svm.setC(2.67)
svm.setGamma(5.383)
print("Done initializing SVM parameters")
# Train SVM on training data
svm.train(X_data, cv2.ml.ROW_SAMPLE, labels12)
print("Done trainning")
svm.save('svm_data.dat')
print("SAVED.")
#testResponse = svm.predict(testData)[1].ravel()
cv2.waitKey(0)
The line at the beginning that says labels12 = np.array([labels1]) i used to try and fix the error that showed up to no avail.
This is the original website that helped me write this code: https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/
you should also do X_data2 = np.array([X_data]) and call svm.train(X_data2, cv2.ml.ROW_SAMPLE, labels12)