ValueError: setting an array element with a sequence at fit(X, y) in k-nearest neighbor - numpy

i have an error at this line:neigh.fit(X, y) :
ValueError: setting an array element with a sequence.
I checked fit function and X is: {array-like, sparse matrix, BallTree, cKDTree}
My X is a list of list with first element solidity number and second elemnt humoment list (7 cells).
If i change and i take only first humoment number for having a pure list of list
give this error: query data dimension must match BallTree data dimension.
My code:
listafeaturevector = list()
path = 'imgknn/'
for infile in glob.glob( os.path.join(path, '*.jpg') ):
print("current file is: " + infile )
gray = cv2.imread(infile,0)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(6,6))
graydilate = cv2.erode(gray, element)
ret,thresh = cv2.threshold(graydilate,127,255,cv2.THRESH_BINARY_INV)
imgbnbin = thresh
#CONTOURS
contours, hierarchy = cv2.findContours(imgbnbin, cv2.RETR_TREE ,cv2.CHAIN_APPROX_SIMPLE)
print(len(contours))
for i in range (0, len(contours)):
fv = list() #1 feature vector
#HUMOMENTS
#print("humoments")
mom = cv2.moments(contours[i], 1)
Humoments = cv2.HuMoments(mom)
#print(Humoments)
fv.append(Humoments) #query data dimension must match BallTree data dimension
#SOLIDITY
area = cv2.contourArea(contours[i])
hull = cv2.convexHull(contours[i]) #ha tanti valori
hull_area = cv2.contourArea(hull)
solidity = float(area)/hull_area
fv.append(solidity)
#fv.append(elongation)
listafeaturevector.append(fv)
print("i have done")
print(len(listafeaturevector))
lenmatrice=len(listafeaturevector)
#KNN
X = listafeaturevector
y = [0,1,2,3]* (lenmatrice/4)
from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X, y) #ValueError: setting an array element with a sequence.
print(neigh.predict([[1.1]]))
print(neigh.predict_proba([[0.9]]))
If i try to covert it in a numpy array:
listafv = np.dstack(listafeaturevector)
listafv=np.rollaxis(listafv,-1)
print(listafv.shape)
data = listafv.reshape((lenmatrice, -1))
print(data.shape)
#KNN
X = data
i got: setting an array element with a sequence

A couple of suggestions/questions:
Humoments = cv2.HuMoments(mom)
What is the class of the return value Humoments? a float or a list? If float, that is fine.
for each image file
for i in range (0, len(contours)):
fv = list() #1 feature vector
...
fv.append(Humoments)
...
fv.append(solidity)
listafeaturevector.append(fv)
The above code does not seem correct. In your problem, I think you need to a construct a feature vector for each image. So anything that is related to image i should go to the same feature vector x_i. Then you combine all feature vectors to get a list of feature vectors X. However, your listafeaturevector (or X) presents in the inner-most loop, it's obviously not correct.
Second, you have a loop against the number of elements in the contours, are you sure the number of elements stays the same for each image? Otherwise, the number of features (|x_i|) is totally different across different images, that might cause the error of
setting an array element with a sequence.
Third, are you clear about how you want to classify the images? what are the target values/labels of different images? I see you just setting labels with [0,1,2,3]* (lenmatrice/4). Can you elaborate on what you are trying to do with those images? Are they containing different type of object? Are they showing different patterns? Are those images describe different topic/color? If yes, for each different type, you give a different label - either 0,1,2 or 'red','white','black' (assume you have only 3 types). The values of the label do not matter. What matters is how many values they have. I am trying to understand the difference of labels in your case.
On the other hand, if you only want to retrieve similar images, you don't need to use a classifier or specify a label for each image. Instead, try to use NearestNeighbors.
print(neigh.predict([[1.1]]))
print(neigh.predict_proba([[0.9]]))
Fourth, the above two lines of test are not correct. You need to set an X-like object in order to get a prediction from the classifier. That is to say, you need a feature vector x with the identical structure as you constructed in your training examples (with all h,e,s in the same order).

Related

Computing Bounding Boxes from a Mask-Image (Tensorflow or other)

I'm looking for ways to convert a mask (a Height x Width boolean image) into a series of bounding boxes (see example picture below, which I hand-drew), with boxes encircling the "islands of truth".
Specifically, I'm looking for a way that would work with standard TensorFlow ops (though all input is welcome). I want this so I can convert the model to TFLite without adding custom ops and recompiling from source. But in general it would just be nice to be aware of different ways of doing this.
Notes:
I already have a solution involving non-standard Tensorflow, based on tfa.image.connected_components (see solution here). However that op is not included in Tensorflow Lite. It also feels like it does something slightly harder than necessary (finding connected components feels harder than just outlining blobs on an image without worrying about whether they are connected or not)
I know I haven't specified here exactly how I'd like the boxes generated (e.g whether separate "ying-yang-style" connected components should have separate boxes even if they overlap, etc). Really I'm not worried about the details, just that the resulting boxes look "reasonable".
Some related questions (please read before flagging as duplicate!):
Converting a binary mask into a bounding box in tensorflow asks about creating a single bounding box, which is significantly easier.
Generating bounding boxes from heatmap data (similar, but asks the slightly broader question of converting from "heatmap", and does not specify Tensorflow).
Create Bounding Boxes from Image Labels assumes the image has already been segmented into components (called "labels" there)
I'm ideally looking for something that does not need training (e.g. YOLO-style regression) and just works out of the box (heh).
Edit Here is an example mask image: https://github.com/petered/data/blob/master/images/example_mask3.png which can be loaded into a mask with
mask = cv2.imread(os.path.expanduser('~/Downloads/example_mask3.png')).mean(axis=2) > 50
Well, not sure if this is doable with just tensorflow ops, but here is a Python/Numpy implementation (which uses a very inefficient double-for loop). In principle, it should be fast if vectorized (again, not sure if possible) or written in C, because it just does 2 passes over the pixels to compute the boxes.
I'm not sure if this algorithm has an existing name, but if not I would call it Downright Boxing because it involves extending the mask-segments down and to the right in order to find boxes.
Here's the result on the mask in the question (with a few extra shapes added as examples):
def mask_to_boxes(mask: Array['H,W', bool]) -> Array['N,4', int]:
""" Convert a boolean (Height x Width) mask into a (N x 4) array of NON-OVERLAPPING bounding boxes
surrounding "islands of truth" in the mask. Boxes indicate the (Left, Top, Right, Bottom) bounds
of each island, with Right and Bottom being NON-INCLUSIVE (ie they point to the indices AFTER the island).
This algorithm (Downright Boxing) does not necessarily put separate connected components into
separate boxes.
You can "cut out" the island-masks with
boxes = mask_to_boxes(mask)
island_masks = [mask[t:b, l:r] for l, t, r, b in boxes]
"""
max_ix = max(s+1 for s in mask.shape) # Use this to represent background
# These arrays will be used to carry the "box start" indices down and to the right.
x_ixs = np.full(mask.shape, fill_value=max_ix)
y_ixs = np.full(mask.shape, fill_value=max_ix)
# Propagate the earliest x-index in each segment to the bottom-right corner of the segment
for i in range(mask.shape[0]):
x_fill_ix = max_ix
for j in range(mask.shape[1]):
above_cell_ix = x_ixs[i-1, j] if i>0 else max_ix
still_active = mask[i, j] or ((x_fill_ix != max_ix) and (above_cell_ix != max_ix))
x_fill_ix = min(x_fill_ix, j, above_cell_ix) if still_active else max_ix
x_ixs[i, j] = x_fill_ix
# Propagate the earliest y-index in each segment to the bottom-right corner of the segment
for j in range(mask.shape[1]):
y_fill_ix = max_ix
for i in range(mask.shape[0]):
left_cell_ix = y_ixs[i, j-1] if j>0 else max_ix
still_active = mask[i, j] or ((y_fill_ix != max_ix) and (left_cell_ix != max_ix))
y_fill_ix = min(y_fill_ix, i, left_cell_ix) if still_active else max_ix
y_ixs[i, j] = y_fill_ix
# Find the bottom-right corners of each segment
new_xstops = np.diff((x_ixs != max_ix).astype(np.int32), axis=1, append=False)==-1
new_ystops = np.diff((y_ixs != max_ix).astype(np.int32), axis=0, append=False)==-1
corner_mask = new_xstops & new_ystops
y_stops, x_stops = np.array(np.nonzero(corner_mask))
# Extract the boxes, getting the top-right corners from the index arrays
x_starts = x_ixs[y_stops, x_stops]
y_starts = y_ixs[y_stops, x_stops]
ltrb_boxes = np.hstack([x_starts[:, None], y_starts[:, None], x_stops[:, None]+1, y_stops[:, None]+1])
return ltrb_boxes

Applying Tensorflow Dataset .map() to subsequent dataset elements

I've got a TFRecordDataset and I'm trying to preprocess the features of two subsequent elements by means of the map() API.
dataset_ext = dataset.map(lambda x: tf.py_function(parse_data, [x], [tf.float32]))
As map applies the function parse_data to every dataset element, I don't know what parse_data should look like in order to keep track of the feature extracted from the previous dataset element.
Can anyone help? Thank you
EDIT: I'm working on the Waymo dataset, so each element is a frame. You can refer to https://github.com/Jossome/Waymo-open-dataset-document for its structure.
This is my parse function parse_data:
from waymo_open_dataset import dataset_pb2 as open_dataset
def parse_data(input_data):
frame = open_dataset.Frame()
frame.ParseFromString(bytearray(input_data.numpy()))
av_speed = (frame.images[0].velocity.v_x, frame.images[0].velocity.v_y, frame.images[0].velocity.v_z)
return av_speed
I'd like to build a dataset whose features are the car speed and acceleration, defined as the speed variation between subsequent frames (the first value can be 0).
One way I thought about is to give the map function dataset and dataset.skip(1) as inputs but I'm not sure about it yet.
I am not sure but it might be unnecessary to make your mapped function a tf.py_function. How parse_data is supposed to look like depends on your dataset dataset_ext. If it has for example two file paths (1 instace of input data and 1 instance of output data), the mapping function should have 2 arguments and should return 2 arguments.
For example: if your dataset contains images and you want them to be randomly cropped each time an example of your dataset is drawn the mapping function looks like this:
def process_img_random_crop(img_in, img_out, output_shape):
merged = tf.stack([img_in, img_out])
mergedCrop = tf.image.random_crop(merged, size=(2,) + output_shape)
img_in_cropped, img_out_cropped = tf.unstack(mergedCrop, 2, 0)
return img_in_cropped, img_out_cropped
I call it as follows:
image_ds_test = image_ds_test.map(lambda i, o: process_img_random_crop(i, o, output_shape=(64, 64, 1)), num_parallel_calls=tf.data.experimental.AUTOTUNE)
What exactly is your plan with dataset_ext and what does it contain?
Edit:
Okay, got what you meant with you the two frames. So the map function is applied to each entry of your dataset separatly. If you need cross-entry information, a single entry of your dataset needs to contain two frames. With this more complicated set-up, I would suggest you to use a tensorflow Sequence: The explanation from the tensorflow team is pretty straigth forward. Hope this help!

What is an effective way to pad a variable length dataset for batching in Tensorflow that does not have have exact

I am trying to integrate the Dataset API into my input pipeline. Before this integration, the program used tf.train.batch_join(), which had dynamic padding enabled. Hence, this would batch elements and pad them according to the largest one in the mini-batch.
image, width, label, length, text, filename = tf.train.batch_join(
data_tuples,
batch_size=batch_size,
capacity=queue_capacity,
allow_smaller_final_batch=final_batch,
dynamic_pad=True)
For dataset, however, I was unable to find the exact alternative to this. I cannot use padded batch, since the dimensions of the images does not have a set threshold. The image width could be anything. My partner and I were able to come up with a work around for this using tf.contrib.data.bucket_by_sequence(). Here is an excerpt:
dataset = dataset.apply(tf.contrib.data.bucket_by_sequence_length
(element_length_func=_element_length_fn,
bucket_batch_sizes=np.full(len([0]) + 1, batch_size),
bucket_boundaries=[0]))
What this does is basically dumps all the elements into the overflow bucket since the boundary is set to 0. Then, it batches it from that bucket since bucketing pads the elements according to the largest one.
Is there a better way to achieve this functionality?
I meet exactly the same problem. Now I know how to solve this. If your input_data only has one dimension that is of variable length, try to use tf.contrib.data.bucket_by_sequence_length to dataset.apply() function, make bucket_batch_sizes = [batch_size] * (len(buckets) + 1). And there is another way to do so just as #mrry has said in comments.
iterator = dataset.make_one_shot_iterator()
item = iterator.get_next()
padded_shapes = []
for i in item:
padded_shapes.append(i.get_shape())
padded_shapes = tf.contrib.framework.nest.pack_sequence_as(item, padded_shapes)
dataset = dataset.padded_batch(batch_size, padded_shapes)
If one dimension in the shapes of a tensor is None or -1, then padded_batch will pad the tensor on that dimension to max length of the batch.
My training data has two features of varibale length, And this method works fine.

How to detect only one specified class instead of all classes in tensorflow object detection?

I trained my dataset with six classes and it works fine in detection different classes. Is it possible to modify object detector script to detect only one specified class instead of all six classes? Or I must retrain my dataset for one class again from scratch? Thanks a lot for any recommendation.
Here is my drawing part of object detector script:
vis_util.visualize_boxes_and_labels_on_image_array(
image,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=1,
agnostic_mode=False,
groundtruth_box_visualization_color='black',
skip_scores=False,
skip_labels=False,
min_score_thresh=0.80)
Unless you change the code, you're going to get probabilities for all the classes. Ofc, you can select highest one among em. Makes sense?
It might not be the best solution to this problem, but you could try making a copy of the label_map.pbtxt-file (one to alter and one for safe-keeping) and delete all labels but the one you are interested in, in one of them.
Then you can lower the min_score_thresh to maybe 0.1 or something (or not modify this parameter at all), and only detect the one label you kept in the label_map.pbtxt-file.
If you are using the Object detection API from GitHub, the mscoco_label_map.pbtxt-file can be found in models-master/research/object_detection/data/ (remember to open it with a text-editor)
Before you call the visualization function add the following code -
objectOfInterest = 1 # Interested object class number as per label file
box = np.asarray(boxes)
cls = np.asarray(classes).astype(np.int32)
scr = np.asarray(scores)
bl = (cls == objectOfInterest)
classes = np.extract(boolar,cls)
scores = np.extract(boolar,scr)
boxes = np.extract(boolar,box)
The code suggested below by Suman was almost perfect, but the "boxes" array needs to be a 4 position tuple (box coordinates). To select a specific class, selecting the matching box coordinate tuple is needed. So I've added some lines before the code suggested by Suman. Check the code below:
objectOfInterest = 1 # Interested object class number as per label file
box = np.asarray(boxes)
cls = np.asarray(classes).astype(np.int32)
scr = np.asarray(scores)
boxes = []
for in range(1, len(cls)):
if cls[i] == objectOfInterest:
boxes.append(box[i])
boxes = np.array(boxes)
bl = (cls == objectOfInterest)
classes = np.extract(boolar,cls)
scores = np.extract(boolar,scr)

One hot encoding variable n m mode numbers

I'm doing image classification of with image labels of mode numbers m and n. I'm also varying the convnet training to take in different maximum mode numbers to test it out, before fully training on a massive data set.
Silly question but, given a label (m,n) how do I one hot encode it into an array of length n*m?
Thanks.
Edit: Yikes this is actually really simple,
modenum = %some integer
def getLabel(n,m):
array = np.zeros((modeNum,modeNum),dtype='int8')
array[n,m] = 1
label = np.ndarray.flatten(array)
return label