i used Tensorflow label_image example to detect and localize 10 class objects from image . now i want remove multi predicted rectangle for one object with tensorflow::ops::NonMaxSuppression . i don't know how to use it in my code . please help me to solve it. like this picture

You can use the below function to draw boxes which surpass the threshold, I am taking it from Tensorflow object detection API.
def visualize_boxes_and_labels_on_image_array(
"""Overlay labeled boxes on an image with formatted scores and label names.
This function groups boxes that correspond to the same location
and creates a display string for each detection and overlays these
on the image. Note that this function modifies the image in place, and returns
that same image.
image: uint8 numpy array with shape (img_height, img_width, 3)
boxes: a numpy array of shape [N, 4]
classes: a numpy array of shape [N]. Note that class indices are 1-based,
and match the keys in the label map.
scores: a numpy array of shape [N] or None. If scores=None, then
this function assumes that the boxes to be plotted are groundtruth
boxes and plot all boxes as black with no classes or scores.
category_index: a dict containing category dictionaries (each holding
category index `id` and category name `name`) keyed by category indices.
instance_masks: a numpy array of shape [N, image_height, image_width] with
values ranging between 0 and 1, can be None.
instance_boundaries: a numpy array of shape [N, image_height, image_width]
with values ranging between 0 and 1, can be None.
keypoints: a numpy array of shape [N, num_keypoints, 2], can
be None
track_ids: a numpy array of shape [N] with unique track ids. If provided,
color-coding of boxes will be determined by these ids, and not the class
use_normalized_coordinates: whether boxes is to be interpreted as
normalized coordinates or not.
max_boxes_to_draw: maximum number of boxes to visualize. If None, draw
all boxes.
min_score_thresh: minimum score threshold for a box to be visualized
agnostic_mode: boolean (default: False) controlling whether to evaluate in
class-agnostic mode or not. This mode will display scores but ignore
line_thickness: integer (default: 4) controlling line width of the boxes.
groundtruth_box_visualization_color: box color for visualizing groundtruth
skip_scores: whether to skip score when drawing a single detection
skip_labels: whether to skip label when drawing a single detection
skip_track_ids: whether to skip track id when drawing a single detection
uint8 numpy array with shape (img_height, img_width, 3) with overlaid boxes.
# Create a display string (and color) for every box location, group any boxes
# that correspond to the same location.
box_to_display_str_map = collections.defaultdict(list)
box_to_color_map = collections.defaultdict(str)
box_to_instance_masks_map = {}
box_to_instance_boundaries_map = {}
box_to_keypoints_map = collections.defaultdict(list)
box_to_track_ids_map = {}
if not max_boxes_to_draw:
max_boxes_to_draw = boxes.shape[0]
for i in range(min(max_boxes_to_draw, boxes.shape[0])):
if scores is None or scores[i] > min_score_thresh:
box = tuple(boxes[i].tolist())
if instance_masks is not None:
box_to_instance_masks_map[box] = instance_masks[i]
if instance_boundaries is not None:
box_to_instance_boundaries_map[box] = instance_boundaries[i]
if keypoints is not None:
if track_ids is not None:
box_to_track_ids_map[box] = track_ids[i]
if scores is None:
box_to_color_map[box] = groundtruth_box_visualization_color
display_str = ''
if not skip_labels:
if not agnostic_mode:
if classes[i] in six.viewkeys(category_index):
class_name = category_index[classes[i]]['name']
class_name = 'N/A'
display_str = str(class_name)
if not skip_scores:
if not display_str:
display_str = '{}%'.format(int(100*scores[i]))
display_str = '{}: {}%'.format(display_str, int(100*scores[i]))
if not skip_track_ids and track_ids is not None:
if not display_str:
display_str = 'ID {}'.format(track_ids[i])
display_str = '{}: ID {}'.format(display_str, track_ids[i])
if agnostic_mode:
box_to_color_map[box] = 'DarkOrange'
elif track_ids is not None:
prime_multipler = _get_multiplier_for_color_randomness()
box_to_color_map[box] = STANDARD_COLORS[
(prime_multipler * track_ids[i]) % len(STANDARD_COLORS)]
box_to_color_map[box] = STANDARD_COLORS[
classes[i] % len(STANDARD_COLORS)]
# Draw all boxes onto image.
for box, color in box_to_color_map.items():
ymin, xmin, ymax, xmax = box
if instance_masks is not None:
if instance_boundaries is not None:
if keypoints is not None:
radius=line_thickness / 2,
return image


How to cut and paste a part of an image randomly to a different location using tensorflow?

I am trying to implement a custom layer in keras.layers where I want to do a custom image augmentation. My idea is to cut out a part of an image from a random location and paste it to a different random location in the same image. The code below I have written works well for PIL Image but when I integrate it into my final code (which is a tensorflow model), I get as error saying that tensor doesn't support item assignment.
Below is the class that I have implemented:
class Cut_Paste(layers.Layer):
def __init__(self, x_scale = 10, y_scale = 10, IMG_SIZE = (224,224), **kwargs):
defining the x span and the y span of the box to cutout
x_scale and y_scale are taken as inputs as % of the width and height of the image
self.size_x, self.size_y = IMG_SIZE
self.span_x = int(x_scale*self.size_x*0.01)
self.span_y = int(y_scale*self.size_y*0.01)
#getting the vertices for cut and paste
def get_vertices(self):
#determining random points for cut and paste
""" since the images in the dataset have the object of interest in the center of
the Image, the cutout will be taken from the central 25% of the image"""
fraction = 0.25
vert_x = random.randint(int(self.size_x*0.5*(1-fraction)),
vert_y = random.randint(int(self.size_y*0.5*(1-fraction)),
start_x = int(vert_x-self.span_x/2)
start_y = int(vert_y-self.span_y/2)
end_x = int(vert_x+self.span_x/2)
end_y = int(vert_y+self.span_y/2)
return start_x, start_y, end_x, end_y
def call(self, image):
#getting random vertices for cutting
cut_start_x, cut_start_y, cut_end_x, cut_end_y = self.get_vertices()
#getting the image as a sub-image
#image = tf.Variable(image)
sub_image = image[cut_start_x:cut_end_x,cut_start_y:cut_end_y,:]
#getting random vertices for pasting
paste_start_x, paste_start_y, paste_end_x, paste_end_y = self.get_vertices()
#replacing a part of the image at random location with sub_image
paste_start_y:paste_end_y,:] = sub_image
return image
I am calling it from my model class this way:
class Contrastive_learning_model(keras.Model):
def __init__(self):
self.cut_paste = Cut_Paste(**cut_paste_augmentation)
def train_step(self, data):
augmented_images_2 =
I have removed the part of the code which is irrelevant. But upon executing this is the error I get:
TypeError: 'tensorflow.python.framework.ops.EagerTensor' object does not support item assignment
I understood from other sources that it is not possible to do item assignment in tensor. So here I am seeking help to do this in an easier way. I need to use tensors for this. Any help will be much appreciated.
Tensorflow does not support item assignment unlike PyTorch.
A workaround you can implement is to convert the tensor to tf.Variable and then a numpy array like the following:
image = tf.Variable(image).numpy()

Camera calibration python

good evening I'm trying to calibrate a camera. I followed the code posted on the OpenCV website but as I tried to run it, for some reason the code runs through the images I have given it but when the runtime is finished it doesn't produce the calibration parameters. here's the following error message I get
error: (-215:Assertion failed) nimages > 0 in function 'cv::calibrateCameraRO'
#!/usr/bin/env python
import cv2 as cv
import numpy as np
import os
import glob
# Defining the dimensions of checkerboard
size = (1376, 917)
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# Defining the world coordinates for 3D points
objp = np.zeros((CHECKERBOARD[0] * CHECKERBOARD[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:CHECKERBOARD[0], 0:CHECKERBOARD[1]].T.reshape(-1, 2)
#prev_img_shape = None
# Creating vector to store vectors of 3D points for each checkerboard image
objpoints = []
# Creating vector to store vectors of 2D points for each checkerboard image
imgpoints = []
# Extracting path of individual image stored in a given directory
images = glob.glob('*.jpeg')
for image in images:
img = cv.imread(image)
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# Find the chess board corners
# If desired number of corners are found in the image then ret = true
ret, corners = cv.findChessboardCorners(gray, CHECKERBOARD, None)
If desired number of corner are detected,
we refine the pixel coordinates and display
them on the images of checker board
if ret == True:
# refining pixel coordinates for given 2d points.
corners2 = cv.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
# Draw and display the corners
cv.drawChessboardCorners(img, CHECKERBOARD, corners2, ret)
cv.imshow('img', img)
Performing camera calibration by
passing the value of known 3D points (objpoints)
and corresponding pixel coordinates of the
detected corners (imgpoints)
ret, mtx, dist, rvecs, tvecs = cv.calibrateCamera(objpoints, imgpoints, size, None, None)
print("\n camera Calibrated", ret)
print("\nCamera matrix:\n", mtx)
print("\ndist:\n", dist)
print("\nrotation vector : \n", rvecs)
print("\n translation vector : \n", tvecs)


I am implementing a Faster RCNN v2 Inception in Tensorflow Object Detection API. To remove redundant overlapping detections, I read that NMS should be applied.
One way of doing this is adjusting the NMS IOU Threshold in the config file first_stage_nms_iou_threshold.
What is this parameter exactly? To what value should this parameter be adjusted to (default value is 0.7)
Why is it called first_stage_nms_iou_threshold? Why first stage only?
Is there another easy and more effective way of removing redundant detections?
I can't anwser your first and second question but i had the same problem with overlapping bounding boxes and use the following code to fix them manually... You have to know the x1,y1,x2,y2 coordinates of your bounding boxes which are overlapping...
# import the necessary packages
from nms import non_max_suppression_slow
import numpy as np
import cv2
# path to your image
# and the coordinates x1,x2,y1,y2 of the overlapping bounding boxes
images = [
("path/to/your/image", np.array([
(664, 0, 988, 177),
(670, 10, 1000, 188),
(685, 20, 1015, 193),
(47, 100, 357, 500),
(55, 105, 362, 508),
(68, 120, 375, 520),
(978, 80, 1093, 206)]))]
# loop over the images
for (imagePath, boundingBoxes) in images:
# load the image and clone it
print("[x] %d initial bounding boxes" % (len(boundingBoxes)))
image = cv2.imread(imagePath)
orig = image.copy()
# loop over the bounding boxes for each image and draw them
for (startX, startY, endX, endY) in boundingBoxes:
cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 0, 255), 2)
# perform non-maximum suppression on the bounding boxes
pick = non_max_suppression_slow(boundingBoxes, 0.3)
print("[x] after applying non-maximum, %d bounding boxes" % (len(pick)))
# loop over the picked bounding boxes and draw them
for (startX, startY, endX, endY) in pick:
cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)
# display the images
cv2.imshow("Original", orig)
cv2.imshow("After NMS", image)
and still need this :
# import the necessary packages
import numpy as np
def non_max_suppression_slow(boxes, overlapThresh):
# if there are no boxes, return an empty list
if len(boxes) == 0:
return []
# initialize the list of picked indexes
pick = []
# grab the coordinates of the bounding boxes
x1 = boxes[:,0]
y1 = boxes[:,1]
x2 = boxes[:,2]
y2 = boxes[:,3]
# compute the area of the bounding boxes and sort the bounding
# boxes by the bottom-right y-coordinate of the bounding box
area = (x2 - x1 + 1) * (y2 - y1 + 1)
idxs = np.argsort(y2)
# keep looping while some indexes still remain in the indexes
# list
while len(idxs) > 0:
# grab the last index in the indexes list, add the index
# value to the list of picked indexes, then initialize
# the suppression list (i.e. indexes that will be deleted)
# using the last index
last = len(idxs) - 1
i = idxs[last]
suppress = [last]
# loop over all indexes in the indexes list
for pos in range(0, last):
# grab the current index
j = idxs[pos]
# find the largest (x, y) coordinates for the start of
# the bounding box and the smallest (x, y) coordinates
# for the end of the bounding box
xx1 = max(x1[i], x1[j])
yy1 = max(y1[i], y1[j])
xx2 = min(x2[i], x2[j])
yy2 = min(y2[i], y2[j])
# compute the width and height of the bounding box
w = max(0, xx2 - xx1 + 1)
h = max(0, yy2 - yy1 + 1)
# compute the ratio of overlap between the computed
# bounding box and the bounding box in the area list
overlap = float(w * h) / area[j]
# if there is sufficient overlap, suppress the
# current bounding box
if overlap > overlapThresh:
# delete all indexes from the index list that are in the
# suppression list
idxs = np.delete(idxs, suppress)
# return only the bounding boxes that were picked
return boxes[pick]

About use tf.image.crop_and_resize

I'm working on the ROI pooling layer which work for fast-rcnn and I am used to use tensorflow. I found tf.image.crop_and_resize can act as the ROI pooling layer.
But I try many times and cannot get the result that I expected.Or did the true result is exactly what I got?
here is my code
import cv2
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
img_path = r'F:\IMG_0016.JPG'
img = cv2.imread(img_path)
img = img.reshape([1,580,580,3])
img = img.astype(np.float32)
#img = np.concatenate([img,img],axis=0)
img_ = tf.Variable(img) # img shape is [580,580,3]
boxes = tf.Variable([[100,100,300,300],[0.5,0.1,0.9,0.5]])
box_ind = tf.Variable([0,0])
crop_size = tf.Variable([100,100])
#b = tf.image.crop_and_resize(img,[[0.5,0.1,0.9,0.5]],[0],[50,50])
c = tf.image.crop_and_resize(img_,boxes,box_ind,crop_size)
sess = tf.Session()
a = c.eval(session=sess)
And I handed in my origin img and result:a0,a1
if I was wrong can anyone teach me how to use this function? thanks.
Actually, there's no problem with Tensorflow here.
From the doc of tf.image.crop_and_resize (emphasis is mine) :
boxes: A Tensor of type float32. A 2-D tensor of shape [num_boxes, 4].
The i-th row of the tensor specifies the coordinates of a box in the
box_ind[i] image and is specified in normalized coordinates [y1, x1,
y2, x2]. A normalized coordinate value of y is mapped to the image
coordinate at y * (image_height - 1), so as the [0, 1] interval of
normalized image height is mapped to [0, image_height - 1] in image
height coordinates. We do allow y1 > y2, in which case the sampled
crop is an up-down flipped version of the original image. The width
dimension is treated similarly. Normalized coordinates outside the [0,
1] range are allowed, in which case we use extrapolation_value to
extrapolate the input image values.
The boxes argument needs normalized coordinates. That's why you get a black box with your first set of coordinates [100,100,300,300] (not normalized, and no extrapolation value provided), and not with your second set [0.5,0.1,0.9,0.5].
However, as that why matplotlib show you gibberish on your second attempt, it's just because you're using the wrong datatype.
Quoting the matplotlib documentation of plt.imshow (emphasis is mine):
All values should be in the range [0 .. 1] for floats or [0 .. 255]
for integers. Out-of-range values will be clipped to these bounds.
As you're using float outside the [0,1] range, matplotlib is bounding your values to 1. That's why you get those colored pixels (either solid red, solid green or solid blue, or a mixing of these). Cast your array to uint_8 to get an image that make sense.
plt.imshow( a[1].astype(np.uint8))
Edit :
As requested, I will dive a bit more into
[When providing non normalized coordinates and no extrapolation values], why I just get a blank result?
Quoting the doc :
Normalized coordinates outside the [0, 1] range are allowed, in which
case we use extrapolation_value to extrapolate the input image values.
So, normalized coordinates outside [0,1] are allowed. But they still need to be normalized !
With your example, [100,100,300,300], the coordinates you provide makes the red square. Your original image is the little green dot in the upper left corner! The default value of the argument extrapolation_value is 0, so the values outside the frame of the original image are inferred as [0,0,0] hence the black.
But if your usecase needs another value, you can provide it. The pixels will take a RGB value of extrapolation_value%256 on each channel. This option is useful if the zone you need to crop is not fully included in you original images. (A possible usecase would be sliding windows for example).
It seems that tf.image.crop_and_resize expects pixel values in the range [0,1].
Changing your code to
test = tf.image.crop_and_resize(image=image_np_expanded/255., ...)
solved the problem for me.
Yet another variant is to use tf.central_crop function.
Below is a concrete implementation of the tf.image.crop_and_resize API. tf version 1.14
import tensorflow as tf
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
def single_data_2(img_path):
img = tf.read_file(img_path)
img = tf.image.decode_bmp(img,channels=1)
img_4d = tf.expand_dims(img, axis=0)
processed_img = tf.image.crop_and_resize(img_4d,boxes=
processed_img_2 = tf.squeeze(processed_img,0)
raw_img_3 = tf.squeeze(img_4d,0)
return raw_img_3, processed_img_2
def plot_two_image(raw,processed):
raw_ = fig.add_subplot(1,2,1)
raw_.set_title('Raw Image')
processed_ = fig.add_subplot(1,2,2)
processed_.set_title('Processed Image')
img_path = 'D:/samples/your_bmp_image.bmp'
raw_img, process_img = single_data_2(img_path)
Below is my working code, also output image is not black, this can be of help to someone
for idx in range(len(bboxes)):
if bscores[idx] >= Threshold:
#Region of Interest
y_min = int(bboxes[idx][0] * im_height)
x_min = int(bboxes[idx][1] * im_width)
y_max = int(bboxes[idx][2] * im_height)
x_max = int(bboxes[idx][3] * im_width)
class_label = category_index[int(bclasses[idx])]['name']
bbox.append([x_min, y_min, x_max, y_max, class_label, float(bscores[idx])])
#Crop Image - Working Code
cropped_image = tf.image.crop_to_bounding_box(image, y_min, x_min, y_max - y_min, x_max - x_min).numpy().astype(np.int32)
# encode_jpeg encodes a tensor of type uint8 to string
output_image = tf.image.encode_jpeg(cropped_image)
# decode_jpeg decodes the string tensor to a tensor of type uint8
#output_image = tf.image.decode_jpeg(output_image)
score = bscores[idx] * 100
file_name = tf.constant(OUTPUT_PATH+image_name[:-4]+'_'+str(idx)+'_'+class_label+'_'+str(round(score))+'%'+'_'+os.path.splitext(image_name)[1])
writefile =, output_image)

Color map an image with TensorFlow?

I'm saving grayscale images in TFRecord files. The idea then was to color map them on my GPU (only using TF of course) so they get three channels (They are going to be used on a pre-trained VGG-16 model so they have to have three channels).
Does anyone have any idea how to this properly?
I tried to do it with my homemade TF color mapping script, using for-loops, tf.scatter_nd and a mapping array with shape = (256,3)... but it took forever.
cmp = [[255,255,255],
cmp = tf.convert_to_tensor(cmp, tf.int32) # (256, 3)
hot = tf.zeros([224,224,3], tf.int32)
for i in range(img_rgb.shape[2]):
for j in range(img_rgb.shape[1]):
for k in range(img_rgb.shape[0]):
indices = tf.constant([[k,j,i]])
updates = tf.Variable([cmp[img_rgb[k,j,i],i]])
shape = tf.constant([256, 3])
hot = tf.scatter_nd(indices, updates, shape)
This was my attempt, I know it's not optimal in any way, but It was the only solution I could come up with.
Thanks work by jimfleming,
import matplotlib
import tensorflow as tf
def colorize(value, vmin=None, vmax=None, cmap=None):
A utility function for TensorFlow that maps a grayscale image to a matplotlib
colormap for use with TensorBoard image summaries.
- value: 2D Tensor of shape [height, width] or 3D Tensor of shape
[height, width, 1].
- vmin: the minimum value of the range used for normalization.
(Default: value minimum)
- vmax: the maximum value of the range used for normalization.
(Default: value maximum)
- cmap: a valid cmap named for use with matplotlib's `get_cmap`.
(Default: 'gray')
Example usage:
output = tf.random_uniform(shape=[256, 256, 1])
output_color = colorize(output, vmin=0.0, vmax=1.0, cmap='plasma')
tf.summary.image('output', output_color)
Returns a 3D tensor of shape [height, width, 3].
# normalize
vmin = tf.reduce_min(value) if vmin is None else vmin
vmax = tf.reduce_max(value) if vmax is None else vmax
value = (value - vmin) / (vmax - vmin) # vmin..vmax
# squeeze last dim if it exists
value = tf.squeeze(value)
# quantize
indices = tf.to_int32(tf.round(value * 255))
# gather
cm = if cmap is not None else 'gray')
colors = tf.constant(cm.colors, dtype=tf.float32)
value = tf.gather(colors, indices)
return value
You could also try tf.image.grayscale_to_rgb, although there seems to be only one choice of color map, gray.
We're here to help. If everyone wrote optimal code, there would be no need for Stackoverflow. :)
Here's how I would do it in place of the last 7 lines (untested code):
conv_img = tf.gather( params = cmp,
indices = img_rgb[ :, :, 0 ] )
Basically, no need for the for loops, Tensorflow will do that for you, and much quicker. tf.gather() will collect elements from cmp according to the indices provided, which here would be the 0th channel of img_rgb. Each collected element will have the three channels from cmp so when you put them all together, it will form an image.
I don't have time to test right now, gotta run, sorry. Hope it works.