tensorflow object detection API: training is very slow

tensorflow object detection API: training is very slow - tensorflow

I am currently studying google tensorflow object detection API. When I try to retrain the model with Oxford III pet dataset, the training process is very slow.
Here is what I found so far:
most of time only 2% GPU is utilzed.
but CPU utilization is 60%, so It seems GPU is not starved by input, otherwise CPU should be near 100% utilization.
I am trying to profile it with tensorflow profiler, but I am in a bit hurry now, any idea or suggestion would be helpful.

I found the problems. It's the issue with input, my tfrecord file is corrupted somehow, so the input thread hang up sometimes.

There are many reasons for this to happen. The most common being that there is some problem with your record file. There need to be done some testing before adding an image and it's contour to record file. Some of them are:
First check the image before sending it to the record:
def checkJPG(fn):
with tf.Graph().as_default():
try:
image_contents = tf.read_file(fn)
image = tf.image.decode_jpeg(image_contents, channels=3)
init_op = tf.initialize_all_tables()
with tf.Session() as sess:
sess.run(init_op)
tmp = sess.run(image)
except:
print("Corrupted file: ", fn)
return False
return True
Also, check the height and width of the contour and if any contour is not crossing the borders:
boxW = xmax - xmin
boxH = ymax - ymin
if boxW == 0 or boxH == 0:
print("...ONE CONTOUR SKIPPED... (boxW | boxH) = 0")
continue
if boxW*boxH < 100:
print("...ONE CONTOUR SKIPPED... (boxW*boxH) < 100")
continue
if xmin / width <= 0 or xmax / width <= 0 or ymin / height <= 0 or ymax / height <= 0:
print("...ONE CONTOUR SKIPPED... (x | y) <= 0")
continue
if xmin / width >= 1 or xmax / width >= 1 or ymin / height >= 1 or ymax / height >= 1:
print("...ONE CONTOUR SKIPPED... (x | y) >= 1")
continue
One of the other reason is that there is too much data in evaluation record file. It's better to add only 10 images in your evaluation record file and change the evaluation config like this:
eval_config {
num_visualizations: 10
num_examples: 10
eval_interval_secs: 3000
max_evals: 1
use_moving_averages: false
}

As i can see , it is not utilizing GPU as now,
Have you tried to optimise GPU using tensorflow given parameter
https://www.tensorflow.org/performance/performance_guide#optimizing_for_gpu

Related

How to define prob_threshold to avoid double counting during object detection?

I am developing an object detection application using SSD model and I have defined the bounding box and the prob_threshold, when I run the code I realise that the model double count person in frame. Please see below my code
## Setting Pro_threshold for person detection filtering
try:
prob_threshold = float(os.environ['PROB_THRESHOLD'])
except:
prob_threshold = 0.4
def draw_boxes(frame, result, width, height):
"""
:Draws bounding box when person is detected on video frame
:and the probability is more than the specified threshold
"""
present_count = 0
for obj in result[0][0]:
conf = obj[2]
if conf >= prob_threshold:
xmin = int(obj[3] * width)
ymin = int(obj[4] * height)
xmax = int(obj[5] * width)
ymax = int(obj[6] * height)
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), (0, 255, 0), 3)
present_count += 1
return frame, present_count

In order to ensure that the number of people in the video frame was not double counted I first initialise the variables and used if statement to calculate the duration spent by each person in the video frame.
## Initialise variables##
present_request_id = 0
present_count = 0
start_time = 0
last_count = 0
total_count = 0
## Calculating the duration a person spent on video#
if present_count < last_count and int(time.time() - start_time) >=1:
duration = int(time.time() - start_time)
if duration > 0:
# Publish messages to the MQTT server
client.publish("person/duration",
json.dumps({"duration": duration + lagtime}))
else:
lagtime += 1
log.warning(lagtime)
adding below argument and experimenting between the seconds, in my case I experimented between 1secs and 3sec
int(time.time() - start_time) >=1
see GitHub Repo for explanation.

RuntimeError: libpng signaled error while visualizing cnn layers

I am visualizing layers of cnn with keras. The visualization is on mnist test image.The model summary is here
The code for visualization is as follows:
layer_names = []
for layer in model.layers[:12]:
layer_names.append(layer.name) # Names of the layers, so you can have them as part of your plot
images_per_row = 16
for layer_name, layer_activation in zip(layer_names, activations): # Displays the feature maps
n_features = layer_activation.shape[-1] # Number of features in the feature map
size = layer_activation.shape[1] #The feature map has shape (1, size, size, n_features).
n_cols = n_features // images_per_row # Tiles the activation channels in this matrix
display_grid = np.zeros((size * n_cols, images_per_row * size))
for col in range(n_cols): # Tiles each filter into a big horizontal grid
for row in range(images_per_row):
channel_image = layer_activation[0,
:, :,
col * images_per_row + row]
channel_image -= channel_image.mean() # Post-processes the feature to make it visually palatable
channel_image /= channel_image.std()
channel_image *= 64
channel_image += 128
channel_image = np.clip(channel_image, 0, 255).astype('uint8')
display_grid[col * size : (col + 1) * size, # Displays the grid
row * size : (row + 1) * size] = channel_image
scale = 1. / size
plt.figure(figsize=(scale * display_grid.shape[1],
scale * display_grid.shape[0]))
plt.title(layer_name)
plt.grid(False)
plt.imshow(display_grid, aspect='auto', cmap='viridis')
This code visualize output of first two layers and show image with filters. But with the third layer it throws the error as follows:
RuntimeError: libpng signaled error
<Figure size 1152x0 with 1 Axes>
I have tried to uninstall and reinstall matplotlib but still it is not working.

It’s a logic error:
<Figure size 1152x0 with 1 Axes>
implies that scale * display_grid.shape[0] == 0 which can only happen if you set n_cols to zero in this line:
n_cols = n_features // images_per_row
caused by n_features being < images_per_row/2.
There should be a nicer error in future versions of matplotlib.

Rotating a 2d sub-array using numpy without aliasing effects

I would like to rotate only the positive value pixels in my 2d array some degree about the center point. The data represents aerosol concentrations from a plume dispersion model, and the chimney position is the origin of rotation.
I would like to rotate this dispersion pattern given the wind direction.
The concentrations are first calculated for a wind direction along the x-axis and then translated to their rotated position using a 2d linear rotation about the center point of my array (the chimney position) for all points whose concentration is > 0.
The input X,Y to the rotation formula are pixel indexes.
My problem is that the output is aliased since integers become floats. In order to obtain integers, I rounded up or down the output. However, this creates null cells which become increasingly numerous as the angle increases.
Can anyone help me find a solution to my problem? I would like to fix this problem if possible using numpy, or a minimum of packages...
The part of my script that deals with computing the concentrations and rotating the pixel by 50°N is the following. Thank you for your help.
def linear2D_rotation(xcoord,ycoord,azimuth_degrees):
radians = (90 - azimuth_degrees) * (np.pi / 180) # in radians
xcoord_rotated = (xcoord * np.cos(radians)) - (ycoord * np.sin(radians))
ycoord_rotated = (xcoord * np.sin(radians)) + (ycoord * np.cos(radians))
return xcoord_rotated,ycoord_rotated
u_orient = 50 # wind orientation in degres from North
kernel = np.zeros((NpixelY, NpixelX)) # initialize matrix
Yc = int((NpixelY - 1) / 2) # position of central pixel
Xc = int((NpixelX - 1) / 2) # position of central pixel
nk = 0
for Y in list(range(0,NpixelX)):
for X in list(range(0,NpixelY)):
# compute concentrations only in positive x-direction
if (X-Xc)>0:
# nnumber of pixels to origin point (chimney)
dx = ((X-Xc)+1)
dy = ((Y-Yc)+1)
# distance of point to origin (chimney)
DX = dx*pixel_size_X
DY = dy*pixel_size_Y
# compute diffusivity coefficients
Sy, Sz = calcul_diffusivity_coeff(DX, stability_class)
# concentration at ground level below the centerline of the plume
C = (Q / (2 * np.pi * u * Sy * Sz)) * \
np.exp(-(DY / (2 * Sy)) ** 2) * \
(np.exp(-((Z - H) / (2 * Sz)) ** 2) + np.exp(-((Z + H) / (2 * Sz)) ** 2)) # at point away from center line
C = C * 1e9 # convert MBq to Bq
# rotate only if concentration value at pixel is positive
if C > 1e-12:
X_rot, Y_rot = linear2D_rotation(xcoord=dx, ycoord=dy,azimuth_degrees=u_orient)
X2 = int(round(Xc+X_rot))
Y2 = int(round(Yc-Y_rot)) # Y increases downwards
# pixels that fall out of bounds -> ignore
if (X2 > (NpixelX - 1)) or (X2 < 0) or (Y2 > (NpixelY - 1)):
continue
else:
# replace new pixel position in kernel array
kernel[Y2, X2] = C
The original array to be rotated
The rotated array by 40°N showing the data loss

Your problem description is not 100% clear, but here are a few recommendations:
1.) Don't reinvent the wheel. There are standard solutions for things like rotating pixels. Use them! In this case
scipy.ndimage.affine_transform for performing the rotation
a homogeneous coordinate matrix for specifying the rotation
nearest neighbor interpolation (parameter order=0 in code below).
2.) Don't loop where not necessary. The speed you gain by not processing non-positive pixels is nothing against the speed you lose by looping. Compiled functions can ferry around a lot of redundant zeros before hand-written python code catches up with them.
3.) Don't expect a solution that maps pixels one-to-one because it is a fact that there will be points that are no ones nearest neighbor and points that are nearest neighbor to multiple other points. With that in mind, you may want to consider a higher order, smoother interpolation.
Comparing your solution to the standard tools solution we find that the latter
gives a comparable result much faster and without those hole artifacts.
Code (without plotting). Please note that I had to transpose and flipud to align the results :
import numpy as np
from scipy import ndimage as sim
from scipy import stats
def mock_data(n, Theta=50, put_neg=True):
y, x = np.ogrid[-20:20:1j*n, -9:3:1j*n, ]
raster = stats.norm.pdf(y)*stats.norm.pdf(x)
if put_neg:
y, x = np.ogrid[-5:5:1j*n, -3:9:1j*n, ]
raster -= stats.norm.pdf(y)*stats.norm.pdf(x)
raster -= (stats.norm.pdf(y)*stats.norm.pdf(x)).T
return {'C': raster * 1e-9, 'Theta': Theta}
def rotmat(Theta, offset=None):
theta = np.radians(Theta)
c, s = np.cos(theta), np.sin(theta)
if offset is None:
return np.array([[c, -s] [s, c]])
R = np.array([[c, -s, 0], [s, c,0], [0,0,1]])
to, fro = np.identity(3), np.identity(3)
offset = np.asanyarray(offset)
to[:2, 2] = offset
fro[:2, 2] = -offset
return to # R # fro
def f_pp(C, Theta):
m, n = C.shape
clipped = np.maximum(0, 1e9 * data['C'])
clipped[:, :n//2] = 0
M = rotmat(Theta, ((m-1)/2, (n-1)/2))
return sim.affine_transform(clipped, M, order = 0)
def linear2D_rotation(xcoord,ycoord,azimuth_degrees):
radians = (90 - azimuth_degrees) * (np.pi / 180) # in radians
xcoord_rotated = (xcoord * np.cos(radians)) - (ycoord * np.sin(radians))
ycoord_rotated = (xcoord * np.sin(radians)) + (ycoord * np.cos(radians))
return xcoord_rotated,ycoord_rotated
def f_OP(C, Theta):
kernel = np.zeros_like(C)
m, n = C.shape
for Y in range(m):
for X in range(n):
if X > n//2:
c = C[Y, X] * 1e9
if c > 1e-12:
dx = X - n//2 + 1
dy = Y - m//2 + 1
X_rot, Y_rot = linear2D_rotation(xcoord=dx, ycoord=dy,azimuth_degrees=Theta)
X2 = int(round(n//2+X_rot))
Y2 = int(round(m//2-Y_rot)) # Y increases downwards
# pixels that fall out of bounds -> ignore
if (X2 > (n - 1)) or (X2 < 0) or (Y2 > (m - 1)):
continue
else:
# replace new pixel position in kernel array
kernel[Y2, X2] = c
return kernel
n = 100
data = mock_data(n, 70)

word2vec_basic not working (Tensorflow)

I am new to word-embedding and Tensorflow. I am working on a project where I need to apply word2vec to health data.
I used the code for Tensorflow website (word2vec_basic.py). I modified a little this code to make it read my data instead of "text8.zip" and it runs normally until the last step:
num_steps = 100001
with tf.Session(graph=graph) as session:
# We must initialize all variables before we use them.
tf.initialize_all_variables().run()
print('Initialized')
average_loss = 0
for step in range(num_steps):
batch_data, batch_labels = generate_batch(
batch_size, num_skips, skip_window)
feed_dict = {train_dataset : batch_data, train_labels : batch_labels}
_, l = session.run([optimizer, loss], feed_dict=feed_dict)
average_loss += l
if step % 2000 == 0:
if step > 0:
average_loss = average_loss / 2000
# The average loss is an estimate of the loss over the last 2000 batches.
print('Average loss at step %d: %f' % (step, average_loss))
average_loss = 0
# note that this is expensive (~20% slowdown if computed every 500 steps)
if step % 10000 == 0:
sim = similarity.eval()
for i in range(valid_size):
valid_word = reverse_dictionary[valid_examples[i]]
top_k = 8 # number of nearest neighbors
nearest = (-sim[i, :]).argsort()[1:top_k+1]
log = 'Nearest to %s:' % valid_word
for k in range(top_k):
close_word = reverse_dictionary[nearest[k]]
log = '%s %s,' % (log, close_word)
print(log)
final_embeddings = normalized_embeddings.eval()<code>
This code is exactly the same as the example so I don't think it is wrong. the error It gave is:
KeyError Traceback (most recent call last)
<ipython-input-20-fc4c5c915fc6> in <module>()
34 for k in xrange(top_k):
35 print(nearest[k])
---> 36 close_word = reverse_dictionary[nearest[k]]
37 log_str = "%s %s," % (log_str, close_word)
38 print(log_str)
KeyError: 2868
I changed the size of the input data but it still gives the same error.
I would really appreciate if someone could give me some advice on how to fix this problem.

If the vocabulary size is less than default maximum (50000), you should modify the number.
At the last of step 2, let's modify vocabulary_size to actual dictionary size.
data, count, dictionary, reverse_dictionary = build_dataset(words)
del words # Hint to reduce memory.
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10], [reverse_dictionary[i] for i in data[:10]])
#add this line to modify
vocabulary_size = len(dictionary)
print('Dictionary size', len(dictionary))

Stimuli changes with every frame being displayed.

I have a bit of code (displayed below) that is supposed to display the stimulus for 10 frames. We need pretty exact display times, so using number of frames is a must instead of core.wait(xx) as the display time won't be as precise.
Instead of drawing the stimuli, and leaving it for another 9 frames - the stimuli is re-drawn for every frame.
# Import what is needed
import numpy as np
from psychopy import visual, event, core, logging
from math import sin, cos
import random, math
win = visual.Window(size=(1366, 768), fullscr=True, screen=0, allowGUI=False, allowStencil=False,
monitor='testMonitor', color=[0,0,0], colorSpace='rgb',
blendMode='avg', useFBO=True,
units='deg')
### Definitions of libraries
'''Parameters :
numpy - python package of numerical computations
visual - where all visual stimulus live
event - code to deal with mouse + keyboard input
core - general function for timing & closing the program
logging - provides function for logging error and other messages to one file
random - options for creating arrays of random numbers
sin & cos - for geometry and trigonometry
math - mathematical operations '''
# this is supposed to record all frames
win.setRecordFrameIntervals(True)
win._refreshThreshold=1/65.0+0.004 #i've got 65Hz monitor and want to allow 4ms tolerance
#set the log module to report warnings to the std output window (default is errors only)
logging.console.setLevel(logging.WARNING)
nIntervals=5
# Create space variables and a window
lineSpaceX = 0.55
lineSpaceY = 0.55
patch_orientation = 45 # zero is vertical, going anti-clockwise
surround_orientation = 90
#Jitter values
g_posJitter = 0.05 #gaussian positional jitter
r_posJitter = 0.05 #random positional jitter
g_oriJitter = 5 #gaussian orientation jitter
r_oriJitter = 5 #random orientation jitter
#create a 1-Dimentional array
line = np.array(range(38)) #with values from (0-37) #possibly not needed 01/04/16 DK
#Region where the rectangular patch would appear
#x_rand=random.randint(1,22) #random.randint(Return random integers from low (inclusive) to high (exclusive).
#y_rand=random.randint(1,25)
x_rand=random.randint(6,13) #random.randint(Return random integers from low (inclusive) to high (inclusive).
y_rand=random.randint(6,16)
#rectangular patch dimensions
width=15
height=12
message = visual.TextStim(win,pos=(0.0,-12.0),text='...Press SPACE to continue...')
fixation = visual.TextStim(win, pos=(0.0,0.0), text='X')
# Initialize clock to record response time
rt_clock = core.Clock()
#Nested loop to draw anti-aliased lines on grid
#create a function for this
def myStim():
for x in xrange(1,33): #32x32 grid. When x is 33 will not execute loop - will stop
for y in xrange(1,33): #When y is 33 will not execute loop - will stop
##Define x & y value (Gaussian distribution-positional jitter)
x_pos = (x-32/2-1/2 )*lineSpaceX + random.gauss(0,g_posJitter) #random.gauss(mean,s.d); -1/2 is to center even-numbered stimuli; 32x32 grid
y_pos = (y-32/2-1/2 )*lineSpaceY + random.gauss(0,g_posJitter)
if (x >= x_rand and x < x_rand+width) and (y >= y_rand and y < y_rand+height): # note only "=" on one side
Line_Orientation = random.gauss(patch_orientation,g_oriJitter) #random.gauss(mean,s.d) - Gaussian func.
else:
Line_Orientation = random.gauss(surround_orientation,g_oriJitter) #random.gauss(mean,s.d) - Gaussian func.
#Line_Orientation = random.gauss(Line_Orientation,g_oriJitter) #random.gauss(mean,s.d) - Gaussian func.
#stimOri = random.uniform(xOri - r_oriJitter, xOri + r_oriJitter) #random.uniform(A,B) - Uniform func.
visual.Line(win, units = "deg", start=(0,0), end=(0.0,0.35), pos=(x_pos,y_pos), ori=Line_Orientation, autoLog=False).draw() #Gaussian func.
for frameN in range (10):
myStim()
win.flip()
print x_rand, y_rand
print keys, rt #display response and reaction time on screen output window
I have tried to use the following code to keep it displayed (by not clearing the buffer). But it just draws over it several times.
for frameN in range(10):
myStim()
win.flip(clearBuffer=False)
I realize that the problem could be because I have .draw() in the function that I have defined def myStim():. However, if I don't include the .draw() within the function - I won't be able to display the stimuli.
Thanks in advance for any help.

If I understand correctly, the problem you are facing is that you have to re-draw the stimulus on every flip, but your current drawing function also recreates the entire (random) stimulus, so:
the stimulus changes on each draw between flips, although you need it to stay constant, and
you get a (on some systems quite massive) performance penalty by re-creating the entire stimulus over and over again.
What you want instead is: create the stimulus once, in its entirety, before presentation; and then have this pre-generated stimulus drawn on every flip.
Since your stimulus consists of a fairly large number of visual elements, I would suggest using a class to store the stimulus in one place.
Essentially, you would replace your myStim() function with this class (note that I stripped out most comments, re-aligned the code a bit, and simplified the if statement):
class MyStim(object):
def __init__(self):
self.lines = []
for x in xrange(1, 33):
for y in xrange(1, 33):
x_pos = ((x - 32 / 2 - 1 / 2) * lineSpaceX +
random.gauss(0, g_posJitter))
y_pos = ((y - 32 / 2 - 1 / 2) * lineSpaceY +
random.gauss(0, g_posJitter))
if ((x_rand <= x < x_rand + width) and
(y_rand <= y < y_rand + height)):
Line_Orientation = random.gauss(patch_orientation,
g_oriJitter)
else:
Line_Orientation = random.gauss(surround_orientation,
g_oriJitter)
current_line = visual.Line(
win, units="deg", start=(0, 0), end=(0.0, 0.35),
pos=(x_pos, y_pos), ori=Line_Orientation,
autoLog=False
)
self.lines.append(current_line)
def draw(self):
[line.draw() for line in self.lines]
What this code does on instantiation is in principle identical to your myStim() function: it creates a set of (random) lines. But instead of drawing them onto the screen right away, they are all collected in the list self.lines, and will remain there until we actually need them.
The draw() method traverses through this list, element by element (that is, line by line), and calls every line's draw() method. Note that the stimuli do not have to be re-created every time we want to draw the whole set, but instead we just draw the already pre-created lines!
To get this working in practice, you first need to instantiate the MyStim class:
myStim = MyStim()
Then, whenever you want to present the stimulus, all you have to do is
myStim.draw()
win.flip()
Here is the entire, modified code that should get you started:
import numpy as np
from psychopy import visual, event, core, logging
from math import sin, cos
import random, math
win = visual.Window(size=(1366, 768), fullscr=True, screen=0, allowGUI=False, allowStencil=False,
monitor='testMonitor', color=[0,0,0], colorSpace='rgb',
blendMode='avg', useFBO=True,
units='deg')
# this is supposed to record all frames
win.setRecordFrameIntervals(True)
win._refreshThreshold=1/65.0+0.004 #i've got 65Hz monitor and want to allow 4ms tolerance
#set the log module to report warnings to the std output window (default is errors only)
logging.console.setLevel(logging.WARNING)
nIntervals=5
# Create space variables and a window
lineSpaceX = 0.55
lineSpaceY = 0.55
patch_orientation = 45 # zero is vertical, going anti-clockwise
surround_orientation = 90
#Jitter values
g_posJitter = 0.05 #gaussian positional jitter
r_posJitter = 0.05 #random positional jitter
g_oriJitter = 5 #gaussian orientation jitter
r_oriJitter = 5 #random orientation jitter
x_rand=random.randint(6,13) #random.randint(Return random integers from low (inclusive) to high (inclusive).
y_rand=random.randint(6,16)
#rectangular patch dimensions
width=15
height=12
message = visual.TextStim(win,pos=(0.0,-12.0),text='...Press SPACE to continue...')
fixation = visual.TextStim(win, pos=(0.0,0.0), text='X')
# Initialize clock to record response time
rt_clock = core.Clock()
class MyStim(object):
def __init__(self):
self.lines = []
for x in xrange(1, 33):
for y in xrange(1, 33):
x_pos = ((x - 32 / 2 - 1 / 2) * lineSpaceX +
random.gauss(0, g_posJitter))
y_pos = ((y - 32 / 2 - 1 / 2) * lineSpaceY +
random.gauss(0, g_posJitter))
if ((x_rand <= x < x_rand + width) and
(y_rand <= y < y_rand + height)):
Line_Orientation = random.gauss(patch_orientation,
g_oriJitter)
else:
Line_Orientation = random.gauss(surround_orientation,
g_oriJitter)
current_line = visual.Line(
win, units="deg", start=(0, 0), end=(0.0, 0.35),
pos=(x_pos, y_pos), ori=Line_Orientation,
autoLog=False
)
self.lines.append(current_line)
def draw(self):
[line.draw() for line in self.lines]
myStim = MyStim()
for frameN in range(10):
myStim.draw()
win.flip()
# Clear the screen
win.flip()
print x_rand, y_rand
core.quit()
Please do note that even with this approach, I am dropping frames on a 3-year-old laptop computer with relatively weak integrated graphics chip. But I suspect a modern, fast GPU would be able to handle this amount of visual objects just fine. In the worst case, you could pre-create a large set of stimuli, save them as a bitmap file via win.saveMovieFrames(), and present them as a pre-loaded SimpleImageStim during your actual study.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

tensorflow object detection API: training is very slow - tensorflow

I found the problems. It's the issue with input, my tfrecord file is corrupted somehow, so the input thread hang up sometimes.

As i can see , it is not utilizing GPU as now, Have you tried to optimise GPU using tensorflow given parameter https://www.tensorflow.org/performance/performance_guide#optimizing_for_gpu

Related

How to define prob_threshold to avoid double counting during object detection?

RuntimeError: libpng signaled error while visualizing cnn layers

Rotating a 2d sub-array using numpy without aliasing effects

word2vec_basic not working (Tensorflow)

Stimuli changes with every frame being displayed.

Categories

Resources