When I run TensorflowJS on the browser, especially on the phone, it takes really long to predict and sometimes doesn't even work. I'm using the optimized graph already. I want to know if there is any way to speed it up, whether by running a prediction before the page loads so that the second is faster or anything else.
I am using InceptionV3 architecture, and the image size is 299 by 299; if I could make that smaller perhaps it could go faster, but that would mean I would have to retrain my model. Note: I am not training using Tensorflowjs, only making predictions. Here is the relevant code:
var ctx = canvas.getContext('2d');
var file = ctx.getImageData(0,0,120,120);
const raw = tf.fromPixels(file).toFloat();
const resized = tf.image.resizeBilinear(raw, [299, 299])
const offset = tf.scalar(127);
const normalized = resized.sub(offset).div(offset);
batched = normalized.expandDims(0);
f = model.execute(batched).dataSync();
Related
I've created a ONNX model for Object Detection with Visual Studio and ML Model Builder, using VOTT to define the 4 objects I want to detect.
I'm testing the model as explained in the tutorial, and it works well, result is ok:
var sampleData = new MLModel1.ModelInput()
{
ImageSource = #"C:\Data\sample1.jpg",
};
//Load model and predict output
var result = MLModel1.Predict(sampleData);
Problem is it takes 5 seconds (10 seconds on first run, 5 on the following ones).
sample.jpg is a 700x400 pixels image, 85kb, the computer is a Intel i7 2.9GHz.
Why it's so slow? Am I doing something wrong or this is the speed I should expect?
Here's the image, the objects to detect are the REF, LOT, the hourglass icon and the factory icon.
Is there any other technique I could use to have a faster detection of these objects?
Thanks
I have converted a custom Keras model to layersModel for Tensorflow.js. I tested the model by uploading an image and calling the prediction after upload was done. Snippet for prediction:
let img = document.getElementById('image')
let offset = tf.scalar(255)
let tensorImg = tf.browser.fromPixels(img).resizeNearestNeighbor([224,224]).toFloat().expandDims();
let tensorImg_scaled = tensorImg.div(offset)
prediction = await model.predict(tensorImg_scaled).data();
With this code, my predictions follow the original model, confidence values changing constantly like they should. However my intention is to analyze webcam feed every second. A function including this code is called every second:
const video = document.querySelector("video");
let offset = tf.scalar(255)
let tensorImg = tf.browser.fromPixels(video).resizeNearestNeighbor([224,224]).toFloat().expandDims();
let tensorImg_scaled = tensorImg.div(offset)
prediction = await model.predict(tensorImg_scaled).data();
With video I get awful results where the prediction is always something like Float32Array(3) [6.18722574920633e-16, 1, 3.5979095258653615e-8] - the middle confidence value always being 1 or 0,9999.
What could be the problem here? Calling the video prediction snippet more seldom - like every 5 seconds - does not help.
Any help with video predictions is super appreciated - it is a final project to uni and the panic starts to creep in... Many thanks!
Even though video is technically made of individual frames it has one important thing which is that those frames exist as a sequence of frames. Your model is not performing well because you trained it to do well on a single frame at a time. When dealing with video data you should be using a CONV(for spatial features) and then a LSTM(for temporal features).
In your case what you could do is implement a rolling prediction over K predictions i.e., the prediction at a frame is the average prediction over a certain number of predictions.
I have been tinkering around a lot with tensorflow in the past few days however I am quite unsure whether a function I wrote would break the backpropagation in a Neural network. I thought I'd ask here before I try to integrate this function in a NN. So the basic setup is I want to add two matricies with
op = tf.add(tfObject, tfImageBackground)
where tfImageBackground is some constant image. (i.e. an RGBA image of size 800, 800 with R = G = B = A = 0) and the tfObject is again a matrix with the same dimenstion however we get that with the function I am unsure about
def getObject(vector):
objectId = vector[0]
x = vector[1]
y = vector[2]
xEnd = baseImageSize-(x+objectSize)
yStart =baseImageSize- (y+objectSize)
padding = tf.convert_to_tensor([[x, xEnd], [yStart, y],[0,0]])
RTensor = tfObjectMatrix[objectId,:,:,0:1]
GTensor = tfObjectMatrix[objectId,:,:,1:2]
BTensor = tfObjectMatrix[objectId,:,:,2:3]
ATensor = tfObjectMatrix[objectId,:,:,3:4]
paddedR = tf.pad(tensor = RTensor,
paddings= padding,
mode='Constant',
name='padAverageRed',
constant_values=255)
...
generates padding for every channel
...
finalTensor=tf.concat([paddedR, paddedG, paddedB, paddedA], 2)
return finalTensor
The tfObjectMatrix is a list of images which never change.
I did check wether I was able to generate a tf.gradient from the op, which turned out to work. I am unsure if that is sufficient for backpropagation to work though.
Thanks for you time and effort. Any input at all would be greatly appreciated.
TensorFlow will backpropagate to everything by default. As per your code, everything will receive gradients with a training operation from an optimizer. So to answer your question, backpropagation will work.
The only thing to consider, is that you say tfObjectMatrix is a list of images that will not change. So you might not want it to receive any gradients. Therefore you might want to look into tf.stop_gradient() and maybe use it like OM = tf.stop_gradient( tfObjectMatrix ) and work with that OM in your function.
Hi,I have a question that, how can I make predict with unfixed input data? I will try to describe in detail clear:
I use MTCNN for face detection(it's ok unfamiliar with that), and it employs 3 networks: PNet, RNet, ONet. PNet detects a mass of proposal face bounding boxes, then these boxes are coarse-to-fine by the rest net one after another, finally get precise face bbox(s). When taking an image as input to PNet, image's size is unfixed, and the output proposal box number from PNet is also unfixed, so as RNet, ONet. Reference to another MTCNN code I set a large data_shapes(e.g., image size, batch size) when I bind the module, and initialize all to zero,then make predict. That works though, Isn't that a redundant calculation? (Question 1)
PNet:
max_img_w=1000
max_img_h=1000
sym, arg_params, aux_params = mx.model.load_checkpoint(‘det1’, 0)
self.PNets = mx.mod.Module(symbol=sym, context=ctx,label_names=None)
self.PNets.bind(data_shapes=[(‘data’, (1, 3, max_img_w, max_img_h))],for_training=False)
self.PNets.set_params(arg_params,aux_params)
RNet
sym, arg_params, aux_params = mx.model.load_checkpoint(‘det2’, 0)
self.RNet = mx.mod.Module(symbol=sym, context=ctx,label_names=None)
self.RNet.bind(data_shapes=[(‘data’, (2048,3, 24, 24))],for_training=False)
self.RNet.set_params(arg_params,aux_params,allow_missing=True)
ONet
sym, arg_params, aux_params = mx.model.load_checkpoint(‘det3’, 0)
self.ONet = mx.mod.Module(symbol=sym, context=ctx,label_names=None)
self.ONet.bind(data_shapes=[(‘data’, (256, 3, 48, 48))],for_training=False)
self.ONet.set_params(arg_params,aux_params,allow_missing=True)
And I try mx.mod.Module.reshape before predict, which will adjust data'shape according to last network's output, but I get this error:(Question 2)
AssertionError: Shape of unspecified array arg:prob1_label changed. This can cause the new executor to not share parameters with the old one. Please check for error in the network. If this is intended, set partial_shaping=True to suppress this warning.
One more thing is that The MTCNN code (https://github.com/pangyupo/mxnet_mtcnn_face_detection) primary use deprecated function to load models:
self.PNet = mx.model.FeedForward.load(‘det1’,0)
One single line to work with arbitrary data_shapes, why this function be deprecated..?(Question 3)
I found a little difference that after load model, FeedFroward takes 0MB memory before make one predict, but mx.mod.Module takes up memory once loaded, and increase obviously after making one prediction.
You can use MXNet imperative API Gluon and that will let you use different batch-sizes.
If like in this case, your model was trained using the symbolic API or has been exported in the serialized MXNet format ('-0001.params', '-symbol.json' for e.g), you can load it in Gluon that way:
ctx = mx.cpu()
sym = mx.sym.load_json(open('det1-symbol.json', 'r').read())
PNet = gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data'))
PNet.load_params('det1-0001.params', ctx=ctx)
Then you can use it the following way:
# a given batch size (1)
data1 = mx.nd.ones((1, C, W, H))
output1 = PNet(data1)
# a different batch size (5)
data2 = mx.nd.ones((5, C, W, H))
output2 = PNet(data2)
And it would work.
You can get started with MXNet Gluon with the official 60 minutes crash course
The weights retrieved from restored model doesn't change and the input is also constant
But the output of 'Relu:0' operation is giving different results each time.
Below is my code:
sess=tf.Session()
saver = tf.train.import_meta_graph('checkpoints/checkpoints_otherapproach_1/cameranetwork_RAID_CNN-3100.meta')
saver.restore(sess,tf.train.latest_checkpoint(checkpoint_dir='checkpoints/checkpoints_otherapproach_1/'))
images = tf.get_default_graph().get_tensor_by_name('images:0')
phase = tf.get_default_graph().get_tensor_by_name('phase:0')
Activ = tf.get_default_graph().get_tensor_by_name('network/siamese_model/convolution_1/conv_1/Relu:0')
image_array = np.zeros(shape = [1,3,128,64,3]) #*******
imagepath = 'RAiD_Dataset' + '/images_afterremoving_persons_notinallcameras/'+'test'+'/camera_'+str(1)
fullfile_name = imagepath+"/"+ 'camera_1_person_23_index_1.jpg'
image_array[0][0] = cv2.imread(fullfile_name)
image_array[0][1] = image_array[0][0]
image_array[0][2] = image_array[0][0]
image_array = image_array.astype(np.float32)
feed_dict_values ={images: image_array, phase:False}
temp2 = sess.run(Activ, feed_dict =feed_dict_values)
temp1 = sess.run(Activ, feed_dict =feed_dict_values)
print (temp1==temp2).all() #output is false
There are two possible reasons for this:
Some of the tensorflow ops inherit non-deterministic behavior from CUDA. This results in small numerical errors (which might be amplified by non-linearities). See this answer on how to try running your model on a single CPU thread. If the two arrays will turn out to be identical in this condition, then this is the case.
I'm assuming that you know the graph you are loading, but the graph itself might produce inconsistent results 'by design' due to operations deliberately introducing either randomness or inconstant data. For example, consider operations that use the random number generator or operations that update variables (e.g., tf.assign) each time Activ is evaluated.