Background
I'm using source code from Tensorflow's object detection, as well as Firebase's MLInterpreter. I'm trying to stick closely to the prescribed steps in the documentation. During training, I can see on TensorBoard that the models is training properly, but somehow I am not exporting and wiring things up correctly for inference. Here are the details:
Commands I used, from training through .tflite file
First, I submit the training job using a ssd_mobilenet_v1 config file. The config file is more or less the same that Tensorflow provides by default - I have only modified the class count and the bucket name.
gcloud ml-engine jobs submit training `whoami`_<JOB_NAME>_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.12 \
--job-dir=gs://<BUCKET_NAME>/model_dir \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \
--region us-central1 \
--config object_detection/samples/cloud/cloud.yml \
-- \
--model_dir=gs://<BUCKET_NAME>/model_dir \
--pipeline_config_path=gs://<BUCKET_NAME>/data/ssd_mobilenet_v1.config
Then I export the tflite_graph.pb file:
python models/research/object_detection/export_tflite_ssd_graph.py \
--input_type image_tensor \
--pipeline_config_path ssd_mobilenet_v1.config \
--trained_checkpoint_prefix model.ckpt-264012 \
--output_directory exported_tflite
Great, at this point I have tflite_graph.pb, and need to get from there to the actual .tflite file:
tflite_convert \
--output_file=model.tflite \
--graph_def_file=exported_tflite/tflite_graph.pb \
--input_arrays=normalized_input_image_tensor \
--output_arrays=TFLite_Detection_PostProcess \
--input_shapes=1,300,300,3 \
--allow_custom_ops
Performing inference with Swift and Firebase
I'd like to eventually use AVFoundation to capture images from the camera, but to make this more readable I'll post just the relevant parts of the code:
Here's where the model is initialized and ioOptions are set. I found a comment at the top of export_tflite_ssd_graph (used above) that I used to determine the ioOptions, but I continue to be unconvinced that I configured those properly:
guard let modelPath = Bundle.main.path(forResource: "model", ofType: "tflite") else {
self.interpreter = nil;
super.init()
return;
}
let localModel = CustomLocalModel(modelPath: modelPath)
self.interpreter = ModelInterpreter.modelInterpreter(localModel: localModel)
do {
try self.ioOptions.setInputFormat(index: 0, type: .float32, dimensions: [1, 300, 300, 3])
try self.ioOptions.setOutputFormat(index: 0, type: .float32, dimensions: [1, 10, 4])
try self.ioOptions.setOutputFormat(index: 1, type: .float32, dimensions: [1, 10])
try self.ioOptions.setOutputFormat(index: 2, type: .float32, dimensions: [1, 10])
try self.ioOptions.setOutputFormat(index: 3, type: .float32, dimensions: [1])
} catch let error as NSError {
print("Failed to set input or output format with error: \(error.localizedDescription)")
}
After setting things up, I use the following lines to perform inference later on. Basically, I convert the databuffer to CGImage, do some resizing, and then repack the RGB values into a buffer that I can pass to the model for inference:
# Draw the image in a context
guard let context = CGContext(
data: nil,
width: image.width, height: image.height,
bitsPerComponent: 8, bytesPerRow: image.width * 4,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
) else {
return;
}
context.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
guard let imageData = context.data else { return; }
# "image" is now a CGImage
let inputs = ModelInputs()
var inputData = Data()
do {
for row in 0 ..< 300 {
for col in 0 ..< 300 {
let offset = 4 * (col * context.width + row)
// (Ignore offset 0, the unused alpha channel)
let red = imageData.load(fromByteOffset: offset+1, as: UInt8.self)
let green = imageData.load(fromByteOffset: offset+2, as: UInt8.self)
let blue = imageData.load(fromByteOffset: offset+3, as: UInt8.self)
var normalizedRed = Float32(red) / 255.0
var normalizedGreen = Float(green) / 255.0
var normalizedBlue = Float(blue) / 255.0
// Append normalized values to Data object in RGB order.
let elementSize = MemoryLayout.size(ofValue: normalizedRed)
var bytes = [UInt8](repeating: 0, count: elementSize)
memcpy(&bytes, &normalizedRed, elementSize)
inputData.append(&bytes, count: elementSize)
memcpy(&bytes, &normalizedGreen, elementSize)
inputData.append(&bytes, count: elementSize)
memcpy(&bytes, &normalizedBlue, elementSize)
inputData.append(&bytes, count: elementSize)
}
}
try inputs.addInput(inputData)
} catch let error {
print("Failed to add input: \(error)")
}
guard let interpret = self.interpreter else { return; }
print("Running interpreter")
interpret.run(inputs: inputs, options: self.ioOptions) { outputs, error in
guard error == nil, let outputs = outputs else { return; }
do {
try print(outputs.output(index: 1))
try print(outputs.output(index: 2))
...
} catch let error {
print(error)
}
}
Problem / Question
I actually get an output finally, after a few hours of trying to get the data into a format that doesn't throw errors.
The problem is, the output probabilities are really low and the classes are almost never correct. I know that my model has better accuracy than this, and am feeling like I've done something wrong between getting the checkpoint files and actually running inference on the .tflite file.
Can anybody who has worked with object detection see where I may have gone off course?
Related
For the pre-trained model in python we can reset input/output shapes:
from tensorflow import keras
# Load the model
model = keras.models.load_model('models/generator.h5')
# Define arbitrary spatial dims, and 3 channels.
inputs = keras.Input((None, None, 3))
# Trace out the graph using the input:
outputs = model(inputs)
# Override the model:
model = keras.models.Model(inputs, outputs)
The source code
I'm trying to do the same in TFJS:
// Load the model
this.model = await tf.loadLayersModel('/assets/fast_srgan/model.json');
// Define arbitrary spatial dims, and 3 channels.
const inputs = tf.layers.input({shape: [null, null, 3]});
// Trace out the graph using the input.
const outputs = this.model.apply(inputs) as tf.SymbolicTensor;
// Override the model.
this.model = tf.model({inputs: inputs, outputs: outputs});
TFJS does not support one of the layers in the model:
...
u = keras.layers.Conv2D(filters, kernel_size=3, strides=1, padding='same')(layer_input)
u = tf.nn.depth_to_space(u, 2) # <- TFJS does not support this layer
u = keras.layers.PReLU(shared_axes=[1, 2])(u)
...
I wrote my own:
import * as tf from '#tensorflow/tfjs';
export class DepthToSpace extends tf.layers.Layer {
constructor() {
super({});
}
computeOutputShape(shape: Array<number>) {
// I think the issue is here
// because the error occurs during initialization of the model
return [null, ...shape.slice(1, 3).map(x => x * 2), 32];
}
call(input): tf.Tensor {
const result = tf.depthToSpace(input[0], 2);
return result;
}
static get className() {
return 'TensorFlowOpLayer';
}
}
Using the model:
tf.tidy(() => {
let img = tf.browser.fromPixels(this.imgLr.nativeElement, 3);
img = tf.div(img, 255);
img = tf.expandDims(img, 0);
let sr = this.model.predict(img) as tf.Tensor;
sr = tf.mul(tf.div(tf.add(sr, 1), 2), 255).arraySync()[0];
tf.browser.toPixels(sr as tf.Tensor3D, this.imgSrCanvas.nativeElement);
});
but I get the error:
Error: Input 0 is incompatible with layer p_re_lu: expected axis 1 of input shape to have value 96 but got shape 1,128,128,32.
The pre-trained model was trained with 96x96 pixels images. If I use the 96x96 image, it works. But if I try to use other sizes (for example 128x128), It doesn't work. In python, we can easily reset input/output shapes. Why it doesn't work in JS?
To define a new model from the layers of the previous model, you need to use tf.model
this.model = tf.model({inputs: inputs, outputs: outputs});
I tried to debug this class:
import * as tf from '#tensorflow/tfjs';
export class DepthToSpace extends tf.layers.Layer {
constructor() {
super({});
}
computeOutputShape(shape: Array<number>) {
return [null, ...shape.slice(1, 3).map(x => x * 2), 32];
}
call(input): tf.Tensor {
const result = tf.depthToSpace(input[0], 2);
return result;
}
static get className() {
return 'TensorFlowOpLayer';
}
}
and saw: when I do not try to rewrite the size, the computeOutputShape, method works only twice, and it works 4 times when I try to reset inputs/outputs. Well, then I opened the model's JSON file and changed inputs from [null, 96, 96, 32] to [null, 128, 128, 32] and removed these lines:
// Define arbitrary spatial dims, and 3 channels.
const inputs = tf.layers.input({shape: [null, null, 3]});
// Trace out the graph using the input.
const outputs = this.model.apply(inputs) as tf.SymbolicTensor;
// Override the model.
this.model = tf.model({inputs: inputs, outputs: outputs});
And now it works with 128x128 images. It looks like the piece of code above, adds the layers instead of rewriting them.
I am using p5 to return the vector path of a drawn line. All the vectors in the line are pushed into an array that holds all the vectors. I'm trying to use this as a tensor but I keep getting an error saying
Error when checking model input: the Array of Tensors that you are passing to your model is not the size the model expected. Expected to see 1 Tensor(s), but instead got the following list of Tensor(s):
When I opened the array on the dev tool, each vector was printed like this:
0: Vector {p5: p5, x: 0.5150300601202404, y: -0.25450901803607207, z: 0}
could it be the p5 text in the vector array that's giving me the error? Here's my model and fit code:
let vectorpath = []; //vector path array
// model, setting layers till next '-----'
const model = tf.sequential();
model.add(tf.layers.dense({units: 4, inputShape: [2, 2], activation: 'sigmoid'}));
model.add(tf.layers.dense({units: 2, activation: 'sigmoid'}));
console.log(JSON.stringify(model.outputs[0].shape));
model.weights.forEach(w => {
console.log(w.name, w.shape);
});
// -----
//this is under the draw function so it is continually updated
const labels = tf.randomUniform([0, 1]);
function onBatchEnd(batch, logs) {
console.log('Accuracy', logs.acc);
}
model.fit(vectorpath, labels, {
epochs: 5,
batchSize: 32,
callbacks: {onBatchEnd}
}).then(info => {
console.log('Final accuracy', info.history.acc);
});
What could be causing the error? and how can I fix it?
The question's pretty vague but I'm really just not sure.
I trained my model using Keras in Python and I converted my model to a tfjs model to use it in my webapp. I also wrote a small prediction script in python to validate my model on unseen data. In python it works perfectly, but when I'm trying to predict in my webapp it goes wrong.
This is the code I use in Python to create tensors and predict based on these created tensors:
input_dict = {name: tf.convert_to_tensor([value]) for name, value in sample_v.items()}
predictions = model.predict(input_dict)
classes = predictions.argmax(axis=-1)
In TFJS however it seems I can't pass a dict (or object) to the predict function, but if I write code to convert it to a tensor array (like I found on some places online), it still doesn't seem to work.
Object.keys(input).forEach((k) => {
input[k] = tensor1d([input[k]]);
});
console.log(Object.values(input));
const prediction = await model.executeAsync(Object.values(input));
console.log(prediction);
If I do the above, I get the following error: The shape of dict['key_1'] provided in model.execute(dict) must be [-1,1], but was [1]
If I then convert it to this code:
const input = { ...track.audioFeatures };
Object.keys(input).forEach((k) => {
input[k] = tensor2d([input[k]], [1, 1]);
});
console.log(Object.values(input));
I get the error that some dtypes have to be int32 but are float32. No problem, I can set the dtype manually:
const input = { ...track.audioFeatures };
Object.keys(input).forEach((k) => {
if (k === 'int_key') {
input[k] = tensor2d([input[k]], [1, 1], 'int32');
} else {
input[k] = tensor2d([input[k]], [1, 1]);
}
});
console.log(Object.values(input));
I still get the same error, but if I print it, I can see the datatype is set to int32.
I'm really confused as to why this is and why I can't just do like python and just put a dict (or object) in TFJS, and how to fix the issues I'm having.
Edit 1: Complete Prediction Snippet
const model = await loadModel();
const input = { ...track.audioFeatures };
Object.keys(input).forEach((k) => {
if (k === 'time_signature') {
input[k] = tensor2d([parseInt(input[k], 10)], [1, 1], 'int32');
} else {
input[k] = tensor2d([input[k]], [1, 1]);
}
});
console.log(Object.values(input));
const prediction = model.predict(Object.values(input));
console.log(prediction);
Edit 2: added full errormessage
my environment:
ubuntu 18.04
rtx 2080ti
cuda 10.1
node v12.16.3
tfjs 1.7.4
the saved_model is efficientdet-d0,
and the step of inference is in inference step
for parsing image data with js,i convert img.png to img.jpg,and the result of saved_model is same with saved_model result
the command convert saved_model to tfjs_graph_model is
tensorflowjs_converter --input_format=tf_saved_model /tmp/saved_model ~/DATA/http_models/specDetection/
and my test code is
var tfc = require("#tensorflow/tfjs-converter");
var tf = require("#tensorflow/tfjs-core");
var jpeg_js = require("jpeg-js");
var fs = require("fs");
async function loadModel() {
var modelUrl = "http://localhost:8000/model.json"
var model = await tfc.loadGraphModel(modelUrl);
return model;
}
async function detect() {
var model = await loadModel();
var img = fs.readFileSync("~/SRC/automl_test/efficientdet/img.jpg");
const input = jpeg_js.decode(img,{useTArray:true,formatAsRGBA:false});
const batched = tf.tidy(() => {
const img = tf.browser.fromPixels(input);
// Reshape to a single-element batch so we can pass it to executeAsync.
return img.expandDims(0);
});
const result = await model.executeAsync({'image_arrays:0':batched},['detections:0']);
console.log(result);
}
detect();
when detect object in img.jpg with my test code,nothing detected --- the size of result is 0
what do i do to sovle this problem?
thanks for any cue
edit:
code 1:
var img = fs.readFileSync("~/DATA/http_models/specDetection/test.jpg");
var dataJpegJs = jpeg_js.decode(img,{useTArray:true,formatAsRGBA:false})
var batched = tf.browser.fromPixels({data:dataJpegJs.data, width: dataJpegJs.width, height:dataJpegJs.height},3);
batched = batched.slice([0,0,0],[-1,-1,3]);
var result = await model.executeAsync({'image_arrays:0':batched.expandDims(0)},['detections:0']);
result = tf.slice(result,[0,0,1],[1,-1,4]);
code 2:
var img = fs.readFileSync("~/DATA/http_models/specDetection/test.jpg");
var dataJpegJs = jpeg_js.decode(img,{useTArray:true,formatAsRGBA:true})
var batched = tf.browser.fromPixels({data:dataJpegJs.data, width: dataJpegJs.width, height:dataJpegJs.height},4);
batched = batched.slice([0,0,0],[-1,-1,3]);
var result = await model.executeAsync({'image_arrays:0':batched.expandDims(0)},['detections:0']);
result = tf.slice(result,[0,0,1],[1,-1,4]);
code 1 got a bad result and code 2 got a correct result.
code 2 decode jpg with formatAsRGBA:true,and set numChannels=4 in tf.browser.fromPixels. jpeg-js must decode jpg to RGBA to work correctly.
i think it is a bug of jpeg-js.or i am not familiar with jpg encoding?
The tensor is not well generated. fromPixels is mostly used to get a tensor from an htmlImageElement. Printing a summary of the tensor and compare it with the one generated for python can suffice to tell that.
Is there an issue with jpeg-js ?
First we need to know how the imageData works. An image Data pixel is a 4 numerical values R, G, B, A. When using the data decoded by jpeg_js.decode as argument of tf.browser.fromPixel with 3 channels (formatAsRGBA:false), it is considered as an image data. Let's consider the data [a, b, c, d, e, f] = jpeg_js.decode("path", {formatAsRGBA:false}) and the tensor t created from it
t = tf.browser.fromPixels({data, width: 2, height: 1}). How it is interpreted ? tf.browser.fromPixels, will create an ImageDate of height: 1 and of width: 2. Consequently, the imageData will be of size 1 * 2 * 4 (instead of 1 * 2 * 3) and has all its values set to 0. Then it will copy the data decoded to the imageData. So imageData = [a, b, c, d, e, f, 0, 0].
As a result, the slice (t.slice([0, 0, 0], [-1, -1, 3]) will be [a, b, c, e, f, 0].
Neither is jpeg_js the issue, nor tf.browser.fromPixels. This is how imageData works
What can be done ?
keep the alpha channel of the decoded image formatAsRGBA:true
Instead of using tf.browser.fromPixels, use directly tf.tensor to create the tensor
const img = tf.tensor(input.data, [input.height, input.width, 3])
Another option is to usetensorflow-node. And tf.node.decodeImage can decode an image from a tensor.
const img = fs.readFileSync("path/of/image");
const tensor = tf.node.decodeImage(img)
// use the tensor for prediction
Unlike jpeg-js that works only for image in jpeg encoding format, it can decode a wider range of images
I am trying to convert my custom Keras model, with two bidirectional GRU layers, to tf-lite for use on mobile devices. I converted my model to the protobuff format and tried to convert it with the given code by TensorFlow:
converter = tf.lite.TFLiteConverter.from_frozen_graph('gru.pb', input_arrays=['input_array'], output_arrays=['output_array'])
tflite_model = converter.convert()
When I execute this it runs for a bit and then I get the following error:
F tensorflow/lite/toco/tooling_util.cc:1455] Should not get here: 5
So I looked up that file and it states the following:
void MakeArrayDims(int num_dims, int batch, int height, int width, int depth,
std::vector<int>* out_dims) {
CHECK(out_dims->empty());
if (num_dims == 0) {
return;
} else if (num_dims == 1) {
CHECK_EQ(batch, 1);
*out_dims = {depth};
} else if (num_dims == 2) {
*out_dims = {batch, depth};
} else if (num_dims == 3) {
CHECK_EQ(batch, 1);
*out_dims = {height, width, depth};
} else if (num_dims == 4) {
*out_dims = {batch, height, width, depth};
} else {
LOG(FATAL) << "Should not get here: " << num_dims;
}
}
Which seems to be correct since I am using 5 dimensions: [Batch, Sequence, Height, Width, Channels]
Google didn't help me much with this issue, but maybe I am using the wrong search terms.
So is there any way to avoid this error, or does tf-lite simply not support sequences?
ps.
I am using TensorFlow 1.14 with python3 in the given docker container.