MXNET build model error on r - mxnet

When I try to use mxnet to build a feedforward model it appeared the following error:
Error in mx.io.internal.arrayiter(as.array(data), as.array(label), unif.rnds, :
basic_string::_M_replace_aux
I follow the R regression example on mxnet website but I change the data into my own data which contains 109 examples and 1876 variables. The first several steps can run without error until ran the model building step. I just can't understand the error information mean. I wonder that it is because of my dataset or the way I deal with the data.

Can you provide the code snippet you are using? That gives more details on the issue. Also, any stacktrace will be useful.
You get this error message mainly due to invalid column/row access and shape (dimension) mismatch. Can you verify if you are using correct "index" values in creating matrix. Let me know if this fixes the issue.
However, MXNet can be better at printing details about error in the stacktrace. I have created a issue to follow up on this - https://github.com/dmlc/mxnet/issues/4206

Related

GluonCV inference with finetuned model - “Please make sure source and target networks have the same prefix” error

I used GluonCV to finetune an object detection model in order to recognize some custom classes, mostly following the related tutorial.
I tried using both “ssd_512_resnet50_v1_coco” and “ssd_512_mobilenet1.0_coco” as base models, and the training process ended successfully (the accuracy on the validation dataset is reasonably high).
The problem is, I tried running inference with the newly trained model, by using for example:
classes = ["CML_mug", "person"]
net = gcv.model_zoo.get_model('ssd_512_mobilenet1.0_custom',
classes=classes,
pretrained_base=False,
ctx=ctx)
net.load_params("saved_weights/-0070.params", ctx=ctx)
but I get the error:
AssertionError: Parameter 'mobilenet0_conv0_weight' is missing in file: saved_weights/CML_mobilenet_00/-0000.params, which contains parameters: 'ssd0_ssd0_mobilenet0_conv0_weight', 'ssd0_ssd0_mobilenet0_batchnorm0_gamma', 'ssd0_ssd0_mobilenet0_batchnorm0_beta', ..., 'ssd0_ssd0_ssdanchorgenerator2_anchor_2', 'ssd0_ssd0_ssdanchorgenerator3_anchor_3', 'ssd0_ssd0_ssdanchorgenerator4_anchor_4', 'ssd0_ssd0_ssdanchorgenerator5_anchor_5'. Please make sure source and target networks have the same prefix.
So, it seems the network parameters are named differently in the .params file and in the model I’m using for inference. Specifically, in the .params file, the name of the network weights is prefixed by the string “ssd0_ssd0_”, which lead to the error when invoking net.load_parameters.
I did this whole procedure a few times in the past without having problems, did anything change? I’m running it on Ubuntu 18.04, with mxnet-mkl (1.6.0) and gluoncv (0.7.0).
I tried loading the .params file by:
from mxnet import nd
model = nd.load(0070.param)
and I wanted to modify it and remove the “ssd0_ssd0_” string that is causing the problem.
I’m trying to navigate the dictionary, but between the keys I only found a:
ssd0_resnetv10_conv0_weight
so, slightly different than indicated in the error.
Anyway, this way of fixing the issue would be a little cumbersome, I’d prefer a more direct way.
Ok, fixed it. Basically, during training I was saving the .params file by using:
net.export(param_file)
and, as I said, loading them during inference by:
net.load_parameters(param_file)
However, it doesn’t work this way, but it does if instead of export I use:
net.save_parameters(param_file)

unsigned int overflow error in converting image to MNIST format

I'm a newbie of deep learning utilizing tensorflow.
I want to make the own model that predict my custom images that are constructed on the grayscale.
But the only thing that I know is MNIST example utilizing tensorflow.
So I used a converting module from this repo but the error had been occurred such as this.
Images like to convert was constructed as 80,680 of training images, 20,170 of test images.
I really don't know why this error has occurred.
Please help me.
The script you're referring to doesn't correctly set up the headers for the MNIST format. It was addressed in a previous Github issue that has since been deleted, but my modification:
header = array('B')
header.extend([0,0,8,1,0,0])
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[2:][2:],16))
to
header = array('B')
header.extend([0,0,8,1])
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[4:][:2],16))
header.append(int('0x'+hexval[6:][:2],16))
header.append(int('0x'+hexval[8:][:2],16))
should get it working. Hope this helps!

Tensorflow Lite export looks like it do not add weigths and add unsupported operations

I want to reload some of my model variables with the saved weight in the chheckpoint and then export it to the tflite file.
The question is a bit tricky without see code, so I made this Colab jupyter notebook with the complete code to explain it better (All code is working, you can actually copy in a new collab and change if you want):
https://colab.research.google.com/drive/1wSor4CxEz36LgElVi4y_N8uiSt4-j9b2#scrollTo=XKBQzoW_wd4A
I got it working but with two issues:
The exported .tflite file is like 3Ks, so I do not believe it is the entire model with the weights in it. Only the input is an image of 128x128x3, one weight for each is more than 3K.
When I finally import the model in Android, I have this error: "Didn't find custom op for name 'VariableV2' /n Didn't find custom op for name 'ReorderAxes' /n Registration failed."
Maybe the last error is cause the save/restore operations? They look like are there when I save the graph definition.
Thanks in advance.
I realize my problem.. I'm trying to convert to TFLITE a model without previously freezing it, TFLITE do not allow "VariableV2" nodes cause they should not be there..
All the problem is corrected freezing the model like this:
output_graph_def = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), ["output"])
I lost some time looking for that, hope it helps.

what's the pipeline to train tensorflow attention-ocr on customized dataset?

I've read some questions on stackoverflow about attention-ocr, and most of them are about the implementation detail of a specific step. What I wanted to know is the pipeline for us to fine-tune this model on our own dataset.
As far as I know, the steps should be:
0) Should we first download FSNS dataset?? I tried to bypass this step and try running inference on just one image, but it always give me error:"ImportError: No module named 'fsns". So I wonder if this error will go away once I set my own dataset up.
1) Store our data in the same format as FSNS. (Links on this topic: How to create dataset in the same format as the FSNS dataset?, how to create cutomized dataset for google tensorflow attention ocr? )
2) Download the pre-trained checkpoint(http://download.tensorflow.org/models/attention_ocr_2017_08_09.tar.gz)
3) Somehow modify the 'model.py' to fit your own purpose.
4) Somehow modify the 'train.py' to train your own module using tensorflow serving.
I am still on the early stage (creating own dataset) on this project now, and confused on how to do it and what's the next stage.
The error was caused by incorrect version of Python. They should be run with Python 2, and you can just change the 'import' sentence to solve this error. Try to change the 'import fsns' to 'from datasets import fsns'.

Tensorflow Warning while saving SparseTensor

I'm working on a TensorFlow project where 'targets' is defined as:
targets = tf.sparse_placeholder(tf.int32, name='targets')
Now saving my model with saver.save(sess, model_path, meta_graph_suffix='meta', write_meta_graph=True) gives me the following error:
WARNING:tensorflow:Error encountered when serializing targets.
Type is unsupported, or the types of the items don't match field type in CollectionDef.
'SparseTensor' object has no attribute 'name'
I believe the warning is printed in the following lines of code: https://github.com/tensorflow/tensorflow/blob/f974e8d0c2420c6f7e2a2791febb4781a266823f/tensorflow/python/training/saver.py#L1452
Reloading the model with saver.restore(session, save_path) seems to work though.
Has anyone seen this issue before? Why would serializing a SparseTensor give that warning? Is there any way to avoid this warning?
I'm using TensorFlow version 0.10.0rc0 python 2.7 GPU version. I can't provide a minimal example, it doesn't happen all the time, only in certain configurations. And I can't share the model I currently have this issue with.
The component placeholders (for indices, values, and possibly shape) somehow get added to some collections. If you trace through the code in saver.py, you can see ops.get_all_collection_keys() being used.
This should be a benign warning. I will forward to the team to see if something can be done to improve this handling.
The warning means that a SparseTensor type of operation has been added to a collection whose to_proto() implementation expects a "name" field.
I'd consider this a bug if you intend to restore the complete graph from meta_graph, including all the Python objects, and you should find out which operation added that SparseTensor into a collection.
If you never intend to restore from meta_graph, then you can ignore this error.
Hope that helps.
Sherry