I've been hand-rolling augmenters using imgaug, as I really like some of the options that are not available in the tf object detection api. For instance, I use motion blur because so much of my data has fast-moving, blurry objects.
How can I best integrate my augmentation sequence with the api for on-the-fly training?
E.g., say I have an augmenter:
aug = iaa.SomeOf((0, 2),
[iaa.Fliplr(0.5), iaa.Flipud(0.5), iaa.Affine(rotate=(-10, 10))])
Is there some way to configure the object detection api to work with this?
What I am currently doing is using imgaug to generate (augmented) training data, and then creating tfrecord files from each iteration of this augmentation pipeline. This is very inefficient as I am saving large amounts of data to disk rather than running augmentation on the fly, during training.
Someone has made a repo for this:
https://github.com/JinLuckyboy/TensorFlowObjectDetectionAPI-with-imgaug
Sorry this is not a code answer and I have not actually looked into it, so I will not mark this as officially answered. If I ever get a chance to test it I will let people know.
Related
In TF 2.x, there are a whole set of image augmentation API, take tf.image.stateless_random_flip_up_down for example. Most of these will perform the said operation at random. What I like to find out is if there’s a way to interrogate what exactly has been perform for a specific image in a specific batch. This info is critical if the target prediction involve localization like points, bounding boxes, etc. Since affine transform (like translate) performed on image, the same operation should be used to “augment” the targets (y) in a consistent manner.
I think all the image transform API in TF2.X do not return this piece of info. I would like to see if there’s easier way than creating custom ones of my own. I have done this for the older Keras data augmentation API in the past by subclasses, and would prefer not to repeat the tedium if possible.
I guess your main goal is to write a data augmentation pipeline that changes both image and its labels.
Well, then I would recommend you to use Albumentaitons library instead. It is a very popular open-source library that can easily be integrated with Tensorflow and PyTorch frameworks.
Here is its documentation: https://albumentations.ai/
Let me know if it helps!
I haven't tried Tensorflow yet but still curious, how does it store, and in what form, data type, file type, the acquired learning of a machine learning code for later use?
For example, Tensorflow was used to sort cucumbers in Japan. The computer used took a long time to learn from the example images given about what good cucumbers look like. In what form the learning was saved for future use?
Because I think it would be inefficient if the program should have to re-learn the images again everytime it needs to sort cucumbers.
Ultimately, a high level way to think about a machine learning model is three components - the code for the model, the data for that model, and metadata needed to make this model run.
In Tensorflow, the code for this model is written in Python, and is saved in what is known as a GraphDef. This uses a serialization format created at Google called Protobuf. Common serialization formats include Python's native Pickle for other libraries.
The main reason you write this code is to "learn" from some training data - which is ultimately a large set of matrices, full of numbers. These are the "weights" of the model - and this too is stored using ProtoBuf, although other formats like HDF5 exist.
Tensorflow also stores Metadata associated with this model - for instance, what should the input look like (eg: an image? some text?), and the output (eg: a class of image aka - cucumber1, or 2? with scores, or without?). This too is stored in Protobuf.
During prediction time, your code loads up the graph, the weights and the meta - and takes some input data to give out an output. More information here.
Are you talking about the symbolic math library, or the idea of tensor flow in general? Please be more specific here.
Here are some resources that discuss the library and tensor flow
These are some tutorials
And here is some background on the field
And this is the github page
If you want a more specific answer, please give more details as to what sort of work you are interested in.
Edit: So I'm presuming your question is more related to the general field of tensor flow than any particular application. Your question still is too vague for this website, but I'll try to point you toward a few resources you might find interesting.
The tensorflow used in image recognition often uses an ANN (Artificial Neural Network) as the object on which to act. What this means is that the tensorflow library helps in the number crunching for the neural network, which I'm sure you can read all about with a quick google search.
The point is that tensorflow isn't a form of machine learning itself, it more serves as a useful number crunching library, similar to something like numpy in python, in large scale deep learning simulations. You should read more here.
I have been looking at the high-level estimator interface in Tensorflow, walked through fairly well in the wide_n_deep tutorial. It doesn't seem to allow streaming input, which I think I require for a training set that doesn't fit in memory.
Does the high-level API support this? I was reading the source, and I can't quite tell. It looks like maybe I could write an input function that had generators instead of arrays, but maybe the code precludes that.
P.S. Sort of related to this question, but I want to stick to the high-level API if I could.
You can certainly train data that does not fit into memory with TensorFlow using high-level APIs. Just use the Dataset API. You can search for:
"The Dataset API supports a variety of file formats so that you can process large datasets that do not fit in memory" in that page. If you want to use Datasets with Estimators, search for "input_fn" on that page.
I am training a ssd_inception neural network using the Tensorflow Object Detection API. In the pipeline config file, there are preprocessor options to augment images during training. Is there any way to introduce probability of applying a given preprocessing? E.g 20% that the image will change contrast etc. If not, are there any plans to do so?
We don't have any current plan to do that. But feel free to send in a pull request. We are happy to review.
See https://github.com/tensorflow/models/blob/master/object_detection/builders/preprocessor_builder.py and https://github.com/tensorflow/models/blob/master/object_detection/core/preprocessor.py to get started.
For the toy example A2 part of the Beta 12 Release, it is said that there are two option for training:
A2_RunCntk_py3.py (python API)
A2_RunCntk.py (brain_script)
Are the models trained from these two methods the same? Or in other words, can I load the model from brain_script into python API and then detect other testing images?
Also see Object Detection using Fast R CNN.
Yes it is possible to use Python to load a model you trained with Brainscript. A few gotchas in doing this correctly are described here. We are working on making things work seamlessly without too much Python code for massaging the data.