I am using the TF Object Detection API. I have a custom data set. I am training using SLURM jobs and calling the API scripts from within there. I am looking to try and tune hyperparameters found in the pipeline.config files. Unfortunately, in the documentation, this kind of process is not outlined. It seems like the process is to either use the sample configs or tune the hyperparameters by hand.
Tuning by hand is somewhat feasible, for example adjusting for two parameters for three values (batch size and steps) results in nine different .configs, but adding another hyperparameter to that boosts it up to twenty-seven files I need to keep track of. This does not seem like a good way to do it, particularly because it limits the values I can try and is clumsy.
It seems like there are libraries out there that hook into Keras and other more high-level frameworks, but I have found nothing that looks like it can take the results of the Object Detection API and actually optimize it.
Is it possible to do this with a pre-built library I don't know about? I would like to avoid having to edit the API implementation or coding this myself to minimize errors.
Related
In TF 2.x, there are a whole set of image augmentation API, take tf.image.stateless_random_flip_up_down for example. Most of these will perform the said operation at random. What I like to find out is if there’s a way to interrogate what exactly has been perform for a specific image in a specific batch. This info is critical if the target prediction involve localization like points, bounding boxes, etc. Since affine transform (like translate) performed on image, the same operation should be used to “augment” the targets (y) in a consistent manner.
I think all the image transform API in TF2.X do not return this piece of info. I would like to see if there’s easier way than creating custom ones of my own. I have done this for the older Keras data augmentation API in the past by subclasses, and would prefer not to repeat the tedium if possible.
I guess your main goal is to write a data augmentation pipeline that changes both image and its labels.
Well, then I would recommend you to use Albumentaitons library instead. It is a very popular open-source library that can easily be integrated with Tensorflow and PyTorch frameworks.
Here is its documentation: https://albumentations.ai/
Let me know if it helps!
I am trying to write code that is eager and graph compatible. However, there is very little information online for how to do this, being a literal footnote on TensorFlow's website. Furthermore, what they have wrote is confusing, saying:
The same code written for eager execution will also build a graph during graph execution. Do this by simply running the same code in a new Python session where eager execution is not enabled.
This implies that a same code solution is possible, where the only change required is the addition or removal of tf.enable_eager_execution().
Currently I use tf.keras to define my model and tf.data for my input pipeline. However, many eager operations don't work in graph, with the opposite also being true.
For example, I keep track of my number of epochs using tf.train.Checkpoint(). In eager mode, after restoring I can access it using epochs.numpy() to assign its value to a local variable. However, this does not work with graphs, which instead would require sess.run(epochs) due to the values not being defined during execution.
Again, to compute my gradients in eager I need to use some form of autograd, in my case tf.GradientTape(). This is not compatible with graphs, as "tf.GradientTape.gradients() does not support graph control flow."
I see that tfe.py_func exists, but once again, this only works when eager is not enabled, thus not helping for this problem.
So how do I make a same code solution, when it seems that many aspects of eager and graph directly conflict with each other?
I haven't tried Tensorflow yet but still curious, how does it store, and in what form, data type, file type, the acquired learning of a machine learning code for later use?
For example, Tensorflow was used to sort cucumbers in Japan. The computer used took a long time to learn from the example images given about what good cucumbers look like. In what form the learning was saved for future use?
Because I think it would be inefficient if the program should have to re-learn the images again everytime it needs to sort cucumbers.
Ultimately, a high level way to think about a machine learning model is three components - the code for the model, the data for that model, and metadata needed to make this model run.
In Tensorflow, the code for this model is written in Python, and is saved in what is known as a GraphDef. This uses a serialization format created at Google called Protobuf. Common serialization formats include Python's native Pickle for other libraries.
The main reason you write this code is to "learn" from some training data - which is ultimately a large set of matrices, full of numbers. These are the "weights" of the model - and this too is stored using ProtoBuf, although other formats like HDF5 exist.
Tensorflow also stores Metadata associated with this model - for instance, what should the input look like (eg: an image? some text?), and the output (eg: a class of image aka - cucumber1, or 2? with scores, or without?). This too is stored in Protobuf.
During prediction time, your code loads up the graph, the weights and the meta - and takes some input data to give out an output. More information here.
Are you talking about the symbolic math library, or the idea of tensor flow in general? Please be more specific here.
Here are some resources that discuss the library and tensor flow
These are some tutorials
And here is some background on the field
And this is the github page
If you want a more specific answer, please give more details as to what sort of work you are interested in.
Edit: So I'm presuming your question is more related to the general field of tensor flow than any particular application. Your question still is too vague for this website, but I'll try to point you toward a few resources you might find interesting.
The tensorflow used in image recognition often uses an ANN (Artificial Neural Network) as the object on which to act. What this means is that the tensorflow library helps in the number crunching for the neural network, which I'm sure you can read all about with a quick google search.
The point is that tensorflow isn't a form of machine learning itself, it more serves as a useful number crunching library, similar to something like numpy in python, in large scale deep learning simulations. You should read more here.
I am working on a project that aims to detect objects in certain difficult circumstances. I ran a test with Mask_RCNN on a dataset that contains that specific type of difficult examples and it did a pretty good job in some of them.
But some other examples didn't get detected surprisingly, when there is no obvious reason. To understand the reason behind this performance difference, I've been adviced to use Tensorboard. But then I realized that its mostly used for training phase, as I understood from this video.
At the end of the video, however, they mention about an integration project of Tensorboard, namely the Tensorflow Debugger Integration. But unfortunately I could not find further information regarding the continuation about that feature.
Is there any way to visualize weights and activation maps inside a CNN during inference/evaluation phase?
The main difference between training and inference time for tensorboard will be the global_step value. Most graphs display global step as the x-axis. You can supply your own global step counter if you like, but you'll have to decide what the x-axis should represent to you in this case since "time" isn't really a logical construct during inference. Other tabs such as the images tab don't have a time component, so using them should be the same as during training.
The tensorflow debugger is a nice terminal debugger, but wouldn't really be related to what you're trying to do here. It's certainly not a visualization tool.
Another approach might be to simply generate your own plots and output a set of PDFs with the various visualizations you need using standard tools like matplotlib for each test image. I've found tools like XnView make it really easy to look through a lot of PDF visualizations to understand what's going on. I've used this approach quite effectively. If you want to view many hundreds or thousands of results quickly you might have an easier time if all the visuals are just dumped out to a directory.
I have been looking at the high-level estimator interface in Tensorflow, walked through fairly well in the wide_n_deep tutorial. It doesn't seem to allow streaming input, which I think I require for a training set that doesn't fit in memory.
Does the high-level API support this? I was reading the source, and I can't quite tell. It looks like maybe I could write an input function that had generators instead of arrays, but maybe the code precludes that.
P.S. Sort of related to this question, but I want to stick to the high-level API if I could.
You can certainly train data that does not fit into memory with TensorFlow using high-level APIs. Just use the Dataset API. You can search for:
"The Dataset API supports a variety of file formats so that you can process large datasets that do not fit in memory" in that page. If you want to use Datasets with Estimators, search for "input_fn" on that page.