how to manage batches for model.provide_groundtruth - tensorflow

I'm trying to use TensorFlow 2 Object Detection API with a custom dataset for multi classes to train an SSD, I took as base the example provide by the documentation: https://github.com/tensorflow/models/blob/master/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb
My current problem is when I start the fine tuning:
InvalidArgumentError: The first dimension of paddings must be the rank
of inputs[2,2] [6] [Op:Pad]
That seems to be related with the section of model.provide_groundtruth on train_step_fn, as I mention I took my data from a TensorFlow record, I mapped this to a dataset and divide it into batches using padded_batches(tf.data.TFRecordDataset) seems that this is the correct to feed the training with the image but now my problem is the groundtruth because this now is also converted to batches [batch_size,num_detections,coordinate_bbox], is this the problem? any idea on how to fix this issue.
Thanks
P.S. I tried to used the version of modified the pipeline.config file and run the model_main_tf2.py as was in the past with TensorFlow 1 but this method is buggy.

Just to share with everyone this resolves my issue was that I manage to split the data into batches the images and ground truth correctly but I never convert my labels to one hot vector encoding.

Related

HOW (how) does Keras and Pytorch handle the last batch when batch_size is not a multiple of size of training data?

Before you happen to suggest or report as duplicate, I am asking HOW does the two libraries handle the problem NOT WHAT HAPPENS because I know it can make a batch from remaining data. And I do know that it handles but all I am asking is HOW.
If I have 100 images as training data and batch_size=15, the last batch will have 10 images to train. My Question is that when the Input() layer already knows that the data is coming as a shape of (Batch_size,channel,width,height) for PyTorch and (batch_size,width,height,channels) for Keras using Tensorflow as backend.
If the last batch has size=10, isn't the model supposed to throw an error because it will get (10,1,28,28) in place of (15,1,28,28) given we have (28,28) pixels Grayscale images?
What is happening behind the scenes?
If you look at the doc for example you can see that the batch size is optional. That is because the batch size is treated a variable in any iteration

TensorFlow Wide and Deep Model , how many features can I use ?

In this wide and deep model with tensorflow https://www.tensorflow.org/tutorials/wide_and_deep, is there a limit of number of features? I mean is it possible to use 20 columns for training and prediction ?
I tried to train my model with 20 columns, and to predict, but I had this error below
Exception during running the graph: Unable to get element as bytes.
I didn't really understand this error, but I think it is linked to the number of features, cause when I tried with 19 columns, prediction worked!
PS: I'm working on GCP with GCS and GCMLE
Here is the model on my github https://github.com/SofiaAmel/censusTest/blob/master/trainer/model.py
There are no limits to the number of columns. The error you are seeing probably indicates some problem specifically with the column you added.

tensorflow retrain.py understanding train_batch_size

I'm working my way through the Tensorflow InceptionV3 tutorial: https://www.tensorflow.org/tutorials/image_retraining#bottlenecks
I come across the following pargraph:
By default this script will run 4,000 training steps. Each step chooses ten images at random from the training set, finds their bottlenecks from the cache, and feeds them into the final layer to get predictions. Those predictions are then compared against the actual labels to update the final layer's weights through the back-propagation process.
Do the "ten images at random" mean that train_batch_size=10? Meanwhile in the source code I found this:
parser.add_argument(
'--train_batch_size',
type=int,
default=100,
help='How many images to train on at a time.'
)
Does this mean I'm interpreting the paragraph incorrectly? If so, what does train_batch_size mean, and how is it different from the ten random images? Or does it simply mean that the tutorial page is out of date with the actual code?
Source Code: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py
Turns out it was a typo. The 10 random images is actually supposed to be 100 random images, which corresponds to train_batch_size.
Pull request that addressed the issue:
https://github.com/tensorflow/tensorflow/pull/17638

Data augmentation in Tensorflow using Estimator API and TFRecords dataset

I'm using Tensorflow's 1.3 Estimator API to perform some image classification. Since I have a considerable amount of data, I gave the TFRecords a go. Saved the file and can read the examples to a Dataset using a parser function inside the input_fn of the estimator model. So far so good.
The issue is when I want to do some image augmentation (rotating and shearing in this case).
1) I tried using the tf.contrib.keras.preprocessing.image.random_shearand the likes. Turns out Keras doesn't like the format of TF's shape ('Dimension') and I can't cast it to a list because its arguments are the axis indexes not the actual value.
2) Then I tried using the tf.contrib.image.rotate and tf.contrib.image.transform with random values in my chosen range. This time I get an error of NotFoundError: Op type not registered 'ImageProjectiveTransform' in binary running on MYPC. Make sure the Op and Kernel are registered in the binary running in this process. which is an open issue (https://github.com/tensorflow/tensorflow/issues/9672). At the moment I can't move from Windows, so I would very interested in possible alternatives.
3) Searched for a way to read TFRecords and transform it to numpy array and do the augmentation with other tools, but can't find a way from within the input_fn from where I can't access the session.
Thanks!
Have you tried using function from the answer to the question below?tensorflow: how to rotate an image for data augmentation?

How should I structure my labels for TensorFlow?

I'm trying to use TensorFlow to train output servo commands given an input image.
I plan on using a file as #mrry suggested in this question, with the images like so:
../some/path/some_img.JPG *some_label*
My question is, what are the label formats I can provide to TensorFlow and what structures are suggested?
My data is basically n servo commands from 0-10 seconds. A vector would work great:
[0,2,4,3]
or similarly:
[0,.25,.4,.3]
I couldn't find much about labels in the docs. Can anyone shed any light on TensorFlow labels?
And a very related question is what is the best way to structure these for TensorFlow to properly learn from them?
In Tensorflow Labels are just generic tensor. You can use any kind of tensor to store your labels. In your case a 1-D tensor with shape (4,) seems to be desired.
Labels do only differ from the rest of the data by its use in the computational graph. (Usually) labels should only be used inside the loss function while you propagate the other data through the whole network. For your problem a 4-d regression function should work.
Also, look at my newest comment to the (old) question. Using the slice_input_producer seems to be preferable in your case.