How do I selectively augment subsets of data based on their labels? - tensorflow

I'm working on a regression neural network using Keras 1.2.1, tensorflow backend, and generators for on-the-fly image augmentation.
I want to augment my shuffled dataset, based on the labels associated with each image.
For example, at each epoch, I only want to include, say 25%, of the images that are labeled as 0.00.
On the other hand, if the image is labeled as, say <= -.20 I want to rotate/flip/sheer it by some random amount.
The question is, how can I choose, selectively, to augment image data based on it's label ?
Is this possible ?

You can do it in numpy using boolean indexing. Tensorflow also lets you do that, see tf.boolean_mask


Define TF model with variable input dimensions

I want to implement a tf model with a tweets-set as input and sentiment (or price movement prediction of the underlying asset) as output. Notice that my input is not a single tweet, but a set of tweets published over the same narrow time frame. The model architecture would look something like this:
I use the same model Trainable Model to predict the single sentiments s_i. I then take the average over these sentiments to compute the overall tweets-set sentiment, which I consider as my output.
Now my question is: Can I implement something like this in tensorflow?
One of the main difficulties I can think of, is that the input shape is not fixed. It depends on the the number of tweets n published in that time frame. I read about tf.placeholder, but it doesn't seem to be suitable here, because it still requires a constant input dimension (How to feed input with changing size in Tensorflow).
Also what possibilities does tensorflow offer in order to define such custom models (not fully connected, custom computations e.g. averaging the sentiments etc.)?

how to detect not in trained-for-category images when using resnet50

I have trained resnet50 on four categories of images. It works fantastic when I feed it an image in any one of the four categories -- I have essentially 100% accuracy on images in these categories.
However, when I feed my trained Resnet50 model an image of a similar object, but not in one of the original four categories, the prediction comes back as one of the four existing classes. By this I mean, in the array that is returned with the likelihood of each category, in many cases the likelihood of one of the categories is basically 1. For example, when I query the model about image that is not in one of the four categories, the prediction array will look like
[1.3492944e-07 9.9999988e-01 8.3132584e-14 1.4716975e-24]
Here is the prediction array for an image that the model was trained on:
[1.8217645e-27 1.0000000e+00 3.6731971e-32 0.0000000e+00]
These scores are different, but not much different. Many of the images that are not in one of the trained-for categories have a 1.00000000 for one of the labels.
I had been planning on dealing with the oddball images by looking at the prediction array to see if the max(category labels prediction) was below some threshold. But most of my max(category labels predictions) are all above .99999 and so I can't differentiate between images in the training set and images not part of the training set.
I plan to train my model for N buckets. When I am running the system I will occasionally have images that are not in one of the N buckets and I need to know that. I don't care what they are, I just want to know when an image is not in one of the N buckets.
Resnet50 does a great job of forcing everything into one of the categories, even when it is not.
My images are super well defined! I wonder if I am somehow overtraining or overlooking some other obvious error.
Here is an example of an image that was correctly categorized:
in training set and correctly categorized
Here is an image that is not part of the training set that was then categorized into one of the categories:
not in training set and incorrectly categorized
In summary: I am trying to sort images and I need to know when one of the images is not part of the training categories so I can reject that image. Restated, I want to sort images into buckets: known, trained for buckets, and one unknown bucket.
Is there any way to do this?
Should I use a different classifier than Resnet50?
My images are grayscale, bicubic interpolated during resize (large to smaller), 150x150. I have about 1,600 training images and 200 validation images per category. My accuracy and val_accuracy are .9997 after 3 epochs.
Training and validation accuracy
Training and validation loss
Your model only knows about 4 classes. It or any other model say MobileNet will always look at an image and assign probabilities to each of the 4 classes. You could put in a picture of a water buffalo and it will still try to classify it. Usually but not always if the out of class image you put in is very different from your training images the class with the highest probability will have a probability value well below 1.0. However in your case the out of class image is NOT all that different from the images in your dataset hence a fairly high false probability prediction.
All I can think off is if your out of class images will be generically similar to each other you could create a 5th class and train your model with the data you have plus gather some "typical" out of class images. Then train the model on these 5 classes. I made a model that classified 50 different dog breeds. It was extremely accurate. I put in a picture of Donald Trump and he was predicted as being a chihuahua!

How to predict Faster RCNN model in batches?

I have a trained RCNN (Keras-Retinanet) model and I could predict only one image at a time.
boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
Full script is here.
Is there a way to predict multiple images at a time?
The reason you expand the dims there is to artificially add a batch dimension. Simply stack a bunch of images together and pass it into this function to get a batch of results.
Said differently, right now you are passing in:
You could instead pass in:
[image_1, image_2, image_3, ...] (as a numpy, not python array)

How can I evaluate FaceNet embeddings for face verification on LFW?

I am trying to create a script that is able to evaluate a model on lfw dataset. As a process, I am reading pair of images (using the LFW annotation list), track and crop the face, align it and pass it through a pre-trained facenet model (.pb using tensorflow) and extract the features. The feature vector size = (1,128) and the input image is (160,160).
To evaluate for the verification task, I am using a Siamese architecture. That is, I am passing a pair of images (same or different person) from two identical models ([2 x facenet] , this is equivalent like passing a batch of images with size 2 from a single network) and calculating the euclidean distance of the embeddings. Finally, I am training a linear SVM classifier to extract 0 when the embedding distance is small and 1 otherwise using pair labels. This way I am trying to learn a threshold to be used while testing.
Using this architecture I am getting a score of 60% maximum. On the other hand, using the same architecture on other models (e.g vgg-face), where the features are 4096 [fc7:0] (not embeddings) I am getting 90%. I definitely cannot replicate the scores that I see online (99.x%), but using the embeddings the score is very low. Is there something wrong with the pipeline in general ?? How can I evaluate the embeddings for verification?
Nevermind, the approach is correct, facenet model that is available online is poorly trained and that is the reason for the poor score. Since this model is trained on another dataset and not the original one that is described in the paper (obviously), verification score will be less than expected. However, if you set a constant threshold to the desired value you can probably increase true positives but by sacrificing f1 score.
You can use a similarity search engine. Either using approximated kNN search libraries such as Faiss or Nmslib, cloud-ready similarity search open-source tools such as Milvus, or production-ready managed service such as

How should I structure my labels for TensorFlow?

I'm trying to use TensorFlow to train output servo commands given an input image.
I plan on using a file as #mrry suggested in this question, with the images like so:
../some/path/some_img.JPG *some_label*
My question is, what are the label formats I can provide to TensorFlow and what structures are suggested?
My data is basically n servo commands from 0-10 seconds. A vector would work great:
or similarly:
I couldn't find much about labels in the docs. Can anyone shed any light on TensorFlow labels?
And a very related question is what is the best way to structure these for TensorFlow to properly learn from them?
In Tensorflow Labels are just generic tensor. You can use any kind of tensor to store your labels. In your case a 1-D tensor with shape (4,) seems to be desired.
Labels do only differ from the rest of the data by its use in the computational graph. (Usually) labels should only be used inside the loss function while you propagate the other data through the whole network. For your problem a 4-d regression function should work.
Also, look at my newest comment to the (old) question. Using the slice_input_producer seems to be preferable in your case.