I want to use the kitti dataset and training it with the yolo algorithm.
In general, the format of the number of objects and the coordinates of the object in an image for each object are as follows:
<object-class> <x_center> <y_center> <width> <height>
The question is how can I add different fields to this algorithm and training it?
What do you mean under "add different fields to this algorithm"?
If you want to train standart YOLO model, you can't just add such parameters as truncated, occluded etc to label file. Best way I see is to redefine labels from kitti to yolo format.
You can just google for open-source solutions like that.
Related
I would like to change the input and output size of a convolutional model of tensorflow, which I am importing from the tensorflow hub.
Would I like to know what is the best way to do this? If I could convert the model to kaeras format I think it would be easier, but I'm not succeeding either.
This is the model https://tfhub.dev/intel/midas/v2_1_small/1
The format of the input is determined by the publisher of the model. Some models could be flexible on the dimensions of the input and some require input with very specific dimensions. In that case, the best way would be to resize the input as needed before feeding it to the model.
I want to train Mask-RCNN on my own dataset. I already have the segmented images (ground truths) of leaves which look something like the image below:
How can I load the dataset for training Mask RCNN?
Since Mask RCNN is pre-trained on COCO dataset,you need to train it with these images. For that purpose you have to label them and train it. Since a mask is involved use a tool such as VGG annotator to do the necessary annotation and labelling, it will generate a json file depending on your classes. Later based on your requirement you have to run the .py files for your classes, train and then generate it for testing.
You will have to convert this to TFrecords for the MASK RCNN model to be able to read the image and its annotations. Please refer this medium article 'https://medium.com/#vijendra1125/custom-mask-rcnn-using-tensorflow-object-detection-api-101149ce0765'
You can use coco annotations ( here is an example ) to annotate your dataset and then just run it like you use coco dataset.
Also you can check this code : https://github.com/matterport/Mask_RCNN/blob/master/samples/shapes/train_shapes.ipynb
I'm working on a regression neural network using Keras 1.2.1, tensorflow backend, and generators for on-the-fly image augmentation.
I want to augment my shuffled dataset, based on the labels associated with each image.
For example, at each epoch, I only want to include, say 25%, of the images that are labeled as 0.00.
On the other hand, if the image is labeled as, say <= -.20 I want to rotate/flip/sheer it by some random amount.
The question is, how can I choose, selectively, to augment image data based on it's label ?
Is this possible ?
You can do it in numpy using boolean indexing. Tensorflow also lets you do that, see tf.boolean_mask
I am interested in using Tensorflow for training my data for binary classification based on CNN.
Now I wonder about how to set the filter value, number of output nodes in the convolution process.
I have read many tutorials and example. However, most of them use image data and I cannot compare it with my data that is customer data, not pixel.
So could you suggest me about this issue?
If you data varies in time or space then you can use CNN,I am currently working with EEG data set which varies in time.Also you can refer to this paper
http://www.nlpr.ia.ac.cn/english/irds/People/lwang/M-MCG_EN/Publications/2015/YD2015ACPR.pdf
were the input data(Which is not an image) is presented as an image to the CNN.
You have to reshape the data to be 4d. In this example, I have only 4 column.
x_train = np.reshape(x_train, (x_train.shape[0],2, 2,1))
x_test = np.reshape(x_test, (x_test.shape[0],2,2, 1))
This is a good example to use none image data
https://github.com/fengjiqiang/LSTM-Wind-Speed-Forecasting
You just need to change the following :
prediction_cols
feature_cols
features
and dataload
This tutorial for text :
Here !
You might use one of following classes:
class Dataset: Represents a potentially large set of elements.
class FixedLengthRecordDataset: A Dataset of fixed-length records
from one or more binary files.
class Iterator: Represents the state of iterating through a Dataset.
class TFRecordDataset: A Dataset comprising records from one or more
TFRecord files.
class TextLineDataset: A Dataset comprising lines from one or more
text files.
Tutorial
official documentation
I'm trying to use TensorFlow to train output servo commands given an input image.
I plan on using a file as #mrry suggested in this question, with the images like so:
../some/path/some_img.JPG *some_label*
My question is, what are the label formats I can provide to TensorFlow and what structures are suggested?
My data is basically n servo commands from 0-10 seconds. A vector would work great:
[0,2,4,3]
or similarly:
[0,.25,.4,.3]
I couldn't find much about labels in the docs. Can anyone shed any light on TensorFlow labels?
And a very related question is what is the best way to structure these for TensorFlow to properly learn from them?
In Tensorflow Labels are just generic tensor. You can use any kind of tensor to store your labels. In your case a 1-D tensor with shape (4,) seems to be desired.
Labels do only differ from the rest of the data by its use in the computational graph. (Usually) labels should only be used inside the loss function while you propagate the other data through the whole network. For your problem a 4-d regression function should work.
Also, look at my newest comment to the (old) question. Using the slice_input_producer seems to be preferable in your case.