COCO json annotation to YOLO txt format - tensorflow

how to convert a single COCO JSON annotation file into a YOLO darknet format?? like below
each individual image has separate filename.txt file

My classmates and I have created a python package called PyLabel to help others with this task and other labelling tasks.
Our package does this conversion! You can see an example in this notebook https://github.com/pylabel-project/samples/blob/main/coco2yolov5.ipynb.
You're answer should be in there! But you should be able to do this conversion by doing something like:
!pip install pylabel
from pylabel import importer
dataset = importer.ImportCoco(path=path_to_annotations, path_to_images=path_to_images)
dataset.export.ExportToYoloV5(dataset)
You can find the source code that is used behind the scenes here https://github.com/pylabel-project/

I built a tool
https://github.com/tw-yshuang/coco2yolo
Download this repo and use the following command:
python3 coco2yolo.py [OPTIONS]
coc2yolo
Usage: coco2yolo.py [OPTIONS] [CAT_INFOS]...
Options:
-ann-path, --annotations-path TEXT
JSON file. Path for label. [required]
-img-dir, --image-download-dir TEXT
The directory of the image data place.
-task-dir, --task-categories-dir TEXT
Build a directory that follows the task-required categories.
-cat-t, --category-type TEXT Category input type. (interactive | file) [default: interactive]
-set, --set-computing-type TEXT
Set Computing for the data. (union | intersection) [default: union]
--help Show this message and exit.

There is an open-source tool called makesense.ai for annotating your images. You can download YOLO txt format once you annotate your images. But you won't be able to download the annotated images.

There is three ways.
use roboflow https://roboflow.com/formats (You can find another solution also)
You can find some usage guide for roboflow. e.g.
https://medium.com/red-buffer/roboflow-d4e8c4b52515
search 'convert coco format to yolo format' -> you will find some open-source codes to convert annotations to yolo format.
write your own code to convert coco format to yolo format

Related

Extracting output from joblib function

I'm trying to extract the output shown when saving a Tensorflow model using joblib's dump function, I can't seem to find the right documentation to extract it. I've attached an image below with the desired output highlighted under the "INFO" section.
Code
Thank you!

Remove header from .MAT files loaded by loadmat() from scipy.io library

I am new to both python and tensorflow.
I am trying to make a input pipeline for a generative adversarial network with input complex number data in .mat format and loaded it with loadmat() from scipy.io library. Now I am trying to prepare my data for giving input to my network and i tried from_tensor_slices(). But it can not be converted into tensor because of the headers in it. I looked up how to remove header from files by python and found some techniques that can be applied to .csv file but nothing on .mat files. How can I remove the header from .mat files? Also, the loadmat() function returns a list of dictionary I think. How can I extract the data from the file under such condition? Thank you.

Is it possible to re-train a mobilenet neural network with png images?

I'm currently working on a mobilenet pre-trained network which I would like to re-train with a dataset which contains png images.
I call the retrain script as follow :
python scripts/retrain.py
--bottleneck_dir=tf_files/bottlenecks
--how_many_training_steps=200
--model_dir=tf_files/models/
--summaries_dir=tf_files/training_summaries/"mobilenet_0.50_224"
--output_graph=tf_files/retrained_graph.pb
--output_labels=tf_files/retrained_labels.txt
--architecture mobilenet_0.50_224
--image_dir=tf_files/data
It seems like the images needs to be jpg, is it any way to work with png images instead ?
Can confirm it doesn't work with png files. I have, however, written a bash script that when placed in the same directory as the subclasses of the dataset can convert the images to jpg.
first you need to install imagemagick package by:
sudo apt-get install imagemagick
then you can run this script:
#!/bin/bash
for d in */ ; do
cd "$d"
for p in * ; do
IFS='.' read -r -a array <<< "$p"
convert "$p" "${array[0]}".jpg
done
cd ..
done
edit:
retrain.py does have a list with valid extensions (line 151):
extensions = ['jpg', 'jpeg', 'JPG', 'JPEG']
I didn't try to add 'png' to the list though

How to manually create text summaries in TensorFlow?

First of all, I already know how to manually add float or image summaries. I can construct a tf.Summary protobuf manually. But what about text summaries? I look at the definition for summary protobuf here, but I don't find a "string" value option there.
TensorBoard's text plugin offers a pb method that lets you create text summaries outside of a TensorFlow environment.
https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/text/summary.py#L74
Example usage:
import tensorboard as tb
text_summary_proto = tb.summary.pb('fooTag', 'text data')
John Hoffman's answer is great, though the tb.summary.pb API seems not available as of TF 1.x. You can instead use the following APIs:
tb.summary.text_pb("key", "content of the text data")
Just FYI, tb.summary has many similar methods for other types of summary as well:
'audio', audio_pb',
'custom_scalar', 'custom_scalar_pb',
'histogram', 'histogram_pb',
'image', 'image_pb',
'pr_curve', 'pr_curve_pb',
'pr_curve_raw_data_op',
'pr_curve_raw_data_pb',
'pr_curve_streaming_op',
'scalar', 'scalar_pb',
'text', 'text_pb'

TensorFlow: Opening log data written by SummaryWriter

After following this tutorial on summaries and TensorBoard, I've been able to successfully save and look at data with TensorBoard. Is it possible to open this data with something other than TensorBoard?
By the way, my application is to do off-policy learning. I'm currently saving each state-action-reward tuple using SummaryWriter. I know I could manually store/train on this data, but I thought it'd be nice to use TensorFlow's built in logging features to store/load this data.
As of March 2017, the EventAccumulator tool has been moved from Tensorflow core to the Tensorboard Backend. You can still use it to extract data from Tensorboard log files as follows:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
event_acc = EventAccumulator('/path/to/summary/folder')
event_acc.Reload()
# Show all tags in the log file
print(event_acc.Tags())
# E. g. get wall clock, number of steps and value for a scalar 'Accuracy'
w_times, step_nums, vals = zip(*event_acc.Scalars('Accuracy'))
Easy, the data can actually be exported to a .csv file within TensorBoard under the Events tab, which can e.g. be loaded in a Pandas dataframe in Python. Make sure you check the Data download links box.
For a more automated approach, check out the TensorBoard readme:
If you'd like to export data to visualize elsewhere (e.g. iPython
Notebook), that's possible too. You can directly depend on the
underlying classes that TensorBoard uses for loading data:
python/summary/event_accumulator.py (for loading data from a single
run) or python/summary/event_multiplexer.py (for loading data from
multiple runs, and keeping it organized). These classes load groups of
event files, discard data that was "orphaned" by TensorFlow crashes,
and organize the data by tag.
As another option, there is a script
(tensorboard/scripts/serialize_tensorboard.py) which will load a
logdir just like TensorBoard does, but write all of the data out to
disk as json instead of starting a server. This script is setup to
make "fake TensorBoard backends" for testing, so it is a bit rough
around the edges.
I think the data are encoded protobufs RecordReader format. To get serialized strings out of files you can use py_record_reader or build a graph with TFRecordReader op, and to deserialize those strings to protobuf use Event schema. If you get a working example, please update this q, since we seem to be missing documentation on this.
I did something along these lines for a previous project. As mentioned by others, the main ingredient is tensorflows event accumulator
from tensorflow.python.summary import event_accumulator as ea
acc = ea.EventAccumulator("folder/containing/summaries/")
acc.Reload()
# Print tags of contained entities, use these names to retrieve entities as below
print(acc.Tags())
# E. g. get all values and steps of a scalar called 'l2_loss'
xy_l2_loss = [(s.step, s.value) for s in acc.Scalars('l2_loss')]
# Retrieve images, e. g. first labeled as 'generator'
img = acc.Images('generator/image/0')
with open('img_{}.png'.format(img.step), 'wb') as f:
f.write(img.encoded_image_string)
You can also use the tf.train.summaryiterator: To extract events in a ./logs-Folder where only classic scalars lr, acc, loss, val_acc and val_loss are present you can use this GIST: tensorboard_to_csv.py
Chris Cundy's answer works well when you have less than 10000 data points in your tfevent file. However, when you have a large file with over 10000 data points, Tensorboard will automatically sampling them and only gives you at most 10000 points. It is a quite annoying underlying behavior as it is not well-documented. See https://github.com/tensorflow/tensorboard/blob/master/tensorboard/backend/event_processing/event_accumulator.py#L186.
To get around it and get all data points, a bit hacky way is to:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
class FalseDict(object):
def __getitem__(self,key):
return 0
def __contains__(self, key):
return True
event_acc = EventAccumulator('path/to/your/tfevents',size_guidance=FalseDict())
It looks like for tb version >=2.3 you can streamline the process of converting your tb events to a pandas dataframe using tensorboard.data.experimental.ExperimentFromDev().
It requires you to upload your logs to TensorBoard.dev, though, which is public. There are plans to expand the capability to locally stored logs in the future.
https://www.tensorflow.org/tensorboard/dataframe_api
You can also use the EventFileLoader to iterate through a tensorboard file
from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
for event in EventFileLoader('path/to/events.out.tfevents.xxx').Load():
print(event)
Surprisingly, the python package tb_parse has not been mentioned yet.
From documentation:
Installation:
pip install tensorflow # or tensorflow-cpu pip install -U tbparse # requires Python >= 3.7
Note: If you don't want to install TensorFlow, see Installing without TensorFlow.
We suggest using an additional virtual environment for parsing and plotting the tensorboard events. So no worries if your training code uses Python 3.6 or older versions.
Reading one or more event files with tbparse only requires 5 lines of code:
from tbparse import SummaryReader
log_dir = "<PATH_TO_EVENT_FILE_OR_DIRECTORY>"
reader = SummaryReader(log_dir)
df = reader.scalars
print(df)