I have few questions about TensorFlow. I'm following the "TensorFlow for Poets" tutorial (https://petewarden.com/2016/02/28/tensorflow-for-poets/), and i got the expected result.
However i would like to know two thing:
1. How to classify more than one image at a time?
2. How to extract the result in .txt format?
Thank you
I had the same issue, so I built the TensorPy GitHub repo to easily handle image classifications of either individual or multiple images directly from web pages. How it works: For multiple images, first it scrapes all the image links directly from a web page given. Then it downloads those images to a temporary folder and converts those images to JPEG format. Finally, it uses TensorFlow to classify all those images and print out the result, which you can then easily output to a txt file by adding " > output.txt" to the end of your command line run statement. See the video tutorial link in the repo, and swap out the individual image file from the example for a web page. Since you probably want your own customization, feel free to look at how the code works so that you can create your own version as you need it.
After creating my solution, I saw that there are also other good solutions available online. Check out Siraj's image classification tutorial, which has a link to the associated GitHub repo in the video description.
UPDATE:
If you're just looking to run TensorFlow's classify_image.py on multiple image files in a folder, you can easily create a bash script for that:
for i in temp_image_folder/*.jpg; do
python classify_image.py --image=$i
done
I am currently using the "find" command.
find ./ -type f -iname "*.jpg" -exec python3 classify_image.py --image={} \;
But I'm also looking for a solution that does not have to load the complete script for every image.
Related
I am trying to follow this tutorial for using Yolov4 with transfer learning for a new object to detect: https://sandipanweb.wordpress.com/2022/01/17/custom-object-detection-with-transfer-learning-with-pre-trained-yolo-v4-model/. In trying to set up the initial file structure on Colab, I'm having to guess where files and folders are needed, and could use some help in identifying if this is the right file/folder structure. Or, if there is a more detailed tutorial or a way to understand what Yolov4 is going to need, that would be great.
There are several references to build/darknet/x64/data/. From what I can see when I explore the two downloads, this is really /content/build/darknet/x64/data/ in Colab. Is this a correct understanding?
Under the data folder, is Yolov4 looking for the train folder and the valid folder? That is, is it looking for /content/build/darknet/x64/data/train/ and /content/build/darknet/x64/data/valid/? I'm guessing the answer is yes.
Does Yolov4 need the _darknet.labels file along with all of the image and image annotation files. I am guessing yes because this is what's in the racoon dataset.
The build/darknet/x64/data/train.txt and build/darknet/x64/data/valid.txt files are to have the names of the images, so I'm guessing that name includes the .jpg extension because the tutorial specifically refers to images. The reason I question this is Yolov4 should also need the annotation file names, but that is not referenced in the tutorial. If Yolov4 strips the .jpg and adds the .txt to get the annotation file name that's great, but if it needs the file name w/o the extension so that it can add the extension to access both files, then I didn't understand that from this tutorial.
Any guidance would really be appreciated!
After working on this, here are some notes that have helped me get it to work:
After executing !git clone https://github.com/AlexeyAB/darknet/ change directories to /content/darknet before executing !wget -P build/darknet/x64/ https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137.
Although it may be obvious, to "change the Makefile to enable GPU and opencv", click on the icon on the left margin in Colab labeled "Files". In the darknet folder, open the Makefile (double click to open) and change the entries for GPU, CUDNN, and OPENCV from 0 to 1. ctrl-s to save the file, then make to compile the darknet executable.
To get around the path names referenced in the tutorial, I used full path names. Also, since the files referenced disappear when the Colab runtime is terminated, I copied the files from folders in Drive to local folders, some of which I needed to create. Here is the code I used:
!mkdir /content/darknet/build/darknet/x64/data/train
!mkdir /content/darknet/build/darknet/x64/data/valid
!cp /content/drive/MyDrive/your_folder_name/obj.data /content/darknet/build/darknet/x64/data/obj.data
!cp /content/drive/MyDrive/your_folder_name/obj.names /content/darknet/build/darknet/x64/data/obj.names
!cp /content/drive/MyDrive/your_folder_name_with_train_images/* /content/darknet/build/darknet/x64/data/train/
!cp /content/drive/MyDrive/your_folder_name_with_valid_images/* /content/darknet/build/darknet/x64/data/valid/
!cp /content/drive/MyDrive/your_folder_name/train.txt /content/darknet/build/darknet/x64/data/train.txt
!cp /content/drive/MyDrive/your_folder_name/valid.txt /content/darknet/build/darknet/x64/data/valid.txt
!cp /content/drive/MyDrive/your_folder_name/yolov4_train.cfg /content/darknet/build/darknet/x64/cfg/yolov4_train.cfg
In the folders with the images, I also included a _darknet.labels file (which just has the label for the single object you are detecting in it) and the annotation file (which has a zero in the first entry, then the bounding box coordinates using centerx, centery, width, heigth and just spaces in between--see the tutorial for an example). This was the answer for my question 3.
The train.txt and valid.txt files in my question 4 do work with just the .jpg file names.
The tutorial mentions that you need to change the number of filters and classes, but note this has to be changed in three locations.
Last, if you just want the prediction to print out in Colab for an image, this can be done by modifying /content/darknet/darknet.py before doing the make. In the detect_image function, insert a print(sorted(predictions, key=lambda x: x[1])) before the return. When the make is done, this will be picked up.
If you have not copied files in Colab, it requires some permissions to be set up in your first cells. Here are the two cells I run at the beginning (Colab will ask you to manually verify you are OK with access being granted when you run the cells):
`from google.colab import drive
drive.mount('/content/drive')`
`# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)`
I tried to run a project (ipynb extension) from GitHub using Google colab.
I have managed to run the program, but when compared with the author’s output, mine is a little different.
For example, train_df.describe() does not print some of the columns (‘target' column in particular because that is used to plot a graph.
Why is it that I run the same program but get different result?
During a model training, I got this error:
First I thought it was maybe a problem with my path, but it wasn't, and I discovered that a colab session does not support the symbolic link (ln) operation, for most of the ways you could run it (also with os.system):
My question is: Are there any other ways that I did not explore? or did someone get the same error with Colab? (not specifically with ln, but any other not supported operation)
Google Drive does not support symbolic link as it is a feature of Linux storage system. However, using the files directly in Colab, given that they can fit in, you can most definitely keep symbolic links. I would recommend compressing them and storing them if it's necessary to keep them in the drive.
You may have the wrong path.
Try changing from
/content/mydrive
To
/content/My Drive
Also check with
!ls "/content/My Drive/path/to/yourfile"
That the file really exists there.
I have many VRT files generated using gdal_translate originally for adjacent images.
Is there away to merge all those VRT file into one VRT file so that when I run gdal2tiles.py I only need to give it this one composite VRT file?
I thought first gdal_wrap will do the trick, but it turn out that gdal_wrap images into one single image.. However, I dont want to merge images, I would like to merge VRT file.
There is gdalbuildvrt utility in GDAL since 1.6.1 - which merges multiple input files into one VRT mosaic file. See this official documentation for usage details:
http://www.gdal.org/gdalbuildvrt.html
You just need to list all the individual files and the output filename very probably.
You have tagged your questions with "maptiler" label, which refers to http://www.maptiler.com/ product. MapTiler is able to render multiple files out of the box and is not using VRT at all internally. It is more efficient to supply the individual input files to maptiler directly, then to create a VRT and pass it to the software. VRT introduces artificial internal block size for reading the data - which slows down the tile rendering process, in some cases significantly.
Feel free to request a demo of MapTiler Pro and compare the speed, size and quality of the map tiles you receive - and post the results here.
I'm trying to do the training process, but I don't understand even how to start. I would like to train for read it numbers. My images are from real world, so it didn't go so good with the reading process.
It says that I have to have a ".tif" image with the examples... is a single image of every number (in this case) or a image with a lot of different types of number (same font, though)?
And what about the makebox? The command didn't work here.
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Could someone explain me better, at least how to start?
I saw a few softwares that do this more quickly, but I tryied one (SunnyPage 1.8) but isn't free. Anyone know any free software that does this? Or a good tutorial?
Using Tesseract 3, Windows 8 (32bits).
It is important to patiently follow the training wiki google code project site. If needed multiple times. It is an open source library and is constantly evolving.
You will have to create a training image(tiff) with a lot of different types of numbers probably should have all the numbers you wish the engine to recognize.
Please consider posting the exact error message you got with make box.
I think Tesseract is the best free solution available. You have to keep working and seek help from community.
There is a very good post from Cédric here explaining the training process for Tesseract.
A good free OCR software is PDF OCR X which is also based on Tesseract. I tried to copy my notes from German which I had scanned at 1200dpi, and the results were commendable but not perfect. I found that this website - http://onlineocr.net - is a lot more accurate. If you are not registered, it allows a maximum of 4mb file size from most image formats (BMP, PNG, JPEG etc.) and PDF. It can output them as a Word file, an Excel file or an txt file.
Hope this helps.