During a model training, I got this error:
First I thought it was maybe a problem with my path, but it wasn't, and I discovered that a colab session does not support the symbolic link (ln) operation, for most of the ways you could run it (also with os.system):
My question is: Are there any other ways that I did not explore? or did someone get the same error with Colab? (not specifically with ln, but any other not supported operation)
Google Drive does not support symbolic link as it is a feature of Linux storage system. However, using the files directly in Colab, given that they can fit in, you can most definitely keep symbolic links. I would recommend compressing them and storing them if it's necessary to keep them in the drive.
You may have the wrong path.
Try changing from
/content/mydrive
To
/content/My Drive
Also check with
!ls "/content/My Drive/path/to/yourfile"
That the file really exists there.
Related
I am trying to follow this tutorial for using Yolov4 with transfer learning for a new object to detect: https://sandipanweb.wordpress.com/2022/01/17/custom-object-detection-with-transfer-learning-with-pre-trained-yolo-v4-model/. In trying to set up the initial file structure on Colab, I'm having to guess where files and folders are needed, and could use some help in identifying if this is the right file/folder structure. Or, if there is a more detailed tutorial or a way to understand what Yolov4 is going to need, that would be great.
There are several references to build/darknet/x64/data/. From what I can see when I explore the two downloads, this is really /content/build/darknet/x64/data/ in Colab. Is this a correct understanding?
Under the data folder, is Yolov4 looking for the train folder and the valid folder? That is, is it looking for /content/build/darknet/x64/data/train/ and /content/build/darknet/x64/data/valid/? I'm guessing the answer is yes.
Does Yolov4 need the _darknet.labels file along with all of the image and image annotation files. I am guessing yes because this is what's in the racoon dataset.
The build/darknet/x64/data/train.txt and build/darknet/x64/data/valid.txt files are to have the names of the images, so I'm guessing that name includes the .jpg extension because the tutorial specifically refers to images. The reason I question this is Yolov4 should also need the annotation file names, but that is not referenced in the tutorial. If Yolov4 strips the .jpg and adds the .txt to get the annotation file name that's great, but if it needs the file name w/o the extension so that it can add the extension to access both files, then I didn't understand that from this tutorial.
Any guidance would really be appreciated!
After working on this, here are some notes that have helped me get it to work:
After executing !git clone https://github.com/AlexeyAB/darknet/ change directories to /content/darknet before executing !wget -P build/darknet/x64/ https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137.
Although it may be obvious, to "change the Makefile to enable GPU and opencv", click on the icon on the left margin in Colab labeled "Files". In the darknet folder, open the Makefile (double click to open) and change the entries for GPU, CUDNN, and OPENCV from 0 to 1. ctrl-s to save the file, then make to compile the darknet executable.
To get around the path names referenced in the tutorial, I used full path names. Also, since the files referenced disappear when the Colab runtime is terminated, I copied the files from folders in Drive to local folders, some of which I needed to create. Here is the code I used:
!mkdir /content/darknet/build/darknet/x64/data/train
!mkdir /content/darknet/build/darknet/x64/data/valid
!cp /content/drive/MyDrive/your_folder_name/obj.data /content/darknet/build/darknet/x64/data/obj.data
!cp /content/drive/MyDrive/your_folder_name/obj.names /content/darknet/build/darknet/x64/data/obj.names
!cp /content/drive/MyDrive/your_folder_name_with_train_images/* /content/darknet/build/darknet/x64/data/train/
!cp /content/drive/MyDrive/your_folder_name_with_valid_images/* /content/darknet/build/darknet/x64/data/valid/
!cp /content/drive/MyDrive/your_folder_name/train.txt /content/darknet/build/darknet/x64/data/train.txt
!cp /content/drive/MyDrive/your_folder_name/valid.txt /content/darknet/build/darknet/x64/data/valid.txt
!cp /content/drive/MyDrive/your_folder_name/yolov4_train.cfg /content/darknet/build/darknet/x64/cfg/yolov4_train.cfg
In the folders with the images, I also included a _darknet.labels file (which just has the label for the single object you are detecting in it) and the annotation file (which has a zero in the first entry, then the bounding box coordinates using centerx, centery, width, heigth and just spaces in between--see the tutorial for an example). This was the answer for my question 3.
The train.txt and valid.txt files in my question 4 do work with just the .jpg file names.
The tutorial mentions that you need to change the number of filters and classes, but note this has to be changed in three locations.
Last, if you just want the prediction to print out in Colab for an image, this can be done by modifying /content/darknet/darknet.py before doing the make. In the detect_image function, insert a print(sorted(predictions, key=lambda x: x[1])) before the return. When the make is done, this will be picked up.
If you have not copied files in Colab, it requires some permissions to be set up in your first cells. Here are the two cells I run at the beginning (Colab will ask you to manually verify you are OK with access being granted when you run the cells):
`from google.colab import drive
drive.mount('/content/drive')`
`# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)`
I start using Microsoft Hyper-V for testing before deployment to end-users and/or trouble-shooting. For this purpose, I have a rather fresh OS image of my employer, let's call the checkpoint root. If I mess this one up, I cannot easily recover it.
Now, for a complex test, I prepare the VM with pre-requisites and create a checkpoint test-start. I can revert to this, whenever I need to start a new run. So far everything is perfect.
The question: How can I simply disregard this test-start checkpoint without merging it into my root, once I finished the testing? If test-start would have child checkpoints, I would like to disregard them as well. In summary, I'd like to do delete checkpoint subtree, but without the merging that implicitly happens if doing so.
I have been searching quite a bit on the web, of course reading the MS docs as well, but couldn't find that kind of information. I hope this question is not too stup... ehm, trivial .. yet simple to answer for somebody.
I believe by deleting the test-start checkpoint and discarding the changes you mean to restore the virtual machine to its original state, i.e. the root checkpoint. That's exactly what's needed to be done. Revert the virtual machine to the root checkpoint and delete the test-start subtree. By reverting to root checkpoint, the test-start and its subtree will belong to a point in time after the current running state, so the corresponding AVHDX files will be deleted immediately without merging.
I found some other MS documentation. It seems so simple that deleting a checkpoint in another branch or ahead in time of the currently running VM, the merging is not happening, and "deleting" the checkpoint actually is a pure "delete" operation.
I have an app hosted on Heroku. I seek to extract text from various PDFs. I'm currently using tesseract for this.
Since Heroku does not offer that much storage space and the .traineddata files are big in size (need to use all of them), is it possible to somehow store the tessdata language data on S3? I was not able to find any solution to this yet.
All I could find is that I can define the --tessdata-dir PATH, but that's for a directory.
Sadly, I'm not sure Heroku is a good fit for your needs if you can't make all the data fit within the heroku slug. Even if you could get it to work, it would be quite a performance hit.
You'd probably be better off setting the Tesseract as an API with it's own server(s), then sending whatever you need to that API from heroku (or moving the entire app over). Depending on the size of the rest of your app and how quickly Tesseract is growing in size, that might just mean Tesseract gets it's own heroku app with absolutely minimal dependencies or might mean moving that part of the app to AWS or something.
What is the best way to store NLP Models? I have multiple NLP models which are about 800MB in size in total. My code will load the models in memory at start up time. However I am wondering what is the best way to store the models. Should I store it in git repo and then I can load directly from local system or should I store in an external location like S3 and load it from there? What are the advantages/disadvantages of each? Or do people use some other method which I haven't considered?
Do your NLP models need to be version controlled? Do you ever need to revert back to a previous NLP model? If these are not the case, storing your artifacts in an S3 bucket is certainly sufficient. If you are planning on storing many NLP models for a long period of time, I also recommend AWS Glacier. Glacier is an extremely cost effective for long term storage.
Very good question, while very few people pay attention to it.
Here are a few factors I point out:
Cost of (1) storing files (2) bandwidth: cost of
downloading/uploading resources (models, etc)
Lazy download: Not all the resources are required for running an NLP systems. It's a headache for the end-point user to download many resources that are not nearly useful for their purpose. In other words, the system should download (ideally itself) any resource needed for its purpose, when it's required.
Convenience.
And options are:
S3: The benefit is that if you have it working, it's convenient. But the issue is that someone familiar with S3 and Amazon AWS has to monitor the system for failures/payments/etc. And it's often expensive. Not only you pay for having the space, more importantly you also pay for band-width. If you have resources like word-embeddings or dictionaries (in addition to your models), each of which taking a few GB, it's not hard to hit terabytes of bandwidth usage. AI2 uses S3 and they have a simple Scala system for their usage. Their system is "lazy" i.e. your program downloads (and caches) a given resource only when it's required.
Keep it in the repo: certainly checking in big binary files in the repo is not a good idea, unless you use LFS to keep the big files outside your git history. Even with this, I'm not sure how you'll make programmatic calls to your files. Like you have to have scripts and instructions for users to manually download the files, etc (which is ugly).
I'm adding these two options too:
Maven dependency: Basically package everything in Jar files, deploy them and add them as dependencies. We used to use this, and some ppl still use it (e.g. StanfordNLP ppl, they ask you to add models as maven dependency). I personally do not recommend it, mainly because maven is not designed to take care of big resources (Like sometimes it hangs, etc). And this approach is not lazy, meaning that maven downloads EVERYTHING at once at compile/run time (e.g. when trying StanfordCoreNLP for the first time, you'll HAVE TO download a few Gigabytes of files that you might never need to use, which is a headache). Also, if you're a Java user you know that working with classpath is a BIGx10 headache.
Your own server: Install file manager server (like Minio), store your files there and whenever required, send programmatic calls to the server in your desired language (their APIs are available for different languages in their github page). We've written a convenient Java system to access it in Java that might come handy to you. This gives you the lazy behavior (like S3), while not being expensive (unlike S3) (Basically you'd get all the benefits of S3).
Just to summarize my opinion: I've tried S3 in past, and it was pretty convenient, but it was expensive. Since we have a server that's often idle we are using Minio and we're happy about it. I'd go with this option, if you have a reliable remote server to store your files.
I have few questions about TensorFlow. I'm following the "TensorFlow for Poets" tutorial (https://petewarden.com/2016/02/28/tensorflow-for-poets/), and i got the expected result.
However i would like to know two thing:
1. How to classify more than one image at a time?
2. How to extract the result in .txt format?
Thank you
I had the same issue, so I built the TensorPy GitHub repo to easily handle image classifications of either individual or multiple images directly from web pages. How it works: For multiple images, first it scrapes all the image links directly from a web page given. Then it downloads those images to a temporary folder and converts those images to JPEG format. Finally, it uses TensorFlow to classify all those images and print out the result, which you can then easily output to a txt file by adding " > output.txt" to the end of your command line run statement. See the video tutorial link in the repo, and swap out the individual image file from the example for a web page. Since you probably want your own customization, feel free to look at how the code works so that you can create your own version as you need it.
After creating my solution, I saw that there are also other good solutions available online. Check out Siraj's image classification tutorial, which has a link to the associated GitHub repo in the video description.
UPDATE:
If you're just looking to run TensorFlow's classify_image.py on multiple image files in a folder, you can easily create a bash script for that:
for i in temp_image_folder/*.jpg; do
python classify_image.py --image=$i
done
I am currently using the "find" command.
find ./ -type f -iname "*.jpg" -exec python3 classify_image.py --image={} \;
But I'm also looking for a solution that does not have to load the complete script for every image.