I am conducting a suitability analysis utilizing a road layer that I buffered around. After creating the vector buffer layer, I converted it to a raster. I now want to use the raster calculator in combination with additional raster layers to produce an output raster that excludes those areas within the buffer (the entire 'buffer raster layer'). My issue is that the 'buffer raster layer' only consists of those areas that have been buffered... Any thoughts/suggestions would be appreciated.
Best,
Eric
One solution to this is to make a new copy of the raster with a larger extent. In your geoprocessing environments, set the Extent to equal the extent of your other raster layers. Then, use the Copy Raster tool. The new raster should be the same size as your other data, and you can proceed with raster calculator.
Related
I will train my dataset with faster-rcnn for one class. All my images are 1920x1080 sizes. Should I resize or crop the images or I can train with this size?
Also my objects are really small (around 60x60).
In the config file there are dimensions written as min_dimension: 600 and max_dimension: 1024 for this reason I am confused to train the model with 1920x1080 size images.
If your objects are small, resizing the images to a smaller size is not a good idea. You can change the max_dimension to 1920 or 2000 which might make the speed a bit lower. For cropping the images, you should first consider how the objects are placed in the images. If cropping will cut a lot of objects, then you will have many cases of truncation which might have a negative effect on the model's performance.
If you insist on the faster-rcnn to cope with this task, personally I recommend:
Change input height and width, maximum and minimum value in the config file, which should work for your dataset in terms of successfully execution.
Change the original region proposal parameters (should be in config file, too) to certain ratio and scale like 1:1 and 60.
But if I were you, I would like to try:
Add some shortcuts in backbone since it is a small object detection task which is in need of features of high resolution.
Cut the fast-rcnn head off to enhance the performance, since I only need to detect one class to be THE class or not to be (being background or other class), and the output should be enough to encode the information at the RPN stage.
I have a working object detection model (fined-tuned MobileNet SSD) that detects my custom small robot. I'll feed it some webcam footage (which will be tied to a drone) and use the real-time bounding box information.
So, I am about to purchase the camera.
My questions: since SSD resizes the input images into 300x300, is the camera resolution very important? Does higher resolution mean better accuracy (even when it gets resized to 300x300 anyway)? Should I crop the camera footage into 1:1 aspect ratio at every frame before running my object detection model on it? Should I divide the image into MxN cropped segments and run inference one by one?
Because my robot is very small and the drone will be at a 4 meter altitude, so I'll effectively be trying to detect a very tiny spot on my input image.
Any sort of wisdom is greatly appreciated, thank you.
These are quite a few questions, I'll try to answer all of them. The detection model resizes the input images before feeding it to the network by some resizing method, e.g. bilinear. It would be better of course if the input image would be equal or larger than the input size to the network rather than smaller. A rule of thumb is that indeed higher resolution means better accuracy, but it highly depends on the setup and the task. If you're trying to detect a small object, and let's say for example that the original resolution is 1920x1080. Then after resizing the image, the small object would be even smaller (pixels-wise), and might be too small to detect. Therefore, indeed, it would be better to either split the image to smaller images (possibly with some overlap to avoid misdetection due to object splitting) and applying detection on each, or using a model with higher input resolution. Be aware that while the first is possible with your current model, you'll need to train a new model possibly with some architectural changes (e.g. adding SSD layers and modifying anchors, depends on the scales you want to detect) for the latter. Regarding the aspect ratio matter, you mostly need to stay consistent. It doesn't matter if you don't keep the original aspect ratio, but if you don't - do it both in training and evaluation/test/deployment.
I trained an Image Classifier with Tensforflow using a bunch of JPG images.
Let's say I have 3 classifiers, ClassifierA, ClassifierB, ClassifierC.
When testing the classifiers, I have no issues at all in 90% of the images I use as a test. But in some cases, I have misclassifications due to the image quality.
For example, the image below is the same, saved as BMP and JPG. You'll see little differences due to the format quality.
When I test the BMP version using tf.image.decode_bmp I get misclassifications, let's say ClassifierA 70%
When I test the JPG version using tf.image.decode_jpeg I get the right one, ClassifierB 90%
When I test the JPG version using tf.image.decode_jpeg and dct_method="INTEGER_ACCURATE" I get the right one with the much better result, ClassifierB 99%
What could be the issue here? Such difference between BMP and JPG, and how can I solve this if there's a solution?
update1: I retrained my Classifier using different effects and randomly changing the quality in which I save the images I use as a dataset.
Now, I get the right output, but still the percentages changes a lot, for example44% with BMP and +90% with JPG
This is a fabulous question, and even more fabulous of an observation. I'm going to use this in my own work in the future!
I expect you have just identified a rather fascinating issue with the dataset. It appears that your model is overfitting to features specific to JPG compression. The solution is to increase data augmentation. In particular, convert your training samples between various formats randomly.
This issue also makes me think that sharpening and blurring operations would make good data augmentation features. It's common to alter color, contrast, rotation, scale, orientation, and translation of the image to augmentat the training dataset, but I don't commonly see blur and sharpness used. I suspect these two data augmentation techniques will go a long way to resolving your issue by themselves.
In case the OP (or others reading this) are not terribly familiar with what "data augmentation" is, I'll define it. It is common to warp your training images in various ways to generate endlessly unique images from your (otherwise finite) dataset. For example, randomly flipping the image left/right is quite simple, common, and effectively doubles your dataset. Changing contrast and brightness settings further alter your images. Adding these and other data augmentation transformations to your pipeline creates a much richer dataset and trains a network that is more robust to these common variations in images.
It's important that the data augmentation techniques you use produce realistic variations. For example, rotating an image is quite a realistic augmentation technique. If your training image is a cat standing horizontally, it's realistically possible that a future sample might be a cat at a 25-degree angle.
My input image is huge and only a small irregularly shaped region shall be used as input. I cannot crop it because the shape of the ROI causes the cropped region to be almost the same dimension as the input. I understand the receptive field and background context stuff, which impacts accuracy. Still, i am curious if any framework allows me to specify a binary mask as ROI?
I am training my own image set using Tensorflow for Poets as an example,
https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/
What size do the images need to be. I have read that the script automatically resizes the image for you, but what size does it resize them to. Can you preresize your images to this to save on your disk space (10,000 1mb images).
How does it crop the images, does it chop off part of your image, or add white/black bars, or change the aspect ratio?
Also, I think Inception v3 uses 299x299 images, what if your image recogition requires more detailed accuracy, is it possible to increase the networks image size, like to 598x598?
I don't know what re-sizing option this implementation uses; if you haven't found that in the documentation, then I expect that we'd need to read the code.
The images can be of any size. Yes, you can shrink your images to save disk space. However, note that you lose image detail; there won't be a way to recover the lost information.
The good news is that you shouldn't need it; CNN models are built for an image size that contains enough detail to handle the problem at hand. Greater image detail generally does not translate to greater accuracy in classification. Doubling the image resolution is usually a waste of storage.
To do that, you'd have to edit the code to accept the larger "native" image size. Then you'd have to alter the model topology to account for the greater input size: either a larger step-down factor somewhere (which could defeat the greater resolution), or another layer on the model to capture the larger size.
To get a more accurate model, you generally need a stronger network topology. 2x resolution does not give us much more information to differentiate a horse from a school bus.