Object Detection - Image Cropping - object-detection

I am working on my first object detection project. My goal is to detect US postal mailing address on envelopes and letters.
My training data is a collection of images ranging in size and amount of content. Some of my images are 1700 x 2800 px letters with lots of content (logos, sidebars, postal address), while others are 100 x 300 px images that only contain the postal address. From my research, it seems that 300px and smaller is a good size for object detection. I tried to resize the large images and make them smaller, but the content became very distorted.
My question is, if my goal is to detect just the postal address, can I crop the 1700x2800 images down to the postal address to make the images smaller and retain clarity? If this is not recommended, is there a recommend approach to downsizing my large images without losing resolution?

Related

Optimise vector objects in pdf

I am working on optimizing a PDF to make it render in low-memory devices. I find a page with some vector graphics + other objects (text, shading, forms)objects in the pdf crashing on a low memory device.
When I remove all the vector objects on the page it renders fine. But it doesn't when the number of vector objects is 113 (which I think is not very huge for a page) and the highest number of points that a vector image has is 9800 points (which I again think is not a huge number).
Say I am gonna turn these vector images to raster images under a condition. What should that condition be? What all to take into consideration when dealing with optimizing a vector object in pdf other than the vectors and the number of points in the vector?

Measuring sizes of before/after images in Photoshop

Perhaps my mind isn't mathematically competent enough to do this, but here it goes:
I am using Photoshop. I have 2 images taken from different heights. Both images have the same object in it (so the size of this object remains the same) but I am trying to resize both images so that this object is the same pixel size. That way I can properly measure the difference between other objects in the images with the proper ratio.
My end goal is to measure the differences of scars healing (before and after) using a same-size object in both images as a baseline.
To measure the difference in the photo, I have been counting pixels using the histogram feature:
Even though i changed the pixel width and height to roughly the same size, the 2 images have a drastically different number of pixels. So comparing the red or white from the before to the after won't make sense until I can get these to match.
Can anyone point me in the right direction here? How can I compare apples to apples here?
So went a different route here in case anyone was trying wondering what I did.
Rather than change the size of the images, just calculated the increase manually separately.

Creating an ML-Model that detects card-values

This is a more generic question about training an ML-Model to detect cards.
The cards are a kid's game, 4 different colors, numbers and symbols. I don't need to detect the color, just the value (a.k.a symbol) of the cards.
I tried to take pictures with my iPhone of every card, used RectLabel to draw the rectangles around the symbols in the upper left corner (the cards have an upside down-symbol in the lower right corner, too, I didn't mark these as they'll be hidden during detection).
I cropped the images so only the card is visible, no surroundings.
Then I uploaded my images to app.roboflow.ai and let them do their magic (using Auto-Orient, Resize to 416x416, Grayscale, Auto-Adjust Contrast, Rotation, Shear, Blur and Noise).
That gave me another set of images which I used to train my model with CreateML from Apple.
However, when I use that model in my app (I'm using the Breakfast Finder Demo from Apple), the cards values aren't detected - well, sometimes it works, but only at a certain distance from the phone and the labels are either upside down or sideways.
My guess is this is because my images aren't taken the way they should be?
Any hints on how I'd have to set this whole thing up so my model gets trained well?
My bet would be on this being the problem:
I cropped the images so only the card is visible, no surroundings
You want your training images to be as similar as possible to the images your model will see in the wild. If it's trained only on images of cards with no surroundings and then you show it images of cards with things around them it won't know what to do.
This UNO scoring example is extremely similar to your problem and might provide some ideas and guidance.

How to solve object detection problem containing some static objects with no variation?

I am working with a object detection problem, here i am using Region based convolution neural network to detect and segment objects in an image.
I have 8 objects to be segmented(Android App Icons), 6 of them posses variations in background and rest 2 will be under white background(static).
I have already taken 200 variations of each object and trained on MaskRCNN, my model is able to depict the patterns very well in 6 objects with variation.But on rest 2 objects it's struggling even on train set even though it's a exact match.
Q. If i have n objects with variations and m objects with no variation(static), do i need to oversample it? Shall i use any other technique here in this case?
Here in image, icons in black bounding boxes are prominent to change(based on background and there position w.r.t background) but Icon in green bounding box will not have any variations(always be in white background)
I have tried adding more images containing static objects but no luck, Can anyone suggest how to go about in any such problem? I don't want to use sliding window(static image processing) approach here.

Is there any way to optimizing image upload?

I have tried the following methods,
normal image upload.
encoding and decoding.
these two methods are taking long time to upload the image.
Any suggestion?
There are some simple ways:
Reduce the size of the image. From 1000x1000 to 500x500
Reduce the bpp of the image. For example instead of RGBA representation (32 bits per pixel) use RGB_565 (16bits per pixel) or even gray level image (8bits)
Reduce the quality of the image. Save it as .jpg. This will make the image much smaller. You can play with the quality parameter of jpeg. 100% means very high quality and large files, 1% means extremely tiny images (~40 times smaller) but all the details will be lost.
Save the image in Jpeg200 format. It reduces the size even further. Not every browser supports this format, so you might need to convert it to regular jpeg.
Use pyramids of images. For example. You have 1000x1000 image. Reduce its size by 2 to get 500x500, reduce again and again. Now you got 4 images 1000x1000, 500x500, 250x250, 125x125. You upload the 4 of them. Starting from the smallest to the largest. The smallest image will be uploaded very quick and you will be able to display it (though it is in lower resolution). Next when a better image arrives you update the display and resolution enhances. The effect would be that the basic image is loaded extremely fast and over time the resolution is enhanced. The transfer time of the 4 images will take only 30% more time than the original but the first one will arrive 64 times faster than the original
These are the basic solutions. If they are not what you needed please refine the question