I'm reading about which factors can be changed to enhance the accuracy of text recognition in an image BEFORE the image is taken. Basically I'm interested in lighting and various camera settings like exposure and shutter speed.
I've found mostly papers on improving camera focus like this one but since autofocus seems common nowadays I'm not sure special efforts need to be taken to adjust focus. But I could not find much on the lighting and camera settings for OCR(I'm interested in handheld cameras).
So, how do I choose the ideal lighting and camera sensors settings in a handheld camera such that the image is suitable for OCR?
Related
I want my model to be displayed in a cool video, in which a camera goes through my model.
Is there any guide for smooth tracking shots and functions that can be used?
I could not find good information in the AnyLogic Library :(
You can design any camera movement you like.
Simplest way is to create a custom agent with a camera at its origin and make the agent move around. Or add a camera to existing agents to "view over their shoulder".
Or check the example models with search term "camera".
Or make your camera positions dynamic:
You can do anything you want :)
I want to process the captured video. I will try to capture the video of handwriting on paper / drawing on paper. But I do not want to show the hand or pen on the paper while live streaming via p5.js.
Can this be done using machine learning?
Any idea how to implement this?
If I understand you right you want to detect where in the image the hand is a draw an overlay on this position right?
If so You can use YOLO more information to detect where the hand is.
There are some trained networks that you can download maybe they are good enough, maybe you have to train your own just for handy.
There are also some libery for yolo and JS https://github.com/ModelDepot/tfjs-yolo-tiny
You may not need to go the full ML object segmentation route.
If the paper's position and illumination are constant (or at least knowable) you could try some simple heuristic comparing the pixels in the current frame with a short history and using the most constant pixel values. There might be some lag as new parts of your drawing 'become constant' so maybe you could try some modification to the accumulation, such as if the pixel was white and is going black.
Cocoa uses a drawing system (user coordinate space) measured in "points" which are resolution independent...sounds great
While we need to be concerned with our app running in many resolutions, Cocoa is going to take care of that for us in (1) above...sounds too good to be true!
It does scale our controls as resolution changes...this is good.
BUT the screen size increases as my resolution increases...this is not good, I though we had a drawing canvas that was independent of the resolution!
What if the controls shrink to silly small levels as the resolution increases - should I be concerned about this?
To summarize: is their a "standard" resolution I should design for and then all automatic scaling by Apple will automatically look fine?
[Confused while reading the Apple Progammer Guide on the topic of Drawing]
You do not need to be concerned about this. The user is only allowed to select resolutions which make sense given the physical size of the display, so the standard controls will always be "large enough". You just need to test your app on Retina and non-Retina displays (and ideally both at the same time, with an external 1x monitor plugged on a 2x machine ; move your windows between the two screens and check that your images update accordingly).
I am developing a cocos2d game. I need to make it universal. Problem is that I want to use minimun amount of images to keep the universal binary as small as possible. Is there any possibility that I can use same images I am using for iphone, retina and iPad somehow? If yes, how can I do that? What image size and quality should it be? Any suggestion?
Thanks and Best regards
As for suggestions: provide HD resolution images for Retina devices and iPad, provide SD resolution images for non-Retina devices. Don't think about an all-in-one solution - there isn't one that's acceptable.
Don't upscale SD images to HD resolution on Retina devices or iPad. It won't look any better.
Don't downscale HD images for non-Retina devices. Your textures will still use 4x the memory on devices that have half or even a quarter of the memory available. In addition, downscaling images is bad for performance because it has to be done by the CPU on older devices. While you could downscale the image and save the downscaled texture, it adds a lot more complexity to your code and will increase the loading time.
There's not a single right answer to this question. One way to do it is to create images that are larger than you need and then scale them down. If the images don't have a lot of fine detail, that should work pretty well. As an example, this is the reason that you submit a 512x512 pixel image of your app icon along with your app to the App Store. Apple never displays the image at that size, but uses it to create a variety of smaller sizes for display in the App Store.
Another approach is to use vector images, which you can draw perfectly at any size that you need. Unfortunately, the only vector format that I can think of that's supported in iOS is PDF.
I've had theft problems outside my house so I setup a simple webcam to capture every second with Dorgem (http://dorgem.sf.net).
Dorgem does offer a feature to use motion detection to only capture frames where something is moving on the screen. The problem is that the motion detection algorithm it uses is extremely sensitive. It goes off because of variations in color between successive shots on my cheap webcam, and it also goes off because the trees in front of the house are blowing in the wind. Additionally, the front of my house is a high traffic area so there is also a large number of legitimately captured frames.
I average capturing 2800/3600 frames every second using Dorgem's motion detection. This is too much for me to search through to find out where the interesting activity is.
I wish I could re-position the camera to a more optimal position where it would only capture the areas I'm interested in, so that motion detection would be simpler, however this is not an option for me.
I think that because my camera has a fixed position and each picture frames the same area in front of my house, then I should be able to scan the images and figure out which ones have motion in some interesting region of that image, throwing out all other frames.
For example: if there's a change in pixel 320,240 then someone has stepped in front of my house and I want to see that frame, but if there's a change in pixel 1,1 then its just the trees blowing in the wind and the frame can be discarded.
I've looked at pdiff, a tool for finding diffs in sets of pictures, but it seems to be also focused on diffing the entire picture, rather than a specific region of it:
http://pdiff.sourceforge.net/
I've also looked at phash, a tool for calculating a hash based on human perception of an image, but it seems too complex:
http://www.phash.org/
I suppose I could implement it in a shell script using imagemagick's mogrify -crop to cherry pick the regions of the image I'm interested in, then running pdiff to find the interesting ones, and using that to pick out the interesting frames.
Any thoughts? ideas? existing tools?
cropping and then using pdiff seems like the best choice to me.