Pillow image grab compute time independent of bounding box area - screenshot

I want to do screen grabs in boxes smaller than the entire screen. This is simple to do with:
from PIL import ImageGrab;
ImageGrab.grab(bbox=(0, 0, 1600, 1200)
I expected that when I halved the box area, I would see the time for the operation decrease by a factor of two. To my surprise, the compute time, just over 1/4 seconds on my new MacBook, was almost completely independent of box area.
Can anyone explain this to me? Is there any simple method to extract small rectangular regions from the screen very fast? The long compute time is very detrimental to my real time program.

Simply halving the image size is not going to result in much speedup, as the overhead of PIL function calls is what is taking up time. You may also have I/O constraints, depending on what you're doing with the images (i.e. saving them to disk) If you want speedups, you could look at multiprocessing libraries, or dropping down the image capture parts of your code to C with something like cython.

Related

Objective-C, Core-Plot Real Time Graph vs CPU

I've been working on implementing a real time Core Plot graph into my application on OS X. To my dismay I noticed a fairly significant issue. Once the line gets to the end of the X-Axis and it starts scrolling to keep up with the line the CPU load hits 30-35% non-stop.
I figured before I proceed any further I had better go back and see if I had made some type of mistake in my code for the CPU to spike like that. There wasn't anything out of the ordinary that I noticed, and I tried to adjust the framerate and updating frequency but without luck. I decided to go back to the real time example project they include and it has the same effect on the CPU.
Is there anything I can do about this, or is that just the nature of
real time graphing on OS X?
. .
Everything is fine for the first 50 frames (indicated by the line with arrows), but once it gets to the end of it that's where things turn for the worse.
Side Note:
I noticed Swift does graphing in the Playground, and even though it's apparently not real time (and I'm using Obj-C) it looks really sharp. Is the Swift graphing feature only available within playgrounds, or is there a way to implement that into a project? I'm only mentioning this because I'm looking to find something soon that is efficient.
That's the expected behavior with Core Plot. Once the graph starts to scroll, it has to redraw the plot, both axes, and all of the grid lines for each animation frame. You could reduce the drawing load by decreasing the number of grid lines and/or axis tick marks.
The playground graphs are a private part of the playground environment.

Viola-Jones - what does the 24x24 window mean?

I'm learning about the Viola-James detection framework and I read that it uses a 24x24 base detection window[1][2]. I'm having problems understanding this base detection window.
Let's say I have an image of size 1280x960 pixels and 3 people in it. When I try to perform face detection on this image, will the algorithm:
Shrink the picture to 24x24 pixels,
Tile the picture with 24x24 pixel large sections and then test each section,
Position the 24x24 window in the top left of the image and then move it by 1px over the whole image area?
Any help is appreciated, even a link to another explanation.
Source: https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf
[1] - page 2, last paragraph before Integral images
[2] - page 4, Results
Does this video help? It is 40 minutes long.
Adam Harvey Explains Viola-Jones Face Detection
Also called Haar Cascades, the algorithm is very popular for face-detection.
About half way down that page is another video which shows a super slow-mo scan in progress so you can see how the window starts small (although much larger than 24x24 for the purpose of demonstration) and shifts around the image pixel by pixel, then does it again and again on successively larger square portions. At each stage, it's still only looking at those windows as though they were resampled to the 24x24 size.
You can also see how it quickly rejects many of those windows and spends most of its time in areas that seem face-like while it computes more and more complex comparisons that become more stringent. This is where the term "cascade" comes into play.
I found this video that perfectly explains how the detection window moves and scales on a picture. I wanted to draw a flowchart how this looks but I think the video illustrates it better:
https://vimeo.com/12774628
Credits to the original author of the video.

Is there any way to optimizing image upload?

I have tried the following methods,
normal image upload.
encoding and decoding.
these two methods are taking long time to upload the image.
Any suggestion?
There are some simple ways:
Reduce the size of the image. From 1000x1000 to 500x500
Reduce the bpp of the image. For example instead of RGBA representation (32 bits per pixel) use RGB_565 (16bits per pixel) or even gray level image (8bits)
Reduce the quality of the image. Save it as .jpg. This will make the image much smaller. You can play with the quality parameter of jpeg. 100% means very high quality and large files, 1% means extremely tiny images (~40 times smaller) but all the details will be lost.
Save the image in Jpeg200 format. It reduces the size even further. Not every browser supports this format, so you might need to convert it to regular jpeg.
Use pyramids of images. For example. You have 1000x1000 image. Reduce its size by 2 to get 500x500, reduce again and again. Now you got 4 images 1000x1000, 500x500, 250x250, 125x125. You upload the 4 of them. Starting from the smallest to the largest. The smallest image will be uploaded very quick and you will be able to display it (though it is in lower resolution). Next when a better image arrives you update the display and resolution enhances. The effect would be that the basic image is loaded extremely fast and over time the resolution is enhanced. The transfer time of the 4 images will take only 30% more time than the original but the first one will arrive 64 times faster than the original
These are the basic solutions. If they are not what you needed please refine the question

Finding significant images in a set of surveillance camera images

I've had theft problems outside my house so I setup a simple webcam to capture every second with Dorgem (http://dorgem.sf.net).
Dorgem does offer a feature to use motion detection to only capture frames where something is moving on the screen. The problem is that the motion detection algorithm it uses is extremely sensitive. It goes off because of variations in color between successive shots on my cheap webcam, and it also goes off because the trees in front of the house are blowing in the wind. Additionally, the front of my house is a high traffic area so there is also a large number of legitimately captured frames.
I average capturing 2800/3600 frames every second using Dorgem's motion detection. This is too much for me to search through to find out where the interesting activity is.
I wish I could re-position the camera to a more optimal position where it would only capture the areas I'm interested in, so that motion detection would be simpler, however this is not an option for me.
I think that because my camera has a fixed position and each picture frames the same area in front of my house, then I should be able to scan the images and figure out which ones have motion in some interesting region of that image, throwing out all other frames.
For example: if there's a change in pixel 320,240 then someone has stepped in front of my house and I want to see that frame, but if there's a change in pixel 1,1 then its just the trees blowing in the wind and the frame can be discarded.
I've looked at pdiff, a tool for finding diffs in sets of pictures, but it seems to be also focused on diffing the entire picture, rather than a specific region of it:
http://pdiff.sourceforge.net/
I've also looked at phash, a tool for calculating a hash based on human perception of an image, but it seems too complex:
http://www.phash.org/
I suppose I could implement it in a shell script using imagemagick's mogrify -crop to cherry pick the regions of the image I'm interested in, then running pdiff to find the interesting ones, and using that to pick out the interesting frames.
Any thoughts? ideas? existing tools?
cropping and then using pdiff seems like the best choice to me.

OpenGL - animation stuttering when in full screen

I'm currently running into a problem regarding animation in OpenGL. I have between 200 and 10000 gears on the screen at a time all rotating. When the window is not in maximized view, my CPU runs at about 10-20 % consistently. No spikes, no stuttering in the animation, it runs perfectly smooth regardless of the number of gears on screen. When I maximize the window though, everything falls apart. My CPU maxes out, I begin getting weird spikes in CPU usage, the animation begins stuttering as a result, and it just looks really ugly, even when I have only 200 gears on screen.
My animation technique looks like this:
While Animating
Calculate current rotation angle based on a running timer
draw Image
call glFlush()
End While
If it helps, I'm using the Tao framework in VB.net. I'm not performing any other calculations other than the ones to calculate the rotation angle mentioned above and perform a few glRotateD and glscaleD in the method to draw the image.
In addition, I guess I was under the impression that regardless of the window size in an orthographic 2-dimensional drawing that is scaling on resizing of the window, the drawing time would always take the same amount of time. Is this a correct assumption?
Any help is greatly appreciated =)
Edit
Note that I've seen the animation run perfectly smooth at full screen before. Every once in awhile, OpenGL will decide it's happy and run perfectly at full screen using between 10-20% of the CPU (same as when not maximized). I haven't pinpointed what causes this though, because it will run one time perfectly, then without changing anything, I will run it again and encounter the choppiness. I simply want to pinpoint what causes the animation to slow down and eliminate it.
I've run a dot trace on my program and it says that the swapBuffers method is using 55 % of my processing time even though I'm never explicitly calling the method. Is the method called by something else that I can eliminate, or is this simply OpenGL's "dead time" method to limit the animation to 60 fps?
I was under the impression that regardless of the window size in an orthographic 2-dimensional drawing that is scaling on resizing of the window, the drawing time would always take the same amount of time. Is this a correct assumption?
If only :)
More pixels require more memory bandwidth/shader units/etc. Look into fillrate.