Storing bitmap data in an integer array for speed? - objective-c

I am using Cocoa/Objective-C and I am using NSBitmapImageRep getPixel:atX:y: to test whether R is 0 or 255. That is the only piece of data I need (the bitmap is only black and white).
I am noticing that this one function is the biggest draw on CPU power in my application, accounting for something like 95% of the overhead. Would it be faster for me to preload the bitmap into a 2 dimensional integer array
NSUInteger pixels[1280][1024];
and read the values like so:
if(pixels[x][y]!=0){
//....do stuff
}
?

One thing that might be helpful could be converting the data into something more "dense". Since you're only interested in a single bit per pixel location, it doesn't make sense to store more than that. Storing more data than necessary means you get less usage out of your cache, which can really slow things down if the image is big and/or the accesses very random.
For instance, you could use the platform's largest "native" integer and pack in the pixels to use a single bit for each pixel. That will make the access a bit more involved since you need to do a single-bit testing, but it might be a win.
You would do something like this:
uint32_t image[HEIGHT * ((WIDTH + 31) / 32)];
Then initialize this array by using the slow getter method, once per pixel. Then you can read out the value of a pixel using something like image[y * ((WIDTH + 31) / 32) + (x / 32)] & (1 << (x & 31)).
I'm being vague ("might", "can" and so on) since it really depends on your access pattern, the size of the image, and other things. You should probably test it.

I'm not familiar with Objective-C or the NSBitmapImageRep object, but a reasonable guess is that the getPixel routine employs clipping to avoid reading outside of memory, which could a possible slowdown (among other things).
Have a look inside it and see what it does.
(update)
Having learnt that this is Apple code, you probably can't take a look inside it.
However, the documentation for NSBitmapImageRep_Class seems to indicate that getPixel:atX:y: performs at least some type magic. You could test if the result is clipped by accessing a pixel outside of the image boundary and observing the result.
The bitmapData seems to be something you'd be interested in: get the pointer to the data, then read the array yourself avoiding type conversion or clipping.

Related

Does vkCmdCopyImageToBuffer work when source image uses VK_IMAGE_TILING_OPTIMAL?

I have read (after running into the limitation myself) that for copying data from the host to a VK_IMAGE_TILING_OPTIMAL VkImage, you're better off using a VkBuffer rather than a VkImage for the staging image to avoid restrictions on mipmap and layer counts. (Here and Here)
So, when it came to implementing a glReadPixels-esque piece of functionality to read the results of a render-to-texture back to the host, I thought that reading to a staging VkBuffer with vkCmdCopyImageToBuffer instead of using a staging VkImage would be a good idea.
However, I haven't been able to get it to work yet, I'm seeing most of the intended image, but with rectangular blocks of the image in incorrect locations and even some bits duplicated.
There is a good chance that I've messed up my synchronization or layout transitions somewhere and I'll continue to investigate that possibility.
However, I couldn't figure out from the spec whether using vkCmdCopyImageToBuffer with an image source using VK_IMAGE_TILING_OPTIMAL is actually supposed to 'un-tile' the image, or whether I should actually expect to receive a garbled implementation-defined image layout if I attempt such a thing.
So my question is: Does vkCmdCopyImageToBuffer with a VK_IMAGE_TILING_OPTIMAL source image fill the buffer with linearly tiled data or optimally (implementation defined) tiled data?
Section 18.4 describes the layout of the data in the source/destination buffers, relative to the image being copied from/to. This is outlined in the description of the VkBufferImageCopy struct. There is no language in this section which would permit different behavior from tiled images.
The specification even has pseudo code for how copies work (this is for non-block compressed images):
rowLength = region->bufferRowLength;
if (rowLength == 0)
rowLength = region->imageExtent.width;
imageHeight = region->bufferImageHeight;
if (imageHeight == 0)
imageHeight = region->imageExtent.height;
texelSize = <texel size taken from the src/dstImage>;
address of (x,y,z) = region->bufferOffset + (((z * imageHeight) + y) * rowLength + x) * texelSize;
where x,y,z range from (0,0,0) to region->imageExtent.width,height,depth}.
The x,y,z part is the location of the pixel in question from the image. Since this location is not dependent on the tiling of the image (as evidenced by the lack of anything stating that it would be), buffer/image copies will work equally on both kinds of tiling.
Also, do note that this specification is shared between vkCmdCopyImageToBuffer and vkCmdCopyBufferToImage. As such, if a copy works one way, it by necessity must work the other.

Fuzzy screenshot comparison with Selenium

I'm using Selenium to automate webpage functional testing. It's important for us to do a pixel-by-pixel comparison when we roll out new code, so we're using Selenium to take screenshots and comparing the base64 encoded strings to see if anything has changed.
We're finding that in practice, it's hard to get complete pixel consistency, especially with images. I would like minor blurriness / rendering artifacts to count as a "pass" instead of a "fail", so I'm wondering if there's a way of doing a fuzzy comparison to make our tests a bit less fragile.
I was thinking of maybe looking at the Levenshtein distance between the base64 strings as a starting point, but I don't really know if that's a good approach, or what the tolerances should be that distinguish "something moved on the page" from "rendering artifact". Any ideas / approaches?
So I ended up going with the ImageMagick command-line tool (because why re-invent image comparison). The "Peak Absolute Error" metric of the "compare" tool tells you how much you have to fuzz pixels before two images are identical. This seems to work well... for an image with slight graphical distortions, there might be a lot of pixels that don't match, but slight fuzzing is enough to make them match; but for two images that are actually different, even though most pixels might match, the ones that don't tend to be very different. Right now I'm checking for a PAE of less than 15% to see if the images should be counted as identical. Command line I'm using is:
compare -metric PAE original.png new.png comparison.png
Documentation on ImageMagick's compare tool is here: http://www.imagemagick.org/script/compare.php
I've been using perceptualdiff which uses a model of the human visual system to try to avoid reporting unnoticeable changes (the authors used for renderer regression testing). Usage is quite simple:
perceptualdiff -output diff.ppm baseline.png test.png
(where diff.ppm is a PPM format image highlighting the areas of difference)
The needle regression testing framework has support for using pdiff to compare screenshots:
http://needle.readthedocs.org/en/latest/#engines
Use an image format that does not create artifacts (like BMP or PNG) then you can do a per-pixel comparison.
I think you can check each pixel with a common Euclidean Distance.
To improve performance a little, do not calculate the square root but check the squares of the distances
// Maximum color distance allowed to define pixel consistency.
const float maxDistanceAllowed = 5.0;
// Square of the distance, used in calculations.
float maxD = maxDistanceAllowed * maxDistanceAllowed;
public bool isPixelConsistent(Color pixel1, Color pixel2)
{
// Euclidean distance in 3-dimensions.
float distanceSquared = (pixel1.R - pixel2.R)*(pixel1.R - pixel2.R) + (pixel1.G - pixel2.G)*(pixel1.G - pixel2.G) + (pixel1.B - pixel2.B)*(pixel1.B - pixel2.B);
// If the actual distance is less than the max allowed, the pixel is
// consistent and the method returns TRUE
return distanceSquared <= maxD;
}
Didn't test the C# code, but it should give you the idea. Give some tries and adjust the maxDistanceAllowed to your needs.
If anyone else is looking for something similar there is Depicted-dpxdt. It is designed to be used as part of a CI/CD process.
It combines perceptual diff with server, commandline tool, wrapper for phantom js.
It has functionality demonstrated like crawling entire site and comparing pages for differences.

Concurrent access to a single FFTSetup data structure in GCD

Is it okay to create a single FFTSetup data structure and use it to perform multiple FFT computations concurrently? Would something like the following work?
FFTSetup fftSetup = vDSP_create_fftsetup(
16, // vDSP_Length __vDSP_log2n,
kFFTRadix2 // FFTRadix __vDSP_radix
);
NSAssert(fftSetup != NULL, #"vDSP_create_fftsetup() failed to allocate storage");
for (int i = 0; i < 100; i++)
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0),
^{
vDSP_fft_zrip(
fftSetup, // FFTSetup __vDSP_setup,
&(splitComplex[i]), // DSPSplitComplex *__vDSP_ioData,
1, // vDSP_Stride __vDSP_stride,
16, // vDSP_Length __vDSP_log2n,
kFFTDirection_Forward // FFTDirection __vDSP_direction
);
});
}
I suppose that the answer depends on the following considerations:
1) Does vDSP_fft_zrip() only access the data within fftSetup (or the data pointed to by it) in a "read-only" fashion? Or are there perhaps some temporary buffers (scratch space) within fftSetup that is written to by vDSP_fft_zrip() in performing its FFT computations?
2) If data like that in fftSetup is being accessed in a "read-only" fashion, is it okay for multiple processes/threads/tasks/blocks to access it simultaneously? (I am thinking of the case where it is possible for more than one process to open the same file for reading, though not necessarily for writing or appending. Is this analogy appropriate?)
On a related note, just how much memory is being taken up by the FFTSetup data structure? Is there any way to find out? (It is an opaque data type.)
You may create one FFT setup and use it repeatedly and concurrently. This is the intended use. (I am the author of the current implementations of vDSP_fft_zrip and other FFT implementations in vDSP.)
In Using Fourier Transforms we're told that the FFTSetup contains the FFT weight array which is a series of complex exponentials. The vDSP_create_fftsetup documentation says
Once prepared, the setup structure can be used repeatedly by FFT
functions (which read the data in the structure and do not alter it)
for any (power of two) length up to that specified when you created
the structure.
so
conceptually, vDSP_fft_zrip should not need to modify the weight array and so it would appear to be one of the FFT functions that do not alter the FFTSetup (I haven't seen any that do apart from create/destroy), however there are no guarantees on what the actual implementation does - it could do anything.
if vDSP_fft_zrip truly accesses its FFTSetup in a read-only fashion, then it's fine to do that from multiple threads.
As for memory usage, the FFT weight array is e^{i*k*2*M_PI/N} for k = [0..N-1], which are N complex float values, so that would 2*N*sizeof(float).
But those complex exponentials are very symmetric so who knows, under the hood the implementation could require less memory. Or more!
In your case, N = 2^16, so it wouldn't be strange to see up to 256k being used.
Where does that leave you? I think it seems reasonable that the FFTSetup be accessible from multiple threads, but it appears to be undocumented. You could be lucky. Or unlucky and unpleasantly surprised now or in a future version of the framework.
So... do you feel lucky?
I wouldn't attempt any explicit concurrency with the vDSP functions, or any other function in the Accelerate framework (of which vDSP is a part) for that matter. Why? Because Accelerate is already designed to take advantage of multiple cores, as well as specific nuances of a given processor implementation, on your behalf - see http://developer.apple.com/library/mac/#DOCUMENTATION/Darwin/Reference/ManPages/man7/vecLib.7.html. You may end up essentially re-parallelizing already parallel computations that are internal to the implementation (if not now, then possibly in a later version). The best approach to the Accelerate framework is generally to assume that it's more clever than you are and just use it in the simplest way possible, then do your performance measurements. If those measurements reflect a level of performance that is somehow insufficient for your needs, then try your own optimizations (and/or file a bug report against the Accelerate framework at http://bugreport.apple.com since the authors of that framework are always interested in knowing where or if their efforts somehow fell short of developer requirements).

Do Arrays take up space even without values in them in .net?

I have a program in VB.net that uses a 3D array:
Private gridList(10, 900, 900) As GridElement
Now, I just used a Memory Profiler on it (because my application is having some major leak issues or something) and apparently, this array (containing at the moment of testing 0-30 elements at one time) is using 94% of the memory currently in use by my application. Even when it is empty it takes up huge amounts of memory.
My only assumption is that even empty arrays take up space! This puts a major blow into my plans!
My Question:
Is there any alternative to this that allows me to still have the same abilities to map
i.g. I've been using it like this:
Dim cGE as GridElement = gridList(3, 5, 7)
but doesn't hog up so much memory for things that aren't using memory?
Thanks!
Do Arrays take up space even without values in them in .net?
No. But your array has values in it. And hence takes up space.
To avoid keeping a lot of elements in memory when you only access a few of all the possible elements, you need to use a so-called sparse array. In .NET, this is easiest implemented via a Dictionary, where the key in your case would be a three-element structure*, and the value would be a GridElement.
* If you’re using an up-to-date version of .NET, then you can model this via a Tuple(Of Integer, Integer, Integer)

How can I compare two NSImages for differences?

I'm attempting to gauge the percentage difference between two images.
Having done a lot of reading I seem to have a number of options but I'm not sure what the best method to follow for:
Ease of coding
Performance.
The methods I've seen are:
Non language specific - academic Image comparison - fast algorithm and Mac specific direct pixel access http://www.markj.net/iphone-uiimage-pixel-color/
Does anyone have any advice about what solutions make most sense for the above two cases and have code samples to show how to apply them?
I've had success calculating the difference between two images using the histogram technique mentioned here. redmoskito's answer in the SO question you linked to was actually my inspiration!
The following is an overview of the algorithm I used:
Convert the images to grayscale—compare one channel instead of three.
Divide each image into an n * n grid of "subimages". Then, for subimage pair:
Calculate their colour composition histograms.
Calculate the absolute difference between the two histograms.
The maximum difference found between two subimages is a measure of the two images' difference. Other metrics could also be used (e.g. the average difference betwen subimages).
As tskuzzy noted in his answer, if your ultimate goal is a binary "yes, these two images are (roughly) the same" or "no, they're not", you need some meaningful threshold value. You could produce such a value by passing images into the algorithm and tweaking the threshold based on its output and how similar you think the images are. A form of machine learning, I suppose.
I recently wrote a blog post on this very topic, albeit as part of a larger goal. I also created a simple iPhone app to demonstrate the algorithm. You can find the source on GitHub; perhaps it will help?
It is really difficult to suggest something when you don't tell us more about the images or the variations. Are they shapes? Are they the different objects and you want to know what class of objects? Are they the same object and you want to distinguish the object instance? Are they faces? Are they fingerprints? Are the objects in the same pose? Under the same illumination?
When you say performance, what exactly do you mean? How large are the images? All in all it really depends. With what you've said if it is only ease of coding and performance I would suggest to just find the absolute value of the difference of pixels. That is super easy to code and about as fast as it gets, but really unlikely to work for anything other than the most synthetic examples.
That being said I would like to point you to: DHOG, GLOH, SURF and SIFT.
You can use fairly basic subtraction technique that the lads above suggested. #carlosdc has hit the nail on the head with regard to the type of image this basic technique can be used for. I have attached an example so you can see the results for yourself.
The first shows a image from a simulation at some time t. A second image was subtracted away from the first which was taken some (simulation) time later t + dt. The subtracted image (in black and white for clarity) then shows how the simulation has changed in that time. This was done as described above and is very powerful and easy to code.
Hope this aids you in some way
This is some old nasty FORTRAN, but should give you the basic approach. It is not that difficult at all. Due to the fact that I am doing it on a two colour pallette you would do this operation for R, G and B. That is compute the intensities or values in each cell/pixal, store them in some array. Do the same for the other image, and subtract one array from the other, this will leave you with some coulorfull subtraction image. My advice would be to do as the lads suggest above, compute the magnitude of the sum of the R, G and B componants so you just get one value. Write that to array, do the same for the other image, then subtract. Then create a new range for either R, G or B and map the resulting subtracted array to this, the will enable a much clearer picture as a result.
* =============================================================
SUBROUTINE SUBTRACT(FNAME1,FNAME2,IOS)
* This routine writes a model to files
* =============================================================
* Common :
INCLUDE 'CONST.CMN'
INCLUDE 'IO.CMN'
INCLUDE 'SYNCH.CMN'
INCLUDE 'PGP.CMN'
* Input :
CHARACTER fname1*(sznam),fname2*(sznam)
* Output :
integer IOS
* Variables:
logical glue
character fullname*(szlin)
character dir*(szlin),ftype*(3)
integer i,j,nxy1,nxy2
real si1(2*maxc,2*maxc),si2(2*maxc,2*maxc)
* =================================================================
IOS = 1
nomap=.true.
ftype='map'
dir='./pictures'
! reading first image
if(.not.glue(dir,fname2,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy2
read(unit2,err=11)rad,dxy
do i=1,nxy2
do j=1,nxy2
read(unit2,err=11)si2(i,j)
enddo
enddo
CLOSE(unit2)
! reading second image
if(.not.glue(dir,fname1,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy1
read(unit2,err=11)rad,dxy
do i=1,nxy1
do j=1,nxy1
read(unit2,err=11)si1(i,j)
enddo
enddo
CLOSE(unit2)
! substracting images
if(nxy1.eq.nxy2)then
nxy=nxy1
do i=1,nxy1
do j=1,nxy1
si(i,j)=si2(i,j)-si1(i,j)
enddo
enddo
else
print *,'SUBSTRACT: Different sizes of image arrays'
IOS=0
return
endif
* normal finishing
IOS=0
nomap=.false.
return
* exceptional finishing
10 write (*,30) fullname
return
11 write (*,32) fullname
return
30 format('Cannot open file ',72A)
31 format('Improper filename ',72A)
32 format('Error reading from file ',72A)
end
! =============================================================
Hope this is of some use. All the best.
Out of the methods described in your first link, the histogram comparison method is by far the simplest to code and the fastest. However key point matching will provide far more accurate results since you want to know a precise number describing the difference between two images.
To implement the histogram method, I would do the following:
Compute the red, green, and blue histograms of each image
Add up the differences between each bucket
If the difference is above a certain threshold, then the percentage is 0%
Otherwise the colors found in the images are similar. So then do a pixel by pixel comparison and convert the difference into a percentage.
I don't know any precise algorithms for finding the key points of an image. However once you find them for each image you can do a pixel by pixel comparison for each of the key points.