How to perform non-maximum supression of keypoints conditioned on another channel - numpy

The standard non maximum suppression or finding peaks in a 2D image involves two steps:
Do a max pool with a kernel size, as maxpooled_image.
Then select the pixels where pixel_value == maxpooled_image value
However, let us say I have an additional channel, value2. Consider two strong pixels that belong in one NMS window. Now, in the standard case, only one of these pixels will be chosen. However, I'd like to add an additional condition that if the value2 are sufficiently different by some threshold (dth), then select both pixels, but if the difference between the value2 value of pixel1 and pixel2 is small, then pick only the brighter pixel.
How do I achieve this? in numpy

Related

Difference between channel_shift_range and brightness_range in ImageDataGenerator (Keras)?

There are multiple pages (like this and this) that present examples about the effect of channel_shift_range in images. At first glance, it appears as if the images have only had a change in brightness applied.
This issue has multiple comments mentioning this observation. So, if channel_shift_range and brightness_range do the same, why do they both exist?
After long hours of reverse engineering, I found that:
channel_shift_range: applies the (R + i, G + i, B + i) operation to all pixels in an image, where i is an integer value within the range [0, 255].
brightness_range: applies the (R * f, G * f, B * f) operation to all pixels in an image, where f is a float value around 1.0.
Both parameters are related to brightness, however, I found a very interesting difference: the operation applied by channel_shift_range roughly preserves the contrast of an image, while the operation applied by brightness_range roughly multiply the contrast of an image by f and roughly preserves its saturation. It is important to note that these conclusions could not be fulfilled for large values of i and f, since the brightness of the image will be intense and it will have lost much of its information.
Channel shift and Brightness change are completely different.
Channel Shift: Channel shift changes the color saturation level(eg. light Red/dark red) of pixels by changing the [R,G,B] channels of the input image. Channel shift is used to introduce the color augmentation in the dataset so as to make the model learn color based features irrespective of its saturation value.
Below is the example of Channel shift from mentioned the article:
In the above image, if you observe carefully, objects(specially cloud region) are still clearly visible and distinguishable from their neighboring regions even after channel shift augmentation.
Brightness change: Brightness level of the image explains the light intensity throughout the image and used to add under exposure and over exposure augmentation in the dataset.
Below is the example of Brightness augmentation:
In the above image, at low brightness value objects(eg. clouds) have lost their visibility due to low light intensity level.

Simulate Camera in Numpy

I have the task to simulate a camera with a full well capacity of 10.000 Photons per sensor element
in numpy. My first Idea was to do it like that:
camera = np.random.normal(0.0,1/10000,np.shape(img))
Imgwithnoise= img+camera
but it hardly shows an effect.
Has someone an idea how to do it?
From what I interpret from your question, if each physical pixel of the sensor has a 10,000 photon limit, this points to the brightest a digital pixel can be on your image. Similarly, 0 incident photons make the darkest pixels of the image.
You have to create a map from the physical sensor to the digital image. For the sake of simplicity, let's say we work with a grayscale image.
Your first task is to fix the colour bit-depth of the image. That is to say, is your image an 8-bit colour image? (Which usually is the case) If so, the brightest pixel has a brightness value = 255 (= 28 - 1, for 8 bits.) The darkest pixel is always chosen to have a value 0.
So you'd have to map from the range 0 --> 10,000 (sensor) to 0 --> 255 (image). The most natural idea would be to do a linear map (i.e. every pixel of the image is obtained by the same multiplicative factor from every pixel of the sensor), but to correctly interpret (according to the human eye) the brightness produced by n incident photons, often different transfer functions are used.
A transfer function in a simplified version is just a mathematical function doing this map - logarithmic TFs are quite common.
Also, since it seems like you're generating noise, it is unwise and conceptually wrong to add camera itself to the image img. What you should do, is fix a noise threshold first - this can correspond to the maximum number of photons that can affect a pixel reading as the maximum noise value. Then you generate random numbers (according to some distribution, if so required) in the range 0 --> noise_threshold. Finally, you use the map created earlier to add this noise to the image array.
Hope this helps and is in tune with what you wish to do. Cheers!

conv2d on non-rectangular image in Tensorflow

I have dataset of images which are half black in a upper triangular fashion, i.e. all pixels below the main diagonal are black.
Is there a way in Tensorflow to give such an image to a conv2d layer and mask or limit the convolution to only the relevant pixels?
If the black translates to 0 then you don't need to do anything. The convolution will multiply the 0 by whatever weight it has so it's not going to contribute to the result. If it's not you can multiply the data with a binary mask to make them 0.
For all black pixels you will still get any bias term if you have any.
You could multiply the result with a binary mask to 0 out the areas you don't want populated. This way you can also decide to drop results that have too many black cells, like around the diagonal.
You can also write your own custom operation that does what you want. I would recommend against it because you only get a speedup of at most 2 (the other operations will lower it). You probably get more performance by running on a GPU.

Is there any available DM script that can compare two images and know the difference

Is there any available DM script that can compare two images and know the difference?
I mean the script can compare two or more images, and it can determine the similarity of two images, for example the 95% area of one image is same as another image, then the similarity of these two images is 95%.
The script can compare brightness and contrast distribution of images.
Thanks,
This question is a bit ill-defined, as "similarity" between images depends a lot on what you want.
If by "95% of the area is the same" you mean that 95% of the pixels are of identical value in images A & B, you can simply create a mask and sum() it to count the number of pixels, i.e.:
sum( abs(A-B)==0 ? 1 : 0 )
However, this will utterly fail if the images A & B are shifted with respect to each other even by a single pixel. It will also fail, if A & B are of same contrast but different absolute value.
I guess the intended question was to find similarity of two images in a fuzzy way.
For these, one way is to do crosscorrelation. DM has this function. Like this,
image xcorr= CrossCorrelate(ref,img)
From xcorr, the peak position gives x- and y- shift between the two, the peak intensity gives "similarity" of the two.
If you know there is no shift between the two, you can just do the sum and multiplication,
number similarity1=sum(img1*img2)
Another way to do similarity is calculate Euclidian distance of the two:
number similarity2=sqrt(sum((img1-img2)**2)).
"similarity2" calculates the "pure" similarity. "similarity1" is the pure similarity plus the mean intensity of img1 and img2. The difference is essentially this,
(a-b)**2=a**2+b**2-2*a*b.
The left term is "similarity2", the last term on the right is the "crosscorrelation" or "similarity1".
I think "similarity1" is called cross-correlation, "similarity2" is called correlation coefficient.
In example comparing two diffraction patterns, if you want to compute the degree of similarity, use "similarity2". If you want to compute the degree of similarity plus a certain character of the diffraction pattern, use "similarity1".

Faster way to perform point-wise interplation of numpy array?

I have a 3D datacube, with two spatial dimensions and the third being a multi-band spectrum at each point of the 2D image.
H[x, y, bands]
Given a wavelength (or band number), I would like to extract the 2D image corresponding to that wavelength. This would be simply an array slice like H[:,:,bnd]. Similarly, given a spatial location (i,j) the spectrum at that location is H[i,j].
I would also like to 'smooth' the image spectrally, to counter low-light noise in the spectra. That is for band bnd, I choose a window of size wind and fit a n-degree polynomial to the spectrum in that window. With polyfit and polyval I can find the fitted spectral value at that point for band bnd.
Now, if I want the whole image of bnd from the fitted value, then I have to perform this windowed-fitting at each (i,j) of the image. I also want the 2nd-derivative image of bnd, that is, the value of the 2nd-derivative of the fitted spectrum at each point.
Running over the points, I could polyfit-polyval-polyder each of the x*y spectra. While this works, this is a point-wise operation. Is there some pytho-numponic way to do this faster?
If you do least-squares polynomial fitting to points (x+dx[i],y[i]) for a fixed set of dx and then evaluate the resulting polynomial at x, the result is a (fixed) linear combination of the y[i]. The same is true for the derivatives of the polynomial. So you just need a linear combination of the slices. Look up "Savitzky-Golay filters".
EDITED to add a brief example of how S-G filters work. I haven't checked any of the details and you should therefore not rely on it to be correct.
So, suppose you take a filter of width 5 and degree 2. That is, for each band (ignoring, for the moment, ones at the start and end) we'll take that one and the two on either side, fit a quadratic curve, and look at its value in the middle.
So, if f(x) ~= ax^2+bx+c and f(-2),f(-1),f(0),f(1),f(2) = p,q,r,s,t then we want 4a-2b+c ~= p, a-b+c ~= q, etc. Least-squares fitting means minimizing (4a-2b+c-p)^2 + (a-b+c-q)^2 + (c-r)^2 + (a+b+c-s)^2 + (4a+2b+c-t)^2, which means (taking partial derivatives w.r.t. a,b,c):
4(4a-2b+c-p)+(a-b+c-q)+(a+b+c-s)+4(4a+2b+c-t)=0
-2(4a-2b+c-p)-(a-b+c-q)+(a+b+c-s)+2(4a+2b+c-t)=0
(4a-2b+c-p)+(a-b+c-q)+(c-r)+(a+b+c-s)+(4a+2b+c-t)=0
or, simplifying,
22a+10c = 4p+q+s+4t
10b = -2p-q+s+2t
10a+5c = p+q+r+s+t
so a,b,c = p-q/2-r-s/2+t, (2(t-p)+(s-q))/10, (p+q+r+s+t)/5-(2p-q-2r-s+2t).
And of course c is the value of the fitted polynomial at 0, and therefore is the smoothed value we want. So for each spatial position, we have a vector of input spectral data, from which we compute the smoothed spectral data by multiplying by a matrix whose rows (apart from the first and last couple) look like [0 ... 0 -9/5 4/5 11/5 4/5 -9/5 0 ... 0], with the central 11/5 on the main diagonal of the matrix.
So you could do a matrix multiplication for each spatial position; but since it's the same matrix everywhere you can do it with a single call to tensordot. So if S contains the matrix I just described (er, wait, no, the transpose of the matrix I just described) and A is your 3-dimensional data cube, your spectrally-smoothed data cube would be numpy.tensordot(A,S).
This would be a good point at which to repeat my warning: I haven't checked any of the details in the few paragraphs above, which are just meant to give an indication of how it all works and why you can do the whole thing in a single linear-algebra operation.