How to exclude pixels above a certain threshold? - numpy

I am a beginner in image processing.
Currently, I’m developing an image analysis tool.
One of its features is calculating Pearson’s correlation coefficient between two images after excluding the oversaturated pixels of them.
The specific method I’m using is to set all pixel intensities above a certain threshold to 0.
For example:
The intensities of image1:
[0,0,0,50,180,210,80,60,45,18,25,250,247,150]
The intensities of image 2:
[0,0,60,0,0,240,96,70,65,8,55,232,238,110]
Let’s say the custom thresholds are 170, and 220, respectively.
After excluding all oversaturated pixels:
The intensities of image1:
[0,0,0,50,0,0,80,60,45,18,25,0,0,150]
The intensities of image 2:
[0,0,60,0,0,0,96,70,65,8,55,0,0,110]
Then calculate Pearson’s correlation coefficient based on them.
Here are my questions:
Is the above approach feasible? Because I saw a similar practice described here:
https://forum.image.sc/t/how-to-remove-bright-aspecific-speckles/50672/2
. I just want to make sure.
What exactly are oversaturated pixels? Is this generally a subjective judgment? As far as I know, for example, the maximum intensity value of an 8-bit image is 255, and the maximum intensity value of a 16-bit image is 65536. Can oversaturate pixels have intensities that exceed these values?
Are there any experts in this field who can answer my question? Thank you very much!

Related

Simulate Camera in Numpy

I have the task to simulate a camera with a full well capacity of 10.000 Photons per sensor element
in numpy. My first Idea was to do it like that:
camera = np.random.normal(0.0,1/10000,np.shape(img))
Imgwithnoise= img+camera
but it hardly shows an effect.
Has someone an idea how to do it?
From what I interpret from your question, if each physical pixel of the sensor has a 10,000 photon limit, this points to the brightest a digital pixel can be on your image. Similarly, 0 incident photons make the darkest pixels of the image.
You have to create a map from the physical sensor to the digital image. For the sake of simplicity, let's say we work with a grayscale image.
Your first task is to fix the colour bit-depth of the image. That is to say, is your image an 8-bit colour image? (Which usually is the case) If so, the brightest pixel has a brightness value = 255 (= 28 - 1, for 8 bits.) The darkest pixel is always chosen to have a value 0.
So you'd have to map from the range 0 --> 10,000 (sensor) to 0 --> 255 (image). The most natural idea would be to do a linear map (i.e. every pixel of the image is obtained by the same multiplicative factor from every pixel of the sensor), but to correctly interpret (according to the human eye) the brightness produced by n incident photons, often different transfer functions are used.
A transfer function in a simplified version is just a mathematical function doing this map - logarithmic TFs are quite common.
Also, since it seems like you're generating noise, it is unwise and conceptually wrong to add camera itself to the image img. What you should do, is fix a noise threshold first - this can correspond to the maximum number of photons that can affect a pixel reading as the maximum noise value. Then you generate random numbers (according to some distribution, if so required) in the range 0 --> noise_threshold. Finally, you use the map created earlier to add this noise to the image array.
Hope this helps and is in tune with what you wish to do. Cheers!

Tuning first_stage_anchor_generator in faster rcnn model

I am trying to detect some very small object (~25x25 pixels) from large image (~ 2040, 1536 pixels) using faster rcnn model from object_detect_api from here https://github.com/tensorflow/models/tree/master/research/object_detection
I am very confused about the following configuration parameters(I have read the proto file and also tried modify them and test):
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
I am kind of very new to this area, if some one can explain a bit about these parameters to me it would be very appreciated.
My Question is how should I adjust above (or other) parameters to accommodate for the fact that I have very small fix-sized objects to detect in large image.
Thanks
I don't know the actual answer, but I suspect that the way Faster RCNN works in Tensorflow object detection is as follows:
this article says:
"Anchors play an important role in Faster R-CNN. An anchor is a box. In the default configuration of Faster R-CNN, there are 9 anchors at a position of an image. The following graph shows 9 anchors at the position (320, 320) of an image with size (600, 800)."
and the author gives an image showing an overlap of boxes, those are the proposed regions that contain the object based on the "CNN" part of the "RCNN" model, next comes the "R" part of the "RCNN" model which is the region proposal. To do that, there is another neural network that is trained alongside the CNN to figure out the best fit box. There are a lot of "proposals" where an object could be based on all the boxes, but we still don't know where it is.
This "region proposal" neural net's job is to find the correct region and it is trained based on the labels you provide with the coordinates of each object in the image.
Looking at this file, I noticed:
line 174: heights = scales / ratio_sqrts * base_anchor_size[0]
line 175: widths = scales * ratio_sqrts * base_anchor_size[[1]]
which seems to be the final goal of the configurations found in the config file(to generate a list of sliding windows with known widths and heights). While the base_anchor_size is created as a default of [256, 256]. In the comments the author of the code wrote:
"For example, setting scales=[.1, .2, .2]
and aspect ratios = [2,2,1/2] means that we create three boxes: one with scale
.1, aspect ratio 2, one with scale .2, aspect ratio 2, and one with scale .2
and aspect ratio 1/2. Each box is multiplied by "base_anchor_size" before
placing it over its respective center."
which gives insight into how these boxes are created, the code seems to be creating a list of boxes based on the scales =[stuff] and aspect_ratios = [stuff] parameters that will be used to slide over the image. The scale is fairly straightforward and is how much the default square box of 256 by 256 should be scaled before it is used and the aspect ratio is the thing that changes the original square box into a rectangle that is more closer to the (scaled) shape of the objects you expect to encounter.
Meaning, to optimally configure the scales and aspect ratios, you should find the "typical" sizes of the object in the image whatever it is ex(20 by 30, 5 by 10 ,etc) and figure out how much the default of 256 by 256 square box should be scaled to optimally fit that, then find the "typical" aspect ratios of your objects(according to google an aspect ratio is: the ratio of the width to the height of an image or screen.) and set those as your aspect ratio parameters.
Note: it seems that the number of elements in the scales and aspect_ratios lists in the config file should be the same but I don't know for sure.
Also I am not sure about how to find the optimal stride, but if your objects are smaller than 16 by 16 pixels the sliding window you created by setting the scales and aspect ratios to what you want might just skip your object altogether.
As I believe proposal anchors are generated only for model types of Faster RCNN. In this file you have specified what parameters may be set for anchors generation within line you mentioned from config.
I tried setting base_anchor_size, however I failed. Though this FasterRCNNTutorial tutorial mentions that:
[...] you also need to configure the anchor sizes and aspect ratios in the .config file. The base anchor size is 255,255.
The anchor ratios will multiply the x dimension and divide the y dimension, so if you have an aspect ratio of 0.5 your 255x255 anchor becomes 128x510. Each aspect ratio in the list is applied, then the results are multiplied by the scales. So the first step is to resize your images to the training/testing size, then manually check what the smallest and largest objects you expect are, and what the most extreme aspect ratios will be. Set up the config file with values that will cover these cases when the base anchor size is adjusted by the aspect ratios and multiplied by the scales.
I think it's pretty straightforward. I also used this 'workaround'.

Do input size effect mobilenet-ssd in aspect-ratio and real anchor ratio? (Tensorflow API)

im recently using tensorflow api object detection. The default SSD-MobileNet v1 is using 300 x 300 images as input training image, but i gonna edit the image size as width and height in different values. For instance, 320 * 180. Are aspects ratio in .config represent the real ratio of the anchors width/height ratio or they are just for the square images?
You can change the "size" to any different value , the general guidance is preserve the aspect ratio of the original image while the size can be different value.
Aspect ratios represent the real ratio of anchors. You can use it for different input ratios, but you will get the best result if you use input ratio similar to square images.

Finding the length of a line through every pixel

I have a raster image with multiple polyline feature classes over it. The lines are not overlapping but they are in multiple different orientations. For every pixel in the raster, I want to calculate the length of the line through that pixel so that the result would be a raster with cells assigned a float value of zero to 2^0.5 times the cell size. What's the best way to do this? I'm using ArcPro with an advanced license.
You can have a look at the answers to a similar question here (using R --- but you have a license for that too)
https://gis.stackexchange.com/questions/119993/convert-line-shapefile-to-raster-value-total-length-of-lines-within-cell/120175

Texture format for cellular automata in OpenGL ES 2.0

I need some quick advice.
I would like to simulate a cellular automata (from A Simple, Efficient Method
for Realistic Animation of Clouds) on the GPU. However, I am limited to OpenGL ES 2.0 shaders (in WebGL) which does not support any bitwise operations.
Since every cell in this cellular automata represents a boolean value, storing 1 bit per cell would have been the ideal. So what is the most efficient way of representing this data in OpenGL's texture formats? Are there any tricks or should I just stick with a straight-forward RGBA texture?
EDIT: Here's my thoughts so far...
At the moment I'm thinking of going with either plain GL_RGBA8, GL_RGBA4 or GL_RGB5_A1:
Possibly I could pick GL_RGBA8, and try to extract the original bits using floating point ops. E.g. x*255.0 gives an approximate integer value. However, extracting the individual bits is a bit of a pain (i.e. dividing by 2 and rounding a couple times). Also I'm wary of precision problems.
If I pick GL_RGBA4, I could store 1.0 or 0.0 per component, but then I could probably also try the same trick as before with GL_RGBA8. In this case, it's only x*15.0. Not sure if it would be faster or not seeing as there should be fewer ops to extract the bits but less information per texture read.
Using GL_RGB5_A1 I could try and see if I can pack my cells together with some additional information like a color per voxel where the alpha channel stores the 1 bit cell state.
Create a second texture and use it as a lookup table. In each 256x256 block of the texture you can represent one boolean operation where the inputs are represented by the row/column and the output is the texture value. Actually in each RGBA texture you can represent four boolean operations per 256x256 region. Beware texture compression and MIP maps, though!