PyTorch/NumPy: Create binary mask from rgb image - numpy

I have a tensor containing a batch of 4 RGB 128x128 images. So the the tensor has the shape (4,128,128,3). I need to create a binary mask from this tensor where each pixel is black if the image is black and white if the image is not black.
I tried the following masks = torch.where(image > 0, 1.0, 0.). But this way the resulting masks have obviously still three channels. So what's the best way to create a binary mask from a RGB tensor? Same question for numpy.

I solved it by applying sum along the rgb channels before calling where.
images_sum = images.sum(axis=3)
masks = torch.where(images_sum > 0, 1.0, 0.)

Related

Apply 3D mask to 3D mask, background is white

I try to apply a binary mask on a 3D image by multiplying them. It, however, returns an image with a white background, rather than black (what I would expect since the binary mask is 0 and 1, where all of the background pixels equal 0).
First I load the 3D scan/image (.nii.gz) and mask (.nii.gz) with nibabel, using:
scan = nib.load(path_to_scan).getfdata()
mask = nib.load(path_to_mask).getfdata()
Then I use:
masked_scan = scan*mask
Below visualized that when applying another mask, the background is darker..
enter image description here
Below visualized what they look like in 3D slicer as volume.
enter image description here
What am I missing? Aim is to have a black background...
I also tried np.where(mask==1, scan, mask*scan)

How do I see the actual color of a single RGB value in Google Colab?

Very basic question. I have a single vector (e.g., [53, 21, 110]) and I want to print the RGB color it represents in a colab notebook. Like a color swatch. What's the simplest way to do this?
The simplest way would be using the Image module from PIL. According to the documentation, you can construct an image with:
PIL.Image.new(mode, size, color=0)
mode [required]: determines the mode used for the image, it can be RGB, RGBA, HSV, etc. You can find more modes in the docs
size [required]: this is a tuple (weight, height) that represents the dimensions of your image in pixels.
color [optional]: this is the color of the image, it can receive a tuple to represent the RGB color in your case. The default color is black.
Then, to show the image within colab, you would use
display(img)
Given your question, the mode would need to be 'RGB' and if your vector is a list, you need to convert into a tuple to use it.
To show an 300px by 300px image, the code would look like.
from PIL import Image
img = Image.new('RGB', (300,300), color = (53, 21, 110))
display(img)

Subtract Blending Mode

I have been trying to implement some of the layer blending modes of GIMP (GEGL) to Python. Currently, I am stuck in Subtract Blending mode. As per documentation, Subtract = max(Background - Foreground, 0). However, doing a simple test in GIMP, with Background image = (205,36,50) and Foreground image = (125,38,85), the resultant composite image/colour comes to be (170, 234, 0) which doesn't quite follow the math above.
As per understanding, Subtract does not use Alpha Blending. So, could this be a compositing issue? Or Subtract follows different math? More details and background can be find in a separate SO question.
EDIT [14/10/2021]:
I tried with this image as my Source. Performed following steps on images normalised in range [0, 1]:
Applied a Colour Dodge (no prior conversion from sRGB -> linear RGB was done) and obtained this from my implementation which matches with GIMP result.
sRGB -> linear RGB conversion on Colour Dodge and Source image. [Reference]
Apply Subtract blending with Background = Colour Dodge and Foreground = Source Image
Reconvert linear RGB-> sRGB
I obtain this from POC. Left RGB triplet: (69,60,34); Right RGB triplet: (3,0,192). And the GIMP result. Left RGB triplet: (69,60,35); Right RGB triplet: (4,255,255)
If you are looking at channel values in the 0 ➞ 255 range they are likely gamma-corrected. The operation is possibly done like this:
convert each layer to "linear light" in the 0.0 ➞ 1.0 range using something like
L = ((V/255) ** gamma) (*)
apply the "difference" formula
convert the result back to gamma-corrected:
V = (255 * (Diff ** (1/gamma)))
With gamma=2.2 you obtain 170 for the Red channel, but I don't see why you get 234 on the Green channel.
(*) The actual formula has a special case for the very low values IIRC.

How to Zero Pad RGB Image?

I want to Pad an RGB Image of size 500x500x3 to 512x512x3. I understand that I need to add 6 pixels on each border but I cannot figure out how. I have read numpy.pad function docs but couldn't understand how to use it. Code snippets would be appreciated.
If you need to pad 0:
RGB = np.pad(RGB, pad_width=[(6, 6),(6, 6),(0, 0)], mode='constant')
Use constant_values argument to pad different values (default is 0):
RGB = np.pad(RGB, pad_width=[(6, 6),(6, 6),(0, 0)], mode='constant', constant_values=0, constant_values=[(3,3),(5,5),(0,0)]))
We can try to get a solution by adding border padding, but it would get a bit complex. I would like to suggest you can alternate approach. First we can create a canvas of size 512x512 and then we place your original image inside this canvas. You can get help from the following code:
import numpy as np
# Create a larger black colored canvas
canvas = np.zeros(512, 512, 3)
canvas[6:506, 6:506] = your_500_500_img
Obviously you can convert 6 and 506 to a more generalized variable and use it as padding, 512-padding, etc. but this code illustrates the concept.

How to resize segmentation mask obtained from Deeplab v3?

Deeplab v3 returns a reduced/resized image and its corresponding mask. How can I resize the image as well its corresponding mask to better fit to my specification.
cv2.resize method can be used keeping interpolation method to be cv2.INTER_NEAREST
resized_image = cv2.resize(segmentation_mask, target_dims, interpolation
=cv2.INTER_NEAREST)
This interpolation method will not lead to change in the RGB values of the labels present in the mask.
If you are saving the masks after resizing, keep the format to be '.png'. Other formats tend to change pixel values by small amount which is not desirable for segmentation masks.