I have images as 64x64 numpy images that look like this:
This image is labeled as 5. I have 5 different categories (classes). Is there any way I can store this image as RGB or (64,64,5)?
RG as center image and B as masked with label. I am a little bit confused and my supervisor has been very vague about it. Or is it maybe (64,64,2) and the second slice as masked label?
you can use a color for each class as follows:
import cv2
# from google.colab.patches import cv2_imshow
import numpy as np
gray = cv2.imread('input.png', 0)
zero = np.zeros_like(gray)
# 5 classes, different color channels
rgb_1 = np.stack([gray, zero, zero], axis=2)
rgb_2 = np.stack([zero, gray, zero], axis=2)
rgb_3 = np.stack([gray, gray, zero], axis=2)
rgb_4 = np.stack([zero, zero, gray], axis=2)
rgb_5 = np.stack([gray, zero, gray], axis=2)
result = np.vstack([np.hstack([rgb_1, rgb_2, rgb_3]), np.hstack([rgb_4, rgb_5, rgb_5])])
# cv2_imshow(result)
cv2.imshow('result', result)
cv2.waitKey(0)
The result will be like this:
I'm replicating the last class for visualization purposes only!
Otherwise, if you want to label the classes inside the image itself, you can use Scikit-Image labeling functions: from skimage.measure import label and from skimage.color import label2rgb as done in this answer.
Related
I have a numpy array and I need to cut a partition of it based on an ROI like (x1,y1)(x2,y2). The background color of the numpy array is zero.
I need to crop that part from the first numpy array and then resize the cropped array to (640,480) pixel.
I am new to numpy and I don't have any clue how to do this.
#numpy1: the first numpy array
roi=[(1,2),(3,4)]
It kind of sounds like you want to do some image processing. Therefore, I suggest you to have a look at the OpenCV library. In their Python implementation, images are basically NumPy arrays. So, cropping and resizing become quite easy:
import cv2
import numpy as np
# OpenCV images are NumPy arrays
img = cv2.imread('path/to/your/image.png') # Just use your NumPy array
# instead of loading some image
# Set up ROI [(x1, y1), (x2, y2)]
roi = [(40, 40), (120, 150)]
# ROI cutout of image
cutout = img[roi[0][1]:roi[1][1], roi[0][0]:roi[1][0], :]
# Generate new image from cutout with desired size
new_img = cv2.resize(cutout, (640, 480))
# Just some output for visualization
img = cv2.rectangle(img, roi[0], roi[1], (0, 255, 0), 2)
cv2.imshow('Original image with marked ROI', img)
cv2.imshow('Resized cutout of image', new_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.5
NumPy: 1.19.1
OpenCV: 4.4.0
----------------------------------------
You can crop an array like
array = array[start_x:stop_x, start_y:stop_y]
or in your case
array = array[roi[0][0]:roi[0][1], roi[1][0]:roi[1][1]]
or one of
array = array[slice(*roi[0]), slice(*roi[1])]
array = array[tuple(slice(*r) for r in roi)]
depending on the amount of abstraction and over-engineering that you need.
I recommend using slicing and skimage. skimage.transform.resize is what you need.
import matplotlib.pyplot as plt
from skimage import data
from skimage.transform import resize
image = data.camera()
crop = image[10:100, 10:100]
crop = resize(crop, (640, 480))
plt.imshow(crop)
More about slicing, pls see here.
Details on skimage, see here
I'm working on the ROI pooling layer which work for fast-rcnn and I am used to use tensorflow. I found tf.image.crop_and_resize can act as the ROI pooling layer.
But I try many times and cannot get the result that I expected.Or did the true result is exactly what I got?
here is my code
import cv2
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
img_path = r'F:\IMG_0016.JPG'
img = cv2.imread(img_path)
img = img.reshape([1,580,580,3])
img = img.astype(np.float32)
#img = np.concatenate([img,img],axis=0)
img_ = tf.Variable(img) # img shape is [580,580,3]
boxes = tf.Variable([[100,100,300,300],[0.5,0.1,0.9,0.5]])
box_ind = tf.Variable([0,0])
crop_size = tf.Variable([100,100])
#b = tf.image.crop_and_resize(img,[[0.5,0.1,0.9,0.5]],[0],[50,50])
c = tf.image.crop_and_resize(img_,boxes,box_ind,crop_size)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
a = c.eval(session=sess)
plt.imshow(a[0])
plt.imshow(a[1])
And I handed in my origin img and result:a0,a1
if I was wrong can anyone teach me how to use this function? thanks.
Actually, there's no problem with Tensorflow here.
From the doc of tf.image.crop_and_resize (emphasis is mine) :
boxes: A Tensor of type float32. A 2-D tensor of shape [num_boxes, 4].
The i-th row of the tensor specifies the coordinates of a box in the
box_ind[i] image and is specified in normalized coordinates [y1, x1,
y2, x2]. A normalized coordinate value of y is mapped to the image
coordinate at y * (image_height - 1), so as the [0, 1] interval of
normalized image height is mapped to [0, image_height - 1] in image
height coordinates. We do allow y1 > y2, in which case the sampled
crop is an up-down flipped version of the original image. The width
dimension is treated similarly. Normalized coordinates outside the [0,
1] range are allowed, in which case we use extrapolation_value to
extrapolate the input image values.
The boxes argument needs normalized coordinates. That's why you get a black box with your first set of coordinates [100,100,300,300] (not normalized, and no extrapolation value provided), and not with your second set [0.5,0.1,0.9,0.5].
However, as that why matplotlib show you gibberish on your second attempt, it's just because you're using the wrong datatype.
Quoting the matplotlib documentation of plt.imshow (emphasis is mine):
All values should be in the range [0 .. 1] for floats or [0 .. 255]
for integers. Out-of-range values will be clipped to these bounds.
As you're using float outside the [0,1] range, matplotlib is bounding your values to 1. That's why you get those colored pixels (either solid red, solid green or solid blue, or a mixing of these). Cast your array to uint_8 to get an image that make sense.
plt.imshow( a[1].astype(np.uint8))
Edit :
As requested, I will dive a bit more into
tf.image.crop_and_resize.
[When providing non normalized coordinates and no extrapolation values], why I just get a blank result?
Quoting the doc :
Normalized coordinates outside the [0, 1] range are allowed, in which
case we use extrapolation_value to extrapolate the input image values.
So, normalized coordinates outside [0,1] are allowed. But they still need to be normalized !
With your example, [100,100,300,300], the coordinates you provide makes the red square. Your original image is the little green dot in the upper left corner! The default value of the argument extrapolation_value is 0, so the values outside the frame of the original image are inferred as [0,0,0] hence the black.
But if your usecase needs another value, you can provide it. The pixels will take a RGB value of extrapolation_value%256 on each channel. This option is useful if the zone you need to crop is not fully included in you original images. (A possible usecase would be sliding windows for example).
It seems that tf.image.crop_and_resize expects pixel values in the range [0,1].
Changing your code to
test = tf.image.crop_and_resize(image=image_np_expanded/255., ...)
solved the problem for me.
Yet another variant is to use tf.central_crop function.
Below is a concrete implementation of the tf.image.crop_and_resize API. tf version 1.14
import tensorflow as tf
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
tf.enable_eager_execution()
def single_data_2(img_path):
img = tf.read_file(img_path)
img = tf.image.decode_bmp(img,channels=1)
img_4d = tf.expand_dims(img, axis=0)
processed_img = tf.image.crop_and_resize(img_4d,boxes=
[[0.4529,0.72,0.4664,0.7358]],crop_size=[64,64],box_ind=[0])
processed_img_2 = tf.squeeze(processed_img,0)
raw_img_3 = tf.squeeze(img_4d,0)
return raw_img_3, processed_img_2
def plot_two_image(raw,processed):
fig=plt.figure(figsize=(35,35))
raw_ = fig.add_subplot(1,2,1)
raw_.set_title('Raw Image')
raw_.imshow(raw,cmap='gray')
processed_ = fig.add_subplot(1,2,2)
processed_.set_title('Processed Image')
processed_.imshow(processed,cmap='gray')
img_path = 'D:/samples/your_bmp_image.bmp'
raw_img, process_img = single_data_2(img_path)
print(raw_img.dtype,process_img.dtype)
print(raw_img.shape,process_img.shape)
raw_img=tf.squeeze(raw_img,-1)
process_img=tf.squeeze(process_img,-1)
print(raw_img.dtype,process_img.dtype)
print(raw_img.shape,process_img.shape)
plot_two_image(raw_img,process_img)
Below is my working code, also output image is not black, this can be of help to someone
for idx in range(len(bboxes)):
if bscores[idx] >= Threshold:
#Region of Interest
y_min = int(bboxes[idx][0] * im_height)
x_min = int(bboxes[idx][1] * im_width)
y_max = int(bboxes[idx][2] * im_height)
x_max = int(bboxes[idx][3] * im_width)
class_label = category_index[int(bclasses[idx])]['name']
class_labels.append(class_label)
bbox.append([x_min, y_min, x_max, y_max, class_label, float(bscores[idx])])
#Crop Image - Working Code
cropped_image = tf.image.crop_to_bounding_box(image, y_min, x_min, y_max - y_min, x_max - x_min).numpy().astype(np.int32)
# encode_jpeg encodes a tensor of type uint8 to string
output_image = tf.image.encode_jpeg(cropped_image)
# decode_jpeg decodes the string tensor to a tensor of type uint8
#output_image = tf.image.decode_jpeg(output_image)
score = bscores[idx] * 100
file_name = tf.constant(OUTPUT_PATH+image_name[:-4]+'_'+str(idx)+'_'+class_label+'_'+str(round(score))+'%'+'_'+os.path.splitext(image_name)[1])
writefile = tf.io.write_file(file_name, output_image)
I edited it to view the foreground image on a white background but now, none of the images are visible.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('91_photo.jpg')
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (10,10,360,480)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,255).astype('uint8')
img = img*mask2[:,:,np.newaxis]
plt.imshow(img),plt.colorbar(),plt.show()
Expecting the result to be a visible image on a white background
This is what i'm getting
There are a number of small issues with your code that are adding up to that weird result.
OpenCV uses BGR ordering of the channels of an image, where matplotlib uses RGB. That means if you read an image with OpenCV but want to display with matplotlib, you need to convert the image from BGR to RGB before displaying (that's the reason the colors are weird). Also, not that important, but color images are not displayed with a colormap, so showing the colormap does not do anything for you.
In numpy, it's best to keep masks boolean whenever you can, because you can use them to index your arrays. Your current code converts a boolean mask to a uint8 image with 0 and 255 values and then you multiply that with your image. That means your image will be set to zero wherever the mask is zero---and your image values will explode (or do weird stuff with overflow). Instead, keep the mask boolean and use it to index your array. That way anywhere the mask is True you can just set the value in your image to something specific (like 255 for white).
This should fix you up:
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('91_photo.jpg')
mask = np.zeros(img.shape[:2], np.uint8)
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
rect = (10, 10, 360, 480)
cv2.grabCut(img, mask, rect, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_RECT)
mask2 = (mask==2) | (mask==0)
img[mask2] = 255
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
plt.show()
I would like to create a matrix subplot and display each BMP files, from a directory, in a different subplot, but I cannot find the appropriate solution for my problem, could somebody helping me?.
This the code that I have:
import os, sys
from PIL import Image
import matplotlib.pyplot as plt
from glob import glob
bmps = glob('*trace*.bmp')
fig, axes = plt.subplots(3, 3)
for arch in bmps:
i = Image.open(arch)
iar = np.array(i)
for i in range(3):
for j in range(3):
axes[i, j].plot(iar)
plt.subplots_adjust(wspace=0, hspace=0)
plt.show()
I am having the following error after executing:
natively matplotlib only supports PNG images, see http://matplotlib.org/users/image_tutorial.html
then the way is always read the image - plot the image
read image
img1 = mpimg.imread('stinkbug1.png')
img2 = mpimg.imread('stinkbug2.png')
plot image (2 subplots)
plt.figure(1)
plt.subplot(211)
plt.imshow(img1)
plt.subplot(212)
plt.imshow(img2)
plt.show()
follow the tutorial on http://matplotlib.org/users/image_tutorial.html (because of the import libraries)
here is a thread on plotting bmps with matplotlib: Why bmp image displayed as wrong color with plt.imshow of matplotlib on IPython-notebook?
The bmp has three color channels, plus the height and width, giving it a shape of (h,w,3). I believe plotting the image gives you an error because the plot only accepts two dimensions. You could grayscale the image, which would produce a matrix of only two dimensions (h,w).
Without knowing the dimensions of the images, you could do something like this:
for idx, arch in enumerate(bmps):
i = idx % 3 # Get subplot row
j = idx // 3 # Get subplot column
image = Image.open(arch)
iar_shp = np.array(image).shape # Get h,w dimensions
image = image.convert('L') # convert to grayscale
# Load grayscale matrix, reshape to dimensions of color bmp
iar = np.array(image.getdata()).reshape(iar_shp[0], iar_shp[1])
axes[i, j].plot(iar)
plt.subplots_adjust(wspace=0, hspace=0)
plt.show()
I'd like to segment my image using numpy's label and then based on the number of indices found in each label remove those which satisfy my criteria. For example if an image with regions in it that I'd segmented were created like this and segmented using scipy's label:
from numpy import ones, zeros
from numpy.random import random_integers
from scipy.ndimage import label
image = zeros((512, 512), dtype='int')
regionator = ones((11, 11), dtype='int')
xs = random_integers(5, 506, size=500)
ys = random_integers(5, 506, size=500)
for x, y in zip(xs, ys):
image[x-5:x+6, y-5:y+6] = regionator
labels, n_labels = label(image)
Now I'd like to retrieve the indices for each region which has a size greater than 121 pixels (or one regionator size). I'd then like to take those indices and set them to zero so they are no longer part of the labeled image. What is the most efficient way to accomplish this task?
Essentially something similar to MATLAB's regionprops or utilizing IDL's reverse_indices output from its histogram function.
I would use bincount and threshold the result to make a lookup table:
import numpy as np
threshold = 121
size = np.bincount(labels.ravel())
keep_labels = size <= threshold
# Make sure the background is left as 0/False
keep_labels[0] = 0
filtered_labels = keep_labels[labels]
On the last above I index the array keep_labels with the array labels. This is called advanced indexing in numpy and it requires that labels be an integer array. Numpy then uses the elements of labels as indices to keep_labels and produces an array the same shape as labels.
Here's what I've found to work for me so far with good performance even for large datasets.
Using the get indices process taken from here I've come to this:
from numpy import argsort, histogram, reshape, where
import bisect
h = histogram(labels, bins=n_labels)
h_inds = where(h[0] > 121)[0]
labels_f = labels.flatten()
sortedind = argsort(labels_f)
sorted_labels_f = labels_f[sortedind]
inds = []
for i in range(1, len(h_inds)):
i1 = bisect.bisect_left(sorted_labels_f, h[1][h_inds[i]])
i2 = bisect.bisect_right(sorted_labels_f, h[1][h_inds[i]])
inds.extend(sortedind[i1:i2])
# Now get rid of all of those indices that were part of a label
# larger than 121 pixels
labels_f[inds] = 0
filtered_labels = reshape(labels_f, (512, 512))