I have an RGB image of shape (h, w, 3) and a corresponding depth map of shape (h, w).
Thus I know, for each pixel, its 3D coordinates.
I would like to rotate the image by some 3D rotation matrix.
I know how to apply the rotation to the input coordinates and get the coordinates in the target view, but how do I render the new view given the input image pixel values?
I tried using scipy's griddata, but this interpolation "fills" in gaps for occluded regions and overall performs interpolation, but not rendering of the new view.
Is there a better way to render the new rotated view in pytorch or numpy?
Here's some code that would do the association.
def get_colored_point_cloud(calib, rgb, depth):
"""
pass in rgb and associated depth map
return point cloud and color for each point
cloud.shape -> (num_points, 3) for [x, y, z]
colors.shape -> (num_points, 3) for [r, g, b]
"""
rows, cols = depth.shape
#create a grid and stack depth and rgb
c, r = np.meshgrid(np.arange(cols), np.arange(rows)) # c-> (cols, rows), r-> (cols, rows)
points = np.stack([c, r, depth]) # -> stacking (3, num_points)
colors = np.stack([c, r, rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]])
points = points.reshape(3, -1) #-> (3, num_points)
colors = colors.reshape(5, -1) #-> (5, num_points)
points = points.T #-> (num_points, 3)
colors = colors.T #-> (num_points, 5)
#now transform [u, v] to [x, y, z] by camera unprojection
cloud = unproject_image_to_point_cloud(points, calib.intrinsic_params) #-> (num_points, 3)
return cloud, colors[:,2:5] # (num_points, 3), (num_points, 3)
It is also possible to do this through open3d. But you will have to deal with practical matters of getting the view as desired for it to work in open3d.
See this post: Generate point cloud from depth image
The more direct way of doing this instead of the somewhat ugly meshgrid process (at least the way I have written it) is by creating separate arrays for point(col_index, row_index, z) and color(col_index, row_index, R, G, B), and transforming (col_index, row_index,z) to (x, y, z) in an unrolled way for each point, but this is much slower as it does not use numpy vectorization magic under the hood.
def get_colored_point_cloud(calib, rgb, depth):
points = []
colors = []
rows, cols = depth.shape
for i in range(rows):
for j in range(cols):
z = depth[i, j]
r = rgb[i,j,0]
g = rgb[i,j,1]
b = rgb[i,j,2]
points.append([j,i,z])
colors.append([r,g,b])
points = np.asarray(points)
colors = np.asarray(colors)
cloud = unproject_image_to_point_cloud(points,\
calib.intrinsic_params) #-> (num_points, 3)
return cloud
Related
I am using nibabel lib to load data from nii file. I read the document of the lib at http://nipy.org/nibabel/gettingstarted.html, and found that
This information is available without the need to load anything of the main image data into the memory. Of course there is also access to the image data as a NumPy array
This is my code to load the data and it shapes
import nibabel as nib
img = nib.load('example.nii')
data = img.get_data()
data = np.squeeze(data)
data = np.copy(data, order="C")
print data.shape
I got the result
128, 128, 64
What is order of data shape? Is it WidthxHeightxDepth? And my input must arranged as depth, height, width. So I will use input=data.transpose(2,0,1). Is it right? Thanks all
Update: I found that the Numpy will read the image by order Height x Width x Depth as the reference http://www.python-course.eu/images/axis.jpeg
OK, here's my take:
Using scipy.ndimage.imread('img.jpg', mode='RGB'), the resulting array will always have this order: (H, W, D) i.e. (height, width, depth) because of the terminology that numpy uses for ndarrays (axis=0, axis=1, axis=2) or analogously (Y, X, Z) if one would like to visualize in 3 dimensions.
# read image
In [21]: img = scipy.ndimage.imread('suza.jpg', mode='RGB')
# image shape as (H, W, D)
In [22]: img.shape
Out[22]: (634, 1366, 3)
# transpose to shape as (D, H, W)
In [23]: tr_img = img.transpose((-1, 0, 1))
In [23]: tr_img.shape
Out[23]: (3, 634, 1366)
If you consider the img_shape as a tuple,
# index (0, 1, 2)
img_shape = (634, 1366, 3)
# or index (-3, -2, -1)
Choose which one is a convenient way for you to remember.
NOTE: The scipy.ndimage.imread() API has been removed since Scipy 1.2.0. So, it is now recommended to use imageio.imread(), which reads the image and returns Array, a subclass of numpy array, following the same conventions discussed above.
# read image
$ img = imageio.imread('suza.jpg', format='jpg')
# convert the image to a numpy array
$ img_np = np.asarray(img)
PS: It should also be noted that libraries like tensorflow also (almost) follows the same convention as numpy.
tf.image_decode_jpeg() returns:
A Tensor of type uint8. 3-D with shape [height, width, channels]
Hello I am beginner in OpenCv.
I have a maze image. I wrote maze solver code. I need to get the photo like the picture for this code to work.
I want to choose the contours of the white area using ROI but I could not
When I try the ROI method I get a smooth rectangle with a black area selected.
https://i.stack.imgur.com/Ty5BX.png -----> this is my code result
https://i.stack.imgur.com/S7zuJ.png --------> I want to this result
import cv2
import numpy as np
#import image
image = cv2.imread('rt4.png')
#grayscaleqq
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
#cv2.imshow('gray', gray)
#qcv2.waitKey(0)
#binary
#ret,thresh = cv2.threshold(gray,127,255,cv2.THRESH_BINARY_INV)
threshold = 150
thresh = cv2.threshold(gray, threshold, 255, cv2.THRESH_BINARY)[1]
cv2.namedWindow('second', cv2.WINDOW_NORMAL)
cv2.imshow('second', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
#dilation
kernel = np.ones((1,1), np.uint8)
img_dilation = cv2.dilate(thresh, kernel, iterations=1)
cv2.namedWindow('dilated', cv2.WINDOW_NORMAL)
cv2.imshow('dilated', img_dilation)
cv2.waitKey(0)
cv2.destroyAllWindows()
#find contours
im2,ctrs, hier = cv2.findContours(img_dilation.copy(),
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)
[0])
list = []
for i, ctr in enumerate(sorted_ctrs):
# Get bounding box
x, y, w, h = cv2.boundingRect(ctr)
# Getting ROI
roi = image[y:y+h, x:x+w]
a = w-x
b = h-y
list.append((a,b,x,y,w,h))
# show ROI
#cv2.imshow('segment no:'+str(i),roi)
cv2.rectangle(image,(x,y),( x + w, y + h ),(0,255,0),2)
#cv2.waitKey(0)
if w > 15 and h > 15:
cv2.imwrite('home/Desktop/output/{}.png'.format(i), roi)
cv2.namedWindow('marked areas', cv2.WINDOW_NORMAL)
cv2.imshow('marked areas',image)
cv2.waitKey(0)
cv2.destroyAllWindows()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2.cornerHarris(gray,2,3,0.04)
#result is dilated for marking the corners, not important
dst = cv2.dilate(dst,None)
image[dst>0.01*dst.max()]=[0,0,255]
cv2.imshow('dst',image)
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
list.sort()
print(list[len(list)-1])
I misunderstood your question earlier. So, I'm rewriting.
As #Silencer has already stated, you could use the drawContours method. You can do it as follows:
import cv2
import numpy as np
#import image
im = cv2.imread('Maze2.png')
gaus = cv2.GaussianBlur(im, (5, 5), 1)
# mask1 = cv2.dilate(gaus, np.ones((15, 15), np.uint8, 3))
mask2 = cv2.erode(gaus, np.ones((5, 5), np.uint8, 1))
imgray = cv2.cvtColor(mask2, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(imgray, 127, 255, 0)
im2, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
maxArea1=0
maxI1=0
for i in range(len(contours)):
area = cv2.contourArea(contours[i])
epsilon = 0.01 * cv2.arcLength(contours[i], True)
approx = cv2.approxPolyDP(contours[i], epsilon, True)
if area > maxArea1 :
maxArea1 = area
print(maxArea1)
print(maxI1)
cv2.drawContours(im, contours, maxI1, (0,255,255), 3)
cv2.imshow("yay",im)
cv2.imshow("gray",imgray)
cv2.waitKey(0)
cv2.destroyAllWindows()
I used it on the following image:
And I got the right answer. You can add additional filters, or you could decrease the area using an ROI, to decrese the discrepancy, but it wasn't required
Hope it helps!
A simple solution to just draw a slanted rectangle would be to use cv2.polylines. Based on your result, I'm assuming you have the coordinates of the vertices of the area already, lets call them [x1,y1], [x2,y2], [x3,y3], [x4,y4]. The polylines function draws a line from vertex to vertex to create a closed polygon.
import cv2
import numpy as np
#List coordinates of vertices as an array
pts = np.array([[x1,y1],[x2,y2],[x3,y3],[x4,y4]], np.int32)
pts = pts.reshape((-1,1,2))
#Draw lines from vertex to vertex
cv2.polylines(image, [pts], True, (255,0,0))
I am able to make a plot of data points based on their Lat and Long, which looks like:
whereby the orange is made up of points like:
using the code:
m = Basemap(projection='merc',llcrnrlat=-0.5,urcrnrlat=0.5,\
llcrnrlon=9,urcrnrlon=10,lat_ts=0.25,resolution='i')
m.drawcoastlines()
m.drawcountries()
# draw parallels and meridians.
parallels = np.arange(-9.,10.,0.5)
# Label the meridians and parallels
m.drawparallels(parallels,labels=[False,True,True,False])
# Draw Meridians and Labels
meridians = np.arange(-1.,1.,0.5)
m.drawmeridians(meridians,labels=[True,False,False,True])
m.drawmapboundary(fill_color='white')
x,y = m(X, Y) # This is the step that transforms the data into the map's projection
scatter = plt.scatter(x,y)
m.scatter(x,y)
where X and Y are numpy arrays.
I want to get the X and Y co-ordinate of a point that I click on.
I can get the co-ord using:
coords = []
def onclick(event):
if plt.get_current_fig_manager().toolbar.mode != '':
return
global coords
ix, iy = event.x, event.y
print('x = %d, y = %d'%(ix, iy))
global coords
coords.append((ix, iy))
return coords
cid = fig.canvas.mpl_connect('button_press_event', onclick)
plt.show()
but this seems to return the figure co-ordinates. Is there a way to convert these to their respective lat and long co-ordinates?
I then plan to use these to find the nearest point in the original X and Y arrays to where I click
First of all you would want to use the data coordinates
ix, iy = event.xdata, event.ydata
Then to get lon/lat coordinates you need to apply the inverse map transform
lon, lat = m(event.xdata, event.ydata, inverse=True)
Consider the following image:
I'd like to print it as a grayscale image. I can do the conversion with scikit-image:
from skimage.io import imread
from matplotlib import pyplot as plt
from skimage.color import rgb2gray
img = imread('image.jpg')
plt.grid(which = 'both')
plt.imshow(rgb2gray(img), cmap=plt.cm.gray)
I get:
which is obviously not what I want.
My question is: Is there a way with scikit-image or with raw numpy and/or mathplotlib to digitize the image so that I get a 3D array (first dimension: X index, second dimension: Y index, third dimension: value according to the colormap). Then I can easily change to colormap to something that turns out to have better results when printing in grayscale?
The example below demonstrates a simple way to undo a colormap's value -> RGB mapping.
def unmap_nearest(img, rgb):
""" img is an image of shape [n, m, 3], and rgb is a colormap of shape [k, 3]. """
d = np.sum(np.abs(img[np.newaxis, ...] - rgb[:, np.newaxis, np.newaxis, :]), axis=-1)
i = np.argmin(d, axis=0)
return i / (rgb.shape[0] - 1)
This function works by taking the RGB value of each pixel and looking up the index of the best matching color in the colormap. Some trickery with indexing and broadcasting allows for efficient vectorization (at the cost of memory spent on temporary arrays):
img[np.newaxis, ...] converts the image from shape [n, m, 3] to [1, n, m, 3]
rgb[:, np.newaxis, np.newaxis, :] converts the colormap from shape [k, 3] to [k, 1, 1, 3].
subtracting the resulting arrays leads to an array of shape [k, n, m, 3] that contians the difference between each colormap index k and pixel n, m for each color component.
sum(abs(..), axis=-1) takes the absolute value of the differences and sums over all color components (the last dimension) to get the total difference between all pixels and color map entries (array of shape [k, n, m]).
i = np.argmin(d, axis=0) finds the index of the minimum element along the first dimension. The result is the index of the best matching color map entry of each pixel [n, m].
return i / (rgb.shape[0] - 1) finally returns the indices normalized by the color map size so that the result is in range 0-1.
There are a faw caveats with this approach:
It cannot reconstruct the original value range.
It will treat all pixels as part of the color map (i.e. continent contuors will also be mapped).
If you use the wrong color map it will fail hilariously.
.
import numpy as np
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
def unmap_nearest(img, rgb):
""" img is an image of shape [n, m, 3], and rgb is a colormap of shape [k, 3]. """
d = np.sum(np.abs(img[np.newaxis, ...] - rgb[:, np.newaxis, np.newaxis, :]), axis=-1)
i = np.argmin(d, axis=0)
return i / (rgb.shape[0] - 1)
cmap = plt.cm.jet
rgb = cmap(np.linspace(0, 1, cmap.N))[:, :3]
original = (np.arange(10)[:, None] + np.arange(10)[None, :])
plt.subplot(2, 2, 1)
plt.imshow(original, cmap='gray')
plt.colorbar()
plt.title('original')
plt.subplot(2, 2, 2)
rgb_img = cmap(original / 18)[..., :-1]
plt.imshow(rgb_img)
plt.title('color-mapped')
plt.subplot(2, 2, 3)
wrong = rgb2gray(rgb_img)
plt.imshow(wrong, cmap='gray')
plt.title('rgb2gray')
plt.subplot(2, 2, 4)
reconstructed = unmap_nearest(rgb_img, rgb)
plt.imshow(reconstructed, cmap='gray')
plt.colorbar()
plt.title('reconstructed')
plt.show()
Building on #kazemakmakase's answer, if you're digitizing a figure, you probably are dealing with a copy of the original that's been converted, or maybe even printed and scanned at some point. Those things can distort colors from the "true" colormap that was originally used.
You can deal with this by using a slice through the figure's colorbar as the 'pattern' (rgb) to match against. Specifically, crop the figure down to just the color ramp (in landscape orientation in this example), then replace the rgb variable in #kazemakmakase's example with:
cmapimg = plt.imread('cropped_colorbar.png')
rgb = cmapimg[cmapimg.shape[0]/2,:,:3]
Say that I have a RGB image:
from skimage import data
img = data.astronaut()
print(img.shape) # (512, 512, 3)
Is there a succinct numpy command to unpack it along the color channels:
R, G, B = np.unpack(img, 2) # ?
What I am doing is using comprehension:
R, G, B = (img[:, :, i] for i in range(3))
But is there no simpler command?
Alternatively you can use np.rollaxis -
R,G,B = np.rollaxis(img,2)
You can transpose the length-3 dimension to the front and then unpack it:
R, G, B = img.transpose((2, 0, 1))
Alternatively, you can use np.split:
R, G, B = np.split(img, img.shape[-1], axis=-1)
If your array is of shape (height, width, channel), you can use np.dsplit to split along the depth dimension:
R, G, B = np.dsplit(img, img.shape[-1])