Related
I'm a new learner in the Python language. When a new vector is generated, is it a column or row vector by default?
import numpy as np
theta = np.arange(3)
a = len(theta.T)
b = len(theta)
print('theta = {} \n theta.T = {}'.format(theta,theta.T))
c = theta.T.dot(theta)
d = theta.dot(theta.T)
It turns out a == b == 3, c == d, and both theta and theta.T are displayed as a row vector.
But this matters when I want to calculate the derivative of of symbolic function x ยท xT with x a row vector.
Neither, it is a 1D array as:
>>> theta.shape
(3,)
A column vector would have a shape equal to (3,1), a row vector: (1,3). You can create it by changing the shape
>>> theta.shape = (1,3)
>>> theta
array([[0, 1, 2]])
>>> theta.shape = (3,1)
>>> theta
array([[0],
[1],
[2]])
I have an RGB image of shape (h, w, 3) and a corresponding depth map of shape (h, w).
Thus I know, for each pixel, its 3D coordinates.
I would like to rotate the image by some 3D rotation matrix.
I know how to apply the rotation to the input coordinates and get the coordinates in the target view, but how do I render the new view given the input image pixel values?
I tried using scipy's griddata, but this interpolation "fills" in gaps for occluded regions and overall performs interpolation, but not rendering of the new view.
Is there a better way to render the new rotated view in pytorch or numpy?
Here's some code that would do the association.
def get_colored_point_cloud(calib, rgb, depth):
"""
pass in rgb and associated depth map
return point cloud and color for each point
cloud.shape -> (num_points, 3) for [x, y, z]
colors.shape -> (num_points, 3) for [r, g, b]
"""
rows, cols = depth.shape
#create a grid and stack depth and rgb
c, r = np.meshgrid(np.arange(cols), np.arange(rows)) # c-> (cols, rows), r-> (cols, rows)
points = np.stack([c, r, depth]) # -> stacking (3, num_points)
colors = np.stack([c, r, rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]])
points = points.reshape(3, -1) #-> (3, num_points)
colors = colors.reshape(5, -1) #-> (5, num_points)
points = points.T #-> (num_points, 3)
colors = colors.T #-> (num_points, 5)
#now transform [u, v] to [x, y, z] by camera unprojection
cloud = unproject_image_to_point_cloud(points, calib.intrinsic_params) #-> (num_points, 3)
return cloud, colors[:,2:5] # (num_points, 3), (num_points, 3)
It is also possible to do this through open3d. But you will have to deal with practical matters of getting the view as desired for it to work in open3d.
See this post: Generate point cloud from depth image
The more direct way of doing this instead of the somewhat ugly meshgrid process (at least the way I have written it) is by creating separate arrays for point(col_index, row_index, z) and color(col_index, row_index, R, G, B), and transforming (col_index, row_index,z) to (x, y, z) in an unrolled way for each point, but this is much slower as it does not use numpy vectorization magic under the hood.
def get_colored_point_cloud(calib, rgb, depth):
points = []
colors = []
rows, cols = depth.shape
for i in range(rows):
for j in range(cols):
z = depth[i, j]
r = rgb[i,j,0]
g = rgb[i,j,1]
b = rgb[i,j,2]
points.append([j,i,z])
colors.append([r,g,b])
points = np.asarray(points)
colors = np.asarray(colors)
cloud = unproject_image_to_point_cloud(points,\
calib.intrinsic_params) #-> (num_points, 3)
return cloud
I tried to use a polynomial (3-degrees) to fit a data series, but it seems that it's still not the best fit (some points are off in graph shown below). I also tried to add a log function to help plot. But result is not improved either.
What would be the best curve fitting here?
Here are the raw data points I have:
x_values = [ 0.51,0.56444444,0.61888889 , 0.67333333 , 0.72777778, 0.78222222, 0.83666667, 0.89111111 , 0.94555556 , 1. ]
y_values = [0.67154591, 0.66657266, 0.65878351, 0.6488696, 0.63499979, 0.6202393, 0.59887225, 0.56689689, 0.51768976, 0.33029004]
Results with polynomial fit:
It would be better, if your curve fitting procedure were hypothesis driven, i.e., you had already an idea, what kind of relationship to expect. The shape looked to me more like an exponential function:
from matplotlib import pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
#the function that describes the data
def func(x, a, b, c, d):
return a * np.exp(b * x + c) + d
x_values = [0.51,0.56444444, 0.61888889, 0.67333333 , 0.72777778, 0.78222222, 0.83666667, 0.89111111 , 0.94555556 , 1. ]
y_values = [0.67154591, 0.66657266, 0.65878351, 0.6488696, 0.63499979, 0.6202393, 0.59887225, 0.56689689, 0.51768976, 0.33029004]
#start values [a, b, c, d]
start = [-.1, 1, 0, .1]
#curve fitting
popt, pcov = curve_fit(func, x_values, y_values, p0 = start)
#output [a, b, c, d]
print(popt)
#calculating the fit curve at a better resolution
x_fit = np.linspace(min(x_values), max(x_values), 1000)
y_fit = func(x_fit, *popt)
#plot data and fit
plt.scatter(x_values, y_values, label = "data")
plt.plot(x_fit, y_fit, label = "fit")
plt.legend()
plt.show()
This gives the following output:
This still does not look correct, the first part seems to have a linear offset. If we take this into consideration:
from matplotlib import pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
def func(x, a, b, c, d, e):
return a * np.exp(b * x + c) + d * x + e
x_values = [0.51,0.56444444, 0.61888889, 0.67333333 , 0.72777778, 0.78222222, 0.83666667, 0.89111111 , 0.94555556 , 1. ]
y_values = [0.67154591, 0.66657266, 0.65878351, 0.6488696, 0.63499979, 0.6202393, 0.59887225, 0.56689689, 0.51768976, 0.33029004]
start = [-.1, 1, 0, .1, 1]
popt, pcov = curve_fit(func, x_values, y_values, p0 = start)
print(popt)
x_fit = np.linspace(min(x_values), max(x_values), 1000)
y_fit = func(x_fit, *popt)
plt.scatter(x_values, y_values, label = "data")
plt.plot(x_fit, y_fit, label = "fit")
plt.legend()
plt.show()
we have the following output:
This now is closer to your data points.
BUT. You should go to your data and think about, which model is most likely to reflect reality, then implement this model. You can always construct more complicated functions that better fit your data, but they do not necessarily reflect better reality.
I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
Consider the following image:
I'd like to print it as a grayscale image. I can do the conversion with scikit-image:
from skimage.io import imread
from matplotlib import pyplot as plt
from skimage.color import rgb2gray
img = imread('image.jpg')
plt.grid(which = 'both')
plt.imshow(rgb2gray(img), cmap=plt.cm.gray)
I get:
which is obviously not what I want.
My question is: Is there a way with scikit-image or with raw numpy and/or mathplotlib to digitize the image so that I get a 3D array (first dimension: X index, second dimension: Y index, third dimension: value according to the colormap). Then I can easily change to colormap to something that turns out to have better results when printing in grayscale?
The example below demonstrates a simple way to undo a colormap's value -> RGB mapping.
def unmap_nearest(img, rgb):
""" img is an image of shape [n, m, 3], and rgb is a colormap of shape [k, 3]. """
d = np.sum(np.abs(img[np.newaxis, ...] - rgb[:, np.newaxis, np.newaxis, :]), axis=-1)
i = np.argmin(d, axis=0)
return i / (rgb.shape[0] - 1)
This function works by taking the RGB value of each pixel and looking up the index of the best matching color in the colormap. Some trickery with indexing and broadcasting allows for efficient vectorization (at the cost of memory spent on temporary arrays):
img[np.newaxis, ...] converts the image from shape [n, m, 3] to [1, n, m, 3]
rgb[:, np.newaxis, np.newaxis, :] converts the colormap from shape [k, 3] to [k, 1, 1, 3].
subtracting the resulting arrays leads to an array of shape [k, n, m, 3] that contians the difference between each colormap index k and pixel n, m for each color component.
sum(abs(..), axis=-1) takes the absolute value of the differences and sums over all color components (the last dimension) to get the total difference between all pixels and color map entries (array of shape [k, n, m]).
i = np.argmin(d, axis=0) finds the index of the minimum element along the first dimension. The result is the index of the best matching color map entry of each pixel [n, m].
return i / (rgb.shape[0] - 1) finally returns the indices normalized by the color map size so that the result is in range 0-1.
There are a faw caveats with this approach:
It cannot reconstruct the original value range.
It will treat all pixels as part of the color map (i.e. continent contuors will also be mapped).
If you use the wrong color map it will fail hilariously.
.
import numpy as np
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
def unmap_nearest(img, rgb):
""" img is an image of shape [n, m, 3], and rgb is a colormap of shape [k, 3]. """
d = np.sum(np.abs(img[np.newaxis, ...] - rgb[:, np.newaxis, np.newaxis, :]), axis=-1)
i = np.argmin(d, axis=0)
return i / (rgb.shape[0] - 1)
cmap = plt.cm.jet
rgb = cmap(np.linspace(0, 1, cmap.N))[:, :3]
original = (np.arange(10)[:, None] + np.arange(10)[None, :])
plt.subplot(2, 2, 1)
plt.imshow(original, cmap='gray')
plt.colorbar()
plt.title('original')
plt.subplot(2, 2, 2)
rgb_img = cmap(original / 18)[..., :-1]
plt.imshow(rgb_img)
plt.title('color-mapped')
plt.subplot(2, 2, 3)
wrong = rgb2gray(rgb_img)
plt.imshow(wrong, cmap='gray')
plt.title('rgb2gray')
plt.subplot(2, 2, 4)
reconstructed = unmap_nearest(rgb_img, rgb)
plt.imshow(reconstructed, cmap='gray')
plt.colorbar()
plt.title('reconstructed')
plt.show()
Building on #kazemakmakase's answer, if you're digitizing a figure, you probably are dealing with a copy of the original that's been converted, or maybe even printed and scanned at some point. Those things can distort colors from the "true" colormap that was originally used.
You can deal with this by using a slice through the figure's colorbar as the 'pattern' (rgb) to match against. Specifically, crop the figure down to just the color ramp (in landscape orientation in this example), then replace the rgb variable in #kazemakmakase's example with:
cmapimg = plt.imread('cropped_colorbar.png')
rgb = cmapimg[cmapimg.shape[0]/2,:,:3]