Iterating over multidimensional arrays(images) with numpy array - python - numpy

Hy!
I have two images(same dimension) as numpy array imgA - imgB
i would like to iterate each row and column and get somenthing like that:
for i in range(0, h-1):
for j in range(0, w-1):
final[i][j]= imgA[i,j] - imgB[i-k[i],j]
where h and w are the height and the width of the image and k is and array with dimension[h*w].
i have seen this topic:
Iterating over a numpy array
but it doens't work with images, i get the error: too many values to unpack
Is there any way to do that with numpy and python 2.7?
thanks
edit
I try to explain better myself.
I have 2 images in LAB color space.
these images are (288,384,3).
Now I would like to make deltaE so I could do like that(spitting the 2 arrays):
imgLabL=np.dsplit(imgL,3)
imgLabR=np.dsplit(imgR,3)
imgLl=imgLabL[0]
imgLa=imgLabL[1]
imgLb=imgLabL[2]
imgRl=imgLabR[0]
imgRa=imgLabR[1]
imgRb=imgLabR[2]
delta=np.sqrt(((imgLl-imgRl)**2) + ((imgLa - imgRa)**2) + ((imgLb - imgRb)**2) )
Till now everything is fine.
But now i have this array k of size (288,384).
So now i need a new delta but with different x axis,like the pixel in imgRl(0,0) i want to add the pixel in imgLl(0+k,0)
do you get more my problems?

I'm pretty sure that whatever it is you are trying to do can be vectorized and run without any loops in it. But the way your code is written, it is no surprise that it doesn't work...
If k is an array of shape (h, w), then k[i] is an array of shape (w,). when you do i-k[i], numpy will do its broadcasting magic, and you will get an array of shape (w,). So you are indexing imgB with an array of shape (w,) and a single integer. Because one of the items in the indexing is an array, fancy indexing kicks in. So assuming imgB also has shape (h, w, 1), the return value of imgB[i-k[i], j] will not be an array of shape (1,), but an array of shape (w, 1). When you then try to substract that from imgA[i, j], which is an array of shape (1,), broadcasting magic works again, and so you get an array of shape (w, 1).
We do not know what is final. But if it is an array of shape (h, w, 1), as imgA and imgB, then final[i][j] is an array of shape (1,), and you are trying to assign to it an array of shape (w, 1), which does not fit. Hence the operand requires a reduction,but reduction is not enabled error message.
EDIT
You don't really need to split your arrays to compute DeltaE...
def deltaE(a, b) :
return np.sqrt(((a - b)**2).sum(axis=-1))
delta = deltaE(imgLabL, imgLabR)
I still don't understand what you want to do in the second case... If you want to compare the two images displaced along the x-axis, I would suggest using np.roll:
deltaE(imgLabL, np.roll(imgLabR, k, axis=0))
will have at position (r, c) the deltaE between the pixel (r, c) of imgLabL and the pixel (r - k, c) of imgLAbR. Is that what you want?

I usually use numpy.nditer, the docs for which are here and have many examples. Briefly:
import numpy as np
a = np.ones([4,4])
it = np.nditer(a)
for elem in a:
#do stuff
You can also use c style iteration, i.e.
while not it.finished:
#do stuff
it.iternext()
If you need to access the indices of your arrays. In your situation, I would zip your two images together to create an array of shape [2,h,w] and then iterate over this, filling an empty array with the results of the computation.

Related

Visualize 1D numpy array as 2D array with matplotlib

I have a 2D array of all the numbers 1 to 100 split by 10. And boolean values for each number being prime or not prime. I'm struggling to figure out how to visualize it like in the image below.
Here is my code to help understand what I have better.
I want to visualize it like this pic online.
# excersize
is_prime = np.ones(100, dtype=bool) # array will be filled with Trues since 1 = True
# For each integer j starting from 2, cross out its higher multiples:
N_max = int(np.sqrt(len(is_prime) - 1))
for j in range(2, N_max + 1):
is_prime[2*j::j] = False
# split an array up into multiple sub arrays
split_primes = np.split(is_prime, 10);
# create overlay for numbers
num_overlay = np.arange(100)
split_overlay = np.split(num_overlay, 10)
plt.plot(split_overlay)
Creating 2D array of the numbers
Check out the documentation for numpy's reshape function. Here you can turn your array into a 2D array by doing:
data = is_prime.reshape(10,10)
we can also make an array of the first 100 integers to use for labeling in a similar fashion:
integers = np.arange(100).reshape(10,10)
Plotting the 2D array
When plotting in 2D you need to use one of the 2D functions that matplotlib provides: e.g. imshow, matshow, pcolormesh. You can either call these functions directly on your array, in which case they will use a colormap and each pixel's color will correspond to the value in associated spot in the array. Or you can explicitly make an RGB image which affords you a bit more control over the color of each box. For this case I think that that is a bit easier to do so the below solution uses that approach. However if you want to annotate heatmaps the matplolib documentation has a great resource for that here. For now we will create an array of RGB values (shape of 10 by 10 by 3) and change the colors of only the prime numbers using numpy's indexing abilities.
#create RGB array that we will fill in
rgb = np.ones((10,10,3)) #start with an array of white
rgb[data]=[1,1,0] # color the places where the data is prime to be white
plt.figure(figsize=(10,10))
plt.imshow(rgb)
# add number annotations
integers = np.arange(100).reshape(10,10)
#add annotations based on: https://stackoverflow.com/questions/20998083/show-the-values-in-the-grid-using-matplotlib
for (i, j), z in np.ndenumerate(integers):
plt.text(j, i, '{:d}'.format(z), ha='center', va='center',color='k',fontsize=15)
# remove axis and tick labels
plt.axis('off')
plt.show()
Resulting in this image:

How to check the presence of a given numpy array in a larger-shape numpy array?

I guess the title of my question might not be very clear..
I have a small array, say a = ([[0,0,0],[0,0,1],[0,1,1]]). Then I have a bigger array of a higher dimension, say b = ([[[2,2,2],[2,0,1],[2,1,1]],[[0,0,0],[3,3,1],[3,1,1]],[...]]).
I'd like to check if one of the elements of a can be found in b. In this case, I'd find that the first element of a [0,0,0] is indeed in b, and then I'd like to retrieve the corresponding index in b.
I'd like to do that avoiding looping, since from the very little I understood from numpy arrays, they are not meant to be iterated over in a classic way. In other words, I need it to be very fast, because my actual arrays are quite big.
Any idea?
Thanks a lot!
Arnaud.
I don't know of a direct way, but I here's a function that works around the problem:
import numpy as np
def find_indices(val, arr):
# first take a mean at the lowest level of each array,
# then compare these to eliminate the majority of entries
mb = np.mean(arr, axis=2); ma = np.mean(val)
Y = np.argwhere(mb==ma)
indices = []
# Then run a quick loop on the remaining elements to
# eliminate arrays that don't match the order
for i in range(len(Y)):
idx = (Y[i,0],Y[i,1])
if np.array_equal(val, arr[idx]):
indices.append(idx)
return indices
# Sample arrays
a = np.array([[0,0,0],[0,0,1],[0,1,1]])
b = np.array([ [[6,5,4],[0,0,1],[2,3,3]], \
[[2,5,4],[6,5,4],[0,0,0]], \
[[2,0,2],[3,5,4],[5,4,6]], \
[[6,5,4],[0,0,0],[2,5,3]] ])
print(find_indices(a[0], b))
# [(1, 2), (3, 1)]
print(find_indices(a[1], b))
# [(0, 1)]
The idea is to use the mean of each array and compare this with the mean of the input. np.argwhere() is the key here. That way you remove most of the unwanted matches, but I did need to use a loop on the remainder to avoid the unsorted matches (this shouldn't be too memory-consuming). You'll probably want to customise it further, but I hope this helps.

Numpy Array Shape Issue

I have initialized this empty 2d np.array
inputs = np.empty((300, 2), int)
And I am attempting to append a 2d row to it as such
inputs = np.append(inputs, np.array([1,2]), axis=0)
But Im getting
ValueError: all the input arrays must have same number of dimensions
And Numpy thinks it's a 2 row 0 dimensional object (transpose of 2d)
np.array([1, 2]).shape
(2,)
Where have I gone wrong?
To add a row to a (300,2) shape array, you need a (1,2) shape array. Note the matching 2nd dimension.
np.array([[1,2]]) works. So does np.array([1,2])[None, :] and np.atleast_2d([1,2]).
I encourage the use of np.concatenate. It forces you to think more carefully about the dimensions.
Do you really want to start with np.empty? Look at its values. They are random, and probably large.
#Divakar suggests np.row_stack. That puzzled me a bit, until I checked and found that it is just another name for np.vstack. That function passes all inputs through np.atleast_2d before doing np.concatenate. So ultimately the same solution - turn the (2,) array into a (1,2)
Numpy requires double brackets to declare an array literal, so
np.array([1,2])
needs to be
np.array([[1,2]])
If you intend to append that as the last row into inputs, you can just simply use np.row_stack -
np.row_stack((inputs,np.array([1,2])))
Please note this np.array([1,2]) is a 1D array.
You can even pass it a 2D row version for the same result -
np.row_stack((inputs,np.array([[1,2]])))

How to get a subarray in numpy

I have an 3d array and I want to get a sub-array of size (2n+1) centered around an index indx. Using slices I can use
y[slice(indx[0]-n,indx[0]+n+1),slice(indx[1]-n,indx[1]+n+1),slice(indx[2]-n,indx[2]+n+1)]
which will only get uglier if I want a different size for each dimension. Is there a nicer way to do this.
You don't need to use the slice constructor unless you want to store the slice object for later use. Instead, you can simply do:
y[indx[0]-n:indx[0]+n+1, indx[1]-n:indx[1]+n+1, indx[2]-n:indx[2]+n+1]
If you want to do this without specifying each index separately, you can use list comprehensions:
y[[slice(i-n, i+n+1) for i in indx]]
You can create numpy arrays for indexing into different dimensions of the 3D array and then use use ix_ function to create indexing map and thus get the sliced output. The benefit with ix_ is that it allows for broadcasted indexing maps. More info on this could be found here. Then, you can specify different window sizes for each dimension for a generic solution. Here's the implementation with sample input data -
import numpy as np
A = np.random.randint(0,9,(17,18,16)) # Input array
indx = np.array([5,10,8]) # Pivot indices for each dim
N = [4,3,2] # Window sizes
# Arrays of start & stop indices
start = indx - N
stop = indx + N + 1
# Create indexing arrays for each dimension
xc = np.arange(start[0],stop[0])
yc = np.arange(start[1],stop[1])
zc = np.arange(start[2],stop[2])
# Create mesh from multiple arrays for use as indexing map
# and thus get desired sliced output
Aout = A[np.ix_(xc,yc,zc)]
Thus, for the given data with window sizes array, N = [4,3,2], the whos info shows -
In [318]: whos
Variable Type Data/Info
-------------------------------
A ndarray 17x18x16: 4896 elems, type `int32`, 19584 bytes
Aout ndarray 9x7x5: 315 elems, type `int32`, 1260 bytes
The whos info for the output, Aout seems to be coherent with the intended output shape which must be 2N+1.

Should a pandas dataframe column be converted in some way before passing it to a scikit learn regressor?

I have a pandas dataframe and passing df[list_of_columns] as X and df[[single_column]] as Y to a Random Forest regressor.
What does the following warnning mean and what should be done to resolve it?
DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). probas = cfr.fit(trainset_X, trainset_Y).predict(testset_X)
Simply check the shape of your Y variable, it should be a one-dimensional object, and you are probably passing something with more (possibly trivial) dimensions. Reshape it to the form of list/1d array.
You can use df.single_column.values or df['single_column'].values to get the underlying numpy array of your series (which, in this case, should also have the correct 1D-shape as mentioned by lejlot).
Actually the warning tells you exactly what is the problem:
You pass a 2d array which happened to be in the form (X, 1), but the method expects a 1d array and has to be in the form (X, ).
Moreover the warning tells you what to do to transform to the form you need: y.values.ravel().
Use Y = df[[single_column]].values.ravel() solves DataConversionWarning for me.