Converting flat 1D matrix to a square matrix - numpy

Is there a way/code to convert the following n x 1 matrix,
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
etc.
into a square matrix of the form,
x1 x2 x4 x7
x2 x3 x5 x8
x4 x5 x6 x9
x7 x8 x9 x10
etc.
I have a 903 x 1 matrix (in a .csv format) that I hope to convert into a 42 x 42 matrix with the form as shown. Thanks!

I suppose I should wait until you edit the question, but I went ahead and looked at the figure. It looks like a symmetric matrix based on tri-upper and lower matrices. In what dicispline is that called a `full matrix'?
Anyhow's here one sequence that produces your figure:
In [93]: idx=np.tril_indices(4)
In [94]: idx
Out[94]: (array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3]), array([0, 0, 1, 0, 1, 2, 0, 1, 2, 3]))
In [95]: arr = np.zeros((4,4),int)
In [96]: arr[idx] = np.arange(1,11)
In [97]: arr
Out[97]:
array([[ 1, 0, 0, 0],
[ 2, 3, 0, 0],
[ 4, 5, 6, 0],
[ 7, 8, 9, 10]])
In [98]: arr1 = arr + arr.T
In [99]: arr1
Out[99]:
array([[ 2, 2, 4, 7],
[ 2, 6, 5, 8],
[ 4, 5, 12, 9],
[ 7, 8, 9, 20]])
In [100]: dx = np.diag_indices(4)
In [101]: dx
Out[101]: (array([0, 1, 2, 3]), array([0, 1, 2, 3]))
In [102]: arr1[dx] = arr[dx]
In [103]: arr1
Out[103]:
array([[ 1, 2, 4, 7],
[ 2, 3, 5, 8],
[ 4, 5, 6, 9],
[ 7, 8, 9, 10]])
This is similar to what scipy.spatial calls a squareform for pairwise distances.
https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.spatial.distance.squareform.html#scipy.spatial.distance.squareform
In [106]: from scipy.spatial import distance
In [107]: distance.squareform(np.arange(1,11))
Out[107]:
array([[ 0, 1, 2, 3, 4],
[ 1, 0, 5, 6, 7],
[ 2, 5, 0, 8, 9],
[ 3, 6, 8, 0, 10],
[ 4, 7, 9, 10, 0]])
It appears that this square_form uses compiled code, so I expect it will be quite a bit faster than my tril base code. But the order of elements isn't quite what you expect.

Numpy has a function to reshape arrays -
https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
>>> np.reshape(a, (2, 3)) # C-like index ordering
array([[0, 1, 2],
[3, 4, 5]])

Related

Can I select arbitrary windows from the last dimension of a numpy array?

I'd like to write a numpy function that takes an MxN array A, a window length L, and an MxP array idxs of starting indices into the M rows of A that selects P arbitrary slices of length L from each of the M rows of A. Except, I would love for this to work on the last dimension of A, and not necessarily care how many dimensions A has, so all dims of A and idxs match except the last one. Examples:
If A is just 1D:
A = np.array([1, 2, 3, 4, 5, 6])
window_len = 3
idxs = np.array([1, 3])
result = magical_routine(A, idxs, window_len)
Where result is a 2x3 array since I selected 2 slices of len 3:
np.array([[ 2, 3, 4],
[ 4, 5, 6]])
If A is 2D:
A = np.array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9,10,11,12],
[13,14,15,16,17,18]])
window_len = 3
idxs = np.array([[1, 3],
[0, 1],
[2, 2]])
result = magical_routine(A, idxs, window_len)
Where result is a 3x2x3 array since there are 3 rows of A, and I selected 2 slices of len 3 from each row:
np.array([[[ 2, 3, 4], [ 4, 5, 6]],
[[ 7, 8, 9], [ 8, 9,10]],
[[15,16,17], [15,16,17]]])
And so on.
I have discovered an number of inefficient ways to do this, along with ways that work for a specific number of dimensions of A. For 2D, the following is pretty tidy:
col_idxs = np.add.outer(idxs, np.arange(window_len))
np.take_along_axis(A[:, np.newaxis], col_idxs, axis=-1)
I can't see a nice way to generalize this for 1D and other D's though...
Is anyone aware of an efficient way that generalizes to any number of dims?
For your 1d case
In [271]: A=np.arange(1,7)
In [272]: idxs = np.array([1,3])
Using the kind of iteration that this questions usually gets:
In [273]: np.vstack([A[i:i+3] for i in idxs])
Out[273]:
array([[2, 3, 4],
[4, 5, 6]])
Alternatively generate all indices, and one indexing. linspace is handy for this (though it's not the only option):
In [278]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [279]: j
Out[279]:
array([[1., 3.],
[2., 4.],
[3., 5.]])
In [282]: A[j.T.astype(int)]
Out[282]:
array([[2, 3, 4],
[4, 5, 6]])
for the 2d
In [284]: B
Out[284]:
array([[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18]])
In [285]: idxs = np.array([[1, 3],
...: [0, 1],
...: [2, 2]])
In [286]: j = np.linspace(idxs,idxs+3,3,endpoint=False)
In [287]: j
Out[287]:
array([[[1., 3.],
[0., 1.],
[2., 2.]],
[[2., 4.],
[1., 2.],
[3., 3.]],
[[3., 5.],
[2., 3.],
[4., 4.]]])
With a bit of trial and error, pair up the indices to get:
In [292]: B[np.arange(3)[:,None,None],j.astype(int).transpose(1,2,0)]
Out[292]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
Or iterate as in the first case, but with an extra layer:
In [294]: np.array([[B[j,i:i+3] for i in idxs[j]] for j in range(3)])
Out[294]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
With sliding windows:
In [295]: aa = np.lib.stride_tricks.sliding_window_view(A,3)
In [296]: aa.shape
Out[296]: (4, 3)
In [297]: aa
Out[297]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
In [298]: aa[[1,3]]
Out[298]:
array([[2, 3, 4],
[4, 5, 6]])
and
In [300]: bb = np.lib.stride_tricks.sliding_window_view(B,(1,3))
In [301]: bb.shape
Out[301]: (3, 4, 1, 3)
In [302]: bb[np.arange(3)[:,None],idxs,0,:]
Out[302]:
array([[[ 2, 3, 4],
[ 4, 5, 6]],
[[ 7, 8, 9],
[ 8, 9, 10]],
[[15, 16, 17],
[15, 16, 17]]])
I got it! I was almost there:
def magical_routine(A, idxs, window_len=2000):
col_idxs = np.add.outer(idxs, np.arange(window_len))
return np.take_along_axis(A[..., np.newaxis, :], col_idxs, axis=-1)
I just needed to always add the new axis to A's second to last dim, and then leave remaining axes alone.

How to iterate through slices at the last dimension

For example, you have array
a = np.array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]]])
We want to iterate through slices at the last dimension, i.e. [0,1,2], [3,4,5], [6,7,8], [9,10,11]. Any way to achieve this without the for loop? Thanks!
Tried this but it does not work, because numpy does not interpret the tuple in the way we wanted - a[(0, 0),:] is not the same as a[0, 0, :]
[a[i,:] for i in zip(*product(*(range(ii) for ii in a.shape[:-1])))]
More generally, any way for the last k dimensions? Something equivalent to looping through a[i,j,k, ...].
In [26]: a = np.array([[[ 0, 1, 2],
...: [ 3, 4, 5]],
...:
...: [[ 6, 7, 8],
...: [ 9, 10, 11]]])
In [27]: [a[i,j,:] for i in range(2) for j in range(2)]
Out[27]: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10, 11])]
or
In [31]: list(np.ndindex(2,2))
Out[31]: [(0, 0), (0, 1), (1, 0), (1, 1)]
In [32]: [a[i,j] for i,j in np.ndindex(2,2)]
another
list(a.reshape(-1,3))

Swap values and indexes in numpy

I wonder if this is possible, so I have two 2D arrays:
X[7][9] = 10
Y[7][9] = 5
From above info I want to create following two 2D arrays:
X'[5][10] = 9
Y'[5][10] = 7
Is it possible to accomplish this? Values of X and Y are bounded and won't exceed shape of X and Y. Also X and Y has the same shape.
thanks in advance.
You should be able to use np.nditer to keep track of the multi-index and the corresponding values of the arrays.
rng = np.random.RandomState(0)
X = rng.randint(low=0, high=10, size=(10, 10))
Y = rng.randint(low=0, high=10, size=(10, 10))
X_prime = X.copy()
Y_prime = Y.copy()
it = np.nditer([X, Y], flags=['multi_index'])
for x, y in it:
i, j = it.multi_index
X_prime[y, x] = j
Y_prime[y, x] = i
I believe this is the result you were expecting:
>>> X[7, 9], Y[7, 9]
(3, 9)
>>> X_prime[9, 3], Y_prime[9, 3]
(9, 7)
>>> X[1, 2], Y[1, 2]
(8, 2)
>>> X_prime[2, 8], Y_prime[2, 8]
(2, 1)
In [147]: X = np.random.randint(0,5,(5,5))
In [148]: Y = np.random.randint(0,5,(5,5))
Similar to Matt's answer, but using ndindex to generate the indices. There are various ways of generating all such values. Internally I believe ndindex uses nditer:
In [149]: X_,Y_ = np.zeros_like(X)-1,np.zeros_like(Y)-1
In [150]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: X_[k,l] = i
...: Y_[k,l] = j
...:
In [151]: X
Out[151]:
array([[2, 4, 3, 4, 2],
[0, 3, 0, 2, 3],
[1, 1, 4, 4, 4],
[2, 1, 2, 2, 0],
[0, 1, 0, 1, 4]])
In [152]: Y
Out[152]:
array([[1, 2, 1, 3, 0],
[4, 2, 4, 0, 4],
[4, 3, 3, 2, 1],
[0, 3, 0, 2, 2],
[1, 4, 2, 0, 0]])
In [153]: X_
Out[153]:
array([[-1, 4, 4, -1, 1],
[ 4, -1, -1, 3, 4],
[ 3, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
In [154]: Y_
Out[154]:
array([[-1, 0, 2, -1, 2],
[ 3, -1, -1, 1, 1],
[ 2, 0, 3, -1, -1],
[-1, 2, 1, -1, 4],
[ 4, 4, 3, 2, -1]])
Notice that with randomly generated arrays, the mapping is not full (the -1 values). And if there are duplicates, the last replaces previous values.
Handling duplicates - note the change in X_:
In [156]: for i,j in np.ndindex(*X.shape):
...: k,l = X[i,j], Y[i,j]
...: if X_[k,l]==-1:
...: X_[k,l] = i
...: Y_[k,l] = j
...: else:
...: X_[k,l] += i
...: Y_[k,l] += j
...:
...:
In [157]: X_
Out[157]:
array([[-1, 4, 7, -1, 2],
[ 4, -1, -1, 5, 6],
[ 7, 0, 3, -1, -1],
[-1, 0, 1, -1, 1],
[ 4, 2, 2, 2, -1]])
If the mapping is complete and one to one, it might be possible to do this mapping in a whole-array non-iterative fashion, which would be faster than this.

special tiling of matrix in tensorflow or numpy

Consider 3D tensor of T(w x h x d).
The goal is to create a tensor of R(w x h x K) where K = d x k by tiling along 3rd dimension in a unique way.
The tensor should repeat each slice in 3rd dimension k times, meaning :
T[:,:,0]=R[:,:,0:k] and T[:,:,1]=R[:,:,k:2*k]
There's a subtle difference with standard tiling which gives T[:,:,0]=R[:,:,::k], repeats at every kth in 3rd dimension.
Use np.repeat along that axis -
np.repeat(T,k,axis=2)
Sample run -
In [688]: # Setup
...: w,h,d = 2,3,4
...: k = 2
...: T = np.random.randint(0,9,(w,h,d))
...:
...: # Original approach
...: R = np.zeros((w,h,d*k),dtype=T.dtype)
...: for i in range(4):
...: R[:,:,i*k:(i+1)*k] = T[:,:,i][...,None]
...:
In [692]: T
Out[692]:
array([[[4, 5, 6, 4],
[5, 4, 4, 3],
[8, 0, 0, 8]],
[[7, 3, 8, 0],
[8, 7, 0, 8],
[3, 6, 8, 5]]])
In [690]: R
Out[690]:
array([[[4, 4, 5, 5, 6, 6, 4, 4],
[5, 5, 4, 4, 4, 4, 3, 3],
[8, 8, 0, 0, 0, 0, 8, 8]],
[[7, 7, 3, 3, 8, 8, 0, 0],
[8, 8, 7, 7, 0, 0, 8, 8],
[3, 3, 6, 6, 8, 8, 5, 5]]])
In [691]: np.allclose(R, np.repeat(T,k,axis=2))
Out[691]: True
Alternatively with np.tile and reshape -
np.tile(T[...,None],k).reshape(w,h,-1)

Matrix multiplication of two vectors

I'm trying to do a matrix multiplication of two vectors in numpy which would result in an array.
Example
In [108]: b = array([[1],[2],[3],[4]])
In [109]: a =array([1,2,3])
In [111]: b.shape
Out[111]: (4, 1)
In [112]: a.shape
Out[112]: (3,)
In [113]: b.dot(a)
ValueError: objects are not aligned
As can be seen from the shapes, the array a isn't actually a matrix. The catch is to define a like this.
In [114]: a =array([[1,2,3]])
In [115]: a.shape
Out[115]: (1, 3)
In [116]: b.dot(a)
Out[116]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
How to achieve the same result when acquiring the vectors as fields or columns of a matrix?
In [137]: mat = array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
In [138]: x = mat[:,0] #[1,2,3,4]
In [139]: y = mat[0,:] #[1,2,3]
In [140]: x.dot(y)
ValueError: objects are not aligned
You are computing the outer product of two vectors. You can use the function numpy.outer for this:
In [18]: a
Out[18]: array([1, 2, 3])
In [19]: b
Out[19]: array([10, 20, 30, 40])
In [20]: numpy.outer(b, a)
Out[20]:
array([[ 10, 20, 30],
[ 20, 40, 60],
[ 30, 60, 90],
[ 40, 80, 120]])
Use 2d arrays instead of 1d vectors and broadcasting with the * ...
In [8]: #your code from above
In [9]: y = mat[0:1,:]
In [10]: y
Out[10]: array([[1, 2, 3]])
In [11]: x = mat[:,0:1]
In [12]: x
Out[12]:
array([[1],
[2],
[3],
[4]])
In [13]: x*y
Out[13]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])
It's the similar catch as in the basic example.
Both x and y aren't perceived as matrices but as single dimensional arrays.
In [143]: x.shape
Out[143]: (4,)
In [144]: y.shape
Out[144]: (3,)
We have to add the second dimension to them, which will be 1.
In [171]: x = array([x]).transpose()
In [172]: x.shape
Out[172]: (4, 1)
In [173]: y = array([y])
In [174]: y.shape
Out[174]: (1, 3)
In [175]: x.dot(y)
Out[175]:
array([[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12]])