I have an array with dimensions (2, 3, 4, 5).
When I do np.transpose(a, (0, 3, 2, 1)) I get back the expected result with shape (2, 5, 4, 3).
But when I do np.transpose(a, (0, 3, 1, 2)), I expect to get a result with shape (2, 4, 5, 3) but instead I get a shape of (2, 5, 3, 4)...
What is going on?
The dimensions:
0: 2
1: 3
2: 4
3: 5
first transpose (0,3,2,1) -> dims=[2,5,4,3]
Second transpose (0,3,1,2) -> dims=[2,5,3,4]
What's happening is that numpy is doing it's job, you're just feeding wrong shape, what you want is np.transpose(a, (0, 2, 3, 1))
Related
I have a 2d tensor of shape [32,768] also a 3d tensor of [32,512,768]. I want the stack them and get output to have shape of [32,512,1536].
If I expand dimensions at axis=1 for 2d and concat. I am getting [32,513,768]. So how to get [32,512,1536] as my output shape of tensor?
Short answer: you will have to repeat the 2D tensor along axis 1 512 times to get a 3D tensor of shape [32, 512, 768]. This 3D tensor when concatenated with the other 3D tensor along the last dimension will give a tensor of shape [32, 512, 1536]. You need to make sure this repetition in desired.
Longer extension:
Let's take a much simpler case:
Take a 1D tensor (1, 2, 3, 4, 5). Say you need to concatenate this to a 2D tensor of shape [2, 5], say ((6, 7, 8, 9, 10), (11, 12, 13, 14, 15)). Note that this is a simplified version of your problem, with smaller tensors and no batch dimension.
One way to combine these tensors is to get a tensor of shape [3, 5]. Here, you would expand the 1D tensor to a 2D tensor having shape [1, 5], and concatenate along axis 0. This will give the result ((1, 2, 3, 4, 5), (6, 7, 8, 9, 10), (11, 12, 13, 14, 15)). When applied to your problem, this gives the resulting [32, 513, 768] tensor you have.
The second way would give the tensor ((1, 2, 3, 4, 5, 6, 7, 8, 9, 10), (1, 2, 3, 4, 5, 11, 12, 13, 14, 15)), having shape [2, 10]. As you can see, this requires (1, 2, 3, 4, 5) to be repeated twice. So, you'll have to expand the 1D tensor to get the shape [1, 5], and repeat it to get a tensor of shape [2, 5]. This tensor can then be concatenated with the other 2D tensor. In your case, you will expand the 2D tensor to shape [32, 1, 768], then repeat it 512 times along axis 1 to get a tensor of shape [32, 512, 768], which will be concatenated with the other 3D tensor.
When going for the second method, ensure that you really want to repeat the smaller tensor across all entries of the second tensor.
You can try this:
import torch
a = torch.ones([32,768])
b = torch.ones([32,512,768])
result = torch.cat([a[:, None, :].repeat(1, 512,1), b], dim=2)
I'm trying to solve this algorithm.
Given a radius and an a point.
Find every points in the 3d coordinate system that are in the sphere of that radius that centered at the given point, and store them in a list.
you could do this with numpy, below.
Note the code here will give you coordinates relative to a sphere centered at a point you choose, with a radius you choose. You need to make sure that your input dimensions 'dim' below are set so that the sphere would be fully contained within that volume first. It also will only work for positive indicies. If your point has any coordinates that are negative, use the positive of that, and then in the output flip the signs of that axis coordinates yourself.
import numpy as np
dim = 15
# get 3 arrays representing indicies along each axis
xx, yy, zz = np.ogrid[:dim, :dim, :dim]
# set you center point and radius you want
center = [7, 7, 7]
radius = 3
# create 3d array with values that are the distance from the
# center squared
d2 = (xx-center[0])**2 + (yy-center[1])**2 + (zz-center[2])**2
# create a logical true/false array based on whether the values in d2
# above are less than radius squared
#
# so this is what you want - all the values within "radius" of the center
# are now set to True
mask = d2 <= radius**2
# calculate distance squared and compare to radius squared to avoid having to use
# slow sqrt()
# now you want to get the indicies from the mask array where the value of the
# array is True. numpy.nonzero does that, and gives you 3 numpy 1d arrays of
# indicies along each axis
s, t, u = np.nonzero(mask)
# finally, to get what you want, which is all those indicies in a list, zip them together:
coords = list(zip(s, t, u))
print(coords)
>>>
[(2, 5, 6),
(3, 4, 5),
(3, 4, 6),
(3, 4, 7),
(3, 5, 5),
(3, 5, 6),
(3, 5, 7),
(3, 6, 5),
(3, 6, 6),
(3, 6, 7),
(4, 3, 6),
(4, 4, 5),
(4, 4, 6),
(4, 4, 7),
(4, 5, 4),
(4, 5, 5),
(4, 5, 6),
(4, 5, 7),
(4, 5, 8),
(4, 6, 5),
(4, 6, 6),
(4, 6, 7),
(4, 7, 6),
(5, 4, 5),
(5, 4, 6),
(5, 4, 7),
(5, 5, 5),
(5, 5, 6),
(5, 5, 7),
(5, 6, 5),
(5, 6, 6),
(5, 6, 7),
(6, 5, 6)]
I'm trying to express the N-D behaviour of np.dot using only 2-D np.dot or np.tensordot.
To recap, np.dot does something like the following for N-D: It matches/broadcasts the arrays along all dimensions but the last two and performs dot products for all of them. For example, if x.shape is (2, 3, 4, 5) and y.shape is (2, 3, 5, 4), np.dot(x, y).shape is (2, 3, 4, 4) and np.dot(x, y)[i, j] is np.dot(x[i, j], y[i, j]).
Also, if x.shape is just (4, 5), it will first be converted to (2, 3, 5, 4) via np.broadcast.
I tried np.tensortdot(x, y, axes=(-1, -2)) but it repeats along every dimension of x, y instead of matching them up.
I realise I could write a loop but I was looking for a vectorised solution.
You got the broadcasting behavior of np.dot wrong:
In [254]: x=np.ones((2,3,4,5)); y=np.ones((2,3,5,4))
In [255]: np.dot(x,y).shape
Out[255]: (2, 3, 4, 2, 3, 4)
In [256]: np.matmul(x,y).shape
Out[256]: (2, 3, 4, 4)
and for the (4,5) x:
In [257]: np.dot(x[0,0],y).shape
Out[257]: (4, 2, 3, 4)
In [258]: np.matmul(x[0,0],y).shape
Out[258]: (2, 3, 4, 4)
matmul was added precisely because np.dot does not act like it is performing np.dot(x[i,j,:,:], y[i,j,:,:]) for all i,j.
The shape in Out[255] is the shape of x minus the 5, plus the shape of y minus its 5. In effect an outer produce of everything with summing on the size 5 dimension.
tensordot uses np.dot. It just reshapes and transposes the inputs to reduce the problem to a 2d dot one. Then it massages the result back to the desired shape and order.
In [259]: np.tensordot(x, y, axes=(-1,-2)).shape
Out[259]: (2, 3, 4, 2, 3, 4) # cf Out[255]
In [261]: np.einsum('ijkl,ijlm->ijkm',x,y).shape
Out[261]: (2, 3, 4, 4) # cf Out[256]
Since sparse matrices are 2d to start with - and end with, I don't understand your question. If you have multiple sparse matrices, you'll have to work with them individually.
I want to cut the variable x on 3 groups:
new_var = pd.qcut(x,q = [0,.33,.66,1.],labels = ['low','medium','high'])
as x.quantile(q = 0.33) I received the value 0.6.
My question is: Is there some function, that can cut x variables on n(in my case 3) groups, but instead of quantiles(like in qcut) we can define thresholds. In my case, instead of 0.6 I want to get 0.59999...
Or alternatively: Is there possibility in qcut function, that the values(starting from 0.6) should be defined as 'medium'(not as 'low')? I mean, instead of using closed intervals to use open.
I believe what you are looking for is pd.cut which allows to discretize data into defined bins using a half-open interval.
Example:
>>> pd.cut(range(1,10), [0,3,6,10], right=True)
[(0, 3], (0, 3], (0, 3], (3, 6], (3, 6], (3, 6], (6, 10], (6, 10], (6, 10]]
Categories (3, interval[int64]): [(0, 3] < (3, 6] < (6, 10]]
Say a.shape is (2, 3, 1), b.shape is (2, 1, 4). They both represent an array of matrix, how to element-wise multiply those two arrays of matrix so that the result's shape is (2, 3, 4).