Related
This question is similar to that already answered here, but that question does not address how to retrieve the indices of multiple elements.
I have a 2D tensor points with many rows and a small number of columns, and would like to get a tensor containing the row indices of all the elements in that tensor. I know what elements are present in points beforehand; It contains integer elements ranging from 0 to 999, and I can make a tensor using the range function to reflect the set of possible elements. The elements may be in any of the columns.
How can I retrieve the row indices where each element appears in my tensor in a way that avoids looping or using numpy, so I can do this quickly on a GPU?
I am looking for something like (points == elements).nonzero()[:,1]
Thanks!
try torch.cat([(t == i).nonzero() for i in elements_to_compare])
>>> import torch
>>> t = torch.empty((15,4)).random_(0, 999)
>>> t
tensor([[429., 833., 393., 828.],
[555., 893., 846., 909.],
[ 11., 861., 586., 222.],
[232., 92., 576., 452.],
[171., 341., 851., 953.],
[ 94., 46., 130., 413.],
[243., 251., 545., 331.],
[620., 29., 194., 176.],
[303., 905., 771., 149.],
[482., 225., 7., 315.],
[ 44., 547., 206., 299.],
[695., 7., 645., 385.],
[225., 898., 677., 693.],
[746., 21., 505., 875.],
[591., 254., 84., 888.]])
>>> torch.cat([(t == i).nonzero() for i in [7,385]])
tensor([[ 9, 2],
[11, 1],
[11, 3]])
>>> torch.cat([(t == i).nonzero()[:,1] for i in [7,385]])
tensor([2, 1, 3])
Numpy:
>>> np.nonzero(np.isin(t, [7,385]))
(array([ 9, 11, 11], dtype=int64), array([2, 1, 3], dtype=int64))
>>> np.nonzero(np.isin(t, [7,385]))[1]
array([2, 1, 3], dtype=int64)
I'm not sure if I'm correctly understanding what you're looking for, but if you want the indices of a certain value you could try using where and the sparse representation of the result.
E.g. in the below tensor points the value 998 is present at indices [0,0] and [2,0]. To get those indices one could:
In [34]: points=torch.tensor([ [998, 6], [1, 3], [998, 999], [2, 3] ] )
In [35]: torch.where(points==998, points, torch.tensor(0)).to_sparse().indices()
Out[35]:
tensor([[0, 2],
[0, 0]])
I have two different arrays b0 and b1 where:
b0=[1,2]
b1=[3,4]
I want list[1st element of b0, 1st element of b1] to appended into new array B
and similarly:
list[2nd element of b0, 2nd element of b1] to appended into new array B
and so on......
that is my new array should be something like:
array([1,3],[2,4])
Below is my code:
b0=np.array([1,2])
b1=np.array([3,4])
for val in range(len(b1)):
L=[b0[val],b1[val]]
B=np.append(L,axis=0)
print(B)
I am getting missing on positional argument values error. Kindly help me to fix it.
If you insist to use numpy array, this is what I would do.
new = []
for x, y in zip(b0, b1):
new.append([x, y])
new = np.array(new)
Or list comprehension
new = np.array([[x,y] for x, y in zip(b0, b1)])
Result:
array([[1, 3],
[2, 4]])
Using np.append here isn't the most convenient way in my opinion. You can always cast python list into np.array and it's much easier to just use zip in this case.
b0=np.array([1,2])
b1=np.array([3,4])
B=np.array(list(zip(b0,b1)))
output:
>>> B
array([[1, 3],
[2, 4]])
In [51]: b0=np.array([1,2])
...: b1=np.array([3,4])
Order's wrong:
In [56]: np.vstack((b0,b1))
Out[56]:
array([[1, 2],
[3, 4]])
but you can transpose it:
In [57]: np.vstack((b0,b1)).T
Out[57]:
array([[1, 3],
[2, 4]])
stack is a more general purpose concatenator
In [58]: np.stack((b0,b1), axis=1)
Out[58]:
array([[1, 3],
[2, 4]])
or with:
In [59]: np.column_stack((b0,b1))
Out[59]:
array([[1, 3],
[2, 4]])
More details on combining arrays in my other recent answer: https://stackoverflow.com/a/56159553/901925
All these, including np.append use np.concatenate, just tweaking the dimensions in different ways first. np.append is often misused. It isn't a list append clone. None should be used repeatedly in a loop. They make a new array each time, which isn't very efficient.
Is there a numpy function that pads an array this way?
import numpy as np
def pad(x, length):
tmp = np.zeros((length,))
tmp[:x.shape[0]] = x
return tmp
x = np.array([1,2,3])
print pad(x, 5)
Output:
[ 1. 2. 3. 0. 0.]
I couldn't find a way to do it with numpy.pad()
You can use ndarray.resize():
>>> x = np.array([1,2,3])
>>> x.resize(5)
>>> x
array([1, 2, 3, 0, 0])
Note that this functions behaves differently from numpy.resize(), which pads with repeated copies of the array itself. (Consistency is for people who can't remember everything.)
Sven Marnach's suggestion to use ndarray.resize() is probably the simplest way to do it, but for completeness, here's how it can be done with numpy.pad:
In [13]: x
Out[13]: array([1, 2, 3])
In [14]: np.pad(x, [0, 5-x.size], mode='constant')
Out[14]: array([1, 2, 3, 0, 0])
I am having an issue with Ipython - Numpy. I want to do the following operation:
x^T.x
with and x^T the transpose operation on vector x. x is extracted from a txt file with the instruction:
x = np.loadtxt('myfile.txt')
The problem is that if i use the transpose function
np.transpose(x)
and uses the shape function to know the size of x, I get the same dimensions for x and x^T. Numpy gives the size with a L uppercase indice after each dimensions. e.g.
print x.shape
print np.transpose(x).shape
(3L, 5L)
(3L, 5L)
Does anybody know how to solve this, and compute x^T.x as a matrix product?
Thank you!
What np.transpose does is reverse the shape tuple, i.e. you feed it an array of shape (m, n), it returns an array of shape (n, m), you feed it an array of shape (n,)... and it returns you the same array with shape(n,).
What you are implicitly expecting is for numpy to take your 1D vector as a 2D array of shape (1, n), that will get transposed into a (n, 1) vector. Numpy will not do that on its own, but you can tell it that's what you want, e.g.:
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> a.T
array([0, 1, 2, 3])
>>> a[np.newaxis, :].T
array([[0],
[1],
[2],
[3]])
As explained by others, transposition won't "work" like you want it to for 1D arrays.
You might want to use np.atleast_2d to have a consistent scalar product definition:
def vprod(x):
y = np.atleast_2d(x)
return np.dot(y.T, y)
I had the same problem, I used numpy matrix to solve it:
# assuming x is a list or a numpy 1d-array
>>> x = [1,2,3,4,5]
# convert it to a numpy matrix
>>> x = np.matrix(x)
>>> x
matrix([[1, 2, 3, 4, 5]])
# take the transpose of x
>>> x.T
matrix([[1],
[2],
[3],
[4],
[5]])
# use * for the matrix product
>>> x*x.T
matrix([[55]])
>>> (x*x.T)[0,0]
55
>>> x.T*x
matrix([[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]])
While using numpy matrices may not be the best way to represent your data from a coding perspective, it's pretty good if you are going to do a lot of matrix operations!
For starters L just means that the type is a long int. This shouldn't be an issue. You'll have to give additional information about your problem though since I cannot reproduce it with a simple test case:
In [1]: import numpy as np
In [2]: a = np.arange(12).reshape((4,3))
In [3]: a
Out[3]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [4]: a.T #same as np.transpose(a)
Out[4]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
In [5]: a.shape
Out[5]: (4, 3)
In [6]: np.transpose(a).shape
Out[6]: (3, 4)
There is likely something subtle going on with your particular case which is causing problems. Can you post the contents of the file that you're reading into x?
This is either the inner or outer product of the two vectors, depending on the orientation you assign to them. Here is how to calculate either without changing x.
import numpy
x = numpy.array([1, 2, 3])
inner = x.dot(x)
outer = numpy.outer(x, x)
The file 'myfile.txt' contain lines such as
5.100000 3.500000 1.400000 0.200000 1
4.900000 3.000000 1.400000 0.200000 1
Here is the code I run:
import numpy as np
data = np.loadtxt('iris.txt')
x = data[1,:]
print x.shape
print np.transpose(x).shape
print x*np.transpose(x)
print np.transpose(x)*x
And I get as a result
(5L,)
(5L,)
[ 24.01 9. 1.96 0.04 1. ]
[ 24.01 9. 1.96 0.04 1. ]
I would be expecting one of the two last result to be a scalar instead of a vector, because x^T.x (or x.x^T) should give a scalar.
b = np.array([1, 2, 2])
print(b)
print(np.transpose([b]))
print("rows, cols: ", b.shape)
print("rows, cols: ", np.transpose([b]).shape)
Results in
[1 2 2]
[[1]
[2]
[2]]
rows, cols: (3,)
rows, cols: (3, 1)
Here (3,) can be thought as "(3, 0)".
However if you want the transpose of a matrix A, np.transpose(A) is the solution. Shortly, [] converts a vector to a matrix, a matrix to a higher dimension tensor.
I have a numpy array with xy co-ordinates for points. I have plotted each of these points and want a line connecting each point to every other point (a complete graph). The array is a 2x50 structure so I have transposed it and used a view to let me iterate through the rows. However, I am getting an 'index out of bounds' error with the following:
plt.plot(*zip(*v.T)) #to plot all the points
viewVX = (v[0]).T
viewVY = (v[1]).T
for i in range(0, 49):
xPoints = viewVX[i], viewVX[i+1]
print("xPoints is", xPoints)
yPoints = viewVY[i+2], viewVY[i+3]
print("yPoints is", yPoints)
xy = xPoints, yPoints
plt.plot(*zip(*xy), ls ='-')
I was hoping that the indexing would 'wrap-around' so that for the ypoints, it'd start with y0, y1 etc. Is there an easier way to accomplish what I'm trying to achieve?
import matplotlib.pyplot as plt
import numpy as np
import itertools
v=np.random.random((2,50))
plt.plot(
*zip(*itertools.chain.from_iterable(itertools.combinations(v.T,2))),
marker='o', markerfacecolor='red')
plt.show()
The advantage of doing it this way is that there are fewer calls to plt.plot. This should be significantly faster than methods that make O(N**2) calls to plt.plot.
Note also that you do not need to plot the points separately. Instead, you can use the marker='o' parameter.
Explanation: I think the easiest way to understand this code is to see how it operates on a simple v:
In [4]: import numpy as np
In [5]: import itertools
In [7]: v=np.arange(8).reshape(2,4)
In [8]: v
Out[8]:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
itertools.combinations(...,2) generates all possible pairs of points:
In [10]: list(itertools.combinations(v.T,2))
Out[10]:
[(array([0, 4]), array([1, 5])),
(array([0, 4]), array([2, 6])),
(array([0, 4]), array([3, 7])),
(array([1, 5]), array([2, 6])),
(array([1, 5]), array([3, 7])),
(array([2, 6]), array([3, 7]))]
Now we use itertools.chain.from_iterable to convert this list of pairs of points into a (flattened) list of points:
In [11]: list(itertools.chain.from_iterable(itertools.combinations(v.T,2)))
Out[11]:
[array([0, 4]),
array([1, 5]),
array([0, 4]),
array([2, 6]),
array([0, 4]),
array([3, 7]),
array([1, 5]),
array([2, 6]),
array([1, 5]),
array([3, 7]),
array([2, 6]),
array([3, 7])]
If we plot these points one after another, connected by lines, we get our complete graph. The only problem is that plt.plot(x,y) expects x to be a sequence of x-values, and y to be a sequence of y-values.
We can use zip to convert the list of points into a list of x-values and y-values:
In [12]: zip(*itertools.chain.from_iterable(itertools.combinations(v.T,2)))
Out[12]: [(0, 1, 0, 2, 0, 3, 1, 2, 1, 3, 2, 3), (4, 5, 4, 6, 4, 7, 5, 6, 5, 7, 6, 7)]
The use of the splat operator (*) in zip and plt.plot is explained here.
Thus we've managed to massage the data into the right form to be fed to plt.plot.
With a 2 by 50 array,
for i in range(0, 49):
xPoints = viewVX[i], viewVX[i+1]
print("xPoints is", xPoints)
yPoints = viewVY[i+2], viewVY[i+3]
would get out of bounds for i = 47 and i = 48 since you use i+2 and i+3 as indices into viewVY.
This is what I came up with, but I hope someone comes up with something better.
def plot_complete(v):
for x1, y1 in v.T:
for x2, y2, in v.T:
plt.plot([x1, x2], [y1, y2], 'b')
plt.plot(v[0], v[1], 'sr')
The 'b' makes the lines blue, and 'sr' marks the points with red squares.
Have figured it out. Basically used simplified syntax provided by #Bago for plotting and considered #Daniel's indexing tip. Just have to iterate through each xy set of points and construct a new set of xx' yy' set of points to use to send to plt.plot():
viewVX = (v[0]).T #this is if your matrix is 2x100 ie row [0] is x and row[1] is y
viewVY = (v[1]).T
for i in range(0, v.shape[1]): #v.shape[1] gives the number of columns
for j in range(0, v.shape[1]):
xPoints = viewVX[j], viewVX[i]
yPoints = viewVY[j], viewVY[i]
xy = [xPoints, yPoints] #tuple/array of xx, yy point
#print("xy points are", xy)
plt.plot(xy[0],xy[1], ls ='-')