How to use a numpy array with fromiter - numpy

I tried to use a numpy array with fromiter but It gave this error
import numpy
l=numpy.dtype([("Ad","S20"),("Yas","i4"),("Derecelendirme","f")])
a=numpy.array([("Dr.Wah",20,0.9)])
d=numpy.fromiter(a,dtype=l,count=3)
print(d)
ValueError: setting an array element with a sequence.

In [172]: dt=np.dtype([("Ad","S20"),("Yas","i4"),("Derecelendirme","f")])
...: alist = [("Dr.Wah",20,0.9)]
The normal way to define a structured array is to use a list of tuples for the data along with the dtype:
In [173]: np.array( alist, dtype=dt)
Out[173]:
array([(b'Dr.Wah', 20, 0.9)],
dtype=[('Ad', 'S20'), ('Yas', '<i4'), ('Derecelendirme', '<f4')])
fromiter works as well, but isn't as common
In [174]: np.fromiter( alist, dtype=dt)
Out[174]:
array([(b'Dr.Wah', 20, 0.9)],
dtype=[('Ad', 'S20'), ('Yas', '<i4'), ('Derecelendirme', '<f4')])
If you create an array without the dtype:
In [175]: a = np.array(alist)
In [176]: a
Out[176]: array([['Dr.Wah', '20', '0.9']], dtype='<U6')
In [177]: _.shape
Out[177]: (1, 3)
a.astype(dt) does not work. You have to use a recfunction:
In [179]: import numpy.lib.recfunctions as rf
In [180]: rf.unstructured_to_structured(a, dtype=dt)
Out[180]:
array([(b'Dr.Wah', 20, 0.9)],
dtype=[('Ad', 'S20'), ('Yas', '<i4'), ('Derecelendirme', '<f4')])

Related

Split Pandas series or dataframes to individual elements

**Would like to convert mat8 into individual elements. Does Python has a specific method that i can use. **
mat8 is numpy array with size 1
mat8 is nested-nested-array
In [1]: mat8
Out[1]:
array([(array([[(array([u'hg_press'], dtype='<U9'), array([[24]], dtype=uint8), array([[4.0040e+03, 8.0020e+00, 4.0000e+00, 5.2000e+01, 4.5000e+01,
5.1763e+01]]), array([[(array([[1.54348742]]),)]], dtype=[('Capacity', 'O')]))]],
dtype=[('type', 'O'), ('ambient_pressure', 'O'), ('time', 'O'), ('data', 'O')]),)],
dtype=[('cycle', 'O')])
In [2]: type(mat8)
Out[2]: numpy.ndarray
In [3]: mat8.size
Out[3]: 1

Set value of specific cell in pandas dataframe to sum of two other cells

I have a dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame(
data={'X': [1.5, 6.777, 2.444, np.NaN],
'Y': [1.111, np.NaN, 8.77, np.NaN],
'Z': [5.0, 2.333, 10, 6.6666]})
I think this should work, but i get the following error;
df.at[1,'Z'] =(df.loc[[2],'X'] +df.loc[[0],'Y'])
How can I achieve this?
ValueError: setting an array element with a sequence.
This should work
df.loc[1, 'Z'] = df.loc[2,'X'] + df.loc[0,'Y']

How can I get or create the equivalent of numpy's amax function with rust's ndarray crate?

I want to get or create numpy's amax functionality in Rust with the ndarray crate. So basically, I want to be able to get this behaviour in Rust:
>>> import numpy as np
>>> arr = np.array(range(1, 28)).reshape((3, 3, 3))
>>> np.amax(arr, axis=1)
array([[ 7, 8, 9],
[16, 17, 18],
[25, 26, 27]])
As far as I can see, there's no amax function in ndarray. How can I get the same functionality as mentioned in the above example with this ndarray array:
let arr = Array3::from_shape_vec((3, 3, 3), (1..28).collect()).unwrap();
You can do this using map_axis and min
let a_min = arr.map_axis(Axis(0), |view| *view.iter().min().unwrap() )
You can see a working example in the playground

Transpose of a vector using numpy

I am having an issue with Ipython - Numpy. I want to do the following operation:
x^T.x
with and x^T the transpose operation on vector x. x is extracted from a txt file with the instruction:
x = np.loadtxt('myfile.txt')
The problem is that if i use the transpose function
np.transpose(x)
and uses the shape function to know the size of x, I get the same dimensions for x and x^T. Numpy gives the size with a L uppercase indice after each dimensions. e.g.
print x.shape
print np.transpose(x).shape
(3L, 5L)
(3L, 5L)
Does anybody know how to solve this, and compute x^T.x as a matrix product?
Thank you!
What np.transpose does is reverse the shape tuple, i.e. you feed it an array of shape (m, n), it returns an array of shape (n, m), you feed it an array of shape (n,)... and it returns you the same array with shape(n,).
What you are implicitly expecting is for numpy to take your 1D vector as a 2D array of shape (1, n), that will get transposed into a (n, 1) vector. Numpy will not do that on its own, but you can tell it that's what you want, e.g.:
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> a.T
array([0, 1, 2, 3])
>>> a[np.newaxis, :].T
array([[0],
[1],
[2],
[3]])
As explained by others, transposition won't "work" like you want it to for 1D arrays.
You might want to use np.atleast_2d to have a consistent scalar product definition:
def vprod(x):
y = np.atleast_2d(x)
return np.dot(y.T, y)
I had the same problem, I used numpy matrix to solve it:
# assuming x is a list or a numpy 1d-array
>>> x = [1,2,3,4,5]
# convert it to a numpy matrix
>>> x = np.matrix(x)
>>> x
matrix([[1, 2, 3, 4, 5]])
# take the transpose of x
>>> x.T
matrix([[1],
[2],
[3],
[4],
[5]])
# use * for the matrix product
>>> x*x.T
matrix([[55]])
>>> (x*x.T)[0,0]
55
>>> x.T*x
matrix([[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]])
While using numpy matrices may not be the best way to represent your data from a coding perspective, it's pretty good if you are going to do a lot of matrix operations!
For starters L just means that the type is a long int. This shouldn't be an issue. You'll have to give additional information about your problem though since I cannot reproduce it with a simple test case:
In [1]: import numpy as np
In [2]: a = np.arange(12).reshape((4,3))
In [3]: a
Out[3]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [4]: a.T #same as np.transpose(a)
Out[4]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
In [5]: a.shape
Out[5]: (4, 3)
In [6]: np.transpose(a).shape
Out[6]: (3, 4)
There is likely something subtle going on with your particular case which is causing problems. Can you post the contents of the file that you're reading into x?
This is either the inner or outer product of the two vectors, depending on the orientation you assign to them. Here is how to calculate either without changing x.
import numpy
x = numpy.array([1, 2, 3])
inner = x.dot(x)
outer = numpy.outer(x, x)
The file 'myfile.txt' contain lines such as
5.100000 3.500000 1.400000 0.200000 1
4.900000 3.000000 1.400000 0.200000 1
Here is the code I run:
import numpy as np
data = np.loadtxt('iris.txt')
x = data[1,:]
print x.shape
print np.transpose(x).shape
print x*np.transpose(x)
print np.transpose(x)*x
And I get as a result
(5L,)
(5L,)
[ 24.01 9. 1.96 0.04 1. ]
[ 24.01 9. 1.96 0.04 1. ]
I would be expecting one of the two last result to be a scalar instead of a vector, because x^T.x (or x.x^T) should give a scalar.
b = np.array([1, 2, 2])
print(b)
print(np.transpose([b]))
print("rows, cols: ", b.shape)
print("rows, cols: ", np.transpose([b]).shape)
Results in
[1 2 2]
[[1]
[2]
[2]]
rows, cols: (3,)
rows, cols: (3, 1)
Here (3,) can be thought as "(3, 0)".
However if you want the transpose of a matrix A, np.transpose(A) is the solution. Shortly, [] converts a vector to a matrix, a matrix to a higher dimension tensor.

connecting all numpy array plot points to each other using plt.plot() from matplotlib

I have a numpy array with xy co-ordinates for points. I have plotted each of these points and want a line connecting each point to every other point (a complete graph). The array is a 2x50 structure so I have transposed it and used a view to let me iterate through the rows. However, I am getting an 'index out of bounds' error with the following:
plt.plot(*zip(*v.T)) #to plot all the points
viewVX = (v[0]).T
viewVY = (v[1]).T
for i in range(0, 49):
xPoints = viewVX[i], viewVX[i+1]
print("xPoints is", xPoints)
yPoints = viewVY[i+2], viewVY[i+3]
print("yPoints is", yPoints)
xy = xPoints, yPoints
plt.plot(*zip(*xy), ls ='-')
I was hoping that the indexing would 'wrap-around' so that for the ypoints, it'd start with y0, y1 etc. Is there an easier way to accomplish what I'm trying to achieve?
import matplotlib.pyplot as plt
import numpy as np
import itertools
v=np.random.random((2,50))
plt.plot(
*zip(*itertools.chain.from_iterable(itertools.combinations(v.T,2))),
marker='o', markerfacecolor='red')
plt.show()
The advantage of doing it this way is that there are fewer calls to plt.plot. This should be significantly faster than methods that make O(N**2) calls to plt.plot.
Note also that you do not need to plot the points separately. Instead, you can use the marker='o' parameter.
Explanation: I think the easiest way to understand this code is to see how it operates on a simple v:
In [4]: import numpy as np
In [5]: import itertools
In [7]: v=np.arange(8).reshape(2,4)
In [8]: v
Out[8]:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
itertools.combinations(...,2) generates all possible pairs of points:
In [10]: list(itertools.combinations(v.T,2))
Out[10]:
[(array([0, 4]), array([1, 5])),
(array([0, 4]), array([2, 6])),
(array([0, 4]), array([3, 7])),
(array([1, 5]), array([2, 6])),
(array([1, 5]), array([3, 7])),
(array([2, 6]), array([3, 7]))]
Now we use itertools.chain.from_iterable to convert this list of pairs of points into a (flattened) list of points:
In [11]: list(itertools.chain.from_iterable(itertools.combinations(v.T,2)))
Out[11]:
[array([0, 4]),
array([1, 5]),
array([0, 4]),
array([2, 6]),
array([0, 4]),
array([3, 7]),
array([1, 5]),
array([2, 6]),
array([1, 5]),
array([3, 7]),
array([2, 6]),
array([3, 7])]
If we plot these points one after another, connected by lines, we get our complete graph. The only problem is that plt.plot(x,y) expects x to be a sequence of x-values, and y to be a sequence of y-values.
We can use zip to convert the list of points into a list of x-values and y-values:
In [12]: zip(*itertools.chain.from_iterable(itertools.combinations(v.T,2)))
Out[12]: [(0, 1, 0, 2, 0, 3, 1, 2, 1, 3, 2, 3), (4, 5, 4, 6, 4, 7, 5, 6, 5, 7, 6, 7)]
The use of the splat operator (*) in zip and plt.plot is explained here.
Thus we've managed to massage the data into the right form to be fed to plt.plot.
With a 2 by 50 array,
for i in range(0, 49):
xPoints = viewVX[i], viewVX[i+1]
print("xPoints is", xPoints)
yPoints = viewVY[i+2], viewVY[i+3]
would get out of bounds for i = 47 and i = 48 since you use i+2 and i+3 as indices into viewVY.
This is what I came up with, but I hope someone comes up with something better.
def plot_complete(v):
for x1, y1 in v.T:
for x2, y2, in v.T:
plt.plot([x1, x2], [y1, y2], 'b')
plt.plot(v[0], v[1], 'sr')
The 'b' makes the lines blue, and 'sr' marks the points with red squares.
Have figured it out. Basically used simplified syntax provided by #Bago for plotting and considered #Daniel's indexing tip. Just have to iterate through each xy set of points and construct a new set of xx' yy' set of points to use to send to plt.plot():
viewVX = (v[0]).T #this is if your matrix is 2x100 ie row [0] is x and row[1] is y
viewVY = (v[1]).T
for i in range(0, v.shape[1]): #v.shape[1] gives the number of columns
for j in range(0, v.shape[1]):
xPoints = viewVX[j], viewVX[i]
yPoints = viewVY[j], viewVY[i]
xy = [xPoints, yPoints] #tuple/array of xx, yy point
#print("xy points are", xy)
plt.plot(xy[0],xy[1], ls ='-')