Position variances and covariances in matrix form - pandas

We have the following sets of data which are already given to us : A,B,C that represent variances and D,E, F that represent covariances . I would like to position this sets of data in the matrix form:
matrix: Z Y X
Z A D F
Y D B E
X F E C
How can I arrange the sets of data in the matrix form considering that I don't Know the number of variances/cov?
Then I would like the resulting matrix multiply :
matrix* (G,H,I) * (G
H
I)
The second question is , how I multiply matrix `dimensions 3*3 by 1*3 and 3*1

You can use numpy.matrix and numpy.array to create your own matrix and arrays,
In [1]: import numpy as np
matrix1 = np.matrix([[1, 4, 6], [4, 2, 5],[6, 5, 3]])
array1 = np.array([7,8,9])
Second question: Now use numpy.transpose to calculate the quadratic matrix from array1,
In [2]: matrix2 = array1*np.transpose([array1])
In [3]: matrix2
Out[3]: array([[49, 56, 63],
[56, 64, 72],
[63, 72, 81]])
Finally, multiply both matrix with numpy.matmul,
In [4]: matrix3 = np.matmul(matrix1, matrix2)
In [5]: matrix3
Out[5]: matrix([[651, 744, 837],
[623, 712, 801],
[763, 872, 981]])

Related

Is a new transposed vector in NumPy a column or row vector by default?

I'm a new learner in the Python language. When a new vector is generated, is it a column or row vector by default?
import numpy as np
theta = np.arange(3)
a = len(theta.T)
b = len(theta)
print('theta = {} \n theta.T = {}'.format(theta,theta.T))
c = theta.T.dot(theta)
d = theta.dot(theta.T)
It turns out a == b == 3, c == d, and both theta and theta.T are displayed as a row vector.
But this matters when I want to calculate the derivative of of symbolic function x ยท xT with x a row vector.
Neither, it is a 1D array as:
>>> theta.shape
(3,)
A column vector would have a shape equal to (3,1), a row vector: (1,3). You can create it by changing the shape
>>> theta.shape = (1,3)
>>> theta
array([[0, 1, 2]])
>>> theta.shape = (3,1)
>>> theta
array([[0],
[1],
[2]])

extracting diagonals (sideway down) from 5d matrices using einsum

I only managed to extract one diagonal using Numpy einsum. How do I get the other diagonals like [6, 37, 68, 99] with help of einsum?
x = np.arange(1, 26 ).reshape(5,5)
y = np.arange(26, 51).reshape(5,5)
z = np.arange(51, 76).reshape(5,5)
t = np.arange(76, 101).reshape(5,5)
p = np.arange(101, 126).reshape(5,5)
a4 = np.array([x, y, z, t, p]
Extracting one diagonal:
>>>np.einsum('iii->i', a4)
>>>[ 1 32 63 94 125]
I don't have any "easy" solution using einsum but it is quite simple with a for loop:
import numpy as np
# Generation of a 3x3x3 matrix
x = np.arange(1 , 10).reshape(3,3)
y = np.arange(11, 20).reshape(3,3)
z = np.arange(21, 30).reshape(3,3)
M = np.array([x, y, z])
# Generation of the index
I = np.arange(0,len(M))
# Generation of all the possible diagonals
for ii in [1,-1]:
for jj in [1,-1]:
print(M[I[::ii],I[::jj],I])
# OUTPUT:
# [ 1 15 29]
# [ 7 15 23]
# [21 15 9]
# [27 15 3]
We fix the index of the last dimension and we find all the possible combinations of backward and forward indexing for the other dimensions.
Do you realize that this einsum is the same as:
In [64]: a4=np.arange(1,126).reshape(5,5,5)
In [65]: i=np.arange(5)
In [66]: a4[i,i,i]
Out[66]: array([ 1, 32, 63, 94, 125])
It should be easy to tweak the indices to get other diagonals.
In [73]: a4[np.arange(4),np.arange(1,5),np.arange(4)]
Out[73]: array([ 6, 37, 68, 99])
That `iii->i' producing the main diagonal is more of an happy accident than a designed feature. Don't try to push it.

Remove duplicates with additional requirements

I have three columns (x,y,m), where x and y are coordinates and m is the measurement. There are some duplicates, which are defined to be same (x,y). Among those duplicates, I then rank them by the measurement m, I only pick one of the duplicates with minimum m. Here is an example:
x = np.array([1,1,2,2,1,1,2])
y = np.array([1,2,1,2,1,1,1])
m = np.array([10,2,13,4,6,15,7])
there are three duplicates with same coordinates (1,1), among the three, the minimum m is 6. There are two duplicates with same coordinates (2,1), among the two, the minimum m is 7. So the final result I want is:
x = np.array([1,2,1,2])
y = np.array([2,2,1,1])
m = np.array([2,4,6,7])
The numpy.unique can not handle such situation. Any great thoughts?
We could use pandas here for a cleaner solution -
import pandas as pd
In [43]: df = pd.DataFrame({'x':x,'y':y,'m':m})
In [46]: out_df = df.iloc[df.groupby(['x','y'])['m'].idxmin()]
# Format #1 : Final output as a 2D array
In [47]: out_df.values
Out[47]:
array([[1, 1, 6],
[1, 2, 2],
[2, 1, 7],
[2, 2, 4]])
# Format #2 : Final output as three separate 1D arrays
In [50]: X,Y,M = out_df.values.T
In [51]: X
Out[51]: array([1, 1, 2, 2])
In [52]: Y
Out[52]: array([1, 2, 1, 2])
In [53]: M
Out[53]: array([6, 2, 7, 4])
You can try something like this:
import collections
x = np.array([1,1,2,2,1,1,2])
y = np.array([1,2,1,2,1,1,1])
m = np.array([10,2,13,4,6,15,7])
coords = [str(x[i]) + ',' + str(y[i]) for i in range(len(x))]
results = collections.OrderedDict()
for coords, m in zip(coords, m):
if coords not in results:
results[coords] = m
else:
if m < results[coords]:
results[coords] = m
x = np.array([int(key.split(',')[0]) for key, _ in results.items()])
y = np.array([int(key.split(',')[1]) for key, _ in results.items()])
m = np.array([value for _, value in results.items()])

Multiply every row of a matrix with every row of another matrix

In numpy / PyTorch, I have two matrices, e.g. X=[[1,2],[3,4],[5,6]], Y=[[1,1],[2,2]]. I would like to dot product every row of X with every row of Y, and have the results
[[3, 6],[7, 14], [11,22]]
How do I achieve this?, Thanks!
I think this is what you are looking for:
import numpy as np
x= [[1,2],[3,4],[5,6]]
y= [[1,1],[2,2]]
x = np.asarray(x) #convert list to numpy array
y = np.asarray(y) #convert list to numpy array
product = np.dot(x, y.T)
.T transposes the matrix, which is neccessary in this case for the multiplication (because of the way dot products are defined). print(product) will output:
[[ 3 6]
[ 7 14]
[11 22]]
Using einsum
np.einsum('ij,kj->ik', X, Y)
array([[ 3, 6],
[ 7, 14],
[11, 22]])
In PyTorch, you can achieve this using torch.mm(a, b) or torch.matmul(a, b), as shown below:
x = np.array([[1,2],[3,4],[5,6]])
y = np.array([[1,1],[2,2]])
x = torch.from_numpy(x)
y = torch.from_numpy(y)
# print(torch.matmul(x, torch.t(y)))
print(torch.mm(x, torch.t(y)))
output:
tensor([[ 3, 6],
[ 7, 14],
[11, 22]], dtype=torch.int32)

Transpose of a vector using numpy

I am having an issue with Ipython - Numpy. I want to do the following operation:
x^T.x
with and x^T the transpose operation on vector x. x is extracted from a txt file with the instruction:
x = np.loadtxt('myfile.txt')
The problem is that if i use the transpose function
np.transpose(x)
and uses the shape function to know the size of x, I get the same dimensions for x and x^T. Numpy gives the size with a L uppercase indice after each dimensions. e.g.
print x.shape
print np.transpose(x).shape
(3L, 5L)
(3L, 5L)
Does anybody know how to solve this, and compute x^T.x as a matrix product?
Thank you!
What np.transpose does is reverse the shape tuple, i.e. you feed it an array of shape (m, n), it returns an array of shape (n, m), you feed it an array of shape (n,)... and it returns you the same array with shape(n,).
What you are implicitly expecting is for numpy to take your 1D vector as a 2D array of shape (1, n), that will get transposed into a (n, 1) vector. Numpy will not do that on its own, but you can tell it that's what you want, e.g.:
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> a.T
array([0, 1, 2, 3])
>>> a[np.newaxis, :].T
array([[0],
[1],
[2],
[3]])
As explained by others, transposition won't "work" like you want it to for 1D arrays.
You might want to use np.atleast_2d to have a consistent scalar product definition:
def vprod(x):
y = np.atleast_2d(x)
return np.dot(y.T, y)
I had the same problem, I used numpy matrix to solve it:
# assuming x is a list or a numpy 1d-array
>>> x = [1,2,3,4,5]
# convert it to a numpy matrix
>>> x = np.matrix(x)
>>> x
matrix([[1, 2, 3, 4, 5]])
# take the transpose of x
>>> x.T
matrix([[1],
[2],
[3],
[4],
[5]])
# use * for the matrix product
>>> x*x.T
matrix([[55]])
>>> (x*x.T)[0,0]
55
>>> x.T*x
matrix([[ 1, 2, 3, 4, 5],
[ 2, 4, 6, 8, 10],
[ 3, 6, 9, 12, 15],
[ 4, 8, 12, 16, 20],
[ 5, 10, 15, 20, 25]])
While using numpy matrices may not be the best way to represent your data from a coding perspective, it's pretty good if you are going to do a lot of matrix operations!
For starters L just means that the type is a long int. This shouldn't be an issue. You'll have to give additional information about your problem though since I cannot reproduce it with a simple test case:
In [1]: import numpy as np
In [2]: a = np.arange(12).reshape((4,3))
In [3]: a
Out[3]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [4]: a.T #same as np.transpose(a)
Out[4]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
In [5]: a.shape
Out[5]: (4, 3)
In [6]: np.transpose(a).shape
Out[6]: (3, 4)
There is likely something subtle going on with your particular case which is causing problems. Can you post the contents of the file that you're reading into x?
This is either the inner or outer product of the two vectors, depending on the orientation you assign to them. Here is how to calculate either without changing x.
import numpy
x = numpy.array([1, 2, 3])
inner = x.dot(x)
outer = numpy.outer(x, x)
The file 'myfile.txt' contain lines such as
5.100000 3.500000 1.400000 0.200000 1
4.900000 3.000000 1.400000 0.200000 1
Here is the code I run:
import numpy as np
data = np.loadtxt('iris.txt')
x = data[1,:]
print x.shape
print np.transpose(x).shape
print x*np.transpose(x)
print np.transpose(x)*x
And I get as a result
(5L,)
(5L,)
[ 24.01 9. 1.96 0.04 1. ]
[ 24.01 9. 1.96 0.04 1. ]
I would be expecting one of the two last result to be a scalar instead of a vector, because x^T.x (or x.x^T) should give a scalar.
b = np.array([1, 2, 2])
print(b)
print(np.transpose([b]))
print("rows, cols: ", b.shape)
print("rows, cols: ", np.transpose([b]).shape)
Results in
[1 2 2]
[[1]
[2]
[2]]
rows, cols: (3,)
rows, cols: (3, 1)
Here (3,) can be thought as "(3, 0)".
However if you want the transpose of a matrix A, np.transpose(A) is the solution. Shortly, [] converts a vector to a matrix, a matrix to a higher dimension tensor.