Add x,y Values to numpy Matrix - numpy

So, what I have is a data file in the form of
1 , 1 , 2
2 , 5 , 8
3 , 9 , 10
...
...
In my case, every single triplet is in the form of: value , x-position , y-position.
What i want to achieve is to insert this data in a 2d-matrix, which I already created using the np.zeros function. However, I am stuck and can't figure out how to write a function which puts the given values to the right x and y position in the matrix :/
My current Matrix (named matrix) looks like:
array([[0,0,0,...,0]
[0,0,0,...,0]
[... ]
[0,0,0,...,0]])
and if i would use matrix[1,1]=2 (first line of data) i would get:
array([[0,0,0,...,0]
[0,2,0,...,0]
[... ]
[0,0,0,...,0]])
My goal is to insert all lines of data in this way.

You can make use of the np.genfromtxt function [numpy-doc] where you set as delimiter=… parameter, the comma (','). So given you made a file data.txt, you can load that file into a numpy array with:
>>> import numpy as np
>>> np.genfromtxt('data.txt', delimiter=',')
array([[ 1., 1., 2.],
[ 2., 5., 8.],
[ 3., 9., 10.]])
Or if you are only interested in the x/y values, you can use the usecols=… parameter:
>>> np.genfromtxt('data.txt', delimiter=',', usecols=(1,2))
array([[ 1., 2.],
[ 5., 8.],
[ 9., 10.]])

You can load the data using genfromtxt():
import numpy as np
tmp = np.genfromtxt('data.txt', delimiter=',', dtype=int)
and then generate an empty data matrix a from the first two columns of tmp
a = np.zeros(np.max(tmp[:, :2], axis=0) + 1)
and populate it with values from tmp
a[tmp[:, 0], tmp[:, 1]] = tmp[:, 2]
a
# array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
# [ 0., 2., 0., 0., 0., 0., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0., 8., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 10.]])

Related

Defining a 2-d numpy array from values in 3-d numpy array

I have a 3-D numpy array representing a model domain of 39 layers, 279 rows, 153 columns. The values in the array are either 0 or 1 and signify if the cell in the domain is inactive or active, respectively. I am trying to create a 2-D array of shape 279 rows and 153 columns where the array values equal the layer number for the uppermost active layer in the grid. Essentially, at each row, col location I want to loop through the layers to find the first one that is a 1 and not a 0 and then put that layer number in the 2-D array at that row, col location. For example:
If a four layer (layers 0-3) array looks like this:
array([[[ 0., 1., 0., 0.],
[ 1., 0., 0., 0.],
[ 1., 0., 0., 0.]],
[[ 0., 1., 1., 0.],
[ 1., 1., 0., 0.],
[ 1., 1., 0., 0.]],
[[ 0., 0., 1., 1.],
[ 0., 1., 1., 0.],
[ 0., 1., 1., 0.]],
[[ 0., 0., 1., 1.],
[ 0., 1., 1., 1.],
[ 0., 1., 1., 1.]]])
The 2-D array should look like this:
array([[[ 0., 0., 1., 2.],
[ 0., 1., 2., 3.],
[ 0., 1., 2., 3.]],
If the row-col location is not active (not equal to 1) in any layer , the value in the resulting array should be 0 (like at 1,1), the same as if it were active in layer 0.
I have tried modifying a couple of solutions where the z-axis values are summed, or averaged, but can't seem to figure out how to get exactly what I am looking for.
You could try numpy.argmax:
import numpy as np
a = np.array([[[ 0., 1., 0., 0.],
[ 1., 0., 0., 0.],
[ 1., 0., 0., 0.]],
[[ 0., 1., 1., 0.],
[ 1., 1., 0., 0.],
[ 1., 1., 0., 0.]],
[[ 0., 0., 1., 1.],
[ 0., 1., 1., 0.],
[ 0., 1., 1., 0.]],
[[ 0., 0., 1., 1.],
[ 0., 1., 1., 1.],
[ 0., 1., 1., 1.]]])
print(np.argmax(a,0))
array([[0, 0, 1, 2],
[0, 1, 2, 3],
[0, 1, 2, 3]])
This works because argmax returns the first max value when searching over the defined axis (in this case the 0th axis).

How can I transfer an sparse representaion of .txt to a dense matrix in scipy?

I have a .txt file from epinion data set which is a sparse representation (ie.
23 387 5 represents the fact "user 23 has rated item 387 as 5") . from this sparse format I want to transfer it to its dense Representation scipy so I can do matrix factorization on it.
I have loaded the file with loadtxt() from numpy and it is a [664824, 3] array. Using scipy.sparse.csr_matrix I transfer it to numpy array and using todense() from scipy I was hoping to achieve the dense format but I always get the same matrix: [664824, 3]. How can I turn it into the original [40163,139738] dense representation?
import numpy as np
from io import StringIO
d = np.loadtxt("MFCode/Epinions_dataset.txt")
S = csr_matrix(d)
D = R.todense()
I expected a dense matrix with the shape of [40163,139738]
A small sample csv like text:
In [218]: np.lib.format.open_memmap?
In [219]: txt = """0 1 3
...: 1 0 4
...: 2 2 5
...: 0 3 6""".splitlines()
In [220]: data = np.loadtxt(txt)
In [221]: data
Out[221]:
array([[0., 1., 3.],
[1., 0., 4.],
[2., 2., 5.],
[0., 3., 6.]])
Using sparse, using the (data, (row, col)) style of input:
In [222]: from scipy import sparse
In [223]: M = sparse.coo_matrix((data[:,2], (data[:,0], data[:,1])), shape=(5,4))
In [224]: M
Out[224]:
<5x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [225]: M.A
Out[225]:
array([[0., 3., 0., 6.],
[4., 0., 0., 0.],
[0., 0., 5., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
Alternatively fill in a zeros array directly:
In [226]: arr = np.zeros((5,4))
In [227]: arr[data[:,0].astype(int), data[:,1].astype(int)]=data[:,2]
In [228]: arr
Out[228]:
array([[0., 3., 0., 6.],
[4., 0., 0., 0.],
[0., 0., 5., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
But be ware that np.zeros([40163,139738]) could raise a memory error. M.A (M.toarray())` could also do that.

How is the IoU calculated for multiple bounding box predictions in Tensorflow Object Detection API?

How is the IoU metric calculated for multiple bounding box predictions in Tensorflow Object Detection API ?
Not sure exactly how TensorFlow does it but here is one way that I recently got it to work since I didn't find a good solution online. I used numpy matrices to get the IoU, & other metrics (TP, FP, TN, FN) for multi-object detection.
Lets say for this example that your image is 6x6.
import cv2
empty_array = np.zeros(36).reshape([6, 6])
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
And you have the ground truth for 2 objects, one in the bottom left of the image and one smaller one in the top right.
bbox_actual_obj1 = [[0, 3], [2, 5]] # top left coord & bottom right coord
bbox_actual_obj2 = [[4, 0], [5, 1]]
Using OpenCV, you can add these objects to a copy of the empty image array.
actual = empty.copy()
actual = cv2.rectangle(
actual,
bbox_actual_obj1[0],
bbox_actual_obj1[1],
1,
-1
)
actual = cv2.rectangle(
actual,
bbox_actual_obj2[0],
bbox_actual_obj2[1],
1,
-1
)
array([[0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.]])
Now let's say that below are our predicted bounding boxes:
bbox_pred_obj1 = [[1, 3], [3, 5]] # top left coord & bottom right coord
bbox_pred_obj2 = [[3, 0], [5, 2]]
Now we do the same thing as above but change the value we assign within the array.
pred = empty.copy()
pred = cv2.rectangle(
pred,
bbox_person2_car1[0],
bbox_person2_car1[1],
2,
-1
)
pred = cv2.rectangle(
pred,
bbox_person2_car2[0],
bbox_person2_car2[1],
2,
-1
)
array([[0., 0., 0., 2., 2., 2.],
[0., 0., 0., 2., 2., 2.],
[0., 0., 0., 2., 2., 2.],
[0., 2., 2., 2., 0., 0.],
[0., 2., 2., 2., 0., 0.],
[0., 2., 2., 2., 0., 0.]])
If we convert these arrays to matrices and add them, we get the following result
actual_matrix = np.matrix(actual)
pred_matrix = np.matrix(pred)
combined = actual_matrix + pred_matrix
matrix([[0., 0., 0., 2., 3., 3.],
[0., 0., 0., 2., 3., 3.],
[0., 0., 0., 2., 2., 2.],
[1., 3., 3., 2., 0., 0.],
[1., 3., 3., 2., 0., 0.],
[1., 3., 3., 2., 0., 0.]])
Now all we need to do is count the amount of each number in the combined matrix to get the TP, FP, TN, FN rates.
combined = np.squeeze(
np.asarray(
pred_matrix + actual_matrix
)
)
unique, counts = np.unique(combined, return_counts=True)
zipped = dict(zip(unique, counts))
{0.0: 15, 1.0: 3, 2.0: 8, 3.0: 10}
Legend:
True Negative: 0
False Negative: 1
False Positive: 2
True Positive/Intersection: 3
Union: 1 + 2 + 3
IoU: 0.48 10/(3 + 8 + 10)
Precision: 0.56 10/(10 + 8)
Recall: 0.77 10/(10 + 3)
F1: 0.65 10/(10 + 0.5 * (3 + 8))
Each bounding box around an object has an IoU (intersection over union) with the ground-truth box of that object. It is calculated by dividing the common area (overlap) between the predicted bounding box and the actual correct (ground-truth box) by the cumulative area of the two boxes. After calculating all the IoUs for the boxes around an object, the ones with the highest IoU are selected as the result. Here it is explained better.
Also you can print the IoU value after this line.

Filling multiple diagonal elements of a numpy 2D array

What is the best way to fill multiple diagonal elements (but not all) of a 2 dimensional numpy array.
I know numpy.fill_diagonal is the recommended way to fill all the diagonal elements.
Currently I am just using a loop:
for i in a_list_of_indices: a_2d_array[i,i] = num
If the array is large and the number of diagonal elements to be filled is also large, is there a better way than above.
You can use this without looping:
a_2d_array[a_list_of_indices,a_list_of_indices] = num
Example:
a_2d_array = np.zeros((5,5))
a_list_of_indices = [2, 3]
returns:
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0.]])

How to create a diagonal multi-dimensional (ie greater than 2) in numpy

Is there a higher (than two) dimensional equivalent of diag?
L = [...] # some arbitrary list.
A = ndarray.diag(L)
will create a diagonal 2-d matrix shape=(len(L), len(L)) with elements of L on the diagonal.
I'd like to do the equivalent of:
length = len(L)
A = np.zeros((length, length, length))
for i in range(length):
A[i][i][i] = L[i]
Is there a slick way to do this?
Thanks!
You can use diag_indices to get the indices to be set. For example,
x = np.zeros((3,3,3))
L = np.arange(6,9)
x[np.diag_indices(3,ndim=3)] = L
gives
array([[[ 6., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 7., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 8.]]])
Under the hood diag_indices is just the code Jaime posted, so which to use depends on whether you want it spelled out in a numpy function, or DIY.
You can use fancy indexing:
In [2]: a = np.zeros((3,3,3))
In [3]: idx = np.arange(3)
In [4]: a[[idx]*3] = 1
In [5]: a
Out[5]:
array([[[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 1.]]])
For a more general approach, you could set the diagonal of an arbitrarily sized array doing something like:
def set_diag(arr, values):
idx = np.arange(np.min(arr.shape))
arr[[idx]*arr.ndim] = values