Numpy: subtracting two arrays - numpy

I have a numpy array,say A1, of shape (1,1), and another, say A2, of shape (1,).
When I do A1-A2, I get another array of shape (1,1).
Shouldn't the arrays be of same dimensions for subtraction/sum operation?

If you take a look at the Documentation, you can see that numpy uses broadcasting (duplication of the array until it matches the dimensions of the other array) on the smaller Array to ensure that the arrays have the same size and an elementwise operation is possible.

Related

Simple question about slicing a Numpy Tensor

I have a Numpy Tensor,
X = np.arange(64).reshape((4,4,4))
I wish to grab the 2,3,4 entries of the first dimension of this tensor, which you can do with,
Y = X[[1,2,3],:,:]
Is this a simpler way of writing this instead of explicitly writing out the indices [1,2,3]? I tried something like [1,:], which gave me an error.
Context: for my real application, the shape of the tensor is something like (30000,100,100). I would like to grab the last (10000, 100,100) to (30000,100,100) of this tensor.
The simplest way in your case is to use X[1:4]. This is the same as X[[1,2,3]], but notice that with X[1:4] you only need one pair of brackets because 1:4 already represent a range of values.
For an N dimensional array in NumPy if you specify indexes for less than N dimensions you get all elements of the remaining dimensions. That is, for N equal to 3, X[1:4] is the same as X[1:4, :, :] or X[1:4, :]. Only if you want to index some dimension while getting all elements in a dimension that comes before it is that you actually need to pass :. Such as X[:, 2:4], for instance.
If you wish to select from some row to the end of array, simply use python slicing notation as below:
X[10000:,:,:]
This will select all rows from 10000 to the end of array and all columns and depths for them.

How to apply numpy matrix operations on first two dimensions of 3D array

I have a 3D numpy array which I am using to represent a tuple of (square) matrices, and I'd like to perform a matrix operation on each of those matrices, corresponding to the first two dimensions of the array. For instance, if my list of matrices is [A,B,C] I would like to compute [A'A,B'B,C'C] where ' denotes the conjugate transpose.
The following code kinda sorta does what I'm looking for:
foo=np.array([[[1,1],[0,1]],[[0,1],[0,0]],[[3,0],[0,-2]]])
[np.matrix(j).H*np.matrix(j) for j in foo]
But I'd like to do this using vectorized operations instead of list comprehension.

Splitting up tensor

Let T be a tensor of shape [n,f], which represents a batch. Now I want to slice T into m tensors along axis=0. The value of m depends on the current batch. I have another tensor I of shape [m,2] which stores pairs of indices which indicate where the slices should occur.
I am not really sure how to "iterate" over the indices to apply tf.slice. Any ideas?
Can this somehow be achieved using tf.scan?
I suppose you are looking for the split function.

How do output shape in cntk?

I write this code:
matrix = C.softmax(model).eval(data).
But matrix.shape, matrix.size give me errors. So I'm wondering, how can I output the shape of CNTK variable?
First note that eval() will not give you a CNTK variable, it will give you a numpy array (or a list of numpy arrays, see the next point).
Second, depending on the nature of the model it is possible that what comes out of eval() is not a numpy array but a list. The reason for this is that if the output is a sequence then CNTK cannot guarrantee that all sequences will be of the same length and it therefore returns a list of arrays, each array being one sequence.
Finally, if you truly have a CNTK variable, you can get the dimensions with .shape

Is it possible to build coo and csr matrices with numpy WITHOUT using scipy?

I have to operate on matrices using an equivalent of sicpy's sparse.coo_matrix and sparse.csr_matrix. However, I cannot use scipy (it is incompatible with the image analysis software I want to use this in). I can, however, use numpy.
Is there an easy way to accomplish what scipy.sparse.coo_matrix and scipy.sparse.csr_matrix do, with numpy only?
Thanks!
The attributes of a sparse.coo_matrix are:
dtype : dtype
Data type of the matrix
shape : 2-tuple
Shape of the matrix
ndim : int
Number of dimensions (this is always 2)
nnz
Number of nonzero elements
data
COO format data array of the matrix
row
COO format row index array of the matrix
col
COO format column index array of the matrix
The data, row, col arrays are essentially the data, i, j parameters when defined with coo_matrix((data, (i, j)), [shape=(M, N)]). shape also comes from the definition. dtype from the data array. nzz as first approximation is the length of data (not accounting for zeros and duplicate coordinates).
So it is easy to construct a coo like object. Similarly a lil matrix has 2 lists of lists. And a dok matrix is a dictionary (see its .__class__.__mro__).
The data structure of a csr matrix is a bit more obscure:
data
CSR format data array of the matrix
indices
CSR format index array of the matrix
indptr
CSR format index pointer array of the matrix
It still has 3 arrays. And they can be derived from the coo arrays. But doing so with pure Python code won't be nearly as fast as the compiled scipy functions.
But these classes have a lot of functionality that would require a lot of work to duplicate. Some is pure Python, but critical pieces are compiled for speed. Particularly important are the mathematical operations that the csr_matrix implements, such as matrix multiplication.
Replicating the data structures for temporary storage is one thing; replicating the functionality is quite another.