I have a tensor [a, b, c, d, e, f, g, h, i] with dimension 9 X 1536. I need to create a new tensor which is like [(a,b), (a,c), (a,d), (a,e),(a,f),(a,g), (a,h), (a,i)] with dimension [8 x 2 x 1536]. How can I do it with tensorflow ?
I tried like this
x = tf.zeros((9x1536))
x_new = tf.stack([(x[0],x[1]),
(x[0], x[2]),
(x[0], x[3]),
(x[0], x[4]),
(x[0], x[5]),
(x[0], x[6]),
(x[0], x[7]),
(x[0], x[8])])
This seems to work but I would like to know if there is a better solution or approach which can be used instead of this
You can obtain the desired output with a combination of tf.concat, tf.tile and tf.expand_dims:
import tensorflow as tf
import numpy as np
_in = tf.constant(np.random.randint(0,10,(9,1536)))
tile_shape = [(_in.shape[0]-1).value] + [1]*len(_in.shape[1:].as_list())
_out = tf.concat([
tf.expand_dims(
tf.tile(
[_in[0]],
tile_shape
)
,
1),
tf.expand_dims(_in[1:], 1)
],
1
)
tf.tile repeats the first element of _in creating a tensor of length len(_in)-1 (I compute separately the shape of the tile because we want to tile only on the first dimension).
tf.expand_dims adds a dimension we can then concat on
Finally, tf.concat stitches together the two tensors giving the desired result.
EDIT: Rewrote to fit the OP's actual use-case with multidimensional tensors.
Related
I have a 3D numpy array with the probabilities of each category in the last dimension. Something like:
import numpy as np
from scipy.special import softmax
array = np.random.normal(size=(10, 100, 5))
probabilities = softmax(array, axis=2)
How can I sample from a categorical distribution with those probabilities?
EDIT:
Right now I'm doing it like this:
def categorical(x):
return np.random.multinomial(1, pvals=x)
samples = np.apply_along_axis(categorical, axis=2, arr=probabilities)
But it's very slow so I want to know if there's a way to vectorize this operation.
Drawing samples from a given probability distribution is done by building the evaluating the inverse cumulative distribution for a random number in the range 0 to 1. For a small number of discrete categories - like in the question - you can find the inverse using a linear search:
## Alternative test dataset
probabilities[:, :, :] = np.array([0.1, 0.5, 0.15, 0.15, 0.1])
n1, n2, m = probabilities.shape
cum_prob = np.cumsum(probabilities, axis=-1) # shape (n1, n2, m)
r = np.random.uniform(size=(n1, n2, 1))
# argmax finds the index of the first True value in the last axis.
samples = np.argmax(cum_prob > r, axis=-1)
print('Statistics:')
print(np.histogram(samples, bins=np.arange(m+1)-0.5)[0]/(n1*n2))
For the test dataset, a typical test output was:
Statistics:
[0.0998 0.4967 0.1513 0.1498 0.1024]
which looks OK.
If you have many, many categories (thousands), it's probably better to do a bisection search using a numba compiled function.
Suppose I have a matrix A and two vectors x,y, of appropriate dimensions. I want to compute the dot product x' * A * y, where x' denotes the transpose. This should result in a scalar.
Is there a convenient API function in Tensorflow to do this?
(Note that I am using Tensorflow 2).
Use tf.linalg.tensordot(). See the documentation
As you have mentioned in the question that you are trying to find dot product. In this case tf.matmul() will not work, as it is only for cross product of metrices.
Demo code snippet
import tensorflow as tf
A = tf.constant([[1,4,6],[2,1,5],[3,2,4]])
x = tf.constant([3,2,7])
result = tf.linalg.tensordot(tf.transpose(x), A, axes=1)
result = tf.linalg.tensordot(result, x, axes=1)
print(result)
And the result will be
>>>tf.Tensor(532, shape=(), dtype=int32)
Few points I want to mention here
Don't forget the axes argument inside tf.linalg.tensordot()
When you create tf.zeros(5) it will create a list of shape 5 and it will be like [0,0,0,0,0], when you transpose this it will give you the same list. But if you create it like tf.zeros((5,1)), it would be a vector of shape (5,1) and the result will be
[
[0],[0],[0],[0],[0]
]
Now you can transpose this and the result will be different, but I recommend you do the code snippet I have mentioned. In case of dot product you don't have to bother much about this.
If you are still facing issues, will be very happy to help you.
Just do the following,
import tensorflow as tf
x = tf.constant([1,2])
a = tf.constant([[2,3],[3,4]])
y = tf.constant([2,3])
z = tf.reshape(tf.matmul(tf.matmul(x[tf.newaxis,:], a), y[:, tf.newaxis]),[])
print(z.numpy())
Returns
>>> 49
Just use tf.transpose and multiplication operator like this:
tf.transpose(x)* A * y .
Based on your example:
x = tf.zeros(5)
A = tf.zeros((5,5))
How about
x = tf.expand_dims(x, -1)
tf.matmul(tf.matmul(x, A, transpose_a=True), x)
Suppose we have two tensors:
tensor A whose shape is (d,m,n)
tensor B whose shape is (d,n,l).
If we want to get the pairwise matrix product of the right-most matrix of A and B, I think we can use np.einsum('dmn,...nl->d...ml',A,B) whose size is (d,d,m,l). However, I would like to get the pairwise product of not all the pairs.
Import a parameter k, 1<=k<=d, I want to get the following pairwise matrix product:
from
A(0,...)#B(0,...)
to
A(0,...)#B(k-1,...)
;
from
A(1,...)#B(1,...)
to
A(1,...)#B(k,...)
;
....
;
from
A(d-2,...)#B(d-2,...),
A(d-2,...)#B(d-1,...)
to
A(d-2,...)#B(k-3,...)
;
from
A(d-1,...)#B(d-1,...)
to
A(d-1,...)#B(k-2,...)
.
Note here we we use a rolling way to deal with tensor B. (like numpy.roll).
Finally, we actually get a tensor whose shape is (d,k,m,l).
What's the most efficient way to do this.
I know several ways like:
First get np.einsum('dmn,...nl->d...ml',A,B), then use a mask to extract the (d,k) pairs.
tile B first, then use einsum in some way.
But I think there exists a better way.
I doubt you can do much better than a for loop. Here is, for example, a vectorized version using einsum and stride_tricks compared to a double for loop:
Code:
from simple_benchmark import BenchmarkBuilder, MultiArgument
import numpy as np
from numpy.lib.stride_tricks import as_strided
B = BenchmarkBuilder()
#B.add_function()
def loopy(A,B,k):
d,m,n = A.shape
l = B.shape[-1]
out = np.empty((d,k,m,l),int)
for i in range(d):
for j in range(k):
out[i,j] = A[i]#B[(i+j)%d]
return out
#B.add_function()
def vectory(A,B,k):
d,m,n = A.shape
l = B.shape[-1]
BB = np.concatenate([B,B[:k-1]],0)
BB = as_strided(BB,(d,k,n,l),np.repeat(BB.strides,(2,1,1)))
return np.einsum("ikl,ijln->ijkn",A,BB)
#B.add_arguments('d x k x m x n x l')
def argument_provider():
for exp in range(10):
d,k,m,n,l = (np.r_[1.6,1.5,1.5,1.5,1.5]**exp*(4,2,2,2,2)).astype(int)
print(d,k,m,n,l)
A = np.random.randint(0,10,(d,m,n))
B = np.random.randint(0,10,(d,n,l))
yield k*d*m*n*l,MultiArgument([A,B,k])
r = B.run()
r.plot()
import pylab
pylab.savefig('diagwa.png')
I am wondering if it possible to apply a mask before performing theano.tensor.nnet.softmax?
This is the behavior I am looking for:
>>>a = np.array([[1,2,3,4]])
>>>m = np.array([[1,0,1,0]]) # ignore index 1 and 3
>>>theano.tensor.nnet.softmax(a,m)
array([[ 0.11920292, 0. , 0.88079708, 0. ]])
Note that a and m are matrices, so I would like the softmax with work on an entire matrix and perform row-wise masked softmax.
Also the output should be the same shape as a, so the solution can not do advanced indexing e.g. theano.tensor.softmax(a[0,[0,2]])
def masked_softmax(a, m, axis):
e_a = T.exp(a)
masked_e = e_a * m
sum_masked_e = T.sum(masked_e, axis, keepdims=True)
return masked_e / sum_masked_e
theano.tensor.switch is one way to do this.
In the computational graph you can do the following:
a_mask = theano.tensor.switch(m, a, np.NINF)
sm = theano.tensor.softmax(a_mask)
hope it helps others.
My apologies if this has been answered many times, but I just can't find a solution.
Assume the following code:
import numpy as np
A,_,_ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*5).astype(int)
How can I slice A using B so that B represent the indexes in the first and last dimensions of A (I.e A[magic] = B)?
I have tried
A[:,B,:] which doesn't work due to peculiarities of advanced indexing.
A[:,B,np.arange(10)] generates 7 copies of the matrix I'm after
A[np.arange(7),B,np.arange(10)] gives the error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Any other suggestions?
These both work:
A[0, B, 0]
A[B, B, B]
Really, only the B in axis 1 matters, the others can be any range that will broadcast to B.shape and are limited by A.shape[0] (for axis 1) and A.shape[2] (for axis 2), for a ridiculous example:
A[range(7) + range(3), B, range(9,-1, -1)]
But you don't want to use : because then you'll get, as you said, 7 or 10 (or both!) "copies" of the array you want.
A, _, _ = np.meshgrid(np.arange(5),np.arange(7),np.arange(10))
B = (rand(7,10)*A.shape[1]).astype(int)
np.allclose(B, A[0, B, 0])
#True
np.allclose(B, A[B, B, B])
#True
np.allclose(B, A[range(7) + range(3), B, range(9,-1, -1)])
#True