I have a tensor T of shape Batch_Size x Num_Items x Item_Dimension and another tensor P of shape Batch_Size x Num_Items, where the Num_Items values in each batch of P sum to 1 (a probability distribution of items for each batch). I want to sample without replacement N items from T according to probability distribution P. The resulting tensor should be of shape Batch_Size x N x Item_Dimension. How would I do this?

Though note I believe you need logits instead of probs for Gumbel max sampling.


Tabular data: Implementing a custom tensor layer without resorting to iteration

I have an idea for a tensor operation that would not be difficult to implement via iteration, with batch size one. However I would like to parallelize it as much as possible.
I have two tensors with shape (n, 5) called X and Y. X is actually supposed to represent 5 one-dimensional tensors with shape (n, 1): (x_1, ..., x_n). Ditto for Y.
I would like to compute a tensor with shape (n, 25) where each column represents the output of the tensor operation f(x_i, y_j), where f is fixed for all 1 <= i, j <= 5. The operation f has output shape (n, 1), just like x_i and y_i.
I feel it is important to clarify that f is essentially a fully-connected layer from the concatenated [...x_i, ...y_i] tensor with shape (1, 10), to an output layer with shape (1,5).
Again, it is easy to see how to do this manually with iteration and slicing. However this is probably very slow. Performing this operation in batches, where the tensors X, Y now have shape (n, 5, batch_size) is also desirable, particularly for mini-batch gradient descent.
It is difficult to really articulate here why I desire to create this network; I feel it is suited for my domain of 'itemized tabular data' and cuts down significantly on the number of weights per operation, compared to a fully connected network.
Is this possible using tensorflow? Certainly not using just keras.
Below is an example in numpy per AloneTogether's request
import numpy as np
features = 16
batch_size = 256
X_batch = np.random.random((features, 5, batch_size))
Y_batch = np.random.random((features, 5, batch_size))
# one tensor operation to reduce weights in this custom 'layer'
f = np.random.random((features, 2 * features))
for b in range(batch_size):
X = X_batch[:, :, b]
Y = Y_batch[:, :, b]
for i in range(5):
x_i = X[:, i:i+1]
for j in range(5):
y_j = Y[:, j:j+1]
x_i_y_j = np.concatenate([x_i, y_j], axis=0)
# f(x_i, y_j)
# implemented by a fully-connected layer
f_i_j = np.matmul(f, x_i_y_j)
All operations you need (concatenation and matrix multiplication) can be batched.
Difficult part here is, that you want to concatenate features of all items in X with features of all items in Y (all combinations).
My recommended solution is to expand the dimensions of X to [batch, features, 5, 1], expand dimensions of Y to [batch, features, 1, 5]
Than tf.repeat() both tensors so their shapes become [batch, features, 5, 5].
Now you can concatenate X and Y. You will have a tensor of shape [batch, 2*features, 5, 5]. Observe that this way all combinations are built.
Next step is matrix multiplication. tf.matmul() can also do batch matrix multiplication, but I use here tf.einsum() because I want more control over which dimensions are considered as batch.
Full code:
import tensorflow as tf
import numpy as np
x = np.random.uniform(size=[batch_size,features,items])
y = np.random.uniform(size=[batch_size,features,items])
f = np.random.uniform(size=[2*features,features])
x_reps= tf.repeat(x[:,:,:,tf.newaxis], items, axis=3)
y_reps= tf.repeat(y[:,:,tf.newaxis,:], items, axis=2)
xy_conc = tf.concat([x_reps,y_reps], axis=1)
f_i_j = tf.einsum("bfij, fg->bgij", xy_conc,f)
f_i_j = tf.reshape(f_i_j , [batch_size,features,items*items])

Given a batch of n images, how to scalar multiply each image by a different scalar in tensorflow?

Assume we have two TensorFlow tensors:
input and weights.
input is a tensor of n images, say. So its shape is [n, H, W, C].
weights is a simple list of n scalar weights: [w1 w2 ... wn]
The aim is to scalar-multiply each image by its corresponding weight.
How would one do that?
I tried to use tf.nn.conv2D with 1x1 kernels but I do not know how to reshape our rank 1 weight tensor into the required rank 4 kernel tensor.
Any help would be appreciated.
Thanks to user zihaozhihao:
The answer is to change the shape of weights to (-1, 1, 1, 1) and then multiply it with input.
weights = tf.reshape(weights, (-1, 1, 1, 1))
weighted_input = input * weights

Calculate prediction derivation in own loss function

in addition to the MSE of y_true and y_predict i would like to use the second derivative of y_true in the cost function, because my model is currently very dynamic. Suppose I have y_predicted (256, 100, 1). The first dimension corresponds to the samples (delta_t between each sample is 0.1s). Now I would like to differentiate via the first dimension, i.e.
diff(diff(y_predicted[1, :, 1]))/delta_t**2
for each row (0-dim) in y_predictied.
Note, I only want to use y_predicted and delta_t to differentiate
Thank you very much,
To calculate the second order derivative you could use tf.hessians as follow:
x = tf.Variable([7])
x2 = x * x
d2x2 = tf.hessians(x2, x)
Evaluating d2x2 yields:
[array([[2]], dtype=int32)]
In your case, you could do
loss += lam_l1 * tf.hessians(y_pred, xs)
where xs are the tensors with respect to which you would like to differentiate.
If you wish to use Keras directly, you can chain twice keras.backend.gradients(loss, variables), there is no Keras equivalent of tf.hessians.

Tensorflow batch sparse multiply

I would like to multiply a sparse tensor by a dense tensor but do so within a batch.
For example I have a sparse tensor with the corresponding dense shape of (20,65536,65536) where 20 is the batch size. I would like to multiply each (65536,65536) in the batch with the corresponding (65536x1) from a tensor shape (20,65536) which has a dense representation. tf.sparse_tensor_dense_matmul only accepts a rank 2 sparse tensor. Is there a way to perform this over a batch?
I would like to avoid converting the sparse matrix to a dense matrix if possible due to memory constraints.
Assuming that a is a sparse tensor with shape (20, 65536, 65536) and b a dense tensor with shape (20, 65536), you could perform the batch sparse-dense matrix multiplication as follows:
y_sparse = tf.sparse.reduce_sum_sparse(a * b[:, None, :], axis=-1)
This solution expands the second dimension of tensor b to enable implicit broadcasting. Then, the batch matrix multiplication takes place by performing a sparse-dense multiplication and a sparse sum along the last axis.
If b has got a third dimension so it is a batch of matrices, you can multiply their columns individually and concatenate them later:
multiplied_dims = []
for i in range (b.shape[-1]):
multiplied_dims.append(tf.expand_dims(tf.sparse.reduce_sum(a * b[:, :, i][:, None, :], axis=-1), -1))
result = tf.concat(multiplied_dims, -1)
The answer is simple - you reshape the sparse tensor first and then multiply it by the dense matrix. Something like this would work:
sparse_tensor_rank2 = tf.sparse_reshape(sparse_tensor, [-1, 65536])

In Tensorflow (in general deep learning), can I apply "local FC layer" with "CONV layer"?

Could anyone make sure my reasoning?
Let's say I have a (pre-trained) fully connected layer fc that takes bx20x20x10 as input and bx64as output layer, where b is batch size.
Now, I have an input of cx100x60x10. The height and weight 100x60 can be subdivided into 5x3 of 20x20. I would like to have 5x3 of local response (output) by fc layer, i.e., `cx5x3x64'.
Now I am thinking: doing this is same with having convolution layer with fc weights and stride with width 20 and height 20. Is that correct? There can be difference?
Yes, it will be the same if appropriate reshaping of the dense layer weight matrix is performed.
Let us first look at the dense layer. You input a 20 x 20 x 10 matrix to the dense layer. It will first be flattened out to produce a 4000 x 1 vector. You want the output to be of size 64 x 1 vector. So, the weight matrix required is 4000 x 64 and 64 bias parameters. Then y = w^T * x + b = [4000 x 64]^T * [4000 x 1] + [64 x 1] will yield a [64 x 1] vector. Therefore, y[i] = w[i][0]*x[0] + ... + w[i][3999]*x[3999] + b[i] for i = [0, 63]. Note that b indicates a bias parameter.
Let us turn to convolution. To produce a 5 x 3 x 64 output from an input of size 100 x 60 x 10, you need 64 filters, each of size (20,20) and strides (20,20) with no zero-padding. Each 20 x 20 filter however has local connectivity extending along the entire depth i.e. a neuron is connected to all the 10 dimensions along the depth of input. Please read this for more information on local connectivity of convolutional layer.
You convolutional layer has a receptive field of 20 x 20. Each neuron in the convolutional layer will be connected to a 20 x 20 x 10. Thus total 4000 weights (and one bias parameter). You have 64 such filters. Therefore, your total learnable weights for this layer = 4000 x 64 + 64. Convolution between one 20 x 20 x 10 block of x and w (size = 64 x 20 x 20 x 10) can be performed as:
convResult = np.sum(np.sum(np.sum(x*w[:,:,::-1,::-1], axis=-1), axis=-1),axis=-1)
There are some fine points here. I did w[:,:,::-1,::-1] because theano convolution flips the convolution kernel (well, not that simple!). If you are interested in who flips and who does not, read this.
Finally, dense layer and convolution layer (in this context) essentially do the same operation. They first element-wise multiply and then sum up two sets of vectors/matrices of 4000 elements. This procedure is repeated 64 times to produce a 64 x 1 vector. So, it is possible to achieve exactly the same result with dense and convolution layer by proper reshaping of the dense layer weight matrix. However, you need to take care of kernel flipping to match the results.
Below I give a code snippet to compute convolution manually (using numpy) and using Theano.
import theano
from theano import tensor as T
import numpy as np
X = T.ftensor4('X')
W = T.ftensor4('W')
out = T.nnet.conv2d(X,W)
f = theano.function([X, W], out, allow_input_downcast=True)
x = np.random.random((1,10,20,20))
w = np.random.random((64,10,20,20))
# convolution using Theano
c1 = np.squeeze(f(x,w)[0])
# convolution using Numpy
c2 = np.sum(np.sum(np.sum(x*w[:,:,::-1,::-1],axis=-1),axis=-1),axis=-1)
# check that both are almost identical
print np.amax(c2 - c1)