I'm trying to extract all the possible permutations from a Tensor along a specific axis. My input is a [B, S, L] tensor (B batches of S vectors of length L) and I want to extract all the possible permutations among these vectors (the S! permutations) namely a [B, S!, S, L] Tensor as output.
That's what I tried for now but I'm struggling getting the right output shape. I think my mistake might be that I'm creating a batch_range, but I should create a permutation_range as well.
import tensorflow as tf
import numpy as np
from itertools import permutations
S = 3
B = 5
L = 10
input = tf.constant(np.random.randn(B, S, L))
perms = list(permutations(range(S))) # ex with 3: [0, 1, 2], [0, 2 ,1], [1, 0, 2], [1, 2, 0], [2, 1, 0], [2, 0, 1]
length_perm = len(perms)
perms = tf.reshape(tf.constant(perms), [1, length_perm, S, 1])
perms = tf.tile(perms, [B, 1, 1, 1])
batch_range = tf.tile(tf.reshape(tf.range(B, dtype=tf.int32), shape=[B, 1, 1, 1]), [1, length_perm, S, 1])
indicies = tf.concat([batch_range, perms], axis=3)
permutations = tf.gather_nd(tf.tile(tf.reshape(input, [B, 1, S, L]), [1, length_perm, 1, 1]), indicies) #
# I get a [ B, P, S, S, L] instead of the desired [B, P, S, L]
I posted one possible 'solution' just below, but I think there is still a problem with this one. I tested it, and if B>1 it's not going very well.
I just found an answer I think, please correct me if you think I'm wrong or if there is an easier way to do this:
import tensorflow as tf
import numpy as np
from itertools import permutations
S = 3
B = 5
L = 10
input = tf.constant(np.random.randn(B, S, L))
perms = list(permutations(range(S))) # ex with 3: [0, 1, 2], [0, 2 ,1], [1, 0, 2], [1, 2, 0], [2, 1, 0], [2, 0, 1]
length_perm = len(perms)
perms = tf.reshape(tf.constant(perms), [1, length_perm, S, 1])
perms = tf.tile(perms, [B, 1, 1, 1])
batch_range = tf.tile(tf.reshape(tf.range(B, dtype=tf.int32), shape=[B, 1, 1, 1]), [1, length_perm, S, 1])
perm_range = tf.tile(tf.reshape(tf.range(length_perm, dtype=tf.int32), shape=[1, length_perm, 1, 1]), [B, 1, S, 1])
indicies = tf.concat([batch_range, perm_range, perms], axis=3)
permutations = tf.gather_nd(tf.tile(tf.reshape(input, [B, 1, S, L]), [1, length_perm, 1, 1]), indicies) #
print permutations
I know this is late, but I came across the same problem and wanted to share my solution.
I also generate the permutation list. Then I build a permutation tensor from it.
Then I multiply it with the tensor. It doesn't use tf.gather_nd(), but a clean matrix multiplcation.
import tensorflow as tf
import numpy as np
from itertools import permutations
B = 5 # batch size
S = 3 # here permutations
L = 10 # length of the S vecors
data = tf.constant(np.random.randn(B, S, L))
perms = list(permutations(range(S))) # ex with 3: [0, 1, 2], [0, 2 ,1], [1, 0, 2],[1, 2, 0], [2, 1, 0], [2, 0, 1]
N= len(perms)
# from here new code:
eye = tf.eye(S,dtype=tf.int32) # creates eye matrix of [S x S]
# now we cast the eye matrix and permutation matrix, so that they give a [N,S,S] matrix, which are basically N eye matrcices with the permutation indices on the diagonal
perm_mat = tf.constant(np.eye(S)[np.array(perms)],dtype= tf.float64)
# this can be now multiplied to the tensor and gives the permutated output. We just need to broadcast the permutation dimension here
res = tf.linalg.matmul(perm_mat, data[:,tf.newaxis,...])
print(res)
Related
I am doing the image semantic segmentation job with unet. I am confused with the last layers for pixel classification. The Unet code is like this:
...
reshape = Reshape((n_classes,self.img_rows * self.img_cols))(conv9)
permute = Permute((2,1))(reshape)
activation = Activation('softmax')(permute)
model = Model(input = inputs, output = activation)
return model
...
Can I just reshape without using Permute like this?
reshape = Reshape((self.img_rows * self.img_cols, n_classes))(conv9)
Updated:
I found the training result is not right when when using the directly reshape way:
reshape = Reshape((self.img_rows * self.img_cols, n_classes))(conv9) // the loss is not convergent
My groundtruth is generated like this:
X = []
Y = []
im = cv2.imread(impath)
X.append(im)
seg_labels = np.zeros((height, width, n_classes))
for spath in segpaths:
mask = cv2.imread(spath, 0)
seg_labels[:, :, c] += mask
Y.append(seg_labels.reshape(width*height, n_classes))
Why reshape directly does not work?
You clearly misunderstand the meaning of each operation and the final goal:
final goal: classification for each pixel, i.e. softmax along the semantic class axis
how to achieve this goal in the original code? Let's see the code line by line:
reshape = Reshape((n_classes,self.img_rows * self.img_cols))(conv9) # L1
permute = Permute((2,1))(reshape) # L2
activation = Activation('softmax')(permute) # L3
L1's output dim = n_class-by-n_pixs, (n_pixs=img_rows x img_cols)
L2's output dim = n_pixs-by-n_class
L3's output dim = n_pixs-by-n_class
Note the default softmax activation is applied to the last axis, i.e. the axis that n_class stands for, which is the semantic class axis.
Therefore, this original code fulfills the final goal of semantic segmentation.
Let's revisit the code that you want to change, which is
reshape = Reshape((self.img_rows * self.img_cols, n_classes))(conv9) # L4
L4's output dim = n_pixs-by-n_class
My guess is that you think L4's output dim matches L2's, and thus L4 is a short-cut that is equivalent to executing L1 and L2.
However, matching the shape does not necessarily mean matching the physical meaning of axes. Why? A simple example will explain.
Say you have 2 semantic classes and 3 pixels. To see the difference assume all three pixels belong to the same class.
In other words, a ground truth tensor will look like this
# cls#1 cls#2
[ [0, 1], # pixel #1
[0, 1], # pixel #2
[0, 1], # pixel #3
]
Assume you have a perfect network and generate the exact response for each pixel, but your solution will create a tensor like below
# cls#1 cls#2
[ [0, 0], # pixel #1
[0, 1], # pixel #2
[1, 1], # pixel #3
]
whose shape is the same as the ground truth's, but fails to match the physical meaning of axes.
This further makes the softmax operation meaningless, because it is supposed to apply to the class dimension, but this dimension does not physically exist. As a result, it leads to the following erroneous output after applying softmax,
# cls#1 cls#2
[ [0.5, 0.5], # pixel #1
[0, 1], # pixel #2
[0.5, 0.5], # pixel #3
]
which completely mess up the training even if it is under the ideal assumption.
Therefore, it is a good habit to write down the physical meaning of each axis of a tensor. When you do any tensor reshape operation, ask yourself whether the physical meaning of an axis is changed in your expected way.
For example, if you have a tensor T of shape batch_dim x img_rows x img_cols x feat_dim, you can do many things and not all of them make sense (due to the problematic physical meaning of axes)
(Wrong) reshape it to whatever x feat_dim, because whatever dimension is meaningless in testing where the batch_size might be different.
(Wrong) reshape it to batch_dim x feat_dim x img_rows x img_cols, because the 2nd dimension is NOT the feature dimension and neither for the 3rd and 4th dimension.
(Correct) permute axes (3,1,2), and this will lead you the tensor of shape batch_dim x feat_dim x img_rows x img_cols, while keeping the physical meaning of each axis.
(Correct) reshape it to batch_dim x whatever x feat_dim. This is also valid, because the whatever=img_rows x img_cols is equivalent to the pixel location dimension, and both the meanings of batch_dim and feat_dim are unchanged.
Your code will still be runnable since the shape will be the same, but the result (backprops) will be different since the values of tensors will be different. For example:
arr = np.array([[[1,1,1],[1,1,1]],[[2,2,2],[2,2,2]],[[3,3,3],[3,3,3]],[[4,4,4],[4,4,4]]])
arr.shape
>>>(4, 2, 3)
#do reshape, then premute
reshape_1 = arr.reshape((4, 2*3))
np.swapaxes(reshape_1, 1, 0)
>>>array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
#do reshape directly
reshape_2 = arr.reshape(2*3, 4)
reshape_2
>>>array([[1, 1, 1, 1],
[1, 1, 2, 2],
[2, 2, 2, 2],
[3, 3, 3, 3],
[3, 3, 4, 4],
[4, 4, 4, 4]])
The Reshape and Permute is done to take the softmax at each pixel location. Adding to #meowongac's answer, Reshape preserves the order of the elements. In this case, since the channel dimensions have to be swapped, Reshape followed by Permute is appropriate.
Considering the case of (2,2) image with 3 values at each location,
arr = np.array([[[1,1],[1,1]],[[2,2],[2,2]],[[3,3],[3,3]]])
>>> arr.shape
(3, 2, 2)
>>> arr
array([[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]]])
>>> arr[:,0,0]
array([1, 2, 3])
The channel values at each location are [1,2,3]. The goal is to swap the channel axis(length 3) to the end.
>>> arr.reshape((2,2,3))[0,0]
array([1, 1, 1]) # incorrect
>>> arr.transpose((1,2,0))[0,0] # similar to what permute does.
array([1, 2, 3]) # correct
More examples at this link: https://discuss.pytorch.org/t/how-to-change-shape-of-a-matrix-without-dispositioning-the-elements/30708
I would like to shift the indices of a SparseTensor in Tensorflow. Is there a way to alter a SparseTensor's indices?
import numpy as np
import tensorflow as tf
# build graph
x = tf.sparse_placeholder(tf.float32, [None, 10], name='x')
sparse_indices = x.indices
# This line does not work:
sparse_indices[:, 1] = sparse_indices[:, 1] + 1
shifted_x = tf.SparseTensor(indices=sparse_indices,
values=x.values,
dense_shape=[2,20])
# start session
session = tf.InteractiveSession()
indices = np.array([[1, 0], [2, 1]], dtype=np.int64)
values = np.array([1, 1], dtype=np.float32)
shape = np.array([2, 10], dtype=np.int64)
shifted = session.run(shifted_x,
{x:tf.SparseTensorValue(indices, values, shape)})
I am new to tensorflow and wondering if it is possible to resize a single dimension within a tensor.
let's I have a given tensor t:
t = [[1, 10], [2, 20]]
shape(t) = [2, 2]
now I want to modify the shape of this tensor, so that:
shape(t) = [2, 3]
So far I just found the functions:
reshape --> this function is able to reshape the tensor in such a way, that the total number of dimensions stays the same (as far as i understood)
shape(t) = [1, 3] | [3, 1] | [4]
expand_dims --> this function is able to add a new 1-dimensional dimension
shape(t) = [1, 2, 2] | [2, 1, 2] | [2, 2, 1]
Is a function for my described purpose in place? If not: Why? (Maybe it doesn't make sense to have such a function?)
Kind regards
use tf.concat can do it. Here is an example.
import tensorflow as tf
t = tf.constant([[1, 10], [2, 20]], dtype=tf.int32)
# the new tensor w/ the shape of [2]
TBA_a = tf.constant([3,30], dtype=tf.int32)
# reshape TBA_a to [2,1], then concat it to t on axis 1 (column)
new_t = tf.concat([t, tf.reshape(TBA_a, [2,1])], axis=1)
sess = tf.InteractiveSession()
print(new_t.eval())
It will give us
[[ 1 10 3]
[ 2 20 30]]
I am creating a DNNclassifier with sparse columns. The training data looks like this,
samples col1 col2 price label
eg1 [[0,1,0,0,0,2,0,1,0,3,...] [[0,0,4,5,0,...] 5.2 0
eg2 [0,0,...] [0,0,...] 0 1
eg3 [0,0,...]] [0,0,...] 0 1
The following snippet can run successfully,
import tensorflow as tf
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
sparse_feature_b = tf.contrib.layers.sparse_column_with_hash_bucket('col2', 1000, dtype=tf.int32)
sparse_feature_a_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_a, dimension=2)
sparse_feature_b_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_b, dimension=2)
feature_c = tf.contrib.layers.real_valued_column('price')
estimator = tf.contrib.learn.DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb, feature_c],
hidden_units=[5, 3],
n_classes=2,
model_dir='./tfTmp/tfTmp0')
# Input builders
def input_fn_train(): # returns x, y (where y represents label's class index).
features = {'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)]),
'col2': tf.SparseTensor(indices=[[0, 2], [0, 3]],
values=[4, 5],
dense_shape=[3, int(100e6)]),
'price': tf.constant([5.2, 0, 0])}
labels = tf.constant([0, 1, 1])
return features, labels
estimator.fit(input_fn=input_fn_train, steps=100)
However, I have a question from this sentence,
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
where 3 means hash_bucket_size=3, but this sparse tensor includes 4 non-zero values,
'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)])
It seems has_bucket_size does nothing here. No matter how many non-zero values you have in your sparse tensor, you just need to set it with an integer > 1 and it works correctly.
I know my understanding may not be right. Could anyone explain how has_bucket_size works? Thanks a lot!
hash_bucket_size works by taking the original indices, hashing them into a space of the specified size, and using the hashed indices as features.
This means you can specify your model before knowing the full range of possible indices, at the cost of some indices maybe colliding.
I have a 2-D numpy matrix, an example
M = np.matrix([[1,2],[3,4],[5,6]])
I would like, starting from M, to have a matrix like:
M = np.matrix([[[1,2],[1,2],[1,2]],[[3,4],[3,4],[3,4]],[[5,6],[5,6],[5,6]]])
thus, the new matrix has 3 dimensions. How can I do?
NumPy matrix class can't hold 3D data. So, assuming you are okay with NumPy array as output, we can extend the array version of it to 3D with None/np.newaxis and then use np.repeat -
np.repeat(np.asarray(M)[:,None],3,axis=1)
Sample run -
In [233]: M = np.matrix([[1,2],[3,4],[5,6]])
In [234]: np.repeat(np.asarray(M)[:,None],3,axis=1)
Out[234]:
array([[[1, 2],
[1, 2],
[1, 2]],
[[3, 4],
[3, 4],
[3, 4]],
[[5, 6],
[5, 6],
[5, 6]]])
Alternatively, with np.tile -
np.tile(np.asarray(M),3).reshape(-1,3,M.shape[-1])
This should work for you:
np.array([list(np.array(i)) * 3 for i in M])
as another answerer already said, the matrix can't be three-dimensional.
instead of it, you can make 3-dimensional np.array like below.
import numpy as np
M = np.matrix([[1,2],[3,4],[5,6]])
M = np.array(M)
M = np.array([ [x, x, x] for x in M])
M