How to create a Keras layer from tf.math.segment_sum - tensorflow

I would like to use the tf.math.segment_sum function in a Keras layer but I don't get the dimensions right.
As an example, I would like to sum the values of x_1 grouped by id in the dataframe df:
df = pd.DataFrame({'id': [1, 1, 2, 2, 3, 3, 4, 4],
'x_1': [1, 0, 0, 0, 0, 1, 1, 1],
'target': [1, 1, 0, 0, 1, 1, 2, 2]})
The 'model' I created looks as follows:
input_ = tf.keras.Input((1,), name='X')
cid = tf.keras.Input(shape=(1,), dtype='int64', name='id')
summed = tf.keras.layers.Lambda(lambda x: tf.math.segment_sum(x[0], x[1]), name='segment_sum')([input_, cid])
model = tf.keras.Model(inputs=[input_, cid], outputs=[summed])
I get an error about the rank:
ValueError: Shape must be rank 1 but is rank 2 for 'segment_sum/SegmentSum' (op: 'SegmentSum') with input shapes: [?,1], [?,1].
What do I do wrong here?

I solved it using tf.gather. The working code is as follows:
input_ = tf.keras.Input((1,), name='X')
cid = tf.keras.Input(shape=(1,), dtype='int64', name='id')
summed = tf.keras.layers.Lambda(lambda x: tf.gather(tf.math.segment_sum(x[0], tf.reshape(x[1], (-1,))), x[1]), output_shape=(None,1), name='segment_sum')([input_, cid])
model = tf.keras.Model(inputs=[input_, cid], outputs=[summed])

Related

How to create a Tensor with specific values?

I have a tensor with shape NxM.
I'd like to create another tensor with the same shape, filled with ones up until a certain column (might be different for each row) and the rest of it filled with another value (let's say 10 for the example).
How I do that?
Something like this can help you:
input = tf.Variable([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]], dtype = tf.float32)
indices = tf.constant([1, 4, 2])
X = tf.ones_like(input)
Y = tf.constant(10, dtype = tf.float32, shape = input.shape)
result = tf.where(tf.sequence_mask(indices, tf.shape(input)[1]), X, Y)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(input))
print(sess.run(indices))
print(sess.run(result))

Element-wise assignment in tensorflow

In numpy, it could be easily done as
>>> img
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=int32)
>>> img[img>5] = [1,2,3,4]
>>> img
array([[1, 2, 3],
[4, 5, 1],
[2, 3, 4]], dtype=int32)
However, there seems not exist similar operation in tensorflow.
You can never assign a value to a tensor in tensorflow as the change in tensor value is not traceable by backpropagation, but you can still get another tensor from origin tensor, here is a solution
import tensorflow as tf
tf.enable_eager_execution()
img = tf.constant(list(range(1, 10)), shape=[3, 3])
replace_mask = img > 5
keep_mask = tf.logical_not(replace_mask)
keep = tf.boolean_mask(img, keep_mask)
keep_index = tf.where(keep_mask)
replace_index = tf.where(replace_mask)
replace = tf.random_uniform((tf.shape(replace_index)[0],), 0, 10, tf.int32)
updates = tf.concat([keep, replace], axis=0)
indices = tf.concat([keep_index, replace_index], axis=0)
result = tf.scatter_nd(tf.cast(indices, tf.int32), updates, shape=tf.shape(img))
Actually there is a way to achieve this. Very similar to #Jie.Zhou's answer, you can replace tf.constant with tf.Variable, then replace tf.scatter_nd with tf.scatter_nd_update

Tensorflow confusion matrix using one-hot code

I have multi-class classification using RNN and here is my main code for RNN:
def RNN(x, weights, biases):
x = tf.unstack(x, input_size, 1)
lstm_cell = rnn.BasicLSTMCell(num_unit, forget_bias=1.0, state_is_tuple=True)
stacked_lstm = rnn.MultiRNNCell([lstm_cell]*lstm_size, state_is_tuple=True)
outputs, states = tf.nn.static_rnn(stacked_lstm, x, dtype=tf.float32)
return tf.matmul(outputs[-1], weights) + biases
logits = RNN(X, weights, biases)
prediction = tf.nn.softmax(logits)
cost =tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(cost)
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
I have to classify all inputs to 6 classes and each of classes is composed of one-hot code label as the follow:
happy = [1, 0, 0, 0, 0, 0]
angry = [0, 1, 0, 0, 0, 0]
neutral = [0, 0, 1, 0, 0, 0]
excited = [0, 0, 0, 1, 0, 0]
embarrassed = [0, 0, 0, 0, 1, 0]
sad = [0, 0, 0, 0, 0, 1]
The problem is I cannot print confusion matrix using tf.confusion_matrix() function.
Is there any way to print confusion matrix using those labels?
If not, how can I convert one-hot code to integer indices only when I need to print confusion matrix?
You cannot generate confusion matrix using one-hot vectors as input parameters of labels and predictions. You will have to supply it a 1D tensor containing your labels directly.
To convert your one hot vector to normal label, make use of argmax function:
label = tf.argmax(one_hot_tensor, axis = 1)
After that you can print your confusion_matrix like this:
import tensorflow as tf
num_classes = 2
prediction_arr = tf.constant([1, 1, 1, 1, 0, 0, 0, 0, 1, 1])
labels_arr = tf.constant([0, 1, 1, 1, 1, 1, 1, 1, 0, 0])
confusion_matrix = tf.confusion_matrix(labels_arr, prediction_arr, num_classes)
with tf.Session() as sess:
print(confusion_matrix.eval())
Output:
[[0 3]
[4 3]]

what TensorFlow hash_bucket_size matters

I am creating a DNNclassifier with sparse columns. The training data looks like this,
samples col1 col2 price label
eg1 [[0,1,0,0,0,2,0,1,0,3,...] [[0,0,4,5,0,...] 5.2 0
eg2 [0,0,...] [0,0,...] 0 1
eg3 [0,0,...]] [0,0,...] 0 1
The following snippet can run successfully,
import tensorflow as tf
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
sparse_feature_b = tf.contrib.layers.sparse_column_with_hash_bucket('col2', 1000, dtype=tf.int32)
sparse_feature_a_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_a, dimension=2)
sparse_feature_b_emb = tf.contrib.layers.embedding_column(sparse_id_column=sparse_feature_b, dimension=2)
feature_c = tf.contrib.layers.real_valued_column('price')
estimator = tf.contrib.learn.DNNClassifier(
feature_columns=[sparse_feature_a_emb, sparse_feature_b_emb, feature_c],
hidden_units=[5, 3],
n_classes=2,
model_dir='./tfTmp/tfTmp0')
# Input builders
def input_fn_train(): # returns x, y (where y represents label's class index).
features = {'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)]),
'col2': tf.SparseTensor(indices=[[0, 2], [0, 3]],
values=[4, 5],
dense_shape=[3, int(100e6)]),
'price': tf.constant([5.2, 0, 0])}
labels = tf.constant([0, 1, 1])
return features, labels
estimator.fit(input_fn=input_fn_train, steps=100)
However, I have a question from this sentence,
sparse_feature_a = tf.contrib.layers.sparse_column_with_hash_bucket('col1', 3, dtype=tf.int32)
where 3 means hash_bucket_size=3, but this sparse tensor includes 4 non-zero values,
'col1': tf.SparseTensor(indices=[[0, 1], [0, 5], [0, 7], [0, 9]],
values=[1, 2, 1, 3],
dense_shape=[3, int(250e6)])
It seems has_bucket_size does nothing here. No matter how many non-zero values you have in your sparse tensor, you just need to set it with an integer > 1 and it works correctly.
I know my understanding may not be right. Could anyone explain how has_bucket_size works? Thanks a lot!
hash_bucket_size works by taking the original indices, hashing them into a space of the specified size, and using the hashed indices as features.
This means you can specify your model before knowing the full range of possible indices, at the cost of some indices maybe colliding.

simple Recurrent Neural Net from scratch using tensorflow

I've build a simple recurrent neural net with one hidden layer with 4 nodes in it. This is my code:
import tensorflow as tf
# hyper parameters
learning_rate = 0.0001
number_of_epochs = 10000
# Computation Graph
W1 = tf.Variable([[1.0, 1.0, 1.0, 1.0]], dtype=tf.float32, name = 'W1')
W2 = tf.Variable([[1.0], [1.0], [1.0], [1.0]], dtype=tf.float32, name = 'W2')
WR = tf.Variable([[1.0, 1.0, 1.0, 1.0]], dtype=tf.float32, name = 'WR')
# b = tf.Variable([[0], [0], [0], [0]], dtype=tf.float32)
prev_val = [[0.0]]
X = tf.placeholder(tf.float32, [None, None], name = 'X')
labels = tf.placeholder(tf.float32, [None, 1], name = 'labels')
sess = tf.Session()
sess.run(tf.initialize_all_variables())
z = tf.matmul(X, W1) + tf.matmul(prev_val, WR)# - b
prev_val = z
predict = tf.matmul(z, W2)
error = tf.reduce_mean((labels - predict)**2)
train = tf.train.GradientDescentOptimizer(learning_rate).minimize(error)
time_series = [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
lbsx = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]
for i in range(number_of_epochs):
for j in range(len(time_series)):
curr_X = time_series[j]
lbs = lbsx[j]
sess.run(train, feed_dict={X: [[curr_X]], labels: [[lbs]]})
print(sess.run(predict, feed_dict={X: [[0]]}))
print(sess.run(predict, feed_dict={X: [[1]]}))
I'm getting output:
[[ 0.]]
[[ 3.12420416e-05]]
With input 1, it should output 0 and vice versa. I'm also confused regarding the 'previous value'. Should it be a placeholder? I'd really appreciate your efforts to fix the code.