tensorflow assert element tf.where - tensorflow

I have 2 matrix with shape:
pob.shape = (2,49,20)
rob.shape = np.zeros((2,49,20))
and I want to get the index of pob's elements which has value !=0. So in numpy I can do this:
x,y,z = np.where(pob!=0)
eg:
x = [2,4,7]
y = [3,5,5]
z = [3,5,6]
I want to change value of rob:
rob[x1,y1,:] = np.ones((20))
How can i do this with tensorflow objects?
I tried to use tf.where but I can't get the index value out of tensor obj

You could use tf.range() and tf.meshgrid() to create index matrices, then use tf.where() with your condition on them to obtain the indices which meet it. However, the tricky part would come next: you can't easily assign values to a tensor based on indices in TF (my_tensor[my_indices] = my_values).
A workaround for your problem ("for all (i,j,k), if pob[i,j,k] != 0 then rob[i,j] = 1") could be as follows:
import tensorflow as tf
# Example values for demonstration:
pob_val = [[[0, 0, 0], [1, 0, 0], [1, 0, 1]], [[1, 1, 1], [0, 0, 0], [0, 0, 0]]]
pob = tf.constant(pob_val)
pob_shape = tf.shape(pob)
rob = tf.zeros(pob_shape)
# Get the mask:
mask = tf.cast(tf.not_equal(pob, 0), tf.uint8)
# If there's at least one "True" in mask[i, j, :], make all mask[i, j, :] = True:
mask = tf.cast(tf.reduce_max(mask, axis=-1, keepdims=True), tf.bool)
mask = tf.tile(mask, [1, 1, pob_shape[-1]])
# Apply mask:
rob = tf.where(mask, tf.ones(pob_shape), rob)
with tf.Session() as sess:
rob_eval = sess.run(rob)
print(rob_eval)
# [[[0. 0. 0.]
# [1. 1. 1.]
# [1. 1. 1.]]
#
# [[1. 1. 1.]
# [0. 0. 0.]
# [0. 0. 0.]]]

Related

How do I mask for 2-D MultiHeadAttention in Tensorflow?

Can anyone help me understand masking a 3D input (technically 4D) in MultiHeadAttention?
My original dataset consists of timeseries in the form of:
Inputs: (samples, horizon, features) ~> (8, 4, 2) ~> K, V, Q during inference
Targets: (samples, horizon, features) ~> (8, 4, 2) ~> Q during training
Labels: (sample, horizon, features) ~> (1, 4, 2)
Essentially I'm taking 8 samples of timeseries data and ultimately outputting 1 sample in the same format. Targets are horizon-shifted values of Inputs and fed into an encoder-only Transformer model (Q, K, V as shown above).
In order to best approximate the single output sample (which is identical to the last sample in Targets), I need to run full attention on the horizons of each sample and causal attention between samples. Once the data has been run through the encoder, it is sent to an EinsumDense layer which reduces the (8, 4, 2) encoder output into (1, 4, 2). In order for all this to work, I need to inject a 4th dimension on my data, so Inputs and Targets are formatted as (1, 8, 4, 2).
So getting to my actual question, how do I generate the masking for the encoder? After some digging around through errors I noticed that the shape of the tensor that MHA uses for masking the softmax is formatted (1, 1, 8, 4, 8, 4) which makes me believe it's (B, H, TS, TH, SS, SH) where:
B=batch
H=heads
TS=target samples
TH=target horizon
SS=source samples
SH=source horizon
I gather this notion from the docs only because of the attention_output description:
...where T is for target sequence shapes
Assuming this to be the case, is the following a reasonable mask, or is there a more appropriate method:
sample_mask = tf.linalg.band_part(tf.ones((samples, samples)), -1, 0)
horizon_mask = tf.ones((horizon, horizon))
encoder_mask = (
sample_mask[:, tf.newaxis, :, tf.newaxis]
* horizon_mask[tf.newaxis, :, tf.newaxis, :]
)
it is masking you can fancy it since data are contained in many fashions nothing wrong with it but I am trying to use the Tensorflow methods please see the result they are on the same dimensions. Tensorflow Masking layer
Sample: Simply identical masking values with target shapes you become observers for the solutions, proved with eyes fashions improved governance.
import tensorflow as tf
import matplotlib.pyplot as plt
start = 3
limit = 25
delta = 3
sample = tf.range(start, limit, delta)
sample = tf.cast( sample, dtype=tf.int64 )
sample = tf.constant( sample, shape=( 8, 1 ) )
horizon = tf.random.uniform(shape=[1, 4], minval=5, maxval=10, dtype=tf.int64)
features = tf.random.uniform(shape=[1, 1, 2], minval=-5, maxval=+5, dtype=tf.int64)
temp = tf.math.multiply(sample, horizon)
temp = tf.expand_dims(temp, axis=2)
input = tf.math.multiply( temp, features )
print( "input: " )
print( input )
n_samples = 8
n_horizon = 4
n_features = 2
sample_mask = tf.linalg.band_part(tf.ones((n_samples, n_samples)), -1, 0)
horizon_mask = tf.ones((n_horizon, n_horizon))
encoder_mask = (
sample_mask[:, tf.newaxis, :, tf.newaxis]
* horizon_mask[tf.newaxis, :, tf.newaxis, :]
)
print( encoder_mask )
masking_layer = tf.keras.layers.Masking(mask_value=50, input_shape=(n_horizon, n_features))
print( masking_layer(input) )
img_1 = tf.keras.preprocessing.image.array_to_img(
tf.constant( tf.constant( input[:,:,1], shape=(8, 4, 1) ), shape=(8, 4, 1) ),
data_format=None,
scale=True
)
img_2 = tf.keras.preprocessing.image.array_to_img(
tf.constant( masking_layer(input)[:,:,0], shape=(8, 4, 1) ),
data_format=None,
scale=True
)
plt.figure(figsize=(1, 2))
plt.title("🧸")
plt.subplot(1, 2, 1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img_1)
plt.xlabel("Input (8, 4, 2), left")
plt.subplot(1, 2, 2)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img_2)
plt.xlabel("Masks (8, 4, 2), left")
plt.show()
Output: Input tensor we created from table matched features.
[[ -960 0]
[-1080 0]
[ -960 0]
[ -960 0]]], shape=(8, 4, 2), dtype=int64)
Output: The question - masking methods.
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
...
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]]], shape=(8, 4, 8, 4), dtype=float32)
Output: The masking_layer = tf.keras.layers.Masking(mask_value=50, input_shape=(n_horizon, n_features))
[[ -840 0]
[ -945 0]
[ -840 0]
[ -840 0]]
[[ -960 0]
[-1080 0]
[ -960 0]
[ -960 0]]], shape=(8, 4, 2), dtype=int64)

how does the weight update in BasicRnnCell for char level

When I used the Tensorflow to build a character level RNN network, I was confused by the change of the weight in the model. I thought the weight won't update in one batch.
But the kernel(which I think is Wk) was changing. And there were 6 kernels. So I am confused why it changed and why there were 6 kernels. Can I get W and U directly using Tensorflow. Here is my code. Thanks.
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
h = [1, 0, 0, 0]
e = [0, 1, 0, 0]
l = [0, 0, 1, 0]
o = [0, 0, 0, 1]
with tf.variable_scope('two_sequances') as scope:
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5
hidden_size = 2
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
x_data = np.array([[h, e, l, l, o]], dtype=np.float32)
print(x_data.shape)
print(x_data)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
sess.run(tf.global_variables_initializer())
results, state = sess.run([outputs, _states])
variable_names = [v.name for v in tf.global_variables()]
values = sess.run(variable_names)
for k, v in zip(variable_names, values):
print(k, v)
And the output of the variables are as follow.
two_sequances/rnn/basic_rnn_cell/kernel:0
[[ 0.6147509 0.6268855 ]
[ 0.34818882 0.8140872 ]
[ 0.4074654 0.011693 ]
[-0.5032909 -0.69920516]
[ 0.62231725 0.18967694]
[ 0.6888749 0.77280706]]
two_sequances/rnn/basic_rnn_cell/bias:0
[0. 0.]

tensorflow: shift zeros to the end

Given a tensor (with numbers >= 0) in tensorflow. I need to shift all zeros to the end of each line and remove columns that only include 0's.
E.g.
0 2 3 4
0 1 0 5
2 3 1 0
should be transformed to
2 3 4
1 5 0
2 3 1
Is there any nice way to do this in tensorflow? Btw, the order of the non-zero elements should be the same (no sorting).
Ragged tensor method
The best way
def rm_zeros(pred):
pred = tf.cast(pred, tf.float32)
# num_non_zero element in every row
num_non_zero = tf.count_nonzero(pred, -1) #[3 2 3]
# flat input and remove all zeros
flat_pred = tf.reshape(pred, [-1])
mask = tf.math.logical_not(tf.equal(flat_pred, tf.zeros_like(flat_pred)))
flat_pred_without_zero = tf.boolean_mask(flat_pred, mask) #[2. 3. 4. 1. 5. 2. 3. 1.]
# create a ragged tensor and change it to tensor, rows will be padded to max length
ragged_pred = tf.RaggedTensor.from_row_lengths(values=flat_pred_without_zero, row_lengths=num_non_zero)
paded_pred = ragged_pred.to_tensor(default_value=0.)
return paded_pred
a = tf.constant([[0, 2, 3, 4],[0, 1, 0, 5],[2, 3, 1, 0]])
print(rm_zeros(a))
output
tf.Tensor(
[[2. 3. 4.]
[1. 5. 0.]
[1. 2. 3.]], shape=(3, 3), dtype=float32)
Sorted method
If you don't mind the original data get sorted, the code below might be helpful. Although it's not the best solution.
The idea here is
1. change all zeros to infinity
2. sort the tensor
3. change all infinity back to zeros
4. slice the tensor to get minimal padding
def rm_zeros_sorted(input):
input = tf.cast(input, tf.float32)
# 1. change all zeros to infinity
zero_to_inf = tf.where(tf.equal(input, tf.zeros_like(input)), np.inf*tf.ones_like(input), input)
# 2. sort the tensor
input_sorted = tf.sort(zero_to_inf, axis=-1, direction='ASCENDING')
# 3. change all infinity back to zeros
inf_to_zero = tf.where(tf.math.is_inf(input_sorted), tf.zeros_like(input_sorted), input_sorted)
# 4. slice the tensor to get minimal padding
num_non_zero = tf.count_nonzero(inf_to_zero, -1)
max_non_zero = tf.reduce_max(num_non_zero)
remove_useless_zero = inf_to_zero[..., 0:max_non_zero]
return remove_useless_zero
a = tf.constant([[0, 2, 3, 4],[0, 1, 0, 5],[2, 3, 1, 0]])
print(rm_zeros_sorted(a))
output
tf.Tensor(
[[2. 3. 4.]
[1. 5. 0.]
[1. 2. 3.]], shape=(3, 3), dtype=float32)
The code below gets the trick done, although I'm sure that there are more elegant solutions possible and I'm curious to see those. The annoying part is that you have different amounts of zeros for each row.
a = tf.constant([[0, 2, 3, 4],[0, 1, 0, 5],[2, 3, 1, 0]])
boolean_mask = tf.logical_not(tf.equal(a, tf.zeros_like(a)))
# all the non-zero values in a flat tensor
non_zero_values = tf.gather_nd(a, tf.where(boolean_mask))
# number of non-zero values in each row
n_non_zero = tf.reduce_sum(tf.cast(boolean_mask, tf.int64), axis=-1)
# max number of non-zeros -> this will be the padding length
max_non_zero = tf.reduce_max(n_non_zero).numpy()
(Here it gets ugly)
# Split the tensor into flat tensors with the non-zero values of each row
rows = tf.split(non_zero_values, n_non_zero)
# Pad with zeros wherever necessary and recombine into a single tensor
tf.stack([tf.pad(r, paddings=[[0, max_non_zero - r.get_shape().as_list()[0]]]) for r in rows])
Produces the desired result:
<tf.Tensor: id=49, shape=(3, 3), dtype=int32, numpy=
array([[2, 3, 4],
[1, 5, 0],
[2, 3, 1]], dtype=int32)>
def shift_zeros(data, mask):
data_flat = tf.boolean_mask(data, mask)
nonzero_lens = tf.reduce_sum(tf.cast(mask, dtype=tf.int32), axis=-1)
nonzero_mask = tf.sequence_mask(nonzero_lens, maxlen=tf.shape(mask)[-1])
nonzero_data = tf.scatter_nd(tf.cast(tf.where(nonzero_mask), dtype=tf.int32), data_flat, shape=tf.shape(data))
return nonzero_data

How to get tensorflow to do a convolution on a 2 x 2 matrix with a 1 x 2 kernel?

I have the following matrix:
and the following kernel:
If I do a convolution with no padding and slide by 1 row, I should get the following answer:
Because:
Based the documentation of tf.nn.conv2d, I thought this code expresses what I just described above:
import tensorflow as tf
input_batch = tf.constant([
[
[[.0], [1.0]],
[[2.], [3.]]
]
])
kernel = tf.constant([
[
[[1.0, 2.0]]
]
])
conv2d = tf.nn.conv2d(input_batch, kernel, strides=[1, 1, 1, 1], padding='VALID')
sess = tf.Session()
print(sess.run(conv2d))
But it produces this output:
[[[[ 0. 0.]
[ 1. 2.]]
[[ 2. 4.]
[ 3. 6.]]]]
And I have no clue how that is computed. I've tried experimenting with different values for the strides padding parameter but still am not able to produce the result I expected.
You have not correctly read my explanation in the tutorial you linked. After a straight-forward modification of no-padding, strides=1 you suppose to get the following code.
import tensorflow as tf
k = tf.constant([
[1, 2],
], dtype=tf.float32, name='k')
i = tf.constant([
[0, 1],
[2, 3],
], dtype=tf.float32, name='i')
kernel = tf.reshape(k, [1, 2, 1, 1], name='kernel')
image = tf.reshape(i, [1, 2, 2, 1], name='image')
res = tf.squeeze(tf.nn.conv2d(image, kernel, [1, 1, 1, 1], "VALID"))
# VALID means no padding
with tf.Session() as sess:
print sess.run(res)
Which gives you the result you expected: [2., 8.]. Here I got a vector instead of the column because of squeeze operator.
One problem I see with your code (there might be other) is that your kernel is of the shape (1, 1, 1, 2), but it suppose to be (1, 2, 1, 1).

How to split a 1-D tensor by a certain index tensor?

There's a 1-D tensor of int32. I'd like to replace the elements before the first appearing 1 with 0.
#This is a numpy equivalent.
import numpy as np
a = np.array([5, 4, 1, 3, 1, 2, 3, 3, 1, 5], np.int32)
first_ind = np.where(a == 1)[0][0] # => 2
result = np.concatenate((np.zeros((first_ind,)), a[first_ind:]))
# =>[ 0. 0. 1. 3. 1. 2. 3. 3. 1. 5.]
import tensorflow as tf
_a = tf.convert_to_tensor(a)
_first_ind = tf.where(tf.equal(_a, 1))[0][0]
# But I don't know what to do next.
I myself got the answer.
import numpy as np
a = np.array([5, 4, 1, 3, 1, 2, 3, 3, 1, 5], np.int32)
first_ind = np.where(a == 1)[0][0] # => 2
result = np.concatenate((np.zeros((first_ind,)), a[first_ind:]))
# =>[ 0. 0. 1. 3. 1. 2. 3. 3. 1. 5.]
import tensorflow as tf
_a = tf.convert_to_tensor(a)
_first_ind = tf.where(tf.equal(_a, 1))[0]
zero_padding = tf.zeros(tf.to_int32(_first_ind), tf.int32)
_a_back = tf.slice(_a, _first_ind, [-1])
out = tf.concat(0, (zero_padding, _a_back))
with tf.Session() as sess:
print out.eval()
#=> [0 0 1 3 1 2 3 3 1 5]