Tensorflow dataset.shuffle seems not shuffle without repeat() - tensorflow

My code has similar pattern with tensorflow 2.0 tutorial.
I want my dataset object to reshuffle in every epochs.
dataset = tf.data.Dataset.from_tensor_slices(['a','b','c','d'])
dataset = dataset.shuffle(100)
for epoch in range(10):
for d in dataset:
print(d)
Result:
tf.Tensor(b'c', shape=(), dtype=string)
tf.Tensor(b'a', shape=(), dtype=string)
tf.Tensor(b'b', shape=(), dtype=string)
tf.Tensor(b'd', shape=(), dtype=string)
tf.Tensor(b'c', shape=(), dtype=string)
tf.Tensor(b'a', shape=(), dtype=string)
tf.Tensor(b'b', shape=(), dtype=string)
tf.Tensor(b'd', shape=(), dtype=string)
...
It seems the dataset doesn't shuffle for each epoch.
Should I call .shuffle() for each epoch?

Yes, you should call .shuffle during the inner loop. Moreover, it is better to do not mix python code and TensorFlow code when pure tf.* method equivalent to the Python statements are available.
import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices(["a", "b", "c", "d"])
# dataset = dataset.shuffle(2)
#tf.function
def loop():
for epoch in tf.range(10):
for d in dataset.shuffle(2):
tf.print(d)
loop()
The loop call produces the different values every time (and tf.print prints the content of the tf.Tensor, differently from print that prints the object).

Related

Access Tensorflow tensor value

I have a tensor with the following properties. I want to save the numpy = 1 but I don't know how to access this value. How do I do this?
test_labels[1]
<tf.Tensor: shape=(), dtype=int32, numpy=1>
You can use tf.print() to get the values of the tensor.
For Example:
a=tf.constant(1) #Output:<tf.Tensor: shape=(), dtype=int32, numpy=1>
tf.print(a) #Output:1

How to multiply outputs of the neurons in a layer? (preferably in Keras or Tensorflow)

Consider I have a dense layer and I want to multiply the outputs of the units in my next layer:
dense_layer = Dense(n_unit)(input_layer)
next_layer = {Multiply outputs of the n_unit in the dense layer}
Is there any simple way to achieve this functionality?
If no, is it even possible to define a new layer with this aim or there is a fundamental limitation?
You can use a lambda layer, and tf.reduce_prod(dense_layer) function.
For example:
import tensorflow as tf
dense_layer = [1,2,3,4,5]
next_layer = tf.keras.layers.Lambda(lambda x: tf.reduce_prod(x))(dense_layer)
print(next_layer) # <tf.Tensor: shape=(), dtype=int32, numpy=120>

Calculate losses and computing gradients for multiple layers at once in tensorflow with tf.GradientTape()

If my knowledge of layers is correct, then Layers use tf.Variable as weight variable so if a Dense() layer has 3 units in it, it means it is using something like w = tf.Variable([0.2,5,0.9]) for a single instance and if the input_shape is 2 there are variable would be something like w = tf.Variable([[0.2,5,0.9],[2,3,0.4]])?
Please correct me if I am wrong.
I am learning the very deep basics of tensorflow and found a code that I modified as
weight = tf.Variable([3.2])
def get_lost_loss(w):
'''
A very hypothetical function since the name
'''
return (w**1.3)/3.1 # just felt like doing it
def calculate_gradient(w):
with tf.GradientTape() as tape:
loss = get_lost_loss(w) # calculate loss WITHIN tf.GradientTape()
grad = tape.gradient(loss,w) # gradient of loss wrt. w
return grad
# train and apply the things here
opt = tf.keras.optimizers.Adam(lr=0.01)
losses = []
for i in range(50):
grad = calculate_gradient(weight)
opt.apply_gradients(zip([grad],[weight]))
losses.append(get_lost_loss(weight))
Could someone please give me an intuition of what is happening here inside tf.GradientTape(). Also the thing I wanted to ask the most is that if I have to do it for weight1 and weight2 whose shapes are [2,3] instead of weight, what should be the modification on the code
Please make any assumptions that are to be made. You all are far more skilled than me in this field.
Yes, you are right. Layers has two variables. The one you mentioned is called kernel. And the other one is called bias. The example below explains it in details:
import tensorflow as tf
w=tf.Variable([[3.2,5,6,7,5]],dtype=tf.float32)
d=tf.keras.layers.Dense(3,input_shape=(5,)) # Layer d gets inputs with shape (*,5) and generates outputs with shape (*,3)
# It has kernel variable with shape (5,3) and bias variable with shape (3)
print("Output of applying d on w:", d(w))
print("\nLayer d trainable variables:\n", d.trainable_weights)
The output will be something like:
Output of applying d on w: tf.Tensor([[ -0.9845681 -10.321521 7.506028 ]], shape=(1, 3), dtype=float32)
Layer d trainable variables:
[<tf.Variable 'dense_18/kernel:0' shape=(5, 3) dtype=float32, numpy=
array([[-0.8144073 , -0.8408185 , -0.2504158 ],
[ 0.6073988 , 0.09965736, -0.32579994],
[ 0.04219657, -0.33530533, 0.71029276],
[ 0.33406 , -0.673926 , 0.77048916],
[-0.8014116 , -0.27997494, 0.05623555]], dtype=float32)>, <tf.Variable 'dense_18/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]
tf.GradientTape() is used to record the operations on the trainable weights (variables) in its context for automatic differentiation. So later we can get the derivative of the variables.
Suppose you have two weight variables as weight1 and weight2. First you need to change you loss function to use both variables (see the code below). Then in each step you need to get the derivative of loss function wrt. the variables and update them to optimize the loss. Please see the code below.
import tensorflow as tf
weight1 = tf.Variable([[3.2,5,6],[2,5,4]],dtype=tf.float32) #modified
weight2= tf.Variable([[1,2,3],[1,4,3]],dtype=tf.float32) #modified
def get_lost_loss(w1, w2): #modified
'''
A very hypothetical function since the name
'''
return tf.reduce_sum(tf.math.add(w1**1.2/2,w2**2)) # just felt like doing it
def calculate_gradient(w1,w2):
with tf.GradientTape() as tape:
loss = get_lost_loss(w1,w2) # calculate loss WITHIN tf.GradientTape()
dw1,dw2 = tape.gradient(loss,[w1,w2]) # gradient of loss wrt. w1,w2
return dw1,dw2
# train and apply the things here
opt = tf.keras.optimizers.Adam(lr=0.01)
losses = []
for i in range(500):
grad_weight1, grad_weight2 = calculate_gradient(weight1,weight2)
opt.apply_gradients(zip([grad_weight1, grad_weight2],[weight1,weight2]))
losses.append(get_lost_loss(weight1,weight2))
print("loss: "+str(get_lost_loss(weight1,weight2).numpy()))

Loss added to custom layer in tensorflow 2 is cleared when compiling

I am trying to port the implementation of concrete dropout in keras in https://github.com/yaringal/ConcreteDropout/blob/master/concrete-dropout-keras.ipynb to tensorflow 2. This is mostly straightforward, as tf 2 has most of the keras API built into it. However, the custom losses are being cleared before fitting.
After the model is defined, and before compiling it, I can see that the losses for each concrete dropout layer have been added to the model losses by the line self.layer.add_loss(regularizer) run when the layers are built:
>>> print(model.losses)
[<tf.Tensor: id=64, shape=(), dtype=float32, numpy=-8.4521576e-05>, <tf.Tensor: id=168, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=272, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=376, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=479, shape=(), dtype=float32, numpy=-0.000650166>]
After the compilation, however, model.losses becomes an empty list, and the assertion assert len(model.losses) == 5 fails. If I choose to ignore the assertion, the fact that the layer losses are being neglected shows up in the warning WARNING:tensorflow:Gradients do not exist for variables ['concrete_dropout/p_logit:0', 'concrete_dropout_1/p_logit:0', 'concrete_dropout_2/p_logit:0', 'concrete_dropout_3/p_logit:0', 'concrete_dropout_4/p_logit:0'] when minimizing the loss. when training the model.
After digging into the compilation code in https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/training.py#L184 I believe the problematic lines are
# Clear any `_eager_losses` that was added.
self._clear_losses()
Why is this being done at model compilation? And how can I add a loss per layer in tensorflow 2, if this is not the way to do it?
Since that custom loss is not dependent on the model's inputs, you should add it using a zero-argument callable like this:
self.layer.add_loss(lambda: regularizer)

Using seed to sample in tensorflow-probability

I am trying to use tensorflow-probability and started off with something really simple:
import tensorflow as tf
import tensorflow_probability as tfp
tf.enable_eager_execution()
tfd = tfp.distributions
poiss = tfd.Poisson(0.8)
poiss.sample(2, seed=1)
#> Out: <tf.Tensor: id=3569, shape=(2,), dtype=float32, numpy=array([0., 0.], dtype=float32)>
poiss.sample(2, seed=1)
#> Out: <tf.Tensor: id=3695, shape=(2,), dtype=float32, numpy=array([1., 0.], dtype=float32)>
poiss.sample(2, seed=1)
#> Out: <tf.Tensor: id=3824, shape=(2,), dtype=float32, numpy=array([2., 2.], dtype=float32)>
poiss.sample(2, seed=1)
#> Out: <tf.Tensor: id=3956, shape=(2,), dtype=float32, numpy=array([0., 1.], dtype=float32)>
I was thinking I would get the same results when re-using the same seed, but somehow that's not true.
I also tried without eager execution, but the results still weren't reproducible. Same story if I add something like tf.set_random_seed(12).
I suppose there is something basic I am missing?
For those interested, I am running Python 3.5.2 on Ubuntu 16.04 with
tensorflow-probability==0.5.0
tensorflow==1.12.0
For deterministic output in graph mode, you need to set both the graph random seed (tf.set_random_seed) and the op random seed (seed= in your sample call).
The workings of random samplers in TFv2 are still being sorted out. For now, my best understanding is that you can call tf.set_random_seed prior to each call to a sampler, and pass the sampler a seed=, if you want deterministic output in eager.
This is now cleaner, we support fully deterministic randomness in TFP. You can pass a tuple of two ints for seed, or a Tensor of shape (2,) to trigger the deterministic behavior. tfp.random.split_seed is also relevant here.
Besides setting the seed for sample or sample_chain in mcmc, you might need to set the followings as well:
seed = 24
os.environ['TF_DETERMINISTIC_OPS'] = 'true'
os.environ['PYTHONHASHSEED'] = f'{seed}'
np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)