Loss added to custom layer in tensorflow 2 is cleared when compiling - tensorflow

I am trying to port the implementation of concrete dropout in keras in https://github.com/yaringal/ConcreteDropout/blob/master/concrete-dropout-keras.ipynb to tensorflow 2. This is mostly straightforward, as tf 2 has most of the keras API built into it. However, the custom losses are being cleared before fitting.
After the model is defined, and before compiling it, I can see that the losses for each concrete dropout layer have been added to the model losses by the line self.layer.add_loss(regularizer) run when the layers are built:
>>> print(model.losses)
[<tf.Tensor: id=64, shape=(), dtype=float32, numpy=-8.4521576e-05>, <tf.Tensor: id=168, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=272, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=376, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=479, shape=(), dtype=float32, numpy=-0.000650166>]
After the compilation, however, model.losses becomes an empty list, and the assertion assert len(model.losses) == 5 fails. If I choose to ignore the assertion, the fact that the layer losses are being neglected shows up in the warning WARNING:tensorflow:Gradients do not exist for variables ['concrete_dropout/p_logit:0', 'concrete_dropout_1/p_logit:0', 'concrete_dropout_2/p_logit:0', 'concrete_dropout_3/p_logit:0', 'concrete_dropout_4/p_logit:0'] when minimizing the loss. when training the model.
After digging into the compilation code in https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/training.py#L184 I believe the problematic lines are
# Clear any `_eager_losses` that was added.
self._clear_losses()
Why is this being done at model compilation? And how can I add a loss per layer in tensorflow 2, if this is not the way to do it?

Since that custom loss is not dependent on the model's inputs, you should add it using a zero-argument callable like this:
self.layer.add_loss(lambda: regularizer)

Related

'Tensor' vs 'tf.Tensor' tensorflow

I am exploring tensorflow internals, and sometimes when I print the value of a tensor, I will see data like the following: Tensor("x/PlaceholderWithDefault:0", shape=(), dtype=int32)
and other times I will see tf.Tensor(0, shape=(), dtype=int32).
What is the difference between these two expressions? Are Tensor and tf.Tensor different? And if not, why are they displayed differently (and seem to have different behavior)?

Access Tensorflow tensor value

I have a tensor with the following properties. I want to save the numpy = 1 but I don't know how to access this value. How do I do this?
test_labels[1]
<tf.Tensor: shape=(), dtype=int32, numpy=1>
You can use tf.print() to get the values of the tensor.
For Example:
a=tf.constant(1) #Output:<tf.Tensor: shape=(), dtype=int32, numpy=1>
tf.print(a) #Output:1

Is it possible to send different subset of weights to different clients?

I'm trying to use tensorflow-federated to select different subset of weights at the server and send them to the clients. The clients then would train and send back the trained weights. The server aggregates the results and starts a new communication round.
The main problem is that I cannot access the numpy version of the weights and therefore I don't know how to access a subset of them for each layer. I tried using tf.gather_nd and tf.tensor_scatter_nd_update to perform selection and update, but they only work for tensors, and not lists of tensors (as the server_state is in tensorflow-federated).
Does anyone have any hint to solve this problem? Is it even possible to send different weights to each client?
If I follow correctly, a way to write the high-level computation being described in the TFF type shorthand would be:
#tff.federated_computation(...)
def run_one_round(server_state, client_datasets):
weights_subset = tff.federated_map(subset_fn, server_state)
clients_weights_subset = tff.federated_broadcast(weights_subset)
client_models = tff.federated_map(client_training_fn,
(clients_weights_subset, client_datasets))
aggregated_update = tff.federated_aggregate(client_models, ...)
new_server_state = tff.federated_map(apply_aggregated_update_fn, server_state)
return new_server_state
If this is true, it seems like the majority of the work needs to happen in subset_fn which takes the server state and returns a subset of the global mode weights. Generally a model is a structure (list or dict, possibly nested) of tf.Tensor, which as you observed cannot be used as an argument to tf.gather_nd or tf.tensor_scatter_nd_update. However, they can be be applied pointwise to the structure of tensors uses tf.nest.map_structure. For example, selecting the value at [0, 0] from a nested structure of three tensors:
import tensorflow as tf
import pprint
struct_of_tensors = {
'trainable': [tf.constant([[2.0, 4.0, 6.0]]), tf.constant([[5.0]])],
'non_trainable': [tf.constant([[1.0]])],
}
pprint.pprint(tf.nest.map_structure(
lambda tensor: tf.gather_nd(params=tensor, indices=[[0, 0]]),
struct_of_tensors))
>>> {'non_trainable': [<tf.Tensor: shape=(1,), dtype=float32, numpy=array([1.], dtype=float32)>],
'trainable': [<tf.Tensor: shape=(1,), dtype=float32, numpy=array([2.], dtype=float32)>,
<tf.Tensor: shape=(1,), dtype=float32, numpy=array([5.], dtype=float32)>]}

Tensorflow dataset.shuffle seems not shuffle without repeat()

My code has similar pattern with tensorflow 2.0 tutorial.
I want my dataset object to reshuffle in every epochs.
dataset = tf.data.Dataset.from_tensor_slices(['a','b','c','d'])
dataset = dataset.shuffle(100)
for epoch in range(10):
for d in dataset:
print(d)
Result:
tf.Tensor(b'c', shape=(), dtype=string)
tf.Tensor(b'a', shape=(), dtype=string)
tf.Tensor(b'b', shape=(), dtype=string)
tf.Tensor(b'd', shape=(), dtype=string)
tf.Tensor(b'c', shape=(), dtype=string)
tf.Tensor(b'a', shape=(), dtype=string)
tf.Tensor(b'b', shape=(), dtype=string)
tf.Tensor(b'd', shape=(), dtype=string)
...
It seems the dataset doesn't shuffle for each epoch.
Should I call .shuffle() for each epoch?
Yes, you should call .shuffle during the inner loop. Moreover, it is better to do not mix python code and TensorFlow code when pure tf.* method equivalent to the Python statements are available.
import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices(["a", "b", "c", "d"])
# dataset = dataset.shuffle(2)
#tf.function
def loop():
for epoch in tf.range(10):
for d in dataset.shuffle(2):
tf.print(d)
loop()
The loop call produces the different values every time (and tf.print prints the content of the tf.Tensor, differently from print that prints the object).

How to debug Keras in TensorFlow 2.0?

Actually, I find the problem already in TensorFlow 1.13.0. (tensorflow1.12.0 works well).
My code is listed as a simple example:
def Lambda layer(temp):
print(temp)
return temp
which is used as a lambda layer in my Keras model.
In tensorflow1.12.0, the print(temp) can output the detail data like following
[<tf.Tensor: id=250, shape=(1024, 2, 32), dtype=complex64, numpy=
array([[[ 7.68014073e-01+0.95353246j, 7.01403618e-01+0.64385843j,
8.30483198e-01+1.0340731j , ..., -8.88018191e-01+0.4751519j ,
-1.20197642e+00+0.6313924j , -1.03787208e+00+0.22964947j],
[-7.94382274e-01+0.56390345j, -4.73938555e-01+0.55901265j,
-8.73749971e-01+0.67095983j, ..., -5.81580341e-01-0.91620034j,
-7.04443693e-01-1.2709806j , -3.23135853e-01-1.0887597j ]],
It is because I use the 1024 as batch_size. But when I update to tensorflow1.13.0 or TensorFlow 2.0, the same code's output
Tensor("lambda_1/truediv:0", shape=(None, 1), dtype=float32)
This is terrible since I can not know the exact mistakes.
So, any idea about how to solve it?
You see that output because the Keras model is being converted to its graph representation, and thus print printes the tf.Tensor graph description.
To see the content of a tf.Tensor when using Tensorflow 2.0 you should use tf.print instead of print since the former gets converted to its graph representation while the latter doesn't.