For loop in tensorflow/ keras - tensorflow

I am trying to use a for loop within a model definition (and attempting to recreate TabNet in keras).
class TabNet(keras.Model):
def __init__(self, input_dim, output_dim, steps, n_d, n_a, gamma=1.3):
super().__init__()
self.n_d, self.n_a, self.steps = n_d, n_a, steps
self.shared = SharedBlock(n_d+n_a)
self.first_block = SharedBlock(n_a)
self.decision_blocks = [DecisionBlock(n_d+n_a)] * steps
self.prior_scale = Prior(input_dim, gamma)
self.bn = layers.BatchNormalization()
self.attention = [AttentiveTransformer(input_dim)] * steps
self.final = layers.Dense(output_dim)
self.eps = 1e-8
#tf.function
def call(self, x):
self.prior_scale.reset()
final_out = 0
M_loss = 0
x = self.bn(x)
attention = self.first_block(self.shared(x))
for i in range(self.steps):
mask = self.attention[i](attention, self.prior_scale.P)
M_loss += tf.reduce_sum(mask * tf.math.log(mask + self.eps), axis=-1) / self.steps
prior = self.prior_scale(mask)
out = self.decision_blocks[i](self.shared(x * prior))
attention, output = out[:,:self.n_a], out[:,self.n_a:]
final_out += tf.nn.relu(output)
return self.final(final_out), M_loss
If you're unaware of what those individual blocks are, simply assume that they are linear layers. I have a colab notebook with the full code if you wish to see what they actually are.
However, I cannot train it as I am getting the error iterating over tf.Tensor is not allowed: AutoGraph did not convert this function. Try decorating it directly with #tf.function.. I have decorated it, and still does not help.
I am fairly certain it is the for loop that is causing me the error when I do model.fit(train_x, train_y). Would appreciate any thoughts on how to implement the above for loop in the tensorflow way. tf.while_loop is all I have seen so far and the examples given are fairly simplistic compared to what I want to do.

this is my proposal...
I don't know what your network exactly do but what I can see is that you want to produce 2 outputs and combine them inside your loss. One of your output is also the results of some hidden operation inside the network (M_loss).
so if you want to return 2 outputs, 2 targets are needed in keras in order to make a fit. In the code I provide below, the first target is the real labels and the other is a fake output (an array of zeros).
As said before, you try to build a combined loss as sparse_entropy(y_true, y_pred) - reg_sparse * M_loss. To make this possible I split the loss in two pieces (one for each output): the sparse part and the M_loss part. The sparse loss is simply SparseCategoricalCrossentropy(from_logits=True) from keras, while for the M_loss, I wrote this function following your code
def m_loss(y_true, y_pred):
m = tf.reduce_mean(y_pred, keepdims=True)
return m
the m_loss use only 'y_pred' that are the hidden pieces of your network. the y_true in this case doesn't matter for the required operation. this is why we pass an array of zeros when fitting.
At this point, we have to combine the two losses and this possible in keras in this way
reg_sparse = 0.1
model.compile('Adam', loss=[sce, m_loss], loss_weights=[1,-reg_sparse])
model.fit(train_x, [train_y, np.zeros(train_y.shape[0])], epochs=3)
in this case, the final loss is the result of the combination of 1*sce + (-reg_sparse)*m_loss
this is the full running code: https://colab.research.google.com/drive/152q1rmqTJ0dWLbFN8PqzCBhWkVKirkU5?usp=sharing
I also make some little changes in TabNet, for example in the way final_out and M_loss are created

No actually it is not a problem of for loop. I checked your code, the problem was that you forgot to call the superclass constructor in your SharedBlock, DecisionBlock and Prior.
For e.g your code should look like.
class SharedBlock(layers.Layer):
def __init__(self, units, mult=tf.sqrt(0.5)):
super().__init__()
self.layer1 = FCBlock(units)
self.layer2 = FCBlock(units)
self.mult = mult
After doing these changes you will not see that error again but something else comes up.
TypeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1147 predict_function *
outputs = self.distribute_strategy.run(
<ipython-input-46-f609cb1acdfa>:15 call *
self.prior_scale.reset()
TypeError: tf__reset() missing 1 required positional argument: 'len_x'
To resolve this issue you will need to do following changes in the class class Prior(layers.Layer):.
def reset(self, len_x=1.0):
self.P = 1.0
Then you will get another issue.
AttributeError: in user code:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:1147 predict_function *
outputs = self.distribute_strategy.run(
<ipython-input-46-f609cb1acdfa>:26 call *
out = self.decision[i](self.shared(x * prior))
AttributeError: 'TabNet' object has no attribute 'decision'
For this issue I will request to open another question as I think you main issue is resolved.
UPDATE:
You can look into the comment section of this answer, there a solution has been provided for the issue AttributeError: 'TabNet' object has no attribute 'decision'
UPDATE: 21/07
I have to disappoint you again that the issue is not with the for loop.
If you look closely at the error log you will see that the issue is due to the full_loss function.
<ipython-input-10-07e59f23d230>:7 full_loss *
logits, M_loss = y_pred
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:561 __iter__
self._disallow_iteration()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:554 _disallow_iteration
self._disallow_when_autograph_enabled("iterating over `tf.Tensor`")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:532 _disallow_when_autograph_enabled
" decorating it directly with #tf.function.".format(task))
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did not convert this function. Try decorating it directly with #tf.function.
The exact problem is caused by the below statement.
logits, M_loss = y_pred
If you use the below code that does not use your loss function you will see a different result.
model.compile('Adam', loss='sparse_categorical_crossentropy')
model.fit(train_x, train_y, batch_size=1)
Received a label value of 1 which is outside the valid range of [0, 1). Label values: 1
[[node sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-26-d39f533b7a69>:2) ]] [Op:__inference_train_function_18003]
I do not understand the model code completely and the model.summary() is not that helpful in your case. There is some problem with your last layer, at least the error message suggests that you do not have ample neurons (1 for each class).
I will suggest looking into the last layer and the loss function.
Why I am sure it is not due to for loop is for the fact that even if you comment the for loop you will still receive the same error.
I hope I have helped you further, took me a few hours to figure it out.

Related

Using Tensorflow Dataset from_generator() to create multi Input/Output with Custom Generator and ImageDataGenerator

I am trying to scale up my model which uses a "cluster loss" extension, the implementation works so far on MNIST, but I would like to benefit from data augmentation and multi-processing for the real dataset.
In short, the network follows works done with the "centre loss", which resemble a bit a Siamese Network. The important part of the architectures is that the model has 2 inputs and 2 outputs. Therefore, I implemented a custom generator in order to feed the model as follow:
def my_generator(stop):
i = 0
while i < stop:
batch = train_gen.next()
img = batch[0]
labels = batch[1]
labels_size = np.shape(labels)
cluster = np.zeros(labels_size)
x = [img, labels]
y = [labels, cluster]
yield x, y
i += 1
which calls the generator ("train_gen") defined as follow:
generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, horizontal_flip=True)
train_gen = generator.flow_from_dataframe(df, x_col='img_path', y_col='label',
class_mode='categorical',
target_size=(32, 32),
batch_size=batch_size)
The generator works if I set only one worker in the fit function. But obviously it's painfully slow... So I tried to use the recommended tf.Data from Tensorflow (tf.data.Dataset.from_generator) to fit my model, but setting it as follow,
ds = tf.data.Dataset.from_generator(my_generator,
args=[num_iter],
output_types=([tf.float32, tf.float32], [tf.float32, tf.float32]))
I got the following error:
TypeError: Cannot convert value [tf.float32, tf.float32] to a Tensorflow DType.
From there, I tried multiple things, following this post
For example, trying to return tuples instead of arrays:
x = (img, labels)
y = (labels, cluster)
But I got:
ValueError: as_list() is not defined on an unknown TensorShape
Does anyone have experience with this? I am not sure to understand the error and I am thinking that I could change the "output_types" argument perhaps, but TensorFlow has no "list" or "tuple" DType argument.
Here is a link to my code which construct a small image dataset from cifar10 to feed a toy model.
I do not think your generator works as you expect. Each time it is called it sets i=0. The code after
yield x, y
i += 1
i += 1 never executes. Put a print statement as below
yield x, y
i += 1
print ('the value of i is ',i)
and you will see it never executes.
The above is true if you execute
x,y=next(my_generator(2))
which is how generators are used. However if you execute
x,y=my_generator(2)
then the i += 1 statement does execute. Normally with generators you use them with next(my_generator). model.fit I believe gets the next batch by using next() on the generator you specify.

Keras Model - Get input in custom loss function

I am having trouble with Keras Custom loss function. I want to be able to access truth as a numpy array.
Because it is a callback function, I think I am not in eager execution, which means I can't access it using the backend.get_value() function. i also tried different methods, but it always comes back to the fact that this 'Tensor' object doesn't exist.
Do I need to create a session inside the custom loss function ?
I am using Tensorflow 2.2, which is up to date.
def custom_loss(y_true, y_pred):
# 4D array that has the label (0) and a multiplier input dependant
truth = backend.get_value(y_true)
loss = backend.square((y_pred - truth[:,:,0]) * truth[:,:,1])
loss = backend.mean(loss, axis=-1)
return loss
model.compile(loss=custom_loss, optimizer='Adam')
model.fit(X, np.stack(labels, X[:, 0], axis=3), batch_size = 16)
I want to be able to access truth. It has two components (Label, Multiplier that his different for each item. I saw a solution that is input dependant, but I am not sure how to access the value. Custom loss function in Keras based on the input data
I think you can do this by enabling run_eagerly=True in model.compile as shown below.
model.compile(loss=custom_loss(weight_building, weight_space),optimizer=keras.optimizers.Adam(), metrics=['accuracy'],run_eagerly=True)
I think you also need to update custom_loss as shown below.
def custom_loss(weight_building, weight_space):
def loss(y_true, y_pred):
truth = backend.get_value(y_true)
error = backend.square((y_pred - y_true))
mse_error = backend.mean(error, axis=-1)
return mse_error
return loss
I am demonstrating the idea with a simple mnist data. Please take a look at the code here.

Error when using tf.keras.applications.resnet50.preprocess_input inside tf.data.Dataset.map

I'm having problems with the function resnet50.preprocess_input() from tensorflow.compat.v1.keras.applications.resnet50
In particular, after several trial and error, I can say the problem comes when inside a dataset generator function, there is a call:
dataset.map(pre_processing_image)
where
def pre_processing_image(image):
image = resnet50.preprocess_input(image)
return image
and the dataset is splitted in batches. When I reach the last batch, no matter if it is complete or smaller, I get an error similar to
Tensor("Const:0", shape=(3,), dtype=float32) must be from the same graph as Tensor("BatchDatasetV2:0", shape=(), dtype=variant)
I can't really understand what is going on because
If I use another preprocess_input, such as the one of mobilenet, without changing anything else then there is no problem. By digging the code I found that those functions are all calling this one but mobilenet uses "mode='tf'" while resnet should use 'caffe'
The error isn't related to the fact the last batch is smaller compared to the others, I tried to make them all equals but the errors keeps happening at the last step of the first epoch of training
If I don't use map but instead pre_processing_image is called directly inside tf.data.Dataset.from_generator there is no problem.. only the code becomes a lot slower
To give you the full code:
def image_gen(ds_path, ds_scores=None):
for i, path in enumerate(ds_path):
img = im.load_img(path,
color_mode='rgb',
target_size=(NETWORK_INFO.value[1],NETWORK_INFO.value[1]),
interpolation='bilinear')
img_to_numpy = np.array(img)
if (ds_scores is not None):
yield img_to_numpy, ds_scores[i]
else:
yield img_to_numpy
def pre_processing_image(image, score=None):
image = resnet50.preprocess_input(image)
if score is None:
return image
else:
return image, score
def generator(batchsize, train=False, val=False, test=False, shuffle=False):
with tf.Session() as sess:
if (train):
dataset = tf.data.Dataset.from_generator(lambda: image_gen(train_paths, train_scores),
output_types=(tf.float32, tf.float32))
elif(val):
dataset = tf.data.Dataset.from_generator(lambda: image_gen(val_paths, val_scores),
output_types=(tf.float32, tf.float32))
else:
dataset = tf.data.Dataset.from_generator(lambda: image_gen(test_paths),
output_types=(tf.float32))
if (shuffle):
dataset = dataset.shuffle(buffer_size=10*batchsize)
dataset = dataset.batch(batchsize)
dataset = dataset.map(pre_processing_image,
num_parallel_calls=tf.data.experimental.AUTOTUNE)
dataset = dataset.prefetch(buffer_size=2)
dataset = dataset.repeat(count = -1)
iterable = tf.data.make_initializable_iterator(dataset)
batch = iterable.get_next()
sess.run(iterable.initializer)
# yield all the time it is required
while True:
try:
yield sess.run(batch)
except tf.errors.OutOfRangeError:
pass
I tried to mess with the position of the map function and shuffle/prefatch parameters but nothing solved the issue. Finally as you can see I use the same function for both training and validation generator, I just change the input parameter to selecet with dataset the function should use
Solved the issue.
I tried to search for something similar but regarding other networks that shared the same image preprocessing (such as VGG16) and it comes out those related issues were keras bugs
I updated to the last commit the module keras-applications (commit, not release!) and the code now works without problems

How to create a custom layer in Keras with 'stateful' variables/tensors?

I would like to ask you some help for creating my custom layer.
What I am trying to do is actually quite simple: generating an output layer with 'stateful' variables, i.e. tensors whose value is updated at each batch.
In order to make everything more clear, here is a snippet of what I would like to do:
def call(self, inputs)
c = self.constant
m = self.extra_constant
update = inputs*m + c
X_new = self.X_old + update
outputs = X_new
self.X_old = X_new
return outputs
The idea here is quite simple:
X_old is initialized to 0 in the def__ init__(self, ...)
update is computed as a function of the inputs to the layer
the output of the layer is computed (i.e. X_new)
the value of X_old is set equal to X_new so that, at the next batch, X_old is no longer equal to zero but equal to X_new from the previous batch.
I have found out that K.update does the job, as shown in the example:
X_new = K.update(self.X_old, self.X_old + update)
The problem here is that, if I then try to define the outputs of the layer as:
outputs = X_new
return outputs
I will receiver the following error when I try model.fit():
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have
gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
And I keep having this error even though I imposed layer.trainable = False and I did not define any bias or weights for the layer. On the other hand, if I just do self.X_old = X_new, the value of X_old does not get updated.
Do you guys have a solution to implement this? I believe it should not be that hard, since also stateful RNN have a 'similar' functioning.
Thanks in advance for your help!
Defining a custom layer can become confusing some times. Some of the methods that you override are going to be called once but it gives you the impression that just like many other OO libraries/frameworks, they are going to be called many times.
Here is what I mean: When you define a layer and use it in a model the python code that you write for overriding call method is not going to be directly called in forward or backward passes. Instead, it's called only once when you call model.compile. It compiles the python code to a computational graph and that graph in which the tensors will flow is what does the computations during training and prediction.
That's why if you want to debug your model by putting a print statement it won't work; you need to use tf.print to add a print command to the graph.
It is the same situation with the state variable you want to have. Instead of simply assigning old + update to new you need to call a Keras function that adds that operation to the graph.
And note that tensors are immutable so you need to define the state as tf.Variable in the __init__ method.
So I believe this code is more like what you're looking for:
class CustomLayer(tf.keras.layers.Layer):
def __init__(self, **kwargs):
super(CustomLayer, self).__init__(**kwargs)
self.state = tf.Variable(tf.zeros((3,3), 'float32'))
self.constant = tf.constant([[1,1,1],[1,0,-1],[-1,0,1]], 'float32')
self.extra_constant = tf.constant([[1,1,1],[1,0,-1],[-1,0,1]], 'float32')
self.trainable = False
def call(self, X):
m = self.constant
c = self.extra_constant
outputs = self.state + tf.matmul(X, m) + c
tf.keras.backend.update(self.state, tf.reduce_sum(outputs, axis=0))
return outputs

Variable rnn/gru_cell/gates/weights already exists, disallowed

I want to implement the code in th book Tesorflow for Machine Intelligence, the code runs well at the first time,but when run it again ,the error
"Variable rnn/gru_cell/gates/weights already exists, disallowed" occurs. when I restart the console the error disapear and it occurs after the first running or debug. the code is below:
def prediction(self):
output, _ = tf.nn.dynamic_rnn(tf.contrib.rnn.GRUCell(300),
self.data,
dtype = tf.float32,
sequence_length = self.length)
last = self._last_relevant(output, self.length)
#softmax层
num_classes =int(self.target.get_shape()[1])
weight = tf.Variable(tf.truncated_normal([self.params.rnn_hidden, num_classes], stddev = 0.01))
bias = tf.Variable(tf.constant(0.1, shape = [num_classes]))
prediction = tf.nn.softmax(tf.matmul(last, weight) + bias)
return prediction
anyone can help me with the problem?
Code that adds things to your graph (which includes pretty much everything in the function you posted) should usually only be run once. When you want to train your model or have it make a prediction, you would use something like sess.run with a feed_dict and the ops you want output from.
If you actually want to delete your graph without restarting the console, you can use tf.reset_default_graph().