I have a model which I train through an auto-encoder. To start with, I used for my loss function the binary cross_entropy. Everything works well. I wanted to try a different loss function, the sigmoid-focal cross_entropy. When I use this, I get an "incompatible shapes" error.
Here is the relevant code, and as can be seen the output for either loss has the same shape. Nothing else in the code has changed:
relevant code:
loss0 = binary_crossentropy(y_true, y_pred, from_logits=from_logits)
loss1 = tf.reduce_mean(alpha_factor * modulating_factor * loss0, axis=-1)
print('^&^&^&^&^&^&^ loss0',loss0)
print('^&^&^&^&^&^&^ loss1',loss1)
.
.
.
autoencoder.compile(optimizer=optim, loss=loss1, metrics=['accuracy','AUC'])
Output:
^&^&^&^&^&^&^ loss0 Tensor("ww_loss/Mean_2:0", shape=(None, 300, 300), dtype=float32)
^&^&^&^&^&^&^ loss1 Tensor("ww_loss/Mean_3:0", shape=(None, 300, 300), dtype=float32)
InvalidArgumentError: Incompatible shapes: [32,300,300,1] vs. [32,300,300]
[[node gradient_tape/ww_loss/mul_7/BroadcastGradientArgs
When loss0 is used, there is no error. I am a bit stymied!
Related
I have some code that generates a CTC layer which no longer works in TensorFlow 2.7.0 but works in 2.6.1. The code in question which is causing the problem is:
class CTCLayer(layers.Layer):
def __init__(self, name=None):
super().__init__(name=name)
self.loss_fn = keras.backend.ctc_batch_cost
def call(self, labels, label_length, predictions): #input_length,
batch_len = tf.cast(tf.shape(labels)[0], dtype="int64")
input_length = tf.cast(tf.shape(predictions)[1], dtype="int64")
label_length = tf.cast(label_length, dtype="int64")#tf.cast(tf.shape(labels)[1], dtype="int64")
input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
loss = self.loss_fn(y_true=labels, y_pred=predictions, input_length=input_length, label_length=label_length)#, logits_time_major=False)
self.add_loss(loss)
return predictions
and crashes when calling the ctc_batch_cost function during model building with the following error:
ValueError: Exception encountered when calling layer "CTC_LOSS" (type CTCLayer).
Traceback:
File "<ipython-input-10-0b2cf7d5ab7d>", line 16, in call *
loss = self.loss_fn(y_true=labels, y_pred=predictions, input_length=input_length, label_length=label_length)#, logits_time_major=False)
File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 6388, in ctc_batch_cost
ctc_label_dense_to_sparse(y_true, label_length), tf.int32)
File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 6340, in ctc_label_dense_to_sparse
range_less_than, label_lengths, initializer=init, parallel_iterations=1)
ValueError: Input tensor `CTC_LOSS/Cast_5:0` enters the loop with shape (1, 1), but has shape (1, None) after one iteration. To allow the shape to vary across iterations, use the `shape_invariants` argument of tf.while_loop to specify a less-specific shape.
Call arguments received:
• labels=tf.Tensor(shape=(None, 1), dtype=int32)
• label_length=tf.Tensor(shape=(None, 1), dtype=int32)
• predictions=tf.Tensor(shape=(None, 509, 30), dtype=float32)
I suspect the problem is easy to fix and has something to do with the fact that TensorFlow no longer performs upranking as described in the 2.7.0 release notes:
The methods Model.fit(), Model.predict(), and Model.evaluate() will no longer uprank input data of shape (batch_size,) to become (batch_size, 1). This enables Model subclasses to process scalar data in their train_step()/test_step()/predict_step() methods.
Note that this change may break certain subclassed models. You can revert back to the previous behavior by adding upranking yourself in the train_step()/test_step()/predict_step() methods, e.g. if x.shape.rank == 1: x = tf.expand_dims(x, axis=-1). Functional models as well as Sequential models built with an explicit input shape are not affected.
Any idea will be appreciated. Thanks!
'I want to solve a classification problem by keras.model, but after running model.fit I face to a dimension error. I have run following code:'
print(X_train.shape)
print(y_train.shape)
'output:'
(2588, 39436)
(2588, 6)
model = keras.Sequential(
[
keras.Input(shape=(39436,1)),
layers.Conv1D(32, kernel_size=3, strides=5, activation="relu"),
layers.MaxPooling1D(pool_size=10),
layers.Conv1D(64, kernel_size=3, strides=5, activation="relu"),
layers.MaxPooling1D(pool_size=10),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation="softmax"),
]
)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
'After running following code, '
model.fit(X_train, y_train, batch_size=128, epochs=15, validation_split=0.3)
'I give this error:'
ValueError: in user code:
ValueError: Input 0 of layer sequential_1 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: [None, 39436]
'It would be appreciated if you guide me what would be the issue?'
Your input array, as per the error message, has a shape [None, 39436]. However, in your Input layer, you pass in a shape [39436, 1], which matches to [None, 39436, 1] where None represents the samples dimension. This is the error that is being thrown.
You need to match the shapes, either by:
1. Reshaping your input data to have a shape of [samples, 39436, 1], leaving the model architecture unchanged.
This can be done as (suppose train_X are your input features):
train_X = np.expand_dims(train_X, axis=2)
np.expand_dims adds a new dimension to the array at index 2 of the shape of the array. So here it reshapes [samples, 39436] to [samples, 39436, 1].
Refer: NumPy docs for expand_dims
OR
2. Change the input_shape parameter in the Input layer to accept a shape of [39436,], so as to match your data.
I am struggling for the last hour to understand what i am doing wrong. I am a novice in NN, but this is not my first code.
def simple_model(lr=0.1):
X = Input(shape=(6144,))
out = Dense(1)(X)
model = Model(inputs=X, outputs=out)
opt = tf.keras.optimizers.SGD(learning_rate=lr)
model.compile(optimizer=opt, loss='mean_squared_error')
model.summary()
return model
mod = simple_model()
a = np.zeros(6144)
v = mod.predict(a)
running this i get the following error:
WARNING:tensorflow:Model was constructed with shape (None, 6144) for input Tensor("input_1:0", shape=(None, 6144), dtype=float32), but it was called on an input with incompatible shape (32, 1).
......
ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 6144 but received input with shape [32, 1]
Where does this [32, 1] come from ?!
I am sure there is some silly mistake in my code, but can't see it :(
p.s. It does compile the mode and prints the summary before throwing an error
mod = simple_model()
a = np.zeros(6144)
#Add this line
a = np.expand_dims(a,axis=0)
v = mod.predict(a)
The reason why your error appears is that Keras + TensorFlow only allow batch predictions. When we use expand_dims function, we actually create a batch of dimension 1.
When running the model.fit function an error is thrown. The main question is, what does this error mean? The code is run on a TPU V3-8 and uses Google cloud for data retrieval. I did try to look up the error on the web, however I could not find a single case of someone else getting this error.
model.fit(
dataset,
steps_per_epoch = N_IMGS // BATCH_SIZE,
epochs = EPOCHS,
)
Throws the error
InvalidArgumentError: {{function_node __inference_train_function_528542}} Compilation failure: Depth of output must be a multiple of the number of groups: 3 vs 2
[[{{node sequential/conv2d/Conv2D}}]]
TPU compilation failed
[[tpu_compile_succeeded_assert/_15965336225898828069/_5]]
The error message is not clear to me, what exactly is going wrong? The following model is used.
def get_model():
# reset to free memory and training variables
tf.keras.backend.clear_session()
with strategy.scope():
net = efn.EfficientNetB0(include_top=False, weights='noisy-student', input_shape=(HEIGHT, WIDTH, 3))
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(3, (3, 3), padding='same', input_shape=(HEIGHT, WIDTH, 1)),
net,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Dense(N_LABELS, activation='softmax', dtype='float32'),
])
model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
model = get_model()
tf.keras.utils.plot_model(model, 'model.png', show_shapes=True)
The dataset gives the following output
for images, labels in dataset.take(1): # only take first element of dataset
print(f'images.shape: {images.shape}, images.dtype: {images.dtype}, labels.shape: {labels.shape}, labels.dtype: {labels.dtype}')
images.shape: (64, 224, 400, 1), images.dtype: <dtype: 'float32'>, labels.shape: (64,), labels.dtype: <dtype: 'int32'>
I have a variable batch size, so all of my inputs are of the form
tf.placeholder(tf.float32, shape=(None, ...)
to accept the variable batch sizes. However, how might you create a constant value with variable batch size? The issue is with this line:
log_probs = tf.constant(0.0, dtype=tf.float32, shape=[None, 1])
It is giving me an error:
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
I'm sure it is possible to initialize a constant tensor with variable batch size, how might I do so?
I've also tried the following:
tf.constant(0.0, dtype=tf.float32, shape=[-1, 1])
I get this error:
ValueError: Too many elements provided. Needed at most -1, but received 1
A tf.constant() has fixed size and value at graph construction time, so it probably isn't the right op for your application.
If you are trying to create a tensor with a dynamic size and the same (constant) value for every element, you can use tf.fill() and tf.shape() to create an appropriately-shaped tensor. For example, to create a tensor t that has the same shape as input and the value 0.5 everywhere:
input = tf.placeholder(tf.float32, shape=(None, ...))
# `tf.shape(input)` takes the dynamic shape of `input`.
t = tf.fill(tf.shape(input), 0.5)
As Yaroslav mentions in his comment, you may also be able to use (NumPy-style) broadcasting to avoid materializing a tensor with dynamic shape. For example, if input has shape (None, 32) and t has shape (1, 32) then computing tf.mul(input, t) will broadcast t on the first dimension to match the shape of input.
Suppose you want to do something using log_probs. For example, you want to do power operation on a tensor v and a constant log_probs. And you want the shape of log_probs to vary with the shape of v.
v = tf.placeholder(tf.float32, shape=(None, 1)
log_probs = tf.constant(0.0, dtype=tf.float32, shape=[None, 1])
result = tf.pow(v, log_probs)
However, you cannot construct the constant log_probs. While, firstly, you can construct tf.constant just with shape =[1] log_prob = tf.constant(0.0, dtype=tf.float32, shape=[None, 1]). Then use tf.map_fn() to do pow operation for each element of v.
v = tf.placeholder(tf.float32, shape=(None, 1)
log_prob = tf.constant(0.0, dtype=tf.float32, shape=[1])
result = tf.map_fn(lambda ele : tf.pow(ele, log_prob), v)