The code below was taken from TensorFlow in Practice by deeplearning.ai course in Coursera (computer vision example - week 2).
import tensorflow as tf
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
training_images = training_images / 255.0
test_images = test_images / 255.0
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
print("Executing Training:")
model.fit(training_images, training_labels, epochs=5)
print("Executing inference:")
model.evaluate(test_images, test_labels)
The question: How TensorFlow deduce the shape of the input layer? Which shape is being flattened here? The input shape should be derived from the shape of the input data, Am I missing something here?

Given the model.fit(training_images, training_labels), Tensorflow API automatically passes the training_images to the first layer.
Training_images is a tensor of shape (m, 28, 28, 1).
m - Total number of training images,
28x28 - image size,
1 - channel (Gray scale),
tf.keras.layers.Flatten(), reshapes the(28, 28, 1) 28x28 image into -> (784,).
View this for the Flatten() method source code. https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/keras/layers/core.py#L598-L684


Why are the weights of my QAT tf_model are floats and not 8-bit Integers?

I performed a simple Quantization Aware Training with Tensorflow on MNIST as follows:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0
# Define the model architecture.
model = keras.Sequential([
keras.layers.InputLayer(input_shape=(28, 28)),
keras.layers.Reshape(target_shape=(28, 28, 1)),
keras.layers.Conv2D(filters=12, kernel_size=(3, 3)),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
# Train the digit classification model
import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
# q_aware stands for for quantization aware.
q_aware_model = quantize_model(model)
# `quantize_model` requires a recompile.
train_images_subset = train_images[0:1000] # out of 60000
train_labels_subset = train_labels[0:1000]
q_aware_model.fit(train_images_subset, train_labels_subset,
batch_size=500, epochs=5, validation_split=0.1)
However, when I try to investigate the weights of the quantized model using, for instance, q_aware_model.get_weights()[5], I get an array of type Float-32. I am supposed to get type 8-bit integer; what am I doing wrong?

Require 'flatttened_input' when building ANN

Hello i need to build an ANN using binary_alpha_digits from tensorflow but i am unable to pass in the train data inside as it requires 'flatten_input' but I am passing in ['image','label'] dictionary. How do i solve this problem? Appreciate any help on this problem thanks.
from matplotlib import pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers
train_ds, test_ds = tfds.load('BinaryAlphaDigits',
split=['train[:60%]', 'train[60%:]'])
model = tf.keras.Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer= tf.optimizers.Adam(),
epochs = 10
model.fit(train_ds, epochs=epochs)
as you feed images into model, so the input shape must have defined in shape (Height, Width, Channel) which refers to image dimensions and color mode and the second one is that you should preprocess dataset before fitting model on it.
Even notice the output layers units for multi-class classification is not set correctly for this dataset, while there are more than 10 labels, based on dataset it contains 39 labels and so the last layer units would be set to 39.
Here i would implement code which work correctly for you with preprocessing function for images and labels, And even notice the images of the dataset are in shape (20, 16, 1) so you could resize images to set it into (28, 28, 1) or just fed model with the images in their size.
After preprocessing, images are grouped by creating batches or mini-batches, and even shuffle training set to avoid high variance on testing set, so the operations below will be have done cause of that
from matplotlib import pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers
import tensorflow_datasets as tfds
train_ds, test_ds = tfds.load('BinaryAlphaDigits', split=['train[:60%]', 'train[60%:]'])
def preprocess(data):
image = data['image']
image = tf.image.resize(image, (28, 28))
label = data['label']
return image, label
train_ds = train_ds.map(preprocess)
train_ds = train_ds.shuffle(1024)
train_ds = train_ds.batch(batch_size = 32)
test_ds = test_ds.map(preprocess)
test_ds = test_ds.batch(batch_size = 32)
model = tf.keras.Sequential()
model.add(layers.Flatten(input_shape=(28, 28, 1)))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(39, activation=tf.nn.softmax))
model.compile(optimizer= tf.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
epochs = 10
model.fit(train_ds, epochs=epochs)
tfds.load by default gives a dictionary with image and label as the keys.
train_ds, test_ds = tfds.load('BinaryAlphaDigits',
split=['train[:60%]', 'train[60%:]'])
train_ds = train_ds.shuffle(1024).batch(4)
for x in train_ds.take(1):
print(x['image'].shape, x['label'])
<class 'dict'>
(4, 20, 16, 1) tf.Tensor([ 6 32 6 12], shape=(4,), dtype=int64)
There is a setting called as_supervised that gives it as a proper dataset. Check docs here
If you use that setting and use proper input and output sizes, your model works
train_ds, test_ds = tfds.load('BinaryAlphaDigits',
split=['train[:60%]', 'train[60%:]'],as_supervised=True)
train_ds = train_ds.shuffle(1024).batch(4)
for x in train_ds.take(1):
print(x[0].shape, x[1])
<class 'tuple'>
(4, 20, 16, 1) tf.Tensor([13 13 22 31], shape=(4,), dtype=int64)
model = tf.keras.Sequential()
model.add(layers.Flatten(input_shape=(20, 16,1)))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(10, activation=tf.nn.relu))
model.add(layers.Dense(36, activation=tf.nn.softmax))
model.compile(optimizer= tf.optimizers.Adam(),
epochs = 10
model.fit(train_ds, epochs=epochs)
Epoch 1/10
211/211 [==============================] - 1s 3ms/step - loss: 3.5428 - accuracy: 0.0629
Epoch 2/10
211/211 [==============================] - 0s 2ms/step - loss: 3.2828 - accuracy: 0.1105

Tensorflow Keras Shape mismatch

While trying to implement a standard MNIST digit recognizer that many tutorials use to introduce you to neural networks, I'm encountering the error
ValueError: Shape mismatch: The shape of labels (received (1,)) should equal the shape of logits except for the last dimension (received (28, 10)).
I would like to use from_tensor_slices to process the data, since I want to apply the code to another problem where the data comes from a CSV file. Anyway, here is the code producing the error in the line model.fit(...)
import tensorflow as tf
train_dataset, test_dataset = tf.keras.datasets.mnist.load_data()
train_images, train_labels = train_dataset
train_images = train_images/255.0
train_dataset_tensor = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
num_of_validation_data = 10000
validation_data = train_dataset_tensor.take(num_of_validation_data)
train_data = train_dataset_tensor.skip(num_of_validation_data)
model = tf.keras.Sequential([
tf.keras.layers.Dense(100, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='softmax')
model.fit(train_data, batch_size=50, epochs=5)
performance = model.evaluate(validation_data)
I don't understand where the shape (28, 10) of the logits comes from, I thought I was flattening the image, essentially making a 1D vector out of the 2D image? How can I prevent the error?
You can use the following code
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]
train_ds = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(100, activation='sigmoid'),
tf.keras.layers.Dense(10, activation='softmax')

Simple Machine Learning example with handwritten digits does not work with conv2d and MaxPooling2D

I made an easy KI learning with tensorflow 2 with this code and everything works fine.
# Install TensorFlow
import tensorflow as tf
# Import matplotlib library
import matplotlib.pyplot as plt
#Import numpy
import numpy as np
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
plt.imshow(x_train[6], cmap="gray") # Import the image
plt.show() # Plot the image
predictions = model.predict([x_train]) # Make prediction
print("Vorhersage: ", np.argmax(predictions[6])) # Print out the number
print("Correct is: ", y_train[6])
My problem is how to add the detecting layers like Conv2d and MaxPooling2D. Where do I have to add this layers and does this influence my plotting and my predictions?
Before passing input to Convolution2d and maxpool2d, input must have 4 dimensions.
x_train and x_test have shape
[BatchSize, 28, 28] but it should be [BatchSize, 28, 28, 1].
So we are going to add channel dimension at last using np.expand_dims()
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), padding="same", input_shape=(None, 28, 28, 1)),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
Yes, It is going to influence your ploting and predictions.
Convolution layer uses less numbers of weights as compare to dense layer and then Maxpool will take features with only max values to make predictions. Which will reduce your feature because of this may be your accuracy will decrease.
Although, When we have images with large size like 500*500 then we have to apply Convolution and maxpool layers to reduce the features by selecting only important features.
If we apply flatten and dense function on input of 500*500 then program have to initialize large number of weights and you can get Out Of Memory error.

keras.model.predict raise ValueError: Error when checking input

I trained a basic Neural Network model on the MNIST dataset. Here's the code to the training: (imports omitted)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data(path='mnist.npz')
x_train, x_test = x_train/255.0, x_test/255.0
#1st Define the model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape = (28,28)), #input layer
tf.keras.layers.Dense(512, activation=tf.nn.relu), #main computation layer
tf.keras.layers.Dropout(0.2), #Dropout layer to avoid overfitting
tf.keras.layers.Dense(10, activation=tf.nn.softmax) #output layer / Softmax is a classifier AF
#2nd Compile the model
#3rd Fit the model
model.fit(x_train, y_train, epochs=5)
#4th Save the model
#5th Evaluate the model
model.evaluate(x_test, y_test)
I wanted to see how this model works with my own inputs, so I wrote a prediction script with help from this post. My prediction code is: (imports omitted)
model = load_model('models/mnistCNN.h5')
for i in range(3):
img = Image.open(str(i+1) + '.png').convert("L")
img = img.resize((28,28))
im2arr = np.array(img)
im2arr = im2arr/255
im2arr = im2arr.reshape(1, 28, 28, 1)
y_pred = model.predict(im2arr)
print('For Image',i+1,'Prediction = ',y_pred)
First, I don't understand the purpose of this line:
im2arr = im2arr.reshape(1, 28, 28, 1)
If some one could shed light on why this line necessary, that would be of great help.
Second, this very line throws the following error:
ValueError: Error when checking input: expected flatten_input to have 3 dimensions, but got array with shape (1, 28, 28, 1)
What am I missing here?
First dimension is used for batch size. It is added by keras.model internally. So this line just adds it to image array.
im2arr = im2arr.reshape(1, 28, 28, 1)
The error you get is because a single example from mnist dataset, that you used for training has shape (28, 28), so as your input layer. To get rid of this error you need to change this line to
im2arr = img.reshape((1, 28, 28))