Is it possible to print the output of a specific layer of a tensorflow model? - tensorflow

I constructed this dataset for binary classification that contains digit 0 vs. digit 6.
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
import numpy as np
mnist = datasets.load_digits()
# generate the indices
idx_digit = np.argwhere((mnist.target == 0) | (mnist.target == 6)).flatten()
X_train, X_test, y_train, y_test = train_test_split(
mnist.data[idx_digit].reshape((-1,8,8,1)),
mnist.target[idx_digit], test_size=0.33, random_state=42)
y_train[y_train==6]=1
y_test[y_test==6]=1
I built a convolutional neural network with keras.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(filters=1, kernel_size=(3, 3), activation='relu'),
tf.keras.layers.AveragePooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
I compiled and trained the model and it works well.
model.compile(loss=tf.keras.losses.binary_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, verbose=1)
I'd just like to know if it is possible to print the output of a specific layer, e.g.
model.layers[0].output
In other words, how do I get the output of the convolutional layer for a given input x?

In order to get the output of the intermediate layer after training a model, you can do
Along with the last layer output, if we want to receive the convolutional layer's output (of yours).
feature_extractor = tf.keras.Model(
inputs=model.inputs,
outputs=[
model.output, # < last layer output
model.layers[0].output # < your convolution layer output
]
)
x = tf.ones((1, 8, 8, 1))
y, conv_y = feature_extractor(x)
y.shape, conv_y.shape
(TensorShape([1, 1]), TensorShape([1, 6, 6, 1]))
Also, if we want to get all layers output, then we can do
feature_extractor = tf.keras.Model(
inputs=model.inputs,
outputs=[layer.output for layer in model.layers],
)
features = feature_extractor(x); print(len(features))
4
for i in range(len(features)):
print(features[i].shape)
(1, 6, 6, 1) < first layer output / conv layer
(1, 3, 3, 1) < second layer output
(1, 9) < 3rd and
(1, 1) < 4th (last layer)

Related

Why does a model based on Dense layers gives better results than one based on Conv2D?

In Tensorflow, the results of training a model based on Dense layers are better than a model based on equivalent Conv2D layers.
Results:
Using Dense: loss: 16.1930 - mae: 2.5369 - mse: 16.1930
Using Conv2D: loss: 83.7851 - mae: 6.5585 - mse: 83.7851
Should this be expected or are we doing something wrong?
The code we are using is the following (adapted from here):
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import pandas as pd
import sys
model_type = int(sys.argv[1]) # 0: Dense, Else: Conv2D
verbose = 0
# load data & normalize
(train_features, train_labels), (test_features, test_labels) = keras.datasets.boston_housing.load_data()
train_mean = np.mean(train_features, axis=0)
train_std = np.std(train_features, axis=0)
train_features_norm = (train_features - train_mean) / train_std
test_features_norm = (test_features - train_mean) / train_std
train_labels_norm = train_labels
test_labels_norm = test_labels
input_height = train_features_norm.shape[1]
# model
if model_type == 0:
model = keras.Sequential([
layers.InputLayer(input_shape=(input_height)),
layers.Dense(20, activation='relu'),
layers.Dense(1)])
else:
train_features_norm = np.reshape(train_features_norm, (-1, input_height, 1))
test_features_norm = np.reshape(test_features_norm, (-1, input_height, 1))
model = keras.Sequential([
layers.InputLayer(input_shape=(input_height, 1, 1)),
layers.Conv2D(20, (input_height, 1), activation='relu'),
layers.Conv2D(1, (1, 1))]) # replacing this layer with Dense(1) gives the same results
model.compile(
optimizer=tf.optimizers.Adam(),
loss='mse',
metrics=['mae', 'mse'])
model.summary()
# training
early_stop = keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=50)
history = model.fit(
train_features_norm,
train_labels_norm,
epochs=1000,
verbose=verbose,
validation_split=0.1)
# results
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
print(hist)
rmse_final = np.sqrt(float(hist['val_mse'].tail(1)))
print('Final Root Mean Square Error on validation set: {}'.format(round(rmse_final, 3)))
# compare how the model perfoms on the test dataset
mse, _, _ = model.evaluate(test_features_norm, test_labels_norm)
rmse = np.sqrt(mse)
print('Root Mean Square Error on test set: {}'.format(round(rmse, 3)))
NOTE: model_type can be used to select a model based on Dense layers (= 0), or a model based on Conv2D (any other value).
Background
We have a system (BeagleBone AI using TIDL) which doesn't support Dense layers. It does, however, support Conv2D layers and, as far as we know, a Conv2D can be configured to be equivalent to a Dense layer.
For example, in a Dense layer with two units/outputs, no bias, and two inputs, the output is:
O1 = W11 * I1 + W12 * I2
O2 = W21 * I1 + W22 * I2
O - output, I - input, W - weight
Similarly, in a Conv2D layer with two 1x1 output channels, no bias, one 1x2 input channel, and a 1x2 kernel, the output is:
O1 = K11 * I11 + K12 * I12
O2 = K21 * I11 + K22 * I12
O - output channel, I - input channel, K - kernel weights
This means that mathematically they are equivalent. But training works better when the Dense layer is used.
I got it! You have to reshape the output tensor so that it has only two dimensions (batch_size, 1)
I get this test data evaluation : loss: 17.9552 - mae: 2.7125 - mse: 17.9552
It is slightly higher than your results with Dense layers, but seems comparable at least.
Here is my model :
filters = 20
model = keras.Sequential([
layers.InputLayer(input_shape=(input_height,)),
# first Conv layer
layers.Reshape((input_height, 1, 1)),
layers.Conv2D(filters, (input_height, 1), data_format='channels_last', padding='valid'),
layers.Activation('relu'),
# second conv layer
layers.Reshape((filters, 1, 1)),
layers.Conv2D(1, (filters, 1)),
# reshape the final result !!!
layers.Reshape((1,)),
])
There are two issues here:
The shape of the features (None, input_height, 1) doesn't match the shape of the model's input (None, input_height, 1, 1).
The shape of the labels (None, 1) doesn't match the shape of model's output (None, 1, 1, 1).
Each of these has an impact on the performance of the model. Both are needed to reach the performance level of the model based on Dense layers.
Fix (add an extra dim to the features, reshape the labels):
if model_type == 0:
...
else:
train_features_norm = np.reshape(train_features_norm, (-1, input_height, 1, 1))
test_features_norm = np.reshape(test_features_norm, (-1, input_height, 1, 1))
train_labels_norm = np.reshape(train_labels_norm, (-1, 1, 1, 1))
test_labels_norm = np.reshape(test_labels_norm, (-1, 1, 1, 1))
...
Should this be expected or are we doing something wrong?
No, this is not expected. I am not sure if the original code can be considered wrong. My expection (and since it didn't complain about mismatching shapes, as it usually does) was that because the "missing" dimensions were of size 1, it didn't really matter. Well, they do.
Thank you #elbe. Your answer was key for me to realize the issues above.

Why this model can't overfit one example?

I am practicing conv1D on TensorFlow 2.7, and I am checking a decoder I developed by checking if it will overfit one example. The model doesn't learn when trained on only one example and can't overfit this one example. I want to understand this strange behavior, please. This is the link to the notebook on colab Notebook.
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv1D, Dense, BatchNormalization
from tensorflow.keras.layers import ReLU, MaxPool1D, GlobalMaxPool1D
from tensorflow.keras import Model
import numpy as np
def Decoder():
inputs = Input(shape=(68, 3), name='Input_Tensor')
# First hidden layer
conv1 = Conv1D(filters=64, kernel_size=1, name='Conv1D_1')(inputs)
bn1 = BatchNormalization(name='BN_1')(conv1)
relu1 = ReLU(name='ReLU_1')(bn1)
# Second hidden layer
conv2 = Conv1D(filters=64, kernel_size=1, name='Conv1D_2')(relu1)
bn2 = BatchNormalization(name='BN_2')(conv2)
relu2 = ReLU(name='ReLU_2')(bn2)
# Third hidden layer
conv3 = Conv1D(filters=64, kernel_size=1, name='Conv1D_3')(relu2)
bn3 = BatchNormalization(name='BN_3')(conv3)
relu3 = ReLU(name='ReLU_3')(bn3)
# Fourth hidden layer
conv4 = Conv1D(filters=128, kernel_size=1, name='Conv1D_4')(relu3)
bn4 = BatchNormalization(name='BN_4')(conv4)
relu4 = ReLU(name='ReLU_4')(bn4)
# Fifth hidden layer
conv5 = Conv1D(filters=1024, kernel_size=1, name='Conv1D_5')(relu4)
bn5 = BatchNormalization(name='BN_5')(conv5)
relu5 = ReLU(name='ReLU_5')(bn5)
global_features = GlobalMaxPool1D(name='GlobalMaxPool1D')(relu5)
global_features = tf.keras.layers.Reshape((1, -1))(global_features)
conv6 = Conv1D(filters=12, kernel_size=1, name='Conv1D_6')(global_features)
bn6 = BatchNormalization(name='BN_6')(conv6)
outputs = ReLU(name='ReLU_6')(bn6)
model = Model(inputs=[inputs], outputs=[outputs], name='Decoder')
return model
model = Decoder()
model.summary()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
losses = tf.keras.losses.MeanSquaredError()
model.compile(optimizer=optimizer, loss=losses)
n = 1
X = np.random.rand(n, 68, 3)
y = np.random.rand(n, 1, 12)
model.fit(x=X,y=y, verbose=1, epochs=30)
I think the problem here is, that you have no basis to learn anything, so you can't overfit. In every epoch you have just one example which is used to adapt the weights of the network. So there is not enough time to adapt the weights for overfitting here.
So to get the result of overfitting you want to have the same data multiple times inside your training dataset so the weights can change enought to overfitt because you only change them just one small step per epoch.
A deeper look into the back propagation might help you to get a better understanding of the concept. Click
I took th liberty to adapt your notebook and enhanced the dataset as following:
n = 1
X = np.random.rand(n, 68, 3)
y = np.random.rand(n, 1, 12)
for i in range(0,10):
X=np.append(X,X,axis = 0)
y=np.append(y,y,axis = 0)
And the output would be:

Tensorflow input shape incompatible with layer

I'm trying to build a Sequential model with tensorflow.
import tensorflow as tf
import keras
from tensorflow.keras import layers
from keras import optimizers
import numpy as np
model = keras.Sequential (name="model")
model.add(keras.Input(shape=(786,)))
model.add(layers.Dense(2048, activation="relu", name="layer1"))
model.add(layers.Dense(786, activation="relu", name="layer2"))
model.add(layers.Dense(786, activation="relu", name="layer3"))
output = model.add(layers.Dense(786, activation="relu", name="output"))
model.summary()
model.compile(
optimizer=tf.optimizers.Adam(), # Optimizer
loss=keras.losses.CategoricalCrossentropy(),
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
history = model.fit(
x_train,
y_train,
batch_size=1,
epochs=5,
)
The input shape is a vector with length of 768 (so the input shape is (768,) right?), representing a chess board:
def get_dataset():
container = np.load('/content/drive/MyDrive/test_data_vector.npz')
b, v = container['arr_0'], container['arr_1']
v = np.asarray(v / abs(v).max() / 2 + 0.5, dtype=np.float32) # normalization (0 - 1)
return b, v
xtrain, ytrain = get_dataset()
print(xtrain.shape)
print(ytrain.shape)
>> (37, 786) #there are 37 samples
>> (37, 786)
But I always get the error:
ValueError: Input 0 of layer model is incompatible with the layer: expected axis -1 of input shape to have value 786 but received input with shape (1, 1, 768)
I tried with np.expand_dims(), which ended in the same Error.
The error is just a typo, as the user mentioned the issue is resolved by changing the output shape from 786 to 768 and the issue is resolved.
One suggestion based on the model structure.
The number of units are not related to your input shape, you don't have to match that number.
The number of units like 2048 and 786 in dense layer is too large and this may not help the model to learn better.
Try with smaller numbers like 32,64 etc, you can refer some of the examples in the tensorflow document.

Dimensions must be equal, but are 2 and 3 for node binary_crossentropy/mul

I was checking the code I found here, the example at Multivariate Multi-Step LSTM Models - > Multiple Input Multi-Step Output.
I altered the code and used binary_crossentropy and sigmoid activation for the last layer.
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out-1
# check if we are beyond the dataset
if out_end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 3
# convert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
n_features = X.shape[2]
# define model
model = Sequential()
model.add((LSTM(5, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# fit model
model.fit(X, y, epochs=20, verbose=0, batch_size=1)
The above code runs fine. But, when I try to change the n_steps_in, n_steps_out and use for example: n_steps_in, n_steps_out = 3, 2, it gives:
ValueError: Dimensions must be equal, but are 2 and 3 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [1,2], [1,3].
Why this error comes up and how can I overcome this?
this is because your network is build to output 3D sequences of shape (None, 3, 1) while your targets have shape (None, 2, 1)
The best and automated way to handle this situation correctly is to build an encoder-decoder structure... Below the example:
model = Sequential()
model.add(LSTM(5, activation='relu', return_sequences=False,
input_shape=(n_steps_in, n_features))) # ENCODER
model.add(RepeatVector(n_steps_out))
model.add(LSTM(5, activation='relu', return_sequences=True)) # DECODER
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=20, batch_size=1)

Keras fit_generator with images from directory and a constant tensor

I have a simple CNN with input images of shape (5,5,3). As a first step I want to add a constant tensor to the input.
According to the answer in my previous SO question, I have to define the constant tensor as an input layer (const_input), so that I can Add() it to the image data (raw_input). The model is compiled without errors:
from __future__ import print_function
import tensorflow as tf
import numpy as np
import keras
from keras import backend as K
from keras.models import Model
from keras.layers import Input, Add
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
# Python 2.7.10 - keras version 2.2.0 - tf.VERSION '1.8.0'
cnn_layer1 = Conv2D(32, (4, 4), activation='relu')
cnn_layer2 = MaxPooling2D(pool_size=(2, 2))
cnn_layer3 = Dense(64, activation='relu')
cnn_layer4 = Dropout(0.1)
cnn_output = Dense(2, activation='softmax')
raw_input = Input(shape=(5, 5, 3))
const_input = Input(shape=(5, 5, 3))
pre_proc = Add()([raw_input, const_input])
lay1 = cnn_layer1(pre_proc)
lay2 = cnn_layer2(lay1)
lay3 = Flatten()(lay2)
lay4 = cnn_layer3(lay3)
lay5 = cnn_layer4(lay4)
lay_out = cnn_output(lay5)
model = Model(inputs=[raw_input, const_input], outputs=lay_out)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
Now I try to provide the constant tensor as an input along with the images that are read from directory:
batch_size = 10
train_datagen = ImageDataGenerator(rescale=1./255)
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'cd_data/train',
target_size=(5, 5),
classes=['cat', 'dog'],
batch_size=batch_size)
validation_generator = validation_datagen.flow_from_directory(
'cd_data/validation',
target_size=(5, 5),
classes=['cat', 'dog'],
batch_size=batch_size)
const_array = np.array(
[[[5.0,0.0,0.0],[0.0,0.0,-3.0],[-10.0,0.0,0.0],[0.0,0.0,4.0],[-20.0,0.0,0.0]],
[[-15.0,0.0,12.0],[0.0,4.0,0.0],[-3.0,0.0,10.0],[-18.0,0.0,0.0],[20.0,0.0,-6.0]],
[[0.0,0.0,6.0],[0.0,-2.0,-6.0],[0.0,0.0,2.0],[0.0,0.0,-9.0],[7.0,-6.0,0.0]],
[[-3.0,4.0,0.0],[11.0,-12.0,0.0],[0.0,0.0,0.0],[0.0,0.0,7.0],[0.0,0.0,2.0]],
[[0.0,0.0,0.0],[0.0,1.0,-2.0],[4.0,0.0,3.0],[0.0,0.0,0.0],[0.0,0.0,0.0]]])
def merge_generator():
while True:
next_image = train_generator.next()
yield [next_image[0], const_array], next_image[1]
train_gen_with_const = merge_generator()
Executing the fit_generator leads to error below
model.fit_generator(
train_gen_with_const,
steps_per_epoch=2,
epochs=1,
verbose=2, # one line per epoch
validation_data=validation_generator,
validation_steps=2)
ValueError: Error when checking input: expected input_2 to have 4 dimensions, but got array with shape (5, 5, 3)
I tried to provide the missing dimension like this
const_batch = np.broadcast_to(const_array, (batch_size, 5, 5, 3))
def merge_generator():
while True:
next_image = train_generator.next()
yield [next_image[0], const_batch], next_image[1]
but this leads to
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(2, 5, 5, 3), (10, 5, 5, 3)]
What is the right way to provide this constant tensor input?
Any help is highly appreciated!
The problem lies with your validation_data= argument; your model expects two input arrays, whereas validation_generator supplies only one. You fixed this with train_gen_with_const - just extend it to val:
def merge_generator(): # const_batch inside the function to apply to each image
while True:
next_image = train_generator.next()
const_batch = np.broadcast_to(const_array, (len(next_image[0]), 5, 5, 3))
yield [next_image[0], const_batch], next_image[1]
def val_merge_generator():
while True:
next_image = validation_generator.next()
const_batch = np.broadcast_to(const_array, (len(next_image[0]), 5, 5, 3))
yield [next_image[0], const_batch], next_image[1]
Remember, internally, fit_generator calls train_on_batch(x, y) and evaluate(x, y) - so each must receive the same dimensionality for x and y from both generators.