Group convolution in keras - tensorflow

I created a simple neural network for understanding how group convolutions can reduce the number of parameters. But when I use the groups parameter in the second convolution layer, I am getting an unimplemented error. However, when groups parameter is not used, everything works fine. Why when using groups parameter, it throws unimplemented error? Does that mean group convolution is not available in keras api?
import tensorflow
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D,Reshape,MaxPooling2D
from tensorflow.keras.utils import to_categorical
import numpy as np
num_classes = 10
a = np.random.randint(low=0,high=255,size=(100,28,28,1))
b = np.random.randint(low=0,high=10,size=(100,7,7))
a = a.astype('float32')
a = a/255
X_train, Y_train = a[:80], b[:80]
X_test, Y_test = a[80:], b[80:]
num_classes=10
Y_train = to_categorical(Y_train, num_classes)
Y_test = to_categorical(Y_test, num_classes)
# Create the model
model = Sequential()
model.add(Conv2D(8, kernel_size=(3,3),input_shape=(28,28,1),padding='same'))
model.add(Conv2D(8, kernel_size=(3,3),groups=4,input_shape=(28,28,1),padding='same'))
# model.add(Dense(10, input_shape=input_shape, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.add(MaxPooling2D())
model.add(MaxPooling2D())
# model.add(Reshape(target_shape=(10,)))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=10, batch_size=250, verbose=1, validation_split=0.2)
# model.save_weights("model.h5")
# # Test the model after training
# test_results = model.evaluate(X_test, Y_test, verbose=1)
# print(f'Test results - Loss: {test_results[0]} - Accuracy: {test_results[1]}%')
Error
UnimplementedError: Fused conv implementation does not support grouped convolutions for now.
[[node sequential_38/conv2d_37/BiasAdd (defined at <ipython-input-42-e7c1c931a421>:50) ]] [Op:__inference_train_function_8596]
Function call stack:
train_function

Here is the colab file for your code. According to the doc
A positive integer specifying the number of groups in which the input is split along the channel axis. Each group is convolved separately with filters / groups filters. The output is the concatenation of all the groups results along the channel axis. Input channels and filters must both be divisible by groups.
In your code, I found no conflict with that. It should work. Otherwise, it may issue with something else.

Related

How many nodes should I have in the last layer of neural network for binary classification?

I believed that, if I have a binary-classification problem then I should always have only 1 node in the last layer, since the last layer has to decide about the classification. However, in the following code it is not true.
Let's download the pizza/steak datasets (image dataset) and prepare the data using the ImageDataGenerator:
import zipfile
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Dropout, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.applications import EfficientNetB0, resnet50
from tensorflow.keras.models import Sequential
import numpy as np
import pandas as pd
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip
zip_ref = zipfile.ZipFile("pizza_steak.zip", "r")
zip_ref.extractall()
zip_ref.close()
train_directory = './pizza_steak/train/'
test_directory = './pizza_steak/test/'
IMAGE_SIZE = (224, 224)
image_data_generator = ImageDataGenerator(rescale=1. / 255,
zoom_range=0.2,
shear_range=0.2,
rotation_range=0.2)
train_dt = image_data_generator.flow_from_directory(directory=train_directory,
class_mode='categorical',
batch_size=32,
target_size=IMAGE_SIZE)
test_dt = image_data_generator.flow_from_directory(directory=test_directory,
class_mode='categorical',
batch_size=32,
target_size=IMAGE_SIZE)
and then build, compile a neural-network and fit the data on it:
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=3, activation='relu'))
model.add(Conv2D(filters=16, kernel_size=3, activation='relu'))
model.add(MaxPooling2D())
model.add(Conv2D(filters=16, kernel_size=3, activation='relu'))
model.add(Conv2D(filters=16, kernel_size=3, activation='relu'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train_dt,
epochs=5,
validation_data=test_dt,
validation_steps=len(test_dt)
As you can see the val_accuracy is not better than 0.5000, which is very bad!
And now if you just change the last layer to model.add(Dense(2, activation='sigmoid')) and run the same model with 2 nodes in the last layer, you will end up with a far better result, such as val_accuracy: 0.8680.
How should know, how many nodes should I have in the last layer when I have a binary-classification model?
Thanks to #Dr.Snoopy, i add an answer here jut to complete the question.
The point is how do we label our data using the image_data_generator.flow_from_directory().
If we set the class_mode='categorical' then the target is ONE_HOT and the number of nodes in the last layer is equal to "number of classes of target feature". In my case, it is a binary feature, so i need to have 2 nodes in the last layer.
However, if we use class_mode='binary' then the target is indexed and we can have only one node in the last layer.

how to replace keras embedding with pre-trained word embedding to CNN

I am currently studying how CNNs can be used in text classification and found some code on stack overflow that had worked with the use of a keras embedding layer.
I ran the code with the keras embedding but now want to test out what would happen with a pre-trained embedding, I have downloaded the word2vec api from gensim but dont know how to adapt the code from there?
My question is how can I replace the keras embedding layer with a pre-trained embedding like the word2vec model or Glove?
heres is the code
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, Convolution1D, Flatten, Dropout
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.callbacks import TensorBoard
# Using keras to load the dataset with the top_words
top_words = 10000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# Pad the sequence to the same length
max_review_length = 1600
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
# Using embedding from Keras
embedding_vecor_length = 300
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
# Convolutional model (3x conv, flatten, 2x dense)
model.add(Convolution1D(64, 3, padding='same'))
model.add(Convolution1D(32, 3, padding='same'))
model.add(Convolution1D(16, 3, padding='same'))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(180,activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1,activation='sigmoid'))
# Log to tensorboard
tensorBoardCallback = TensorBoard(log_dir='./logs', write_graph=True)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, callbacks=[tensorBoardCallback], batch_size=64)
# Evaluation on the test set
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
This reads the text file containing the weights, stores the words and their weights in a dictionary, then maps them into a new matrix using the vocabulary of your fit tokenizer.
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, Convolution1D, Flatten, Dropout
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.callbacks import TensorBoard
from tensorflow import keras
import itertools
import numpy as np
# Using keras to load the dataset with the top_words
top_words = 10000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
word_index = keras.datasets.imdb.get_word_index()
embedding_vecor_length = 300 # same as the embeds to be loaded below
embeddings_dictionary = dict()
glove_file = open('./embeds/glove.6B.300d.txt', 'rb')
for line in glove_file:
records = line.split() # seperates each line by a white space
word = records[0] # the first element is the word
vector_dimensions = np.asarray(
records[1:], dtype='float32') # the rest are the weights
# storing in dictionary
embeddings_dictionary[word] = vector_dimensions
glove_file.close()
# len_of_vocab = len(word_index)
embeddings_matrix = np.zeros((top_words, embedding_vecor_length))
# mapping to a new matrix, using only the words in your tokenizer's vocabulary
for word, index in word_index.items():
if index>=top_words:
continue
# the weights of the individual words in your vocabulary
embedding_vector = embeddings_dictionary.get(bytes(word, 'utf-8'))
if embedding_vector is not None:
embeddings_matrix[index] = embedding_vector
# Pad the sequence to the same length
max_review_length = 1600
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
# Using embedding from Keras
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length,
input_length=max_review_length, name="embeddinglayer", weights=[embeddings_matrix], trainable=True))
# Convolutional model (3x conv, flatten, 2x dense)
model.add(Convolution1D(64, 3, padding='same'))
model.add(Convolution1D(32, 3, padding='same'))
model.add(Convolution1D(16, 3, padding='same'))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(180, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
# Log to tensorboard
tensorBoardCallback = TensorBoard(log_dir='./logs', write_graph=True)
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, callbacks=[
tensorBoardCallback], batch_size=64)
# Evaluation on the test set
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

when we should use tf.function decorator

I'm trying to boost the performance of a simple 2NN. Here is the code:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.datasets import mnist
from tensorflow import keras
import tensorflow as tf
# load Mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data(path='mnist.npz')
X_train = X_train.reshape(60000, 784).astype('float32') / 255
X_test = X_test.reshape(10000, 784).astype('float32') / 255
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
# configure the model
model = Sequential()
model.add(Dense(200, activation='relu', input_shape=(784,)))
model.add(Dense(200, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.1), metrics=['accuracy'])
# train and evaluate the model
model.fit(X_train, y_train, batch_size=128, epochs=20, verbose=1, validation_data=(X_test, y_test))
model.evaluate(X_test, y_test)
Now, I wounder either there is a case to use #tf.function decorator or not, and if it's needed, how?
Your code only used builtin functions and classes so there is no need to use a #tf.function decorator. #tf.function is basically used to convert a normal function into a TensorFlow Graph as mentioned here. Since you are only using the builtin modules and functions, they are already treated as a graph by the TF compiler.

Simple Machine Learning example with handwritten digits does not work with conv2d and MaxPooling2D

I made an easy KI learning with tensorflow 2 with this code and everything works fine.
# Install TensorFlow
import tensorflow as tf
print(tf.__version__)
# Import matplotlib library
import matplotlib.pyplot as plt
#Import numpy
import numpy as np
#Dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
print("Evaluation");
model.evaluate(x_test, y_test)
plt.imshow(x_train[6], cmap="gray") # Import the image
plt.show() # Plot the image
predictions = model.predict([x_train]) # Make prediction
print("Vorhersage: ", np.argmax(predictions[6])) # Print out the number
print("Correct is: ", y_train[6])
My problem is how to add the detecting layers like Conv2d and MaxPooling2D. Where do I have to add this layers and does this influence my plotting and my predictions?
Before passing input to Convolution2d and maxpool2d, input must have 4 dimensions.
x_train and x_test have shape
[BatchSize, 28, 28] but it should be [BatchSize, 28, 28, 1].
So we are going to add channel dimension at last using np.expand_dims()
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), padding="same", input_shape=(None, 28, 28, 1)),
tf.keras.layers.Activation("relu"),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
Yes, It is going to influence your ploting and predictions.
Convolution layer uses less numbers of weights as compare to dense layer and then Maxpool will take features with only max values to make predictions. Which will reduce your feature because of this may be your accuracy will decrease.
Although, When we have images with large size like 500*500 then we have to apply Convolution and maxpool layers to reduce the features by selecting only important features.
If we apply flatten and dense function on input of 500*500 then program have to initialize large number of weights and you can get Out Of Memory error.

keras custom loss got wrong output when using Bayesian layer

custom-loss-function of keras got wrong output:
When I use a Bayesian layer (tensorflow_probability.layers.DenseFlipout), and use my custom loss function, I got a wrong output loss. But if I replace Bayesian layer by a traditional tf.keras.layers.Dense layer, the output is correct. Can anybody help me ?
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data as mnist_data
train, valid, test = mnist_data.read_data_sets('~/code/Python')
num_classes = 10
from tensorflow import keras
import tensorflow_probability as tfp
model = keras.Sequential()
#model.add(keras.layers.Dense(10, activation = 'softmax', input_shape=(784,)))
model.add(tfp.layers.DenseFlipout(10, activation = 'softmax', input_shape=(784,)))
sgd = keras.optimizers.SGD(lr=.1, momentum=0.9, nesterov=True)
def my_loss(y_true,y_pred):
return tf.reduce_mean((y_true-y_pred)**2)
model.compile(loss=my_loss, optimizer=sgd, metrics=['accuracy'])
x_train, y_train = train.images, train.labels
x_test, y_test = test.images, test.labels
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model.fit(x_train, y_train,
batch_size=128,
epochs=10,
validation_data=(x_test, y_test),
shuffle=True)