Alternative for tf.keras.layers.experimental.preprocessing.Resizing that works on embedded system (ARM6) - resize

I use the following model for audio classification:
model = models.Sequential([
layers.Input(shape=input_shape),
preprocessing.Resizing(64, 64),
layers.Conv2D(64, 3, activation='relu', padding="same"),
layers.Conv2D(64, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(128, 3, activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, activation='relu'),
layers.Conv2D(32, 3, activation='relu'),
layers.Dropout(0.25),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(2),
])
This works fine on a CPU. Now I would like to run the trained model on a TPU (Coral USB Accelerator via Raspberry Pi Zero (ARM6)). The edge TPU compiler which must be used to run a model on the Coral does not accept tf.keras.layers.experimental.preprocessing.Resizing (third line in code snippet). The edge TPU compiler does only accept a subset of TensorFlow lite functions. Therefore I am now looking for an alternative to resize my 624x129x1 spectrogram to 64x64x1 as a preprocessing step outside of models.Sequential(). Unfortunately, Tensorflow and PyTorch do not work on the RPI0. Therefore I cannot use these libraries and am looking for an alternative to preprocess the spectrogram to the right size.
Any help or advice is highly appreciated!

Related

How to force distributed training in tensorflow to use more than 1 server?

I am following the official distributed training documentation but find that only 1 server is used in databricks ganglia
https://www.tensorflow.org/tutorials/distribute/keras
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
outputs 2
why is only one server in used ? When there is multiple (2) server available?
with strategy.scope():
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10)
])
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
Is there a way I can force the number of server to split work in training?

Why are encoded representations bad for classification?

Given a pre-trained well-performing auto-encoder. When I train a classifier on encodings (produced by the auto-encoder) the classifier does very poorly. In particular, it does much worse than training a classifier on normal inputs (i.e. unencoded inputs).
However, when I fine-tune the encoder based on classification loss, the classifier does quite well.
Why are encoded representations bad for classification?
Details: I’m working on CIFAR-100 and trying to classify coarse image labels, i.e. 20 classes (but I think I had the same problem when doing classification on CIFAR-10). The classifier has 5 layers and I’m using dropout:
classifier = tf.keras.Sequential([
tf.keras.layers.Dense(512,
activation='relu',
name='classifier_hidden_1'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(256,
activation='relu',
name='classifier_hidden_2'),
tf.keras.layers.Dense(128,
activation='relu',
name='classifier_hidden_3'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64,
activation='relu',
name='classifier_hidden_4'),
tf.keras.layers.Dense(num_classes,
activation=None,
name='classifier_out'),
], name='classifier')

Can convolutional neural network (CNN) be used for feature extraction in unsupervised learning?

I'm doing a side project to learn AI with ANN, I thought of making an unsupervised model that extracts features of each frame on a video to compare them in the future and detect image repetitions.
My idea is to use a CNN to extract for each frame the features but I can't seem to make it work, as I am learning my intuition tells me that there is something I am just not understanding.
How can I create an unsupervised model that extracts features of an array of images?
This is what I got:
img = load_image_func(???) # this loads a video and return a reshaped ordered list of frames
input_shape = (150, 150, 3)
# The model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', name='conv_1', input_shape=input_shape))
model.add(MaxPooling2D((2, 2), name='maxpool_1'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', name='conv_2'))
model.add(MaxPooling2D((2, 2), name='maxpool_2'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='conv_3'))
model.add(MaxPooling2D((2, 2), name='maxpool_3'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='conv_4'))
model.add(MaxPooling2D((2, 2), name='maxpool_4'))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu', name='dense_1'))
model.add(Dense(128, activation='relu', name='dense_2'))
model.add(Dense(67500, activation='sigmoid', name='output'))
optimizer=keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss = 'sparse_categorical_crossentropy', optimizer= optimizer, metrics=['accuracy'])
#model.summary()
model.fit(vidcap, vidcap, batch_size=64, epochs=20)
I have the feeling I should be training the model but as it is unsupervised I don't have train data.
Also, how many units should I put in the output layer as I don't how many features will be detected?
Thanks for your time
Indeed, the CNN model will extract several features of an Image (e.g. colors, shapes, edges, patterns, etc.)
However, what are you defining as Images Repetition? Are you looking for an algorithm that finds similar images? If this is the case, then You might wanna look into Siamese Networks, which is exactly what they do:
https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf
The main idea here is that there are 2 Neural Networks that are trained together! Then, after the training is done, You use both neural networks to extract features of the same images seperately and compare the results to find the similarity.

Why is my CNN/Image Classifier model accuracy so low?

I'm currently trying to build a CNN that can detect whether a patient has pnemonia caused by covid or not, and no matter what parameters I change the model accuracy is staying at 49%/50% so its basically useless because it's the same as a coin flip. Here is my code, I thought I would try using the VGG-16 model.
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, GlobalAveragePooling2D
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.preprocessing.image import ImageDataGenerator
# Loading in the dataset
traindata = ImageDataGenerator(rescale=1/255)
trainingdata = traindata.flow_from_directory(
directory="Covid-19CT/TrainingData",
target_size=(224,224),
batch_size=100,
class_mode="binary")
testdata = ImageDataGenerator(rescale=1/255)
testingdata = testdata.flow_from_directory(
directory="Covid-19CT/TestingData",
target_size=(224,224),
batch_size=100,
class_mode="binary")
# Initialize the model w/ Sequential & add layers + input and output <- will refer to the VGG 16 model architecture
model = Sequential()
model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(2,2),padding="same", activation="relu"))
model.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation ="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=2))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=2))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=2))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=2))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=2))
model.add(GlobalAveragePooling2D())
model.add(Dense(units=4096, activation="relu"))
model.add(Dense(units=4096, activation="relu"))
model.add(Dense(units=1000, activation="relu"))
model.add(Dense(units=1, activation="softmax"))
# Compile the model
model_optimizer = Adam(lr=0.001)
model.compile(optimizer=model_optimizer, loss=keras.losses.binary_crossentropy, metrics=['accuracy'])
# Add the callbacks
checkpoint = ModelCheckpoint(filepath="Covid-19.hdf5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto')
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=50, verbose=1, mode='auto')
fit = model.fit_generator(steps_per_epoch=25, generator=trainingdata, validation_data=testingdata, validation_steps=10,epochs=10,callbacks=[checkpoint,early])
This always gives:
Epoch 1/10 6/25 [======>.......................] - ETA: 1:22:37 -
loss: 7.5388 - accuracy: 0.5083
<- Well, it just always gives a really poor accuracy...
Additional info:
Some of the images in the data set are JPG others are PNG (Not sure if this is the culprit)
The Dataset has 2072 images for training Covid CTs and 2098 images for training NonCovid CTs
The Dataset has 576 images for testing Covid CTs and 532 images for testing NonCovid CTs
File structure looks like this: Covid19ModelImages -> Training Data & Testing Data - Training Data has 2 subfolders Covid19CT and noncovid19 CT and testing data also has 2 subfolders Covid19CT and noncovid19CT
Also: Am I just being too impatient? I never let it run past the 1st epoch cause I just assume its never going to get better than 50%, could it be that the model will improve more on the next epochs?
If anyone would be willing to help out, or if you need any other additional info to maybe help you gain a better understanding of the problem, please let me know!
Since you are using binary cross entropy, the activation function in the dense layer with 1 unit should be "sigmoid". Since you are not using a GPU you have very long training times per epoch. To see if the model is working correctly you may want to reduce this time. There are few things you could do. Try reducing the image size say to 128 by 128. With 224 X 224 you have 50176 pixels to process versus 16384 for the 128 X 128 image so you reduce the computations by about a factor of 3. Also you have two dense layers with 4096 units. This is also computationally expense. It may also lead to overfitting. Try your model initially without these layers and see how it performs. I am not a fan of early stopping because it is a crutch to avoid dealing with the over fitting issue. If you encounter over fitting add a dropout layer to help avoid it. Finally I recommend you use an adjustable learning rate. The callback ReduceLROnPlateau makes this easy to do. Set it to monitor validation loss. You can set the parameters to reduce the learning rate a factor<1 if the loss fails to decrease after "patience" number of consecutive epochs. I usually use factor=.5 and patience=1. This also enables you to use a larger initial learning rate for faster convergence. Documentation is here. You need to let your model run for several epochs to see if the training loss and validation loss are decreasing.

How to customize AlexNet for difference uses

So I've been working on learning some machine learning this past week and I have been messing around with my own regression CNN with inputs of color images 128x128 and an output of a rating. Although my dataset is small, 400ish total, I got alright results with a little overfitting (mean_absolute_error of 0.5 for training and 0.9 for testing with scale 1-10) with the model showed below:
model = keras.Sequential([
keras.layers.Conv2D(32, kernel_size=(5, 5), strides=(1, 1), activation='relu', input_shape=(128, 128, 3)),
keras.layers.Dropout(0.15),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2)),
keras.layers.Conv2D(64, kernel_size=(5, 5), activation='relu'),
keras.layers.Dropout(0.15),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1)
])
However, not being satisfied with the results, I wanted to try out tried and true models. So I used AlexNet:
model = keras.Sequential([
keras.layers.Conv2D(filters=96, kernel_size=(11, 11), strides=(4, 4), activation='relu', input_shape=(128, 128, 3), padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Conv2D(filters=256, kernel_size=(11, 11), strides=(1, 1), activation='relu', padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation='relu', padding='same'),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(4096, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dropout(0.4),
keras.layers.Dense(1)
])
However, it converged much slower and pretty much plateaued at MAE of 1.2 for train and 0.9 for test. Although this does show less overfitting, I thought it was strange that I still got the same test results. Is my implemention of AlexNet flawed or is this just not the right application for AlexNet. I understand it is usually used for classification, but I figured it might be worth trying with regression. Any info/suggestions/criticisms help, thanks!
I don't see anything clearly wrong with your AlexNet implementation. But I'd like to point few things.
The way dropout used in the first model
It is not a standard things to apply Dropout like that after a convolution output. When you apply dropout this way the outputs in the Convolution output gets randomly switched off. But unlike a fully connected layer, convolution outputs have a "spatial" structure. What I'm trying to say is it make me a more sense to switch of full channels than switching off random neurons. I think an illustration would help. Think of one channel output corresponding to a single neuron of a fully connected layer (not the best analogy but it helps to understand my proposal).
Or the other option is to be rid of Dropout after convolution outputs and only have Dropout after fully connected layers.
Time taken to converge for AlexNet
AlexNet is significantly large than the model 1, meaning way more parameters than your first model. So it makes sense of it to take longer to converge.
Why the accuracy low?
One thing I can think of is the size of the output just before the Flatten() layer. In model 1, it is 32x32, where with Alexnet it is, 4x4 which is very small. So your fully connected layers have very little information coming from the Convolution layers. This might be causing AlexNet to underperform (Just a speculation).