I am new to this Deep Learning. I have learnt the basics through reading and trying to implement a real network to see how/if it'll really work. I chose Tensorflow in digits and the following network because they give out the exact architecture with training materiel. Steganalysis with DL
I wrote the following code for the architecture in Steganalysis with DL by looking at networks existing networks in digits and Tensorflow documentation.
from model import Tower
from utils import model_property
import tensorflow as tf
import tensorflow.contrib.slim as slim
import utils as digits
class UserModel(Tower):
#model_property
def inference(self):
x = tf.reshape(self.x, shape=[-1, self.input_shape[0], self.input_shape[1], self.input_shape[2]])
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_initializer=tf.contrib.layers.xavier_initializer(),
weights_regularizer=slim.l2_regularizer(0.0001)):
conv1 = tf.layers.conv2d(inputs=x, filters=64, kernel_size=7, padding='same', strides=2, activation=tf.nn.relu)
rnorm1 = tf.nn.local_response_normalization(input=conv1)
conv2 = tf.layers.conv2d(inputs=rnorm1, filters=16, kernel_size=5, padding='same', strides=1, activation=tf.nn.relu)
rnorm2 = tf.nn.local_response_normalization(input=conv2)
flatten = tf.contrib.layers.flatten(rnorm2)
fc1 = tf.contrib.layers.fully_connected(inputs=flatten, num_outputs=1000, activation_fn=tf.nn.relu)
fc2 = tf.contrib.layers.fully_connected(inputs=fc1, num_outputs=1000, activation_fn=tf.nn.relu)
fc3 = tf.contrib.layers.fully_connected(inputs=fc2, num_outputs=2)
sm = tf.nn.softmax(fc3)
return fc3
#model_property
def loss(self):
model = self.inference
loss = digits.classification_loss(model, self.y)
accuracy = digits.classification_accuracy(model, self.y)
self.summaries.append(tf.summary.scalar(accuracy.op.name, accuracy))
return loss
I tried running it but the accuracy is pretty low. Could someone tell me if I've done it completely wrong or what's wrong with it and tell me how to properly code it?
UPDATE: Thank you Nessuno! With the fix you mentioned I came up with this code:
from model import Tower
from utils import model_property
import tensorflow as tf
import tensorflow.contrib.slim as slim
import utils as digits
class UserModel(Tower):
#model_property
def inference(self):
x = tf.reshape(self.x, shape=[-1, self.input_shape[0], self.input_shape[1], self.input_shape[2]])
with slim.arg_scope([slim.conv2d, slim.fully_connected],
weights_initializer=tf.contrib.layers.xavier_initializer(),
weights_regularizer=slim.l2_regularizer(0.00001)):
conv1 = tf.layers.conv2d(inputs=x, filters=64, kernel_size=7, padding='Valid', strides=2, activation=tf.nn.relu)
rnorm1 = tf.nn.local_response_normalization(input=conv1)
conv2 = tf.layers.conv2d(inputs=rnorm1, filters=16, kernel_size=5, padding='Valid', strides=1, activation=tf.nn.relu)
rnorm2 = tf.nn.local_response_normalization(input=conv2)
flatten = tf.contrib.layers.flatten(rnorm2)
fc1 = tf.contrib.layers.fully_connected(inputs=flatten, num_outputs=1000, activation_fn=tf.nn.relu)
fc2 = tf.contrib.layers.fully_connected(inputs=fc1, num_outputs=1000, activation_fn=tf.nn.relu)
fc3 = tf.contrib.layers.fully_connected(inputs=fc2, num_outputs=2, activation_fn=None)
return fc3
#model_property
def loss(self):
model = self.inference
loss = digits.classification_loss(model, self.y)
accuracy = digits.classification_accuracy(model, self.y)
self.summaries.append(tf.summary.scalar(accuracy.op.name, accuracy))
return loss
Solver type is SGD. Learning rate is 0.001. I am shuffling training data.I have increased training data to 6000 (3000 per category, 20% from that is reserved for validation). I downloaded the training data from this link. But I am only getting the following graph. I think this is overfitting. Do you have any suggestions to improve the validation accuracy?
In NVIDIA digits, classification_loss, exactly as in tensorflow tf.nn.softmax_cross_entropy_with_logits expects as input a linear layer of neuron.
Instead, you're passing as input sm = tf.nn.softmax(fc3), hence you're applying the softmax operation 2 times and this is the reasong of your low accuracy.
In order to solve this issue, just change the model output layer to
fc3 = slim.fully_connected(fc2, 2, activation_fn=None, scope='fc3')
return fc3
Related
Here are the accuracy and loss plots for the class-weighted version:
Here are the accuracy and loss plots for the unweighted version:
Here is the code. The only difference in the above two versions is that one calls the class weights dictionary and one doesn't. (General advice about how this is set up is also welcome -- as you can see I am very new to this!)
from tensorflow import keras
from keras import optimizers
from keras.applications.resnet_v2 import ResNet50V2
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
from tensorflow.keras import layers
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Rescaling, Conv2D, MaxPool2D, Flatten
#Create datasets
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/drive/MyDrive/Colab Notebooks/train/All classes/',
labels="inferred",
label_mode="int",
validation_split=0.2,
seed=1337,
subset="training",
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/drive/MyDrive/Colab Notebooks/train/All classes/',
labels="inferred",
label_mode="int",
validation_split=0.2,
seed=1337,
subset="validation",
)
test_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/drive/MyDrive/Colab Notebooks/test/All classes/',
labels="inferred",
label_mode="int",
)
#Import ResNet
base_model = ResNet50V2(weights='imagenet', include_top=False)
#Create basic network to append to ResNet above
x = base_model.output
x = Rescaling(1.0 / 255)(x)
x = Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(256,256,3), padding="same")(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2), strides=2)(x)
x = GlobalAveragePooling2D()(x)
predictions = Dense(units=5, activation='softmax')(x)
#Merge the models
model = Model(inputs=base_model.input, outputs=predictions)
#Freeze ResNet layers
for layer in base_model.layers:
layer.trainable = False
#Compile
model.compile(optimizer=keras.optimizers.Adam(1e-3), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
#These are the weights. They are derived here from their numbers in the train dataset -- there are 25,811
#files in class 0, 2444 files in class 1, etc. This dictionary was not called for the unweighted version.
class_weight = {0: 1.0,
1: 25811.0/2444.0,
2: 25811.0/5293.0,
3: 25811.0/874.0,
4: 25811.0/709.0}
#Training the model
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath='/content/drive/MyDrive/Colab Notebooks/ResNet/',
save_weights_only=False,
mode='auto',
save_best_only=True,
save_freq= 'epoch')
history = model.fit(
x=train_ds,
epochs=30,
class_weight=class_weight,
validation_data=val_ds,
callbacks=[model_checkpoint_callback]
)
#Evaluating
loss, acc = model.evaluate(test_ds)
print("Accuracy", acc)
Also, some other questions:
Should the metrics=['accuracy'] actually be metrics=['sparse_categorical_accuracy']?
Should class_weight=class_weight actually be sample_weight=sample_weight? I couldn't tell the difference in the documentation, although most examples seem to use class_weight.
I only used padding in one Conv2D layer, and this was a bodge to force the whole thing to actually compile. Should I have been more consistent and used it for the other one too?
On that note, are there other ways my simple appended CNN model (it was called 'predictions') could be laid out to make better sense?
Ah yes, before I forget -- as you can see from the above code, I didn't preprocess the data in accordance with the keras guidance for ResNet. I figured it probably wouldn't make that much of a difference (but also because I was having trouble trying to implement it). Would that be worth looking into? I suppose the unweighted model shows a very high accuracy... probably too high now that I'm looking at it... oh, dear.
I shall be so very thankful for any advice!
I am practicing conv1D on TensorFlow 2.7, and I am checking a decoder I developed by checking if it will overfit one example. The model doesn't learn when trained on only one example and can't overfit this one example. I want to understand this strange behavior, please. This is the link to the notebook on colab Notebook.
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv1D, Dense, BatchNormalization
from tensorflow.keras.layers import ReLU, MaxPool1D, GlobalMaxPool1D
from tensorflow.keras import Model
import numpy as np
def Decoder():
inputs = Input(shape=(68, 3), name='Input_Tensor')
# First hidden layer
conv1 = Conv1D(filters=64, kernel_size=1, name='Conv1D_1')(inputs)
bn1 = BatchNormalization(name='BN_1')(conv1)
relu1 = ReLU(name='ReLU_1')(bn1)
# Second hidden layer
conv2 = Conv1D(filters=64, kernel_size=1, name='Conv1D_2')(relu1)
bn2 = BatchNormalization(name='BN_2')(conv2)
relu2 = ReLU(name='ReLU_2')(bn2)
# Third hidden layer
conv3 = Conv1D(filters=64, kernel_size=1, name='Conv1D_3')(relu2)
bn3 = BatchNormalization(name='BN_3')(conv3)
relu3 = ReLU(name='ReLU_3')(bn3)
# Fourth hidden layer
conv4 = Conv1D(filters=128, kernel_size=1, name='Conv1D_4')(relu3)
bn4 = BatchNormalization(name='BN_4')(conv4)
relu4 = ReLU(name='ReLU_4')(bn4)
# Fifth hidden layer
conv5 = Conv1D(filters=1024, kernel_size=1, name='Conv1D_5')(relu4)
bn5 = BatchNormalization(name='BN_5')(conv5)
relu5 = ReLU(name='ReLU_5')(bn5)
global_features = GlobalMaxPool1D(name='GlobalMaxPool1D')(relu5)
global_features = tf.keras.layers.Reshape((1, -1))(global_features)
conv6 = Conv1D(filters=12, kernel_size=1, name='Conv1D_6')(global_features)
bn6 = BatchNormalization(name='BN_6')(conv6)
outputs = ReLU(name='ReLU_6')(bn6)
model = Model(inputs=[inputs], outputs=[outputs], name='Decoder')
return model
model = Decoder()
model.summary()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
losses = tf.keras.losses.MeanSquaredError()
model.compile(optimizer=optimizer, loss=losses)
n = 1
X = np.random.rand(n, 68, 3)
y = np.random.rand(n, 1, 12)
model.fit(x=X,y=y, verbose=1, epochs=30)
I think the problem here is, that you have no basis to learn anything, so you can't overfit. In every epoch you have just one example which is used to adapt the weights of the network. So there is not enough time to adapt the weights for overfitting here.
So to get the result of overfitting you want to have the same data multiple times inside your training dataset so the weights can change enought to overfitt because you only change them just one small step per epoch.
A deeper look into the back propagation might help you to get a better understanding of the concept. Click
I took th liberty to adapt your notebook and enhanced the dataset as following:
n = 1
X = np.random.rand(n, 68, 3)
y = np.random.rand(n, 1, 12)
for i in range(0,10):
X=np.append(X,X,axis = 0)
y=np.append(y,y,axis = 0)
And the output would be:
I am trying to use CNN for trying to classify cats/dogs and noticed something strange.
When i define the model compile statement as below -
cat_dog_model.compile(optimizer =optimizers.Adam(),
metrics= [metrics.Accuracy()], loss=losses.binary_crossentropy)
my accuracy is very bad - something like 0.15% after 25 epochs.
When i define the same as
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
my accuracy shoots upto 55% in the first epoch and almost 80% by epoch 25.
When I read the Keras doc - https://keras.io/api/optimizers/ they mention explicitly that
You can either instantiate an optimizer before passing it to model.compile(), as in the above example, or you can pass it by its string identifier. In the latter case, the default parameters for the optimizer will be used.
Also the metrics parameter are also as per the API - Keras Metrics API
So as per my understanding i am using default parameters on both. Also when i change the metrics parameter to hardcode I get the same accuracy. So somehow the accuracy metrics is causing this issue. But I cant figure out why - Any help is appreciated.
My qn is why is hard coding metrics better than defining it as parameter?
Some more details : I am trying to use 8k images for training and about 2k images for validation.
sample code (you can change the line number 32 to get different results) :
from keras import models, layers, losses, metrics, optimizers
import numpy as np
import pandas as pd
from keras.preprocessing.image import ImageDataGenerator, load_img,img_to_array
train_datagen = ImageDataGenerator(rescale = 1./255,shear_range = 0.2,zoom_range = 0.2,horizontal_flip = True)
train_set = train_datagen.flow_from_directory('/content/drive/MyDrive/....../training_set/',
target_size = (64, 64),batch_size = 32,class_mode = 'binary')
test_datagen = ImageDataGenerator(rescale = 1./255)
test_set = test_datagen.flow_from_directory(
'/content/drive/MyDrive/........./test_set/',
target_size = (64, 64),batch_size = 32,class_mode = 'binary')
cat_dog_model = models.Sequential()
cat_dog_model.add(layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3]))
cat_dog_model.add(layers.MaxPool2D(pool_size=2, strides=2))
cat_dog_model.add(layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
cat_dog_model.add(layers.MaxPool2D(pool_size=2, strides=2) )
cat_dog_model.add(layers.Flatten())
cat_dog_model.add(layers.Dense(units=128, activation='relu'))
cat_dog_model.add(layers.Dense(units=1, activation='sigmoid'))
cat_dog_model.compile(optimizer =optimizers.Adam(), metrics= [metrics.Accuracy()], loss=losses.binary_crossentropy)
cat_dog_model.summary()
cat_dog_model.fit(x=train_set,validation_data=test_set, epochs=25)
One of the codes I took from a site just returns a noise to me, while the other one is working perfectly fine. I tried to make them very similar (one of them was in functions, while the other one wasnt, so I took them out of functions for example) and now, the only difference is that one of them is randomly selecting the batch using a numpy random, while the other one is using tensorflow to get batches (?) I believe.
Can someone explain to me why the numpy one doesnt work? In the end, they should be accomplishing similar things. Why isnt this the case?
Here's the working code:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2DTranspose, Conv2D, BatchNormalization, Reshape, LeakyReLU, Flatten, Dropout
from IPython import display
(x_train, _), (_,_) = keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 255.0
batch_size = 128
dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(1000)
dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1)
generator = Sequential()
generator.add(Dense(7*7*128, input_dim=100))
generator.add(Reshape((7,7,128)))
generator.add(BatchNormalization())
generator.add(Conv2DTranspose(64, kernel_size=5, strides=2, padding='same', activation='selu'))
generator.add(BatchNormalization())
generator.add(Conv2DTranspose(1, kernel_size=5, strides=2, padding='same', activation='tanh'))
discriminator = Sequential()
discriminator.add(Conv2D(64, kernel_size=5, strides=2, padding='same', input_shape=(28,28,1)))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dropout(0.3))
discriminator.add(Conv2D(128, kernel_size=5, strides=2, padding='same'))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dropout(0.3))
discriminator.add(Flatten())
discriminator.add(Dense(1, activation='sigmoid'))
discriminator.compile(loss='binary_crossentropy', optimizer='rmsprop')
discriminator.trainable = False
gan = Sequential([generator, discriminator])
gan.compile(loss='binary_crossentropy', optimizer='rmsprop')
def train(gan, dataset, batch_size, epochs=5):
gen, disc = gan.layers
for epoch in range(epochs):
print('epoch:{0}/{1}'.format(epoch+1, epochs))
for x_batch in dataset:
# noise = tf.random.normal(shape=[batch_size, 100])
noise = np.random.normal(0,1,size=(batch_size, 100))
gen_img = generator(noise)
x_fake_and_real = tf.concat([gen_img, tf.reshape(x_batch, (128,28,28,1))], axis=0)
y1 = np.zeros(2*batch_size)
y1[batch_size:] = 1
discriminator.trainable = True
discriminator.train_on_batch(x_fake_and_real, y1)
noise = np.random.normal(0,1,size=(batch_size, 100))
y2 = np.ones(batch_size)
discriminator.trainable = False
gan.train_on_batch(noise, y2)
noise = np.random.normal(0,1,size=(1, 100))
pred = generator.predict(noise)
plt.imshow(pred.reshape(28,28))
train(gan, dataset, 128)
#This is the test:
noise = np.random.normal(0,1,size=(1, 100))
pred = generator.predict(noise)
plt.imshow(pred.reshape(28,28), cmap='gray')
outputs this
Here's the code that refuses to work and returns a noisy image back:
from tensorflow.keras.datasets import mnist
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Conv2DTranspose, Conv2D, BatchNormalization, Reshape, LeakyReLU, Flatten, Dropout
(x_train, _), (_,_) = mnist.load_data()
x_train = x_train.astype('float32') / 255
x_train = x_train.reshape(-1, 28,28,1)
generator = Sequential()
generator.add(Dense(7*7*128, input_dim=100))
generator.add(Reshape((7,7,128)))
generator.add(BatchNormalization())
generator.add(Conv2DTranspose(64, kernel_size=5, strides=2, padding='same', activation='selu'))
generator.add(BatchNormalization())
generator.add(Conv2DTranspose(1, kernel_size=5, strides=2, padding='same', activation='tanh'))
discriminator = Sequential()
discriminator.add(Conv2D(64, kernel_size=5, strides=2, padding='same', input_shape=(28,28,1)))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dropout(0.3))
discriminator.add(Conv2D(128, kernel_size=5, strides=2, padding='same'))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dropout(0.3))
discriminator.add(Flatten())
discriminator.add(Dense(1, activation='sigmoid'))
discriminator.trainable = False
gan = Sequential([generator, discriminator])
discriminator.compile(loss='binary_crossentropy', optimizer='rmsprop')
gan.compile(loss='binary_crossentropy', optimizer='rmsprop')
dLosses = []
gLosses = []
def train(gan, epochs=10, batch_size=128):
generator, discriminator = gan.layers
batch_count = int(x_train.shape[0] / batch_size)
print('Epochs: ', epochs)
print('Batch Size: ', batch_size)
print('Batch Count: ', batch_count)
for epoch in range(1,epochs+1):
print('-'*15, 'Epoch: {0}'.format(epoch), '-'*15)
for _ in range(batch_count):
noise = np.random.normal(0,1, size=[batch_size, 100])
img_batch = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]
gen_img = generator.predict(noise)
X = np.concatenate([gen_img, img_batch])
dis_y = np.zeros(2*batch_size)
dis_y[batch_size:] = 1
discriminator.trainable = True
dloss = discriminator.train_on_batch(X, dis_y)
#new batch
noise = np.random.normal(0,1, size=(batch_size, 100))
yGen = np.ones(batch_size)
discriminator.trainable = False
gloss = gan.train_on_batch(noise, yGen)
dLosses.append(dloss)
gLosses.append(gloss)
train(gan)
#Test code:
noise = np.random.normal(0,1,size=(1, 100))
pred = generator(noise)
plt.imshow(pred.reshape(28,28), cmap='gray')
outputs this
I still dont know why one of them worked and the other didnt. But, I finally built a GAN that works consistently using TF2.0. It's using the WGAN-GP architecture. Even though there were a lot of code examples of this architecture for PyTorch, I couldnt find anything for Keras, and most of them were outdated (coded in TF1.0).
So I built it myself! Here's the link to my github. I know this is not a direct answer to my question, but from what I've researched, WGAN-GP is just a better way to make a GAN overall instead of DCGAN. It also didnt feel right to just abandon the question even though I found an answer, even if its indirect.
Anyways, good luck to everyone :)
I have translated a pytorch program into keras.
A working Pytorch program:
import numpy as np
import cv2
import torch
import torch.nn as nn
from skimage import segmentation
np.random.seed(1)
torch.manual_seed(1)
fi = "in.jpg"
class MyNet(nn.Module):
def __init__(self, n_inChannel, n_outChannel):
super(MyNet, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(n_inChannel, n_outChannel, kernel_size=3, stride=1, padding=1),
nn.ReLU(inplace=True),
nn.BatchNorm2d(n_outChannel),
nn.Conv2d(n_outChannel, n_outChannel, kernel_size=3, stride=1, padding=1),
nn.ReLU(inplace=True),
nn.BatchNorm2d(n_outChannel),
nn.Conv2d(n_outChannel, n_outChannel, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2d(n_outChannel)
)
def forward(self, x):
return self.seq(x)
im = cv2.imread(fi)
data = torch.from_numpy(np.array([im.transpose((2, 0, 1)).astype('float32')/255.]))
data = data.cuda()
labels = segmentation.slic(im, compactness=100, n_segments=10000)
labels = labels.flatten()
u_labels = np.unique(labels)
label_indexes = np.array([np.where(labels == u_label)[0] for u_label in u_labels])
n_inChannel = 3
n_outChannel = 100
model = MyNet(n_inChannel, n_outChannel)
model.cuda()
model.train()
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
label_colours = np.random.randint(255,size=(100,3))
for batch_idx in range(100):
optimizer.zero_grad()
output = model( data )[ 0 ]
output = output.permute( 1, 2, 0 ).view(-1, n_outChannel)
ignore, target = torch.max( output, 1 )
im_target = target.data.cpu().numpy()
nLabels = len(np.unique(im_target))
im_target_rgb = np.array([label_colours[ c % 100 ] for c in im_target]) # correct position of "im_target"
im_target_rgb = im_target_rgb.reshape( im.shape ).astype( np.uint8 )
for inds in label_indexes:
u_labels_, hist = np.unique(im_target[inds], return_counts=True)
im_target[inds] = u_labels_[np.argmax(hist, 0)]
target = torch.from_numpy(im_target)
target = target.cuda()
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
print (batch_idx, '/', 100, ':', nLabels, loss.item())
if nLabels <= 3:
break
fo = "out.jpg"
cv2.imwrite(fo, im_target_rgb)
(source: https://github.com/kanezaki/pytorch-unsupervised-segmentation/blob/master/demo.py)
My translation into Keras:
import cv2
import numpy as np
from skimage import segmentation
from keras.layers import Conv2D, BatchNormalization, Input, Reshape
from keras.models import Model
import keras.backend as k
from keras.optimizers import SGD, Adam
from skimage.util import img_as_float
from skimage import io
from keras.models import Sequential
np.random.seed(0)
fi = "in.jpg"
im = cv2.imread(fi).astype(float)/255.
labels = segmentation.slic(im, compactness=100, n_segments=10000)
labels = labels.flatten()
print (labels.shape)
u_labels = np.unique(labels)
label_indexes = [np.where(labels == u_label)[0] for u_label in np.unique(labels)]
n_channels = 100
model = Sequential()
model.add ( Conv2D(n_channels, kernel_size=3, activation='relu', input_shape=im.shape, padding='same'))
model.add( BatchNormalization())
model.add( Conv2D(n_channels, kernel_size=3, activation='relu', padding='same'))
model.add( BatchNormalization())
model.add( Conv2D(n_channels, kernel_size=1, padding='same'))
model.add( BatchNormalization())
model.add( Reshape((im.shape[0] * im.shape[1], n_channels)))
img = np.expand_dims(im,0)
print (img.shape)
output = model.predict(img)
print (output.shape)
im_target = np.argmax(output[0], 1)
print (im_target.shape)
for inds in label_indexes:
u_labels_, hist = np.unique(im_target[inds], return_counts=True)
im_target[inds] = u_labels_[np.argmax(hist, 0)]
def custom_loss(loss_target, loss_output):
return k.categorical_crossentropy(target=k.stack(loss_target), output=k.stack(loss_output), from_logits=True)
model.compile(optimizer=SGD(lr=0.1, momentum=0.9), loss=custom_loss)
model.fit(img, output, epochs=100, batch_size=1, verbose=1)
pred_result = model.predict(x=[img])[0]
print (pred_result.shape)
target = np.argmax(pred_result, 1)
print (target.shape)
nLabels = len(np.unique(target))
label_colours = np.random.randint(255, size=(100, 3))
im_target_rgb = np.array([label_colours[c % 100] for c in im_target])
im_target_rgb = im_target_rgb.reshape(im.shape).astype(np.uint8)
cv2.imwrite("out.jpg", im_target_rgb)
However, Keras output is really different than of pytorch
Input image:
Pytorch result:
Keras result:
Could someone help me for this translation?
Edit 1:
I corrected two errors as advised by #sebrockm
1. removed `relu` from last conv layer
2. added `from_logits = True` in the loss function
Also, changed the no. of conv layers from 4 to 3 to match with the original code.
However, output image did not improve than before and the `loss` are resulted in negative:
Epoch 99/100
1/1 [==============================] - 0s 92ms/step - loss: -22.8380
Epoch 100/100
1/1 [==============================] - 0s 99ms/step - loss: -23.039
I think that the Keras code lacks connection between model and output. However, could not figure out to make this connection.
Two major mistakes that I see (likely related):
The last convolutional layer in the original model does not have an activation function, while your translation uses relu.
The original model uses CrossEntropyLoss as loss function, while your model uses categorical_crossentropy with logits=False (a default argument). Without mathematical background the difference is tricky to explain, but in short: CrossEntropyLoss has a softmax built in, that's why the model doesn't have one on the last layer. To do the same in keras, use k.categorical_crossentropy(..., logits=True). "logits" means the input values are expected not to be "softmaxed", i.e. all values can be arbitrary. Currently, your loss function expects the output values to be "softmaxed", i.e. all values must be between 0 and 1 (and sum up to 1).
Update:
One other mistake, likely a huge one: In Keras, you calculate the output once in the beginning and never change it from there on. Then you train your model to fit on this initially generated output.
In the original pytorch code, target (which is the variable being trained on) gets updated in every training loop.
So, you cannot use Keras' fit method which is designed for doing the entire training for you (given fixed training data). You will have to replicate the training loop manually, just as it is done in the pytorch code. I'm not sure if this is easily doable with the API Keras provides. train_on_batch is one method you surely will need in your manual loop. You will have to do some more work, I'm afraid...