Keras: difference of InputLayer and Input - tensorflow

I made a model using Keras with Tensorflow. I use Inputlayer with these lines of code:
img1 = tf.placeholder(tf.float32, shape=(None, img_width, img_heigh, img_ch))
first_input = InputLayer(input_tensor=img1, input_shape=(img_width, img_heigh, img_ch))
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
But I get this error:
ValueError: Layer 1st_conv1 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.engine.topology.InputLayer'>. Full input: [<keras.engine.topology.InputLayer object at 0x00000000112170F0>]. All inputs to the layer should be tensors.
When I use Input like this, it works fine:
first_input = Input(tensor=img1, shape=(224, 224, 3), name='1st_input')
first_dense = Conv2D(16, 3, 3, activation='relu', border_mode='same', name='1st_conv1')(first_input)
What is the difference between Inputlayer and Input?

InputLayer is a layer.
Input is a tensor.
You can only call layers passing tensors to them.
The idea is:
outputTensor = SomeLayer(inputTensor)
So, only Input can be passed because it's a tensor.
Honestly, I have no idea about the reason for the existence of InputLayer. Maybe it's supposed to be used internally. I never used it, and it seems I'll never need it.

According to tensorflow website, "It is generally recommend to use the functional layer API via Input, (which creates an InputLayer) without directly using InputLayer."
Know more at this page here

Input: Used for creating a functional model
inp=tf.keras.Input(shape=[?,?,?])
x=layers.Conv2D(.....)(inp)
Input Layer: used for creating a sequential model
x=tf.keras.Sequential()
x.add(tf.keras.layers.InputLayer(shape=[?,?,?]))
And the other difference is that
When using InputLayer with the Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer.
That is in sequential model you can skip the InputLayer and specify the shape directly in the first layer.
i.e From this
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(4,)),
tf.keras.layers.Dense(8)])
To this
model = tf.keras.Sequential([
tf.keras.layers.Dense(8, input_shape=(4,))])

To define it in simple words:
keras.layers.Input is used to instantiate a Keras Tensor. In this case, your data is probably not a tf tensor, maybe an np array.
On the other hand, keras.layers.InputLayer is a layer where your data is already defined as one of the tf tensor types, i.e., can be a ragged tensor or constant or other types.
I hope this helps!

Related

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Keras remove activation function of last layer

I want to use ResNet50 with Imagenet weights.
The last layer of ResNet50 is (from here)
x = layers.Dense(1000, activation='softmax', name='fc1000')(x)
I need to keep the weights of this layer but remove the softmax function.
I want to manually change it so my last layer looks like this
x = layers.Dense(1000, name='fc1000')(x)
but the weights stay the same.
Currently I call my net like this
resnet = Sequential([
Input(shape(224,224,3)),
ResNet50(weights='imagenet', input_shape(224,224,3))
])
I need the Input layer because otherwise the model.compile says that placeholders aren't filled.
Generally there are two ways of achievieng this:
Quick way - supported functions:
To change the final layer's activation function, you can pass an argument classifier_activation.
So in order to get rid of activation all together, your module can be called like:
import tensorflow as tf
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg",
classifier_activation=None
)
])
This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).
To achieve this you can use the workaround solution:
Long way - copy the model's weights
This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.
You need to:
1. Download original model.
2. Save it's weights.
3. Declare your modified version of the model (in your case, without the activation function).
4. Set the weights of the new model.
Below snippet explains this concept in more detail:
import tensorflow as tf
# 1. Download original resnet
resnet = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
)
])
# 2. Hold weights in memory:
imagenet_weights = resnet.get_weights()
# 3. Declare the model, but without softmax
resnet_no_softmax = tf.keras.Sequential([
tf.keras.layers.Input(shape=(224,224,3)),
tf.keras.applications.ResNet50(
include_top=False,
weights='imagenet',
input_shape=(224,224,3),
pooling="avg"
),
tf.keras.layers.Dense(1000, name='fc1000')
])
# 4. Pass the imagenet weights onto the second resnet
resnet_no_softmax.set_weights(imagenet_weights)
Hope this helps!

How to expand output of embedding layer in keras

I have the following network:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Conv2D(256,(2,2),activation='relu'))
the output from the embedding layer is of shape (batchSize, 12, 100). The conv2D layer requires an input of shape (batchSize, filter, 12, 100), and I get the following error:
Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=3
So, how can I expand the output from the embedding layer to make it proper for the Conv2D layer?
I'm using Keras with Tensorflow as the back end.
Adding a reshape Layer should be the way to go https://keras.io/layers/core/#reshape
Depending on the concrete situation Conv1D cold although work.
I managed to add another dimension with the following piece of code:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Lambda(lambda x: expand_dims(x, 3)))
model.add(Conv2D(256,(2,2),activation='relu'))

Converting Theano-based Keras model definition to TensorFlow

When converting Theano-based Keras model definition to TensorFlow, is it enough to change the order of input_shape on the input layer?
For example, the following layer
Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height))
will be replaced as
Convolution2D(32, 3, 3, input_shape=(img_width, img_height, 3))
Note: I don't want to use dim_ordering='th'.
Answer from Francois Chollet:
I think the question means "what input_shape should I pass to my
first layer given that I'm using TensorFlow and that my default
setting for dim_ordering is "tf"". The answer is yep, that's how you
do it, (img_width, img_height, 3).
Important to note that if you want to load saved models that were
trained with Theano with dim_ordering="th", into a model definition
for TF with dim_ordering="tf", you will need to convert the convolution
kernels. Keras has utils for that.

Tensorflow: Convert constant tensor from pre-trained Vgg model to variable

My question is how can I convert a constant tensor loaded from a pre-trained Vgg16 model to a tf.Variable tensor? The motivation is that I need to compute the gradient of a specific loss with respect to the Conv4_3 layers' kernel, however, the kernel were seems set to a tf.Constant type and it is not accepted by tf.Optimizer.compute_gradients method.
F = vgg.graph.get_tensor_by_name('pretrained_vgg16/conv4_3/filter:0')
G = optimizer.compute_gradients(losses, var_list=[F])
# TypeError: Argument is not a tf.Variable: Tensor("pretrained_vgg16/conv4_3/filter:0", shape=(3, 3, 512, 512), dtype=float32)
What I have tried is to use tf.assign method to update the kernel to a variable type tensor with initial value set to be the original kernel, but it gives a TypeError: Input 'ref' of 'Assign' Op requires l-value input
F = tf.assign(F, tf.Variable(F, trainable=False))
So, how can I achieve that? Many thanks in advance!
Update: I download the pretrained model according to Pretrained Vgg16 Tensorflow model and then I loaded the model by:
with open('vgg16.tfmodel', mode='rb') as f:
fileContent = f.read()
graph_def = tf.GraphDef()
graph_def.ParseFromString(fileContent)
# Map input tensor
inputs = tf.placeholder("float", [1, 224, 224, 3], name='inputs')
tf.import_graph_def(graph_def, input_map={ "images": inputs }, name='pretrained_vgg16')
graph = tf.get_default_graph()
All the code above is defined in a class named vgg.
The reason why you did not get variables from the pre-trained model could be explained in this answer. Briefly, tf.import_graph_def just restore the structure of a graph, without the variables.
A solution to this is to build the model yourself, with same variable name to the pre-trained model. Then load pre-trained model and assign every variable with specific parameter.
I recommend this vgg model.