Make Keras model output a constant of certain shape - tensorflow

I want a Keras model which always outputs a constant value of a desired output shape.
def build_model(input_shape, output_shape)
input = tf.keras.layers.Input(shape=(512,512,3))
x = tf.keras.backend.constant(1, shape=output_shape)
output = tf.keras.layers.Lambda(lambda x: x)(x)
model = Model(inputs=input, outputs=output)
return model
model = build_model((512,512,3), (512,512,32))
I get the following error:
Output tensors to a Model must be the output of a TensorFlow Layer (thus holding past layer metadata). Found: Tensor("Const_3:0", shape=(512, 512, 32), dtype=float32)
How can I fix it?
Update
Input and output are indeed not connected. I want to test the performance of my processing pipeline with the lowest GPU load possible. I think that always outputting the same value without doing any computations won't use the GPU much. But I still make sure that my data is properly loaded (input layer).

The issue was indeed that output and input need to be connected. I couldn't use an activation layer because the output should be of a different shape than the input. Thus I ended up concatenating the input 11 times and slice it again to get an output of the correct shape with 0 trainable parameters.
The final model building function looks like this:
def build_model(input_shape=(512,512,3)):
input = tf.keras.layers.Input(shape=input_shape)
lamb = tf.keras.layers.Lambda(lambda x: tf.slice(tf.concat([x]*11, axis=3), begin=(0,0,0,0), size=(-1,512,512,32)))
output = lamb(inp)
model = Model(inputs=input, outputs=output)
return model

Related

How to perform mathematical operation on regression output layer

I have a simple regression neural network like this:
from scipy.spatial.transform import Rotation as R
def nn_model(2):
in = tf.keras.layers.Input(shape=[80, 80, 3])
x = tf.keras.layers.Dense(64, activation='relu')(in)
x = tf.keras.layers.Dense(64, activation='relu')(x)
out = tf.keras.layers.Dense(1, activation="linear")(x)
####### Perform math operation here
r = R.from_euler('z', out.numpy(), degrees=True)
rMat = r.as_matrix()
#######
return tf.keras.Model(inputs=in, outputs=rMat)
I want to perform a mathematical operation on the output regression layer 'out' inside the network. Is it possible to access its value from inside the NN? Running the code above gives this error:
AttributeError: 'KerasTensor' object has no attribute 'numpy'
Keras layers do not output tensor values, they instead output a tensor specification (KerasTensor) used to get the shape, dtypes and other attribute of the previous layer.
So no, it's not possible to access the value of a layer, as it has no value.
What you can do instead is to use a LambdaLayer which let you apply any pythonic code to the "real" output of the layer.
r = tf.keras.layers.Lambda(lambda x: R.from_euler('z', x))(out)
Note that I'm not sure this will work, as the function inside the lambda should preferably use tensorflow operations, and should be differentiable.

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Issue with feeding value into placeholder tensor for sess.run()

I want to get the value of an intermediate tensor in a convolutional neural network for a specific input. I know how to do this in keras and even though I have trained a model using keras, I'm going to move towards constructing and training the model using only tensorflow. Therefore, I want to move away from something like K.function(input_layer, output_layer) which is fairly simple, and instead use tensorflow. I believe I should use placeholder values, like the following approach:
with tf.compat.v1.Session(graph=tf.Graph()) as sess:
loaded_model = tf.keras.models.load_model(filepath)
graph = tf.compat.v1.get_default_graph()
images = tf.compat.v1.placeholder(tf.float32, shape=(None, 28, 28, 1)) # To specify input at MNIST images
output_tensor = graph.get_tensor_by_name(tensor_name) # tensor_name is 'dense_1/MatMul:0'
output = sess.run([output_tensor], feed_dict={images: x_test[0:1]}) # x_test[0:1] is of shape (1, 28, 28, 1)
print(output)
However, I get the following error message for the sess.run() line: Invalid argument: You must feed a value for placeholder tensor 'conv2d_2_input' with dtype float and shape [?,28,28,1]. I am unsure why I get this message because the image used for feed_dict is of type float and is what I believe to be the correct shape. Any help would be suggested.
You must use the input tensor from the Keras model, not make your own new placeholder, which would be disconnected from the rest of the model:
with tf.Graph().as_default(), tf.compat.v1.Session() as sess:
# Load model
loaded_model = tf.keras.models.load_model(filepath)
# Take model input tensor
images = loaded_model.input
# Take output of the second layer (index 1)
output_tensor = loaded_model.layers[1].output
# Evaluate
output = sess.run(output_tensor, feed_dict={images: x_test[0:1]})
print(output)

What would be the output from tensorflow dense layer if we assign itself as input and output while making a neural network?

I have been going through the implementation of neural network in openAI code for any Vanilla Policy Gradient (As a matter of fact, this part is used nearly everywhere). The code looks something like this :
def mlp_categorical_policy(x, a, hidden_sizes, activation, output_activation, action_space):
act_dim = action_space.n
logits = mlp(x, list(hidden_sizes) + [act_dim], activation, None)
logp_all = tf.nn.log_softmax(logits)
pi = tf.squeeze(tf.random.categorical(logits, 1), axis=1)
logp = tf.reduce_sum(tf.one_hot(a, depth=act_dim) * logp_all, axis=1)
logp_pi = tf.reduce_sum(tf.one_hot(pi, depth=act_dim) * logp_all, axis=1)
return pi, logp, logp_pi
and this multi-layered perceptron network is defined as follows :
def mlp(x, hidden_sizes=(32,), activation=tf.tanh, output_activation=None):
for h in hidden_sizes[:-1]:
x = tf.layers.dense(inputs=x, units=h, activation=activation)
return tf.layers.dense(inputs=x, units=hidden_sizes[-1], activation=output_activation)
My question is what is the return from this mlp function? I mean the structure or shape. Is it an N-dimentional tensor? If so, how is it given as an input to tf.random_categorical? If not, and its just has the shape [hidden_layer2, output], then what happened to the other layers? As per their website description about random_categorical it only takes a 2-D input. The complete code of openAI's VPG algorithm can be found here. The mlp is implemented here. I would be highly grateful if someone would just tell me what this mlp_categorical_policy() is doing?
Note: The hidden size is [64, 64], the action dimension is 3
Thanks and cheers
Note that this is a discrete action space - there are action_space.n different possible actions at every step, and the agent chooses one.
To do this the MLP is returning the logits (which are a function of the probabilities) of the different actions. This is specified in the code by + [act_dim] which is appending count of the action_space as the final MLP layer. Note that the last layer of an MLP is the output layer. The input layer is not specified in tensorflow, it is inferred from the inputs.
tf.random.categorical takes the logits and samples a policy action pi from them, which is returned as a number.
mlp_categorical_policy also returns logp, the log probability of the action a (used to assign credit), and logp_pi, the log probability of the policy action pi.
It seems your question is more about the return from the mlp.
The mlp creates a series of fully connected layers in a loop. In each iteration of the loop, the mlp is creating a new layer using the previous layer x as an input and assigning it's output to overwrite x, with this line x = tf.layers.dense(inputs=x, units=h, activation=activation).
So the output is not the same as the input, on each iteration x is overwritten with the value of the new layer. This is the same kind of coding trick as x = x + 1, which increments x by 1. This effectively chains the layers together.
The output of tf.layers.dense is a tensor of size [:,h] where : is the batch dimension (and can usually be ignored). The creation of the last layer happens outisde the loop, it can be seen that the number of nodes in this layer is act_dim (so shape is [:,3]). You can check the shape by doing this:
import tensorflow.compat.v1 as tf
import numpy as np
def mlp(x, hidden_sizes=(32,), activation=tf.tanh, output_activation=None):
for h in hidden_sizes[:-1]:
x = tf.layers.dense(x, units=h, activation=activation)
return tf.layers.dense(x, units=hidden_sizes[-1], activation=output_activation)
obs = np.array([[1.0,2.0]])
logits = mlp(obs, [64, 64, 3], tf.nn.relu, None)
print(logits.shape)
result: TensorShape([1, 3])
Note that the observation in this case is [1.,2.], it is nested inside a batch of size 1.

How can I get a tensor output by a tensorflow.layer

I created a CNN model using higher level tensorflow layers, like
conv1 = tf.layers.conv2d(...)
maxpooling1 = tf.layers.max_pooling2d(...)
conv2 = tf.layers.conv2d(...)
maxpooling2 = tf.layers.max_pooling2d(...)
flatten = tf.layers.flatten(...)
logits = tf.layers.dense(...)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(...))
optimizer = tf.train.AdadeltaOptimizer(init_lr).minimize(loss)
acc = tf.reduce_mean(...)
The model is well trained and saved, everything is good so far. Next, I want to load this saved model, make a change to the learning rate, and continue to train (I know tensorflow provides exponential_decay() function to allow a decay learning rate, here i just want to be in full control of learning rate, and change it manually). To do this, my idea is like:
saver = tf.train.import_meta_grah(...)
saver.restore(sess, tf.train.latest_chechpoint(...))
graph = tf.get_default_graph()
inputImg_ = graph.get_tensor_by_name(...) # this is place_holder in model
labels_ = graph.get_tensor_by_name(...) # place_holder in model
logits = graphget_tensor_by_name(...) # output of dense layer
loss = grah.get_tensor_by_name(...) # loss
optimizer = tf.train.AdadeltaOptimizer(new_lr).minimize(loss) # I give it a new learning rate
acc = tf.reduce_mean(...)
Now I got a problem. the code above can successfully obtain inputmg_, labels_, because I named them when I defined them. But I cannot obtain logits because logits = tf.layers.dense(name='logits') the name is actually given to the dense layer instead of the output tensor logits. That means, I cannot obtain the tensor conv1, conv2 either. It seems tensorflow cannot name a tensor output by a layer. In this case, is there a way to obtain these tensors, like logits, conv1, maxpooling1? I've searched for the answer for a while but failed.
I was having the same problem and solved it using tf.identity.
Since the dense layer has bias and weights parameters, when you name it, you are naming the layer, not the output tensor.
The tf.identity returns a tensor with the same shape and contents as input.
So just leave the dense layer unamed and use it as input to the tf.identity
self.output = tf.layers.dense(hidden_layer3, 2)
self.output = tf.identity(self.output, name='output')
Now you can load the output
output = graph.get_tensor_by_name('output:0')