I have a simple regression neural network like this:
from scipy.spatial.transform import Rotation as R
def nn_model(2):
in = tf.keras.layers.Input(shape=[80, 80, 3])
x = tf.keras.layers.Dense(64, activation='relu')(in)
x = tf.keras.layers.Dense(64, activation='relu')(x)
out = tf.keras.layers.Dense(1, activation="linear")(x)
####### Perform math operation here
r = R.from_euler('z', out.numpy(), degrees=True)
rMat = r.as_matrix()
#######
return tf.keras.Model(inputs=in, outputs=rMat)
I want to perform a mathematical operation on the output regression layer 'out' inside the network. Is it possible to access its value from inside the NN? Running the code above gives this error:
AttributeError: 'KerasTensor' object has no attribute 'numpy'
Keras layers do not output tensor values, they instead output a tensor specification (KerasTensor) used to get the shape, dtypes and other attribute of the previous layer.
So no, it's not possible to access the value of a layer, as it has no value.
What you can do instead is to use a LambdaLayer which let you apply any pythonic code to the "real" output of the layer.
r = tf.keras.layers.Lambda(lambda x: R.from_euler('z', x))(out)
Note that I'm not sure this will work, as the function inside the lambda should preferably use tensorflow operations, and should be differentiable.
Related
So I'm working on writing a GAN neural network and I want to set my network's output to 0 if it is less than 0 and 1 if it is greater than 1 and leave it unchanged otherwise. I'm pretty new to tensorflow, but I don't know of any tensorflow function or activation to do this without unwanted side effects. So I made my loss function so it calculates the loss as if the output was clamped, with this code:
def discriminator_loss(real_output, fake_output):
real_output_clipped = min(max(real_output.numpy()[0],
0), 1)
fake_output_clipped = min(max(fake_output.numpy()[0],
0), 1)
real_clipped_tensor =
tf.Variable([[real_output_clipped]], dtype = "float32")
fake_clipped_tensor =
tf.Variable([[fake_output_clipped]], dtype = "float32")
real_loss = cross_entropy(tf.ones_like(real_output),
real_clipped_tensor)
fake_loss = cross_entropy(tf.zeros_like(fake_output),
fake_clipped_tensor)
total_loss = real_loss + fake_loss
return total_loss
but I get this error:
ValueError: No gradients provided for any variable: ['dense_50/kernel:0', 'dense_50/bias:0', 'dense_51/kernel:0', 'dense_51/bias:0', 'dense_52/kernel:0', 'dense_52/bias:0', 'dense_53/kernel:0', 'dense_53/bias:0'].
Does anyone know a better way to do this, or a way to fix this error?
Thanks!
You can apply a ReLU layer from Keras as your final layer and set max_value=1.0. For example:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(32, input_shape=(16,)))
model.add(tf.keras.layers.Dense(32))
model.add(tf.keras.layers.ReLU(max_value=1.0))
You can read more about it here: https://www.tensorflow.org/api_docs/python/tf/keras/layers/ReLU
TF probably does not know how to update your network weights based on this loss. The input of the cross entropy are tensors (variables) that are directly assigned from numpy arrays and are not connected to your actual network outputs.
If you want to perform operations on tensors that will remain within the graph and (hopefully) be differentiable, use the available TF operations. There's a "clip_by_value" operation described here: https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/clip_by_value.
E.g. real_output_clipped = tf.clip_by_value(real_output, clip_value_min=0, clip_value_max=1)
I have been going through the implementation of neural network in openAI code for any Vanilla Policy Gradient (As a matter of fact, this part is used nearly everywhere). The code looks something like this :
def mlp_categorical_policy(x, a, hidden_sizes, activation, output_activation, action_space):
act_dim = action_space.n
logits = mlp(x, list(hidden_sizes) + [act_dim], activation, None)
logp_all = tf.nn.log_softmax(logits)
pi = tf.squeeze(tf.random.categorical(logits, 1), axis=1)
logp = tf.reduce_sum(tf.one_hot(a, depth=act_dim) * logp_all, axis=1)
logp_pi = tf.reduce_sum(tf.one_hot(pi, depth=act_dim) * logp_all, axis=1)
return pi, logp, logp_pi
and this multi-layered perceptron network is defined as follows :
def mlp(x, hidden_sizes=(32,), activation=tf.tanh, output_activation=None):
for h in hidden_sizes[:-1]:
x = tf.layers.dense(inputs=x, units=h, activation=activation)
return tf.layers.dense(inputs=x, units=hidden_sizes[-1], activation=output_activation)
My question is what is the return from this mlp function? I mean the structure or shape. Is it an N-dimentional tensor? If so, how is it given as an input to tf.random_categorical? If not, and its just has the shape [hidden_layer2, output], then what happened to the other layers? As per their website description about random_categorical it only takes a 2-D input. The complete code of openAI's VPG algorithm can be found here. The mlp is implemented here. I would be highly grateful if someone would just tell me what this mlp_categorical_policy() is doing?
Note: The hidden size is [64, 64], the action dimension is 3
Thanks and cheers
Note that this is a discrete action space - there are action_space.n different possible actions at every step, and the agent chooses one.
To do this the MLP is returning the logits (which are a function of the probabilities) of the different actions. This is specified in the code by + [act_dim] which is appending count of the action_space as the final MLP layer. Note that the last layer of an MLP is the output layer. The input layer is not specified in tensorflow, it is inferred from the inputs.
tf.random.categorical takes the logits and samples a policy action pi from them, which is returned as a number.
mlp_categorical_policy also returns logp, the log probability of the action a (used to assign credit), and logp_pi, the log probability of the policy action pi.
It seems your question is more about the return from the mlp.
The mlp creates a series of fully connected layers in a loop. In each iteration of the loop, the mlp is creating a new layer using the previous layer x as an input and assigning it's output to overwrite x, with this line x = tf.layers.dense(inputs=x, units=h, activation=activation).
So the output is not the same as the input, on each iteration x is overwritten with the value of the new layer. This is the same kind of coding trick as x = x + 1, which increments x by 1. This effectively chains the layers together.
The output of tf.layers.dense is a tensor of size [:,h] where : is the batch dimension (and can usually be ignored). The creation of the last layer happens outisde the loop, it can be seen that the number of nodes in this layer is act_dim (so shape is [:,3]). You can check the shape by doing this:
import tensorflow.compat.v1 as tf
import numpy as np
def mlp(x, hidden_sizes=(32,), activation=tf.tanh, output_activation=None):
for h in hidden_sizes[:-1]:
x = tf.layers.dense(x, units=h, activation=activation)
return tf.layers.dense(x, units=hidden_sizes[-1], activation=output_activation)
obs = np.array([[1.0,2.0]])
logits = mlp(obs, [64, 64, 3], tf.nn.relu, None)
print(logits.shape)
result: TensorShape([1, 3])
Note that the observation in this case is [1.,2.], it is nested inside a batch of size 1.
I want a Keras model which always outputs a constant value of a desired output shape.
def build_model(input_shape, output_shape)
input = tf.keras.layers.Input(shape=(512,512,3))
x = tf.keras.backend.constant(1, shape=output_shape)
output = tf.keras.layers.Lambda(lambda x: x)(x)
model = Model(inputs=input, outputs=output)
return model
model = build_model((512,512,3), (512,512,32))
I get the following error:
Output tensors to a Model must be the output of a TensorFlow Layer (thus holding past layer metadata). Found: Tensor("Const_3:0", shape=(512, 512, 32), dtype=float32)
How can I fix it?
Update
Input and output are indeed not connected. I want to test the performance of my processing pipeline with the lowest GPU load possible. I think that always outputting the same value without doing any computations won't use the GPU much. But I still make sure that my data is properly loaded (input layer).
The issue was indeed that output and input need to be connected. I couldn't use an activation layer because the output should be of a different shape than the input. Thus I ended up concatenating the input 11 times and slice it again to get an output of the correct shape with 0 trainable parameters.
The final model building function looks like this:
def build_model(input_shape=(512,512,3)):
input = tf.keras.layers.Input(shape=input_shape)
lamb = tf.keras.layers.Lambda(lambda x: tf.slice(tf.concat([x]*11, axis=3), begin=(0,0,0,0), size=(-1,512,512,32)))
output = lamb(inp)
model = Model(inputs=input, outputs=output)
return model
I wanna draw the weights of tf.layers.dense in tensorboard histogram, but it not show in the parameter, how could I do that?
The weights are added as a variable named kernel, so you could use
x = tf.dense(...)
weights = tf.get_default_graph().get_tensor_by_name(
os.path.split(x.name)[0] + '/kernel:0')
You can obviously replace tf.get_default_graph() by any other graph you are working in.
I came across this problem and just solved it. tf.layers.dense 's name is not necessary to be the same with the kernel's name's prefix. My tensor is "dense_2/xxx" but it's kernel is "dense_1/kernel:0". To ensure that tf.get_variable works, you'd better set the name=xxx in the tf.layers.dense function to make two names owning same prefix. It works as the demo below:
l=tf.layers.dense(input_tf_xxx,300,name='ip1')
with tf.variable_scope('ip1', reuse=True):
w = tf.get_variable('kernel')
By the way, my tf version is 1.3.
The latest tensorflow layers api creates all the variables using the tf.get_variable call. This ensures that if you wish to use the variable again, you can just use the tf.get_variable function and provide the name of the variable that you wish to obtain.
In the case of a tf.layers.dense, the variable is created as: layer_name/kernel. So, you can obtain the variable by saying:
with tf.variable_scope("layer_name", reuse=True):
weights = tf.get_variable("kernel") # do not specify
# the shape here or it will confuse tensorflow into creating a new one.
[Edit]: The new version of Tensorflow now has both Functional and Object-Oriented interfaces to the layers api. If you need the layers only for computational purposes, then using the functional api is a good choice. The function names start with small letters for instance -> tf.layers.dense(...). The Layer Objects can be created using capital first letters e.g. -> tf.layers.Dense(...). Once you have a handle to this layer object, you can use all of its functionality. For obtaining the weights, just use obj.trainable_weights this returns a list of all the trainable variables found in that layer's scope.
I am going crazy with tensorflow.
I run this:
sess.run(x.kernel)
after training, and I get the weights.
Comes from the properties described here.
I am saying that I am going crazy because it seems that there are a million slightly different ways to do something in tf, and that fragments the tutorials around.
Is there anything wrong with
model.get_weights()
After I create a model, compile it and run fit, this function returns a numpy array of the weights for me.
In TF 2 if you're inside a #tf.function (graph mode):
weights = optimizer.weights
If you're in eager mode (default in TF2 except in #tf.function decorated functions):
weights = optimizer.get_weights()
in TF2 weights will output a list in length 2
weights_out[0] = kernel weight
weights_out[1] = bias weight
the second layer weight (layer[0] is the input layer with no weights) in a model in size: 50 with input size: 784
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(50, activation="relu", name="dense_1")(inputs)
x = layers.Dense(50, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(...)
model.fit(...)
kernel_weight = model.layers[1].weights[0]
bias_weight = model.layers[1].weights[1]
all_weight = model.layers[1].weights
print(len(all_weight)) # 2
print(kernel_weight.shape) # (784,50)
print(bias_weight.shape) # (50,)
Try to make a loop for getting the weight of each layer in your sequential network by printing the name of the layer first which you can get from:
model.summary()
Then u can get the weight of each layer running this code:
for layer in model.layers:
print(layer.name)
print(layer.get_weights())
I recently have started using Keras to build neural networks. I built a simple CNN to classify MNIST dataset. Before learning the model I used K.set_image_dim_ordering('th') in order to plot a convolutional layer weights. Right now I am trying to visualize convolutional layer output with K.function method, but I keep getting error.
Here is what I want to do for now:
input_image = X_train[2:3,:,:,:]
output_layer = model.layers[1].output
input_layer = model.layers[0].input
output_fn = K.function(input_layer, output_layer)
output_image = output_fn.predict(input_image)
print(output_image.shape)
output_image = np.rollaxis(np.rollaxis(output_image, 3, 1), 3, 1)
print(output_image.shape)
fig = plt.figure()
for i in range(32):
ax = fig.add_subplot(4,8,i+1)
im = ax.imshow(output_image[0,:,:,i], cmap="Greys")
plt.xticks(np.array([]))
plt.yticks(np.array([]))
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([1, 0.1, 0.05 ,0.8])
fig.colorbar(im, cax = cbar_ax)
plt.tight_layout()
plt.show()
And this is what I get:
File "/home/kinshiryuu/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1621, in function
return Function(inputs, outputs, updates=updates)
File "/home/kinshiryuu/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1569, in __init__
raise TypeError('`inputs` to a TensorFlow backend function '
TypeError: `inputs` to a TensorFlow backend function should be a list or tuple.
You should do the following changes:
output_fn = K.function([input_layer], [output_layer])
output_image = output_fn([input_image])
K.function takes the input and output tensors as list so that you can create a function from many input to many output. In your case one input to one output.. but you need to pass them as a list none the less.
Next K.function returns a tensor function and not a model object where you can use predict(). The correct way of using is just to call as a function
I think you can also use K.function to get gradients.
self.action_gradients = K.gradients(Q_values, actions)
self.get_action_gradients=K.function[*self.model.input, K.learning_phase()], outputs=action_gradients)
which basically runs the graph to obtain the Q-value to calculate the gradient of the Q-value w.r.t. action vector in DDPG. Source code here (lines 64 to 70): https://github.com/nyck33/autonomous_quadcopter/blob/master/criticSolution.py#L65
In light of the accepted answer and this usage here (originally from project 5 autonomous quadcopter in the Udacity Deep Learning nanodegree), a question remains in my mind, ie. is K.function() something that can be used fairly flexibly to run the graph and to designate as outputs of K.function() for example outputs of a particular layer, gradients or even weights themselves?
Lines 64 to 67 here: https://github.com/nyck33/autonomous_quadcopter/blob/master/actorSolution.py
It is being used as a custom training function for the actor network in DDPG:
#caller
self.actor_local.train_fn([states, action_gradients, 1])
#called
self.train_fn = K.function(inputs=[self.model.input, action_gradients, K.learning_phase()], \
outputs=[], updates=updates_op)
outputs is given a value of an empty list because we merely want to train the actor network with the action_gradients from the critic network.