I have multiple RNN layers right now setup like:
stack = tf.nn.rnn_cell.MultiRNNCell([
tf.nn.rnn_cell.GRUCell(num_hidden, activation=clipped_relu)
for _ in range(num_rnn_layers)
])
But am trying to add layer normalization using https://www.tensorflow.org/api_docs/python/tf/contrib/layers/layer_norm to the RNN layers. I've tried a number of different setups but can't get the model to compile.
Has anyone done this yet? And if so, how did you implement it?
I think you need to define your own layer class that normalizes inside the call function. Did you try that?
There is a layer normalization implementation here:
tf.contrib.rnn.LayerNormBasicLSTMCell
which can be used in the MultiRNNCell function.
Related
VBN is talked in This paper. And implemented Here, Here and Here. I donot want to go to core/full code. I just want to know, how to use VBN as keras layer, as i am not very expert tensorflow/keras coder. I generally use simple batch normalization (BN) as follows
model.add(BatchNormalization(momentum=0.8))
In a similar way how to use VBN instead of BN in following keras code?
model.add(Dense(256,input_dim=self.input_dim))
model.add(LeakyReLU(alpha=.2))
model.add(BatchNormalization(momentum=0.8))%I want to replace this with VBN
model.add(Dense(512))
......
.......
In the first link they say
The __init__ API is intended to mimic
tf.compat.v1.layers.batch_normalization as
closely as possible.
So if you take a look at https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization,
It says you use this function as ...
x_norm = tf.layers.batch_normalization(x, training=training)
So if I understand well,
using the functional API https://keras.io/getting-started/functional-api-guide/,
You should probably do something like:
layer_n = VBN(**kwargs, layer_n-1)
I hope it helps
guys! I have a question to ask.If I want to use maxout as activation function , how should I write the codes in Tensorflow? An input parameter is required in the slim.maxout() function, so it cannot be used for
slim.arg_scope([slim.conv], activation_fn=slim.maxout)?What should I do?
You may have to define maxout in a separate function. For example:
def maxout(inputs,num_inputs):
return slim.maxout(inputs,num_inputs)
slim.arg_scope([slim.conv],activation_fn=maxout)
(I may have the arguments defined incorrectly in the maxout function.)
In any case, I recommend that you switch to tf.layers (tf core API) because tf.slim seems to be in the works of being phased out.
https://github.com/tensorflow/tensorflow/issues/16182#issuecomment-372397483
I have looked around everywhere but could not find the way to do this.
Basically I want to feed input to some intermediate layer in a keras model and want to the backpropagation for the full graph (i.e. including layer before the intermediate layer). To understand this I refer you to the figure as mentioned in the paper "Multi-view Convolutional Neural Networks for 3D Shape Recognition".
From the figure you can see that the feature are maxpooled in view pooling layer and then the resultant vector is passed to the rest of the network.
From the paper they further did he back propagation using the view pooling features.
To achieve this I am trying a simple approach. There will not be any viewpooling layer in my model. This pooling I will do offline by taking the features for multiple views and then taking the max of it. Finally the aggregated feature will be passed to rest of the network. However I am not able to figure out how to do the back propagation to the full network by passing input to intermediate layer directly.
Any help would be appreciated. Thanks
If you have the code of the tensorflow model, then this will be quite simple. The model would probably look like
def model( cnns ):
viewpool_output = f(cnns)
cnn2_output = cnn2( viewpool_output )
...
You would just need to change the model to
def model( viewpool_output ):
cnn2_output = cnn2( viewpool_output )
...
and instead of passing a "real" view pool output, you just pass whatever image you want. But you haven't given any code, so we can only guess at what it looks like.
I need to write a simple initializer for my convolutional layer biases. I am using tf.slim so I can specify the initializer when calling the convolutional layer, like so.
I want to replace the biases_initializer=init_ops.zeros_initializer() with my own custom function that just initializes the bias to a given constant, for example :
`biases_initializer=custom_initializer(value)`
where I can specify the value, for example value = -5.
Can anyone show me how this is done? I've spent about an hour reading through the existing initializers, but still don't know how to implement this simple function.
I finally found that it is not necessary to define that function since there already is a tf.constant_initializer. The above would just be achieved with:
biases_initializer = tf.constant_initializer(value)
I am looking for a proper or best way to get variable importance in a Neural Network created with Keras. The way I currently do it is I just take the weights (not the biases) of the variables in the first layer with the assumption that more important variables will have higher weights in the first layer. Is there another/better way of doing it?
Since everything will be mixed up along the network, the first layer alone can't tell you about the importance of each variable. The following layers can also increase or decrease their importance, and even make one variable affect the importance of another variable. Every single neuron in the first layer itself will give each variable a different importance too, so it's not something that straightforward.
I suggest you do model.predict(inputs) using inputs containing arrays of zeros, making only the variable you want to study be 1 in the input.
That way, you see the result for each variable alone. Even though, this will still not help you with the cases where one variable increases the importance of another variable.
*Edited to include relevant code to implement permutation importance.
I answered a similar question at Feature Importance Chart in neural network using Keras in Python. It does implement what Teque5 mentioned above, namely shuffling the variable among your sample or permutation importance using the ELI5 package.
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
def base_model():
model = Sequential()
...
return model
X = ...
y = ...
my_model = KerasRegressor(build_fn=basemodel, **sk_params)
my_model.fit(X,y)
perm = PermutationImportance(my_model, random_state=1).fit(X,y)
eli5.show_weights(perm, feature_names = X.columns.tolist())
It is not that simple. For example, in later stages the variable could be reduced to 0.
I'd have a look at LIME (Local Interpretable Model-Agnostic Explanations). The basic idea is to set some inputs to zero, pass it through the model and see if the result is similar. If yes, then that variable might not be that important. But there is more about it and if you want to know it, then you should read the paper.
See marcotcr/lime on GitHub.
This is a relatively old post with relatively old answers, so I would like to offer another suggestion of using SHAP to determine feature importance for your Keras models. SHAP also allows you to process Keras models using layers requiring 3d input like LSTM and GRU while eli5 cannot.
To avoid double-posting, I would like to offer my answer to a similar question on Stackoverflow on using SHAP.