Keras How to use max_value in Relu activation function - tensorflow

Relu function as defined in keras/activation.py is:
def relu(x, alpha=0., max_value=None):
return K.relu(x, alpha=alpha, max_value=max_value)
It has a max_value which can be used to clip the value. Now how can this be used/called in the code?
I have tried the following:
(a)
model.add(Dense(512,input_dim=1))
model.add(Activation('relu',max_value=250))
assert kwarg in allowed_kwargs, 'Keyword argument not understood:
' + kwarg
AssertionError: Keyword argument not understood: max_value
(b)
Rel = Activation('relu',max_value=250)
same error
(c)
from keras.layers import activations
uu = activations.relu(??,max_value=250)
The problem with this is that it expects the input to be present in the first value. The error is 'relu() takes at least 1 argument (1 given)'
So how do I make this a layer?
model.add(activations.relu(max_value=250))
has the same issue 'relu() takes at least 1 argument (1 given)'
If this file cannot be used as layer, then there seems to be no way of specifying a clip value to Relu. This implies that the comment here https://github.com/fchollet/keras/issues/2119 closing a proposed change is wrong...
Any thoughts? Thanks!

You can use the ReLU function of the Keras backend. Therefore, first import the backend:
from keras import backend as K
Then, you can pass your own function as activation using backend functionality.
This would look like
def relu_advanced(x):
return K.relu(x, max_value=250)
Then you can use it like
model.add(Dense(512, input_dim=1, activation=relu_advanced))
or
model.add(Activation(relu_advanced))
Unfortunately, you must hard code additional arguments.
Therefore, it is better to use a function, that returns your function and passes your custom values:
def create_relu_advanced(max_value=1.):
def relu_advanced(x):
return K.relu(x, max_value=K.cast_to_floatx(max_value))
return relu_advanced
Then you can pass your arguments by either
model.add(Dense(512, input_dim=1, activation=create_relu_advanced(max_value=250)))
or
model.add(Activation(create_relu_advanced(max_value=250)))

That is as easy as one lambda :
from keras.activations import relu
clipped_relu = lambda x: relu(x, max_value=3.14)
Then use it like this:
model.add(Conv2D(64, (3, 3)))
model.add(Activation(clipped_relu))
When reading a model saved in hdf5 use custom_objects dictionary:
model = load_model(model_file, custom_objects={'<lambda>': clipped_relu})

Tested below, it'd work:
import keras
def clip_relu (x):
return keras.activations.relu(x, max_value=1.)
predictions=Dense(num_classes,activation=clip_relu,name='output')

This is what I did using Lambda layer to implement clip relu:
Step 1: define a function to do reluclip:
def reluclip(x, max_value = 20):
return K.relu(x, max_value = max_value)
Step 2: add Lambda layer into model:
y = Lambda(function = reluclip)(y)

Related

Getting an error trying to invoke the predict() function to calculate model predictions using created variables

Here's my beginning code:
import numpy as np
Create predict() function accepting a prediction probabilities list and threshold value
def predict(predict_prob,thresh):
pred = []
for i in range(len(predict_prob)):
if predict_prob[i] >= thresh:
pred.append(1)
else:
pred.append(0)
return pred
Here's what I'm trying to do:
invoke the predict() function to calculate the model predictions using those variables. Save this output as preds and print it out.
prediction probabilities
probs = [0.886,0.375,0.174,0.817,0.574,0.319,0.812,0.314,0.098,0.741,
0.847,0.202,0.31,0.073,0.179,0.917,0.64,0.388,0.116,0.72]
threshold value
thresh = 0.5
prediction values
preds = predict(predict_prob, thresh) #calling predict ERROR
Here's the error:
NameError: name 'predict_prob' is not defined
But it is defined earlier when I created the function, so not sure why it's giving an error when I ran the earlier code and it didn't error on me.
You are calling the function the wrong way. In this case, it should be:
preds = predict(probs, thresh)
You can't use 'predict_prob' outside of the function. It is defined inside and because of that it is a local variable. Also, you have 'probs' and 'thresh' perfectly prepared.
I hope that helps.

Converting a fully connected neural network with variable number of hidden layers from tensorflow to pytorch

I recently started learning pytorch and I am trying to convert a part of a large script including coding a MLP with variable number of hidden layers from Tensorflow to pytorch.
import tensorflow as tf
### Base neural network
def init_mlp(layer_sizes, std=.01, bias_init=0.):
params = {'w':[], 'b':[]}
for n_in, n_out in zip(layer_sizes[:-1], layer_sizes[1:]):
params['w'].append(tf.Variable(tf.random_normal([n_in, n_out], stddev=std)))
params['b'].append(tf.Variable(tf.mul(bias_init, tf.ones([n_out,]))))
return params
def mlp(X, params):
h = [X]
for w,b in zip(params['w'][:-1], params['b'][:-1]):
h.append( tf.nn.relu( tf.matmul(h[-1], w) + b ) )
#h.append( tf.nn.tanh( tf.matmul(h[-1], w) + b ) )
return tf.matmul(h[-1], params['w'][-1]) + params['b'][-1]
def compute_nll(x, x_recon_linear):
return tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(x_recon_linear, x), reduction_indices=1, keep_dims=True)
def gauss_cross_entropy(mean_post, std_post, mean_prior, std_prior):
d = (mean_post - mean_prior)
d = tf.mul(d,d)
return tf.reduce_sum(-tf.div(d + tf.mul(std_post,std_post),(2.*std_prior*std_prior)) - tf.log(std_prior*2.506628), reduction_indices=1, keep_dims=True)
how could I write down similarly weights and bias variables and attach them in each hidden layer in pytorch?
how could I convert gauss_cross_entropy and compute_nll
functions as well (finding equivalent syntax)?
Are these two codes compatible?
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as func
from torch.distributions import Normal, Categorical, Independent
from copy import
device = "cpu"
if torch.cuda.is_available():
device = "cuda:0"
if torch.cuda.device_count() > 1:
net = nn.DataParallel(net)
net.to(device)
def init_mlp(layer_sizes, std=.01, bias_init=0.):
params = {'w':[], 'b':[]}
for n_in, n_out in zip(layer_sizes[:-1], layer_sizes[1:]):
params['w'].append(torch.tensor(Normal([n_in, n_out], torch.tensor([std])) ,requires_grad=True))
params['b'].append(torch.tensor(torch.mul(bias_init, torch.ones([n_out,])),requires_grad=True))
return params
def mlp(X, params):
h = [X]
for w,b in zip(params['w'][:-1], params['b'][:-1]):
h.append( torch.nn.ReLU( tf.matmul(h[-1], w) + b ) )
return torch.matmul(h[-1], params['w'][-1]) + params['b'][-1]
def compute_nll(x, x_recon_linear):
return torch.sum(func.binary_cross_entropy_with_logits(x_recon_linear, x), reduction_indices=1, keep_dims=True)
def gauss_cross_entropy(mu_post, sigma_post, mu_prior, sigma_prior):
d = (mu_post - mu_prior)
d = torch.mul(d,d)
return torch.sum(-torch.div(d + torch.mul(sigma_post,sigma_post),(2.*sigma_prior*sigma_prior)) - torch.log(sigma_prior*2.506628), reduction_indices=1, keep_dims=True)
What is the substitute function for tf.placeholder in pytorch? For instance here:
class VAE(object):
def __init__(self, hyperParams):
self.X = tf.placeholder("float", [None, hyperParams['input_d']])
self.prior = hyperParams['prior']
self.K = hyperParams['K']
self.encoder_params = self.init_encoder(hyperParams)
self.decoder_params = self.init_decoder(hyperParams)
and also how should I change tf.shape in this line: tf.random_normal(tf.shape(self.sigma[-1]))
How could I write down similar weights and bias variables and attach them in each hidden layer in PyTorch?
An easier way to define those is to create a list containing the params as (weight, bias) tuples:
def init_mlp(layer_sizes, std=.01, bias_init=0.):
params = []
for n_in, n_out in zip(layer_sizes[:-1], layer_sizes[1:]):
params.append([
nn.init.normal_(torch.empty(n_in, n_out)).requires_grad_(True),
torch.empty(n_out).fill_(bias_init).requires_grad_(True)])
return params
Above I define my parameters as 'empty' (created with uninitialized data) tensors with torch.empty. I have used in-place functions such as nn.init.normal_ (there are many others available) and torch.Tensor.fill_ to fill the tensor with an arbitrary value (maybe it is .mul_(bias_init) you are looking for, based on your TensorFlow sample?).
For the inference code, you don't actually need to store the intermediate layer results:
def mlp(x, params):
for i, (W, b) in enumerate(params):
x = x#W + b
if i < len(params) - 1:
x = torch.relu(x)
return x
How could I convert gauss_cross_entropy and compute_nll functions as well (finding equivalent syntax)?
You can use PyTorch functions and mathematical operators to define your logic. For compute_loss you were using the built-in, which actually does not require summation after it, by default the losses of the batch elements are averaged.
def compute_loss(y_pred, y_true):
return F.binary_cross_entropy_with_logits(y_pred, y_true)
What is the substitute function for tf.placeholder in Pytorch?
You don't have placeholders in PyTorch, you compute your outputs explicitly using PyTorch operators, then you should be able to backpropagate through those operators and get the gradients for each parameter.
How should I change tf.shape in this line: tf.random_normal(tf.shape(self.sigma[-1]))
Function tf.shape returns the shape of the tensor, in PyTorch you call torch.Tensor.shape or by calling torch.Tensor.size: i.e. self.sigma[-1].shape or self.sigma[-1].size().

Get logits of a trained Keras model [duplicate]

I am building a deconvolution network. I would like to add a layer to it which is the reverse of a softmax. I tried to write a basic python function that returns the inverse of a softmax for a given matrix and put that in a tensorflow Lambda and add it to my model.
I have no error but when I doing a predict I only have 0 at the exit. When I don't add this layer to my network I have output something other than zeros. This therefore justifies that they are due to my inv_softmax function which is bad.
Can you enlighten me how to proceed?
I define my funct as this :
def inv_softmax(x):
C=0
S = np.zeros((1,1,10)) #(1,1,10) is the shape of the datas that my layer will receive
try:
for j in range(np.max(np.shape(x))):
C+=np.exp(x[0,0,j])
for i in range(np.max(np.shape(x))):
S[0,0,i] = np.log(x[0,0,i]+C
except ValueError:
print("ValueError in inv_softmax")
pass
S = tf.convert_to_tensor(S,dtype=tf.float32)
return S
I add it to my network as :
x = ...
x = layers.Lambda(lambda x : inv_softmax(x),name='inv_softmax',output_shape=[1,1,10])(x)
x = ...
If you need more of my code or others informations ask me please.
Try this:
import tensorflow as tf
def inv_softmax(x, C):
return tf.math.log(x) + C
import math
input = tf.keras.layers.Input(shape=(1,10))
x = tf.keras.layers.Lambda(lambda x : inv_softmax(x, math.log(10.)),name='inv_softmax')(input)
model = tf.keras.Model(inputs=input, outputs=x)
a = tf.zeros([1, 1, 10])
a = tf.nn.softmax(a)
a = model(a)
print(a.numpy())
Thanks it works !
I put :
import keras.backend as K
def inv_softmax(x,C):
return K.log(x)+K.log(C)

How to apply a function to the value of a tensor and then assigning the output to the same tensor

I want to project the updated weights of my network (after performing optimization) to a special space in which I need the value of that tensor to be passed. The function which applies projection gets a numpy array as an input. Is there a way I can do this?
I used tf.assign() as a solution but since my function accepts arrays and not tensors it failed.
Here is a sketch of what I want to do:
W = tf.Variable(...)
...
opt = tf.train.AdamOptimizer(learning_rate).minimize(loss, var_list=['W'])
W = my_function(W)
It seems that tf.control_dependencies is what you need
one simple exmaple:
import tensorflow as tf
var = tf.get_variable('var', initializer=0.0)
# replace `tf.add` with your custom function
addop = tf.add(var, 1)
with tf.control_dependencies([addop]):
updateop = tf.assign(var, addop)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # pylint: disable=no-member
with tf.Session(config=config) as sess:
sess.run(tf.global_variables_initializer())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
output:
1.0
2.0
3.0

Why TensorFlow eager execution API give a wrong answer for this function?

I had this function to get its differentiation value.
def dp1_f1(x):
return 64*x*(1-x)*(math.pow((1-2*x),2) )*math.pow((1-8*x+8*x*x), 2)
I want to get dy/dx value.
I can get this value by numeric method just as below:
def dp_numeric_diff(x):
delta_x = 0.0001
return (dp1_f1(x+delta_x)-dp1_f1(x))/delta_x
I use TensorFlow eager execution API to calculate this value:
def dp_ad_tfe(x):
tf.enable_eager_execution()
tfe = tf.contrib.eager
grad_lx = tfe.gradients_function(dp1_f1)
x = 3.0
y = dp1_f1(x)
rst = grad_lx(x)
return y, rst[0]
I call this function with code below:
numeric_diff = dp_numeric_diff(x)
print('Numeric method:{0}'.format(numeric_diff))
v, d = dp_ad_tfe(x)
print('TFE:{0}'.format(d))
It will display something like this:
Numeric method:-75290405.66440672
TFE:-19208000.0
I am sure that the numeric method is right. What's wrong with my TensorFlow eager execution code? By the way the same TensorFlow eager execution code can get correct answer for simple function like x^2.
I had found that TensorFlow eager execution API can't deal with functions like math.pow. I must provide a function to tell TensorFlow eager execution API how to get the derivative of the function. To solve this question I have to change math.pow to my own function as below:
#tf.custom_gradient
def f3(x, n):
v = tf.pow(x, n)
def grad(dy):
return (dy* (n*tf.pow(x, n-1)) ).numpy()
return v.numpy(), grad
And have to modify the original function as below:
def dp1_f1(x):
return 64*x*(1-x)*f3(1-2*x,2)*f3(1-8*x+8*x*x, 2)
Now TensorFlow eager execution API will give the right answer just as the numeric method.
TensorFlow's automatic differentiation APIs can only differentiate through compositions of TensorFlow operations, not through functions like math.pow() or other libraries. If you replace math.pow() with tf.pow() it should work out just fine.
Something like:
import tensorflow as tf
tf.enable_eager_execution()
def dp1_f1(x):
return 64*x*(1-x)*(tf.pow((1-2*x),2) )*tf.pow((1-8*x+8*x*x), 2)
def dp_numeric_diff(x):
delta_x = 0.0001
return (dp1_f1(x+delta_x)-dp1_f1(x))/delta_x
grad = tf.contrib.eager.gradients_function(dp1_f1)
print(dp_numeric_diff(3.0).numpy()) # Prints -75300000.0
print(grad(3.0)[0].numpy()) # Prints -75279680.0
Hope that helps.
(Seems this was also asked on GitHub)