I'm building a NN that supports complex numbers. Currently working on complex activation. According to a Benjio paper, this is a good one:
Where b is a trainable parameter to be learnt. So I'm building a special layer to do this activation. I'm new to Keras and stuck already. I created this code below, but it gives an error with the build function. I have no idea what's happening, I just tried to copy the template. Please help.
class modrelu(Layer):
def __init__(self, **kwargs):
super(modrelu, self).__init__(**kwargs)
def build(self):
self.b= K.variable(value=np.random.rand()-0.5, dtype='float64')
super(modrelu, self).build() # Be sure to call this at the end
def call(self, x):
assert isinstance(x, list)
ip_r, ip_i = x
comp= tf.complex(ip_r, ip_i )
ABS= tf.math.abs(comp)
ANG= tf.math.angle(comp)
ABS= K.relu( self.b + ABS)
op_r= ABS * K.sin(angle) #K.dot ??
op_i= ABS * K.cos(angle)
return [op_r, op_i]
def compute_output_shape(self, input_shape):
assert isinstance(input_shape, list)
shape_a, shape_b = input_shape
return [shape_a, shape_b]
Comments on my code:
In the init I didn't add anything, cause it is an activation layer that takes no input when instantiated.
In the build method, I tried to add the b's. Not sure if I should use the self.add_weight method. Ideally, I want to have as many b's as the dimension of input.
In the call method, this one, I'm pretty sure what I'm doing. It is easy, I just implemented the function.
The last one, compute_output_shape, I just copied-pasted the template. The output should be the same as the input, cause it is just an activation layer.
Finally, the error for what its worth, I know it is nonsense
TypeError Traceback (most recent call last)
<ipython-input-5-3101a9226da5> in <module>
1 a=K.variable(np.array([1,2]))
2 b=K.variable(np.array([3,4]))
----> 3 act([a,b])
~\AppData\Local\conda\conda\envs\python36\lib\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
429 'You can build it manually via: '
430 '`layer.build(batch_input_shape)`')
--> 431 self.build(unpack_singleton(input_shapes))
432 self.built = True
433
TypeError: build() takes 1 positional argument but 2 were given
There are several issues with your code.
First of all I should address the error you get from interpreter:
TypeError: build() takes 1 positional argument but 2 were given
The build method should take input_shape argument. Therefore you should declare build method as build(self, input_shape)
The second issue is undefined shape of the variables in the build method. You should explicitly declare shape of the variables. In your case the np.random.rand array should be of input_shape shape.
Another issue is that you are trying to return 2 results ([op_r, op_i]) in the call method. I'm not specialist in Keras but as far as I know you can't do it. Every Keras layer should have one and only one output. See here for the details: https://github.com/keras-team/keras/issues/3061
However if you use tensorflow backend you may use complex numbers (tf.complex) to return both real (op_r) and imagenary (op_i) parts of the complex number.
Here is the working implementation of modrelu layer with simple usage example. It is writtern for TensorFlow 1.12.0 which is distributed with it's own implementation of Keras API but I think you can easily adopt it for original Keras:
import tensorflow as tf
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.engine import Layer
import numpy as np
class modrelu(Layer):
def __init__(self, **kwargs):
super(modrelu, self).__init__(**kwargs)
# provide input_shape argument in the build method
def build(self, input_shape):
# You should pass shape for your variable
self.b= K.variable(value=np.random.rand(*input_shape)-0.5,
dtype='float32')
super(modrelu, self).build(input_shape) # Be sure to call this at the end
def call(self, inputs, **kwargs):
assert inputs.dtype == tf.complex64
ip_r = tf.math.real(inputs)
ip_i = tf.math.imag(inputs)
comp = tf.complex(ip_r, ip_i )
ABS = tf.math.abs(comp)
ANG = tf.math.angle(comp)
ABS = K.relu(self.b + ABS)
op_r = ABS * K.sin(ANG) #K.dot ??
op_i = ABS * K.cos(ANG)
# return single tensor in the call method
return tf.complex(op_r, op_i)
real = tf.constant([2.25, 3.25])
imag = tf.constant([4.75, 5.75])
x = tf.complex(real, imag)
y = modrelu()(x)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(y))
P.S.: I didn't check the math so you should check it by yourself.
You are not coding the layer correctly, the build function takes a input_shape parameter, which you can use to initialize the weights/parameters of your layer.
You can see an example in Keras' source code.
Related
I defined simple custom model:
import tensorflow as tf
class CustomModule(tf.keras.layers.Layer):
def __init__(self):
super(CustomModule, self).__init__()
self.v = tf.Variable(1.)
def call(self, x):
print('Tracing with', x)
return x * self.v
def mutate(self, new_v):
self.v.assign(new_v)
I want to save it for serving and that is why I need to provide a function for “serving_default”. I’ve tried to do it like this:
module = CustomModule() module_with_signature_path = './tmp/1' call = tf.function(module.mutate, input_signature=[tf.TensorSpec([], tf.float32)]) tf.saved_model.save(module, module_with_signature_path, signatures=call)
I got an error:
ValueError: Got a non-Tensor value <tf.Operation 'StatefulPartitionedCall' type=StatefulPartitionedCall> for key 'output_0' in the output of the function __inference_mutate_8 used to generate the SavedModel signature 'serving_default'. Outputs for functions used as signatures must be a single Tensor, a sequence of Tensors, or a dictionary from string to Tensor.
How can I properly define signature while saving model? Thank you!
I have implemented a custom version of Batch Normalization with adding self.skip variable that act somehow as trainable. Here is the minimal code:
from tensorflow.keras.layers import BatchNormalization
import tensorflow as tf
# class CustomBN(tf.keras.layers.Layer):
class CustomBN(BatchNormalization):
def __init__(self, **kwargs):
super(CustomBN, self).__init__(**kwargs)
self.skip = False
def call(self, inputs, training=None):
if self.skip:
tf.print("I'm skipping")
else:
tf.print("I'm not skipping")
return super(CustomBN, self).call(inputs, training)
def build(self, input_shape):
super(CustomBN, self).build(input_shape)
To be crystal clear, all I have done so far are:
sub classing BatchNormalization: should I sub class tf.keras.layers.Layer?
defining self.skip to change the behavior of CustomBN layer in run time.
checking the state of self.skip in call method to act correspondingly.
Now, to change the behavior of the 'CustomBN' layer, I use
self.model.layers[ind].skip = state
where state is either True or False, and ind is the index number of CustomBN layer in the model.
the evident problem is that the value of self.skip will never change.
If you notice any mistakes please notify me.
By default, the call function in your layer will be called when the graph is built. Not on a per batch basis. Keras model compile method as a run_eagerly option that would cause your model to run (slower) in eager mode which would invoke your call function without building a graph. This is most likely not what you want to do however.
Ideally you want the flag that changes the behavior to be an input to the call method... For instance you can add an extra input to your graph which is simply this state flag and pass that to your layer.
The following is an example of how you can have a conditional graph on an extra parameter.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
class MyLayerWithFlag(keras.layers.Layer):
def call(self, inputs, flag=None):
c_one = tf.constant([1], dtype=tf.float32)
if flag is not None:
x = tf.cond(
flag, lambda: tf.math.add(inputs, c_one),
lambda: inputs)
return x
return inputs
inputs = layers.Input(shape=(2,))
state = layers.Input(shape=(1,), dtype=tf.bool)
x = MyLayerWithFlag()(inputs, flag=state)
out = layers.Lambda(tf.reduce_sum)(x)
model = keras.Model([inputs, state], out)
data = np.array([[1., 2.]])
state = np.array([[True]])
model.predict((data, state))
I'm following the section "Losses and Metrics Based on Model Internals" on chapter 12 of "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition - Aurélien Geron", in which he shows how to add custom losses and metrics that do not depend on labels and predictions.
To illustrate this, we add a custom "reconstruction loss" by adding a layer on top of the upper hidden layer which should reproduce the input. The loss is the mean squared difference betweeen the reconstruction loss and the inputs.
He shows the code for adding the custom loss, which works nicely, but even following his description I cannot make add the metric, since it raises `ValueError". He says:
Similarly, you can add a custom metric based on model internals by
computing it in any way you want, as long as the result is the output of a
metric object. For example, you can create a keras.metrics.Mean object
in the constructor, then call it in the call() method, passing it the
recon_loss, and finally add it to the model by calling the model’s
add_metric() method.
This is the code(I have added #MINE for the lines I have added myself)
import tensorflow as tf
from tensorflow import keras
class ReconstructingRegressor(keras.models.Model):
def __init__(self, output_dim, **kwargs):
super().__init__(**kwargs)
self.hidden = [keras.layers.Dense(30, activation="selu",
kernel_initializer="lecun_normal")
for _ in range(5)]
self.out = keras.layers.Dense(output_dim)
self.reconstruction_mean = keras.metrics.Mean(name="reconstruction_error") #MINE
def build(self, batch_input_shape):
n_inputs = batch_input_shape[-1]
self.reconstruct = keras.layers.Dense(n_inputs)
super().build(batch_input_shape)
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training: #MINE
result = self.reconstruction_mean(recon_loss) #MINE
else: #MINE
result = 0. #MINE, I have also tried different things here,
#but the help showed a similar sample to this.
self.add_metric(result, name="foo") #MINE
return self.out(Z)
Then compiling and fitting the model:
training_set_size=10
X_dummy = np.random.randn(training_set_size, 8)
y_dummy = np.random.randn(training_set_size, 1)
model = ReconstructingRegressor(1)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2)
Which throws:
ValueError: in converted code:
<ipython-input-296-878bdeb30546>:26 call *
self.add_metric(result, name="foo") #MINE
C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1147 add_metric
self._symbolic_add_metric(value, aggregation, name)
C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1867 _symbolic_add_metric
'We do not support adding an aggregated metric result tensor that '
ValueError: We do not support adding an aggregated metric result tensor that is not the output of a `tf.keras.metrics.Metric` metric instance. Without having access to the metric instance we cannot reset the state of a metric after every epoch during training. You can create a `tf.keras.metrics.Metric` instance and pass the result here or pass an un-aggregated result with `aggregation` parameter set as `mean`. For example: `self.add_metric(tf.reduce_sum(inputs), name='mean_activation', aggregation='mean')`
Having read that, I tried similar things to solve that issue but it just led to different errors. How can I solve this? What is the "correct" way to do this?
I'm using conda on Windows, with tensorflow-gpu 2.1.0 installed.
The problem is just right here:
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training:
result = self.reconstruction_mean(recon_loss)
else:
result = 0.#<---Here!
self.add_metric(result, name="foo")
return self.out(Z)
The error says that add_metric only gets a metric derived from tf.keras.metrics.Metric but 0 is a scalar, not a metric type.
My proposed solution is to simply do that:
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training:
result = self.reconstruction_mean(recon_loss)
self.add_metric(result, name="foo")
return self.out(Z)
This way, your mean reconstruction_error will be shown only in training time.
Since you work with eager mode, you should create your layer with dynamic=True as below:
model = ReconstructingRegressor(1,dynamic=True)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2, batch_size=10)
P.S - pay attention, that when calling model.fit or model.evaluate you should also make sure that the batch size divides your train set (since this is a stateful network). So, call those function like this: model.fit(X_dummy, y_dummy, epochs=2, batch_size=10) or model.evaluate(X_dummy,y_dummy, batch_size=10).
Good Luck!
I am looking at a very simple custom layer which outputs a trainable variable, but ignores the input:
import tensorflow as tf
inputs = tf.random.normal(shape=[500,1])
class MyLayer(tf.keras.Model):
def __init__(self):
super(MyLayer, self).__init__()
def build(self, input_shape):
print( "Input shape", input_shape)
self.kernel = tf.Variable( [-1.0] )
def call(self, input):
print( " Call", input.shape )
return self.kernel
model = MyLayer()
model.predict( inputs[:33] )
model = MyLayer()
model.predict( inputs[:31] )
The first model.predict call succeeds and returns a 33 length array. But the second call crashes
with the error message:
ValueError: Mismatch between expected batch size and model output batch size. Output shape = (1,), expected output shape = shape (31,)
Why would the first call succeed, but the second fail?
Incidentally, if I replace "return self.kernel" in the call definition with "return input*0.0 + self.kernel" then everything works as expected, but I think it should be possible to do this without the multiply by zero which seems unnecessary.
I am trying to implement a layer where the output is trainable, but doesn't depend on input.
Any insight would be greatly appreciated.
Thanks.
One way I found to solve this issue such that the batch is handled is to use tensorflow.map_fn that way the value is mapped along the batch axis. It could be managed by changing the return statement.
return tf.map_fn( lambda x: self.kernel, input, dtype=tensorflow.float32 )
I found that it is easy to use lasagne to make a graph like this.
import lasagne.layers as L
class A:
def __init__(self):
self.x = L.InputLayer(shape=(None, 3), name='x')
self.y = x + 1
def get_y_sym(self, x_var, **kwargs):
y = L.get_output(self.y, {self.x: x_var}, **kwargs)
return y
through the method get_y_sym, we could get a tensor not a value, then I could use this tensor as the input of another graph.
But if I use tensorflow, how could I implement this?
I'm not familiar with lasagne but you should know that ALL of TensorFlow uses graph based computation (unless you use tf.Eager, but that's another story). So by default something like:
net = tf.nn.conv2d(...)
returns a reference to a Tensor object. In other words, net is NOT a value, it is a reference to the output of the convolution node created by tf.nn.conv2d(...).
These can then be chained:
net2 = tf.nn.conv2d(net, ...) and so on.
To get "values" one has to open a tf.Session:
with tf.Session() as sess:
net2_eval = sess.run(net2)