I found that it is easy to use lasagne to make a graph like this.
import lasagne.layers as L
class A:
def __init__(self):
self.x = L.InputLayer(shape=(None, 3), name='x')
self.y = x + 1
def get_y_sym(self, x_var, **kwargs):
y = L.get_output(self.y, {self.x: x_var}, **kwargs)
return y
through the method get_y_sym, we could get a tensor not a value, then I could use this tensor as the input of another graph.
But if I use tensorflow, how could I implement this?
I'm not familiar with lasagne but you should know that ALL of TensorFlow uses graph based computation (unless you use tf.Eager, but that's another story). So by default something like:
net = tf.nn.conv2d(...)
returns a reference to a Tensor object. In other words, net is NOT a value, it is a reference to the output of the convolution node created by tf.nn.conv2d(...).
These can then be chained:
net2 = tf.nn.conv2d(net, ...) and so on.
To get "values" one has to open a tf.Session:
with tf.Session() as sess:
net2_eval = sess.run(net2)
Related
I am trying to create something similar to Word2Vec with the following:
class Word2Vec(keras.Model):
def __init__(self, vocab_size, embedding_dim):
super().__init__()
self.embedding = keras.layers.Embedding(
vocab_size,
embedding_dim,
input_length=1,
name="w2v_embedding"
)
self.dot = keras.layers.Dot(axes=(-1, -1))
def call(self, data):
target, context = data
we = self.embedding(target)
ce = self.embedding(context)
return self.dot([we, ce])
and suppose the loss is the following:
def loss(similarity):
log_prob = tf.math.log(tf.sigmoid(similarity))
return -tf.math.reduce_mean(log_prob)
I am trying to fit to the above model with words and its contexts but running into the error: OperatorNotAllowedInGraphError: iterating over tf.Tensor is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature..
Supposing I have a dummy dataset that looks like the following:
N = 10000
V = 100
word = np.random.randint(0, V, N)
context = np.random.randint(0, V, (N, 4))
What I tried to do was:
word2vec = Word2Vec(V, 32)
word2vec.compile(loss=loss, optimizer="adam")
word2vec.fit(tf.data.Dataset.from_tensor_slices((word, context)), batch_size=128, epochs=1)
when I got the above error. Any thoughts on how to fix this?
I understand that this is not the exact word2vec model, but I'm more concerned about understanding the tensorflow/ keras API and getting this to work, than the actual paper implementation.
Edit 1
An editable kaggle notebook with full code is available here: https://www.kaggle.com/sachin/word-vectors
I think this is causing the issue:
target, context = data
Try this instead:
target = data[0]
context = data[1]
I'm following the section "Losses and Metrics Based on Model Internals" on chapter 12 of "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition - Aurélien Geron", in which he shows how to add custom losses and metrics that do not depend on labels and predictions.
To illustrate this, we add a custom "reconstruction loss" by adding a layer on top of the upper hidden layer which should reproduce the input. The loss is the mean squared difference betweeen the reconstruction loss and the inputs.
He shows the code for adding the custom loss, which works nicely, but even following his description I cannot make add the metric, since it raises `ValueError". He says:
Similarly, you can add a custom metric based on model internals by
computing it in any way you want, as long as the result is the output of a
metric object. For example, you can create a keras.metrics.Mean object
in the constructor, then call it in the call() method, passing it the
recon_loss, and finally add it to the model by calling the model’s
add_metric() method.
This is the code(I have added #MINE for the lines I have added myself)
import tensorflow as tf
from tensorflow import keras
class ReconstructingRegressor(keras.models.Model):
def __init__(self, output_dim, **kwargs):
super().__init__(**kwargs)
self.hidden = [keras.layers.Dense(30, activation="selu",
kernel_initializer="lecun_normal")
for _ in range(5)]
self.out = keras.layers.Dense(output_dim)
self.reconstruction_mean = keras.metrics.Mean(name="reconstruction_error") #MINE
def build(self, batch_input_shape):
n_inputs = batch_input_shape[-1]
self.reconstruct = keras.layers.Dense(n_inputs)
super().build(batch_input_shape)
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training: #MINE
result = self.reconstruction_mean(recon_loss) #MINE
else: #MINE
result = 0. #MINE, I have also tried different things here,
#but the help showed a similar sample to this.
self.add_metric(result, name="foo") #MINE
return self.out(Z)
Then compiling and fitting the model:
training_set_size=10
X_dummy = np.random.randn(training_set_size, 8)
y_dummy = np.random.randn(training_set_size, 1)
model = ReconstructingRegressor(1)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2)
Which throws:
ValueError: in converted code:
<ipython-input-296-878bdeb30546>:26 call *
self.add_metric(result, name="foo") #MINE
C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1147 add_metric
self._symbolic_add_metric(value, aggregation, name)
C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1867 _symbolic_add_metric
'We do not support adding an aggregated metric result tensor that '
ValueError: We do not support adding an aggregated metric result tensor that is not the output of a `tf.keras.metrics.Metric` metric instance. Without having access to the metric instance we cannot reset the state of a metric after every epoch during training. You can create a `tf.keras.metrics.Metric` instance and pass the result here or pass an un-aggregated result with `aggregation` parameter set as `mean`. For example: `self.add_metric(tf.reduce_sum(inputs), name='mean_activation', aggregation='mean')`
Having read that, I tried similar things to solve that issue but it just led to different errors. How can I solve this? What is the "correct" way to do this?
I'm using conda on Windows, with tensorflow-gpu 2.1.0 installed.
The problem is just right here:
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training:
result = self.reconstruction_mean(recon_loss)
else:
result = 0.#<---Here!
self.add_metric(result, name="foo")
return self.out(Z)
The error says that add_metric only gets a metric derived from tf.keras.metrics.Metric but 0 is a scalar, not a metric type.
My proposed solution is to simply do that:
def call(self, inputs, training=None):
Z = inputs
for layer in self.hidden:
Z = layer(Z)
reconstruction = self.reconstruct(Z)
recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
self.add_loss(0.05 * recon_loss)
if training:
result = self.reconstruction_mean(recon_loss)
self.add_metric(result, name="foo")
return self.out(Z)
This way, your mean reconstruction_error will be shown only in training time.
Since you work with eager mode, you should create your layer with dynamic=True as below:
model = ReconstructingRegressor(1,dynamic=True)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2, batch_size=10)
P.S - pay attention, that when calling model.fit or model.evaluate you should also make sure that the batch size divides your train set (since this is a stateful network). So, call those function like this: model.fit(X_dummy, y_dummy, epochs=2, batch_size=10) or model.evaluate(X_dummy,y_dummy, batch_size=10).
Good Luck!
I'm building a NN that supports complex numbers. Currently working on complex activation. According to a Benjio paper, this is a good one:
Where b is a trainable parameter to be learnt. So I'm building a special layer to do this activation. I'm new to Keras and stuck already. I created this code below, but it gives an error with the build function. I have no idea what's happening, I just tried to copy the template. Please help.
class modrelu(Layer):
def __init__(self, **kwargs):
super(modrelu, self).__init__(**kwargs)
def build(self):
self.b= K.variable(value=np.random.rand()-0.5, dtype='float64')
super(modrelu, self).build() # Be sure to call this at the end
def call(self, x):
assert isinstance(x, list)
ip_r, ip_i = x
comp= tf.complex(ip_r, ip_i )
ABS= tf.math.abs(comp)
ANG= tf.math.angle(comp)
ABS= K.relu( self.b + ABS)
op_r= ABS * K.sin(angle) #K.dot ??
op_i= ABS * K.cos(angle)
return [op_r, op_i]
def compute_output_shape(self, input_shape):
assert isinstance(input_shape, list)
shape_a, shape_b = input_shape
return [shape_a, shape_b]
Comments on my code:
In the init I didn't add anything, cause it is an activation layer that takes no input when instantiated.
In the build method, I tried to add the b's. Not sure if I should use the self.add_weight method. Ideally, I want to have as many b's as the dimension of input.
In the call method, this one, I'm pretty sure what I'm doing. It is easy, I just implemented the function.
The last one, compute_output_shape, I just copied-pasted the template. The output should be the same as the input, cause it is just an activation layer.
Finally, the error for what its worth, I know it is nonsense
TypeError Traceback (most recent call last)
<ipython-input-5-3101a9226da5> in <module>
1 a=K.variable(np.array([1,2]))
2 b=K.variable(np.array([3,4]))
----> 3 act([a,b])
~\AppData\Local\conda\conda\envs\python36\lib\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
429 'You can build it manually via: '
430 '`layer.build(batch_input_shape)`')
--> 431 self.build(unpack_singleton(input_shapes))
432 self.built = True
433
TypeError: build() takes 1 positional argument but 2 were given
There are several issues with your code.
First of all I should address the error you get from interpreter:
TypeError: build() takes 1 positional argument but 2 were given
The build method should take input_shape argument. Therefore you should declare build method as build(self, input_shape)
The second issue is undefined shape of the variables in the build method. You should explicitly declare shape of the variables. In your case the np.random.rand array should be of input_shape shape.
Another issue is that you are trying to return 2 results ([op_r, op_i]) in the call method. I'm not specialist in Keras but as far as I know you can't do it. Every Keras layer should have one and only one output. See here for the details: https://github.com/keras-team/keras/issues/3061
However if you use tensorflow backend you may use complex numbers (tf.complex) to return both real (op_r) and imagenary (op_i) parts of the complex number.
Here is the working implementation of modrelu layer with simple usage example. It is writtern for TensorFlow 1.12.0 which is distributed with it's own implementation of Keras API but I think you can easily adopt it for original Keras:
import tensorflow as tf
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.engine import Layer
import numpy as np
class modrelu(Layer):
def __init__(self, **kwargs):
super(modrelu, self).__init__(**kwargs)
# provide input_shape argument in the build method
def build(self, input_shape):
# You should pass shape for your variable
self.b= K.variable(value=np.random.rand(*input_shape)-0.5,
dtype='float32')
super(modrelu, self).build(input_shape) # Be sure to call this at the end
def call(self, inputs, **kwargs):
assert inputs.dtype == tf.complex64
ip_r = tf.math.real(inputs)
ip_i = tf.math.imag(inputs)
comp = tf.complex(ip_r, ip_i )
ABS = tf.math.abs(comp)
ANG = tf.math.angle(comp)
ABS = K.relu(self.b + ABS)
op_r = ABS * K.sin(ANG) #K.dot ??
op_i = ABS * K.cos(ANG)
# return single tensor in the call method
return tf.complex(op_r, op_i)
real = tf.constant([2.25, 3.25])
imag = tf.constant([4.75, 5.75])
x = tf.complex(real, imag)
y = modrelu()(x)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(y))
P.S.: I didn't check the math so you should check it by yourself.
You are not coding the layer correctly, the build function takes a input_shape parameter, which you can use to initialize the weights/parameters of your layer.
You can see an example in Keras' source code.
I have two packages I'd like to use, one is written in Keras1.2, and the other one in tensorflow. I'd like to use a part of the architecture that is built in tensorflow into a Keras model.
A partial solution is suggested here, but it's for a sequential model. The suggestion regarding functional models - wrapping the pre-processing in a Lambda layer - didn't work.
The following code worked:
inp = Input(shape=input_shape)
def ID(x):
return x
lam = Lambda(ID)
flatten = Flatten(name='flatten')
output = flatten(lam(inp))
Model(input=[inp], output=output)
But, when replacing flatten(lam(inp)) with a pre-processed output tensor flatten(lam(TF_processed_layer)), I got: "Output tensors to a Model must be Keras tensors. Found: Tensor("Reshape:0", shape=(?, ?), dtype=float32)"
You could try wrapping your input tensor into the Keras Input layer and carry on building your model from there. Like so:
inp = Input(tensor=tftensor,shape=input_shape)
def ID(x):
return x
lam = Lambda(ID)
flatten = Flatten(name='flatten')
output = flatten(lam(inp))
Model(input=inp, output=output)
You are not defining your lamba correctly for Keras.
Try something like this
def your_lambda_layer(x):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
return x
....
model.add(Lambda(your_lambda_layer))
of seeing you are using the Functional API like this
def your_lambda_layer(x):
x -= K.mean(x, axis=1, keepdims=True)
x = K.l2_normalize(x, axis=1)
return x
....
x = SomeLayerBeforeLambda(options...)(x)
x = (Lambda(your_lambda_layer))(x)
But even so, the lambda layer may not be able to be flattened so printout the shape of the lambda and take a look at it and see what it is.
I want to use a function that creates weights for a normal dense layer, it basically behaves like an initialization function, only that it "initializes" before every new forward pass.
The flow for my augmented linear layer looks like this:
input = (x, W)
W_new = g(x,W)
output = tf.matmul(x,W_new)
However, g(x,W) is not differentiable, as it involves some sampling. Luckily it also doesn't have any parameters I want to learn so I just try to do the forward and backward pass, as if I would have never replaced W.
Now I need to tell the automatic differentiation to not backpropagate through g(). I do this with:
W_new = tf.stop_gradient(g(x,W))
Unfortunately this does not work, as it complains about non-matching shapes.
What does work is the following:
input = (x, W)
W_new = W + tf.stop_gradient(g(x,W) - W)
output = tf.matmul(x,W_new)
as suggested here: https://stackoverflow.com/a/36480182
Now the forward pass seems to be OK, but I don't know how to override the gradient for the backward pass. I know, that I have to use: gradient_override_map for this, but could not transfer applications I have seen to my particular usecase (I am still quite new to TF).
However, I am not sure how to do this and if there isn't an easier way. I assume something similar has to be done in the first forward pass in a given model, where all weights are initialized while we don't have to backpropagate through the init functions as well.
Any help would be very much appreciated!
Hey #jhj I too faced the same problem fortunately I found this gist. Hope this helps :)
Sample working -
import tensorflow as tf
from tensorflow.python.framework import ops
import numpy as np
Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name, "PyFuncStateless": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
Def custom square function using np.square instead of tf.square:
def mysquare(x, name=None):
with ops.name_scope(name, "Mysquare", [x]) as name:
sqr_x = py_func(np.square,
[x],
[tf.float32],
name=name,
grad=_MySquareGrad) # <-- here's the call to the gradient
return sqr_x[0]
Actual gradient:
def _MySquareGrad(op, grad):
x = op.inputs[0]
return grad * 20 * x # add a "small" error just to see the difference:
with tf.Session() as sess:
x = tf.constant([1., 2.])
y = mysquare(x)
tf.global_variables_initializer().run()
print(x.eval(), y.eval(), tf.gradients(y, x)[0].eval())