How to backpropagate with complex valued weights - tensorflow

We are currently trying to replicate the results of the following paper: https://openreview.net/forum?id=H1S8UE-Rb
To do so, we need to run backpropagation on a neural network which contains complex valued weights.
When we try to do so (with code [0]), we get an error (at [1]). We cannot find the source code for any project that trains a neural network containing complex valued weights.
We were wondering if we would need to implement the paper's backpropagation adjustments ourselves or if this is already part of some neural network libraries. If it needs to be implemented in Tensorflow, what would be the proper steps to achieve that?
[0]:
def define_neuron(x):
"""
x is input tensor
"""
x = tf.cast(x, tf.complex64)
mnist_x = mnist_y = 28
n = mnist_x * mnist_y
c = 10
m = 10 # m needs to be calculated
with tf.name_scope("linear_combination"):
complex_weight = weight_complex_variable([n,m])
complex_bias = bias_complex_variable([m])
h_1 = x # complex_weight + complex_bias
return h_1
def main(_):
mnist = input_data.read_data_sets(
FLAGS.data_dir,
one_hot=True,
)
# `None` for the first dimension in this shape means that it is variable.
x_shape = [None, 784]
x = tf.placeholder(tf.float32, x_shape)
y_ = tf.placeholder(tf.float32, [None, 10])
yz = h_1 = define_neuron(x)
y = tf.nn.softmax(tf.abs(yz))
with tf.name_scope('loss'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=y_,
logits=y,
)
cross_entropy = tf.reduce_mean(cross_entropy)
with tf.name_scope('adam_optimizer'):
optimizer = tf.train.AdamOptimizer(1e-4)
optimizer = tf.train.GradientDescentOptimizer(1e-4)
train_step = optimizer.minimize(cross_entropy)
[1]:
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
Traceback (most recent call last):
File "complex.py", line 156, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "complex.py", line 58, in main
train_step = optimizer.minimize(cross_entropy)
File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 343, in minimize
grad_loss=grad_loss)
File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 419, in compute_gradients
[v for g, v in grads_and_vars
File "/Users/kevin/wdev/learn_tensor/env/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 547, in _assert_valid_dtypes
dtype, t.name, [v for v in valid_dtypes]))
ValueError: Invalid type tf.complex64 for linear_combination/Variable:0, expected: [tf.float32, tf.float64, tf.float16].

I have also tried to implement a similar network in tensorflow and saw that the optimizer cannot do backpropagation using complex valued tensors. The work around is to have separate real tensors for the real and imaginary parts. You will have to do write a function that will get the amplitude of the "complex" output of the network which is simply Re^2 - Im^2. This output value is what you will use to compute the loss.

Using the optimizer won't work it is a reported issue and I don't think tf 2 support it yet. You can however make it by hand, for example:
[...]
gradients = tf.gradients(mse, [weights])[0]
training_op = tf.assign(weights, weights - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
sess.run(training_op)
Gradients here do as expected and compute the gradient as it should. Here is the discussion on what the gradient compute for complex variables.

Related

Trouble with TensorFlow and MNIST recognition

Beforehand, I thank you for analyzing my post and helping out. I've recently gotten interested in ML with Tensorflow,
but I've encountered a problem with my code. I'm reading a book called Learning TensorFlow, and I've written out the whole thing
from the first example. They are analyzing MNIST images, and I've also added my own comments with my perspective on how things work
in the code. When I run the code, however, I get an error. Here's my code, and the error.
#Import tensorflow under the name of ts
import tensorflow as tf
#Import MNIST tutorial data from tensorflow
from tensorflow.examples.tutorials.mnist import input_data
#Declare constants
#Data path
DATA_DIR = 'C:/tmp/data'
#Number of steps
NUM_STEPS = 1000
#Number of examples per step
MINIBATCH_SIZE = 100
#When we read the data-set it saves it locally under our data path, or under c:/tmp/data
data = input_data.read_data_sets(DATA_DIR, one_hot = True)
#Our placeholder X is the image. Placeholders are supplied when running the computation graph
x = tf.placeholder(tf.float32, [None, 784])
#Create a variable representing the weights. Variables are manipulated by the computation graph
W = tf.Variable(tf.zeros([784, 10]))
y_true = tf.placeholder(tf.float32, [None, 784])
y_pred = tf.matmul(x, W)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=y_pred, labels=y_true))
gd_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
correct_mask = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
with tf.Session() as sess:
#Initialize global variables
sess.run(tf.global_variables_initializer())
for _ in range(NUM_STEPS):
batch_xs, batch_ys = data.train.next_batch(MINIBATCH_SIZE)
sess.run(gd_step, feed_dict={x: batch_xs, y_true: batch_ys})
ans = sess.run(accuracy, feed_dict={x: data.test.images,
y_true: data.test.labels})
print("Accuracy: {:.4}%".format(ans*100))
Now here's the error.
runfile('C:/Users/user/.spyder-py3/temp.py', wdir='C:/Users/user/.spyder-py3')
Extracting C:/tmp/data\train-images-idx3-ubyte.gz
Extracting C:/tmp/data\train-labels-idx1-ubyte.gz
Extracting C:/tmp/data\t10k-images-idx3-ubyte.gz
Extracting C:/tmp/data\t10k-labels-idx1-ubyte.gz
Traceback (most recent call last):
File "<ipython-input-11-bf503334b166>", line 1, in <module>
runfile('C:/Users/user/.spyder-py3/temp.py', wdir='C:/Users/CwWJc/.spyder-py3')
File "C:\Users\user\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\Users\user\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/user/.spyder-py3/temp.py", line 38, in <module>
sess.run(gd_step, feed_dict={x: batch_xs, y_true: batch_ys})
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1149, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (100, 10) for Tensor
'Placeholder_15:0', which has shape '(?, 784)'
Any help is greatly appreciated. Sorry if I'm making a stupid mistake. I find that I often do, though. Thanks in advance! Also, sorry for garbage formatting. :)
Hahaha! I got y_true mixed up. Sorry for the hassle everyone.

Evaluating TF model inside a TF op throws error

I am using TensorFlow 2. I am trying to optimize a function which uses the loss of a trained tensorflow model (poison).
#tf.function
def totalloss(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel*np.ones(xt.shape[0])
loss1 = poison.evaluate(xt, label, steps=1)
loss2 = tf.linalg.norm(m, 1)
return loss1 + loss2
I am not able to execute this function, however, when I comment the #tf.function line the function works!
I need to use this function as a tensorflow op so as to optimize 'm' & 'd'.
Value Error: Unknown graph. Aborting.
This is how I am defining the model and variables:
# mask
m = tf.Variable(tf.zeros(shape=(1, 784)), name="m")
d = tf.Variable(tf.zeros(shape=(1, 784)), name="d")
# target
targetlabel = 6
poison = fcn()
poison.load_weights("MNISTP.h5")
adam = tf.keras.optimizers.Adam(lr=.002, decay=1e-6)
poison.compile(optimizer=adam, loss=tf.losses.sparse_categorical_crossentropy)
This is how I am calling the function later: (Executing this line results in an error listed below. However if I comment off the #tf.function line, this command works!)
loss = totalloss(ptestdata)
This is the entire traceback call:
ValueError: in converted code:
<ipython-input-52-4841ad87022f>:5 totalloss *
loss1 = poison.evaluate(xt, label, steps=1)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:746 evaluate
use_multiprocessing=use_multiprocessing)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:693 evaluate
callbacks=callbacks)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:187 model_iteration
f = _make_execution_function(model, mode)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_arrays.py:555 _make_execution_function
return model._make_execution_function(mode)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2034 _make_execution_function
self._make_test_function()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2010 _make_test_function
**self._function_kwargs)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:3544 function
return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py:3429 __init__
raise ValueError('Unknown graph. Aborting.')
ValueError: Unknown graph. Aborting.
The purpose of #tf.function decorator is to convert Tensorflow operations written in Python into Tensorflow graph to achieve better performance. The error might come when you tried to use a pre-trained model with a serialized graph. Thus, the decorator cannot make the graph-to-graph conversion.
I've reported this error here: https://github.com/tensorflow/tensorflow/issues/33997
A (temporary) solution is that your loss function should be separated into two small functions. The decorator should only be used in the function not including the pre-trained model. In this way, you still can achieve better performance in other operations but not with the part of using the pre-trained model.
For example:
#tf.function
def _other_ops(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel * np.ones(xt.shape[0])
loss2 = tf.linalg.norm(m, 1)
return xt, label, loss2
def total_loss(x):
xt, label, loss2 = _other_ops(x)
loss1 = poison.evaluate(xt, label, steps=1)
return loss1 + loss2
Update:
According to the discussion in the above TF issue link, an elegant solution is to manually pass the input through each layer of the model. You could get a list of layers in your model by calling your_model.layers
In your case, you might calculate the loss from the prediction of your output with the label in the last layer. Thus, I think you should skip the last layer and calculate the loss outside of the loop:
#tf.function
def totalloss(x):
xt = tf.multiply(x, (1.0 - m)) + tf.multiply(m, d)
label = targetlabel*np.ones(xt.shape[0])
feat = xt
# Skip the last layer which calculates loss1
for i in range(len(poison.layers) - 1):
layer = poison.layers[i]
feat = layer(feat)
# Now, calculate loss by yourself
loss1 = tf.keras.losses.sparse_categorical_crossentropy(feat, label)
loss2 = tf.linalg.norm(m, 1)
return loss1 + loss2
The way that the TF engineers explain for this issue is that a model might wrap high-level processing which does guarantee by the #tf.function. So, putting a model inside a function decorated with #tf.function is not recommended. Thus, we need to break the model to smaller pieces to bypass it.

Confusion Matrix with Tensorflow

I am using finetune AlexNet architecture written by #kratzert on my own dataset which, works properly (I got the code from here: https://github.com/kratzert/finetune_alexnet_with_tensorflow) and I want to figure out how to build confusion matrix from his code. I have tried to use tf.confusion_matrix(labels, predictions, num_classes) to build confusion matrix but I can't. I am confused what should be the values for labels and predictions, I mean, I know what should be but each time I feed these value got an error. Can anyone help me on this or have a look at the code (above link) and guide me?
I added these two lines in finetune.py exactly after calculating accuracy to make the labels and the predictions as the number of the class.
with tf.name_scope("accuracy"):
correct_pred = tf.equal(tf.argmax(score, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
**true_class = tf.argmax(y, 1)
predicted_class = tf.argmax(score, 1)**
and I have added tf.confusion_matrix() inside my session at the very bottom before saving checkpoint of the model
for _ in range(val_batches_per_epoch):
img_batch, label_batch = sess.run(next_batch)
acc, cost = sess.run([accuracy, loss], feed_dict={x: img_batch,
y: label_batch,
keep_prob: 1.})
test_acc += acc
test_count += 1
test_acc /= test_count
print("{} Validation Accuracy = {:.4f} -- Validation Loss = {:.4f}".format(datetime.now(),test_acc, cost))
print("{} Saving checkpoint of model...".format(datetime.now()))
**print(sess.run(tf.confusion_matrix(true_class, predicted_class, num_classes)))**
# save checkpoint of the model
checkpoint_name = os.path.join(checkpoint_path,
'model_epoch'+str(epoch+1)+'.ckpt')
save_path = saver.save(sess, checkpoint_name)
print("{} Model checkpoint saved at {}".format(datetime.now(),
checkpoint_name))
I have tried other places as well but each time I will get an error:
Caused by op 'Placeholder_1', defined at:
File "/home/armin/Desktop/Alexnet_DataPipeline/finetune.py", line 85, in <module>
y = tf.placeholder(tf.float32, [batch_size, num_classes])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1777, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 4521, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3290, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1654, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_1' with dtype float and shape [128,3]
any help will be appreciated, Thanks.
It's a fairly long piece of code you're referring to, and you did not specify where you put your confusion matrix line.
Just by experience, the most frequent problem with confusion matrices is that tf.confusion_matrix() requires both the labels and the predictions as the number of the class, not as one-hot vectors. In other words, the label and the prediction should be in the form of the number 5 instead of [ 0, 0, 0, 0, 0, 1, 0, 0, 0, 0 ].
In the code you refer to, y is in the one-hot format. The output of the network, score is a vector, giving the probability of each class. That is also not the required format. You need to do something like
true_class = tf.argmax( y, 1 )
predicted_class = tf.argmax( score, 1 )
and use those with the confusion matrix like
tf.confusion_matrix( true_class, predicted_class, num_classes )
(Basically, if you take a look at line 123 of finetune.py, that has both of those elements for determining accuracy, but they are not saved in separate tensors.)
If you want to keep a running total of confusion matrices of all batches, you just have to add them up - since each cell of the matrix counts the number of examples falling into that category, an element-wise addition creates the confusion matrix for the whole set:
cm_running_total = None
cm_nupmy_array = sess.run(tf.confusion_matrix(true_class, predicted_class, num_classes), feed_dict={x: img_batch, y: label_batch, keep_prob: 1.} )
if cm_running_total is None:
cm_running_total = cm_numpy_array
else:
cm_running_total += cm_numpy_array

Applying gradients with feed_dict for gradients

I want to do some non-tensorflow processing on the computed gradients, before applying them on the variables.
My plan was to run the gradient ops that I get from the compute_gradients function , do my processing (in python without tensorflow), and then run the apply operation I get from the apply_gradients function and feed the processed gradients in the feed_dict. Unfortunately, this doesn't work in my scenario.
I managed to narrow it down to some issue with tf.nn.embedding_lookup (same happens with tf.gather), and the error can be reproduced as follows (using tf1.4):
import tensorflow as tf
x = tf.placeholder(dtype=tf.float32, shape=[])
z = tf.placeholder(dtype=tf.int32, shape=[])
emb_mat = tf.get_variable('w', [100, 5], initializer=tf.truncated_normal_initializer(stddev=0.1))
emb = tf.nn.embedding_lookup(emb_mat, z)
loss = x - tf.reduce_sum(emb) # Just some silly loss
opt = tf.train.GradientDescentOptimizer(0.1)
grads_and_vars = opt.compute_gradients(loss, tf.trainable_variables())
train_op = opt.apply_gradients(grads_and_vars)
grads = [g for g,v in grads_and_vars]
tsess = tf.Session()
tsess.run(tf.global_variables_initializer())
gradsres = tsess.run(grads, {x: 1.0, z: 1})
tsess.run(train_op, {g:r for g,r in zip(grads, gradsres)})
which results in the error
Traceback (most recent call last):
File "/home/cruvadom/.p2/pool/plugins/org.python.pydev_6.0.0.201709191431/pysrc/_pydevd_bundle/pydevd_exec.py", line 3, in Exec
exec exp in global_vars, local_vars
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1098, in _run
raise ValueError('Tensor %s may not be fed.' % subfeed_t)
ValueError: Tensor Tensor("gradients/Gather_1_grad/ToInt32:0", shape=(2,), dtype=int32, device=/device:GPU:0) may not be fed.
It seems there is some additional tensor I need to feed to the graph for the computation. What is the right way to do why I want to do?
Thanks!
If you run the training operation, it will automatically calculate the gradients. You can retrieve the gradients from the session:
tsess = tf.Session()
tsess.run(tf.global_variables_initializer())
_, grads_and_vars, loss = tsess.run([train_op ,grads, loss], {x: 1.0, z: 1})
assert not np.isnan(loss), 'Something wrong! loss is nan...'
#Get the gradients
for g, v in grads_and_vars:
if g is not None:
grad_hist_summary = tf.summary.histogram("{}/grad_histogram".format(v.name), g)
sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name), tf.nn.zero_fraction(g))

Custom loss function: perform a model.predict on the data in y_pred

I am training a network to denoise images, for this I am using the CIFAR10 dataset. I am trying to generate a custom loss function so that the loss is mse / classification_accuracy.
Given that my network receives as input 32x32 (noisy) images and predicts 32x32 (denoised) images, I am assuming that y_pred and Y_true would be arrays of 32x32 images. Thus my custom loss functions looks like this:
def custom_loss():
def joint_optimized_loss(y_true, y_pred):
mse = K.mean(K.square(y_pred - y_true), axis=-1)
preds = classif_model.predict(y_pred)
correctPreds = 0
totPreds = 0
for pred in preds:
predictedClass = pred.index(max(pred))
totPreds += 1
if predictedClass == currentClass:
correctPreds += 1
classifAccuracy = correctPreds / totPreds
loss = mse / classifAccuracy
return loss
return joint_optimized_loss
myModel.compile(optimizer='adadelta', loss=custom_loss())
classif_model is a pre-trained model that classifies CIFAR10 images into one of the 10 classes. It receives an array of 32x32 images.
However when I run my code I get the following error:
Traceback (most recent call last):
File "myCode.py", line 94, in
myModel.compile(optimizer='adadelta', loss=custom_loss())
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/engine/training.py",
line 850, in compile
sample_weight, mask)
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/engine/training.py",
line 450, in weighted
score_array = fn(y_true, y_pred)
File "myCode.py", line 57, in joint_optimized_loss
preds = classif_model.predict(y_pred)
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/models.py",
line 913, in predict
return self.model.predict(x, batch_size=batch_size, verbose=verbose)
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/engine/training.py",
line 1713, in predict
verbose=verbose, steps=steps)
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/engine/training.py",
line 1260, in _predict_loop
batches = _make_batches(num_samples, batch_size)
File "/home/rvidalma/anaconda2/envs/tensorUpdated/lib/python2.7/site-packages/keras/engine/training.py",
line 374, in _make_batches
num_batches = int(np.ceil(size / float(batch_size)))
AttributeError: 'Dimension' object has no attribute 'ceil'
I think this has something to do with the fact that y_true and y_pred are both tensors that, before training, are empty thus classif_model.predict fails as it is expecting an array. However I am not sure on how to fix this...
I tried getting instead the value of y_pred using K.get_value(y_pred), but that gives me the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape
[-1,32,32,3] has negative dimensions [[Node: input_1 =
Placeholderdtype=DT_FLOAT, shape=[?,32,32,3],
_device="/job:localhost/replica:0/task:0/cpu:0"]]
You cannot use accuracy as a loss function, as it is not differentiable. This is why upper bounds on accuracy like the cross-entropy are used instead.
Additionally, the way you implemented accuracy is also non-symbolic, you should have used only functions in keras.backend to implement a loss for it to work properly.
I had almost same problem, and I tried this and it worked for me.
Instead of:
preds = classif_model.predict(y_pred)
try:
preds = classif_model(y_pred)
I am not sure about the reason but it is because when we use model.predict(y) it need batch_size and while compiling we don't have any, so we can not use model.predict(y).
Please correct me if this is wrong.