I've made a new op and I'd like to use it with AdamOptimizer. I've created a gradient for it following the instructions here and added it to my optimizer's var_list but Tensorflow says that my variable doesn't have a processor.
Is there support for Tensorflow custom ops in optimizers?
Does the optimizer class let me create a new processor or would I have to rewrite part of compute_gradients?
Also, what does automatic differentiation mean, as stated by the TF docs:
To make automatic differentiation work for new ops, you must register a gradient function which computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs.
Thanks!
So I found out that what I was doing was not supported with Tensorflow optimizer.
I was trying to create an op that would act like a Tensorflow variable (i.e. get updated by the functions within Optimizer::minimize()), however, I believe that TF does something weird with processors and Eigen::Tensors that I don't fully understand in order to update gradients with minimize(), and naturally this doesn't work with Op classes.
Related
I want to add a constraint option in my loss function. The definition of this constraint option needs numpy array type as input. So, I can not define it as a tensor type as a graph node in tensorflow. How can I define this part in graph so as to join in the network optimization?
Operations done on numpy arrays cannot be automatically differentiated in TensorFlow. Since you are using this computation as part of loss computation, I assume you want to differentiate it. In this case, your best option is probably to reimplement the constraint in TensorFlow. The only other approach I can think of is to use autograd in conjuction with TF. This seems possible - something along the lines of evaluate part of the graph with TF, get numpy arrays out, call your function under autograd, get gradients, feed them back into TF - but will likely be harder and slower.
If you are reimplementing it in TF, most numpy operations have easy one-to-one corresponded operations in TF. If the implementation is using a lot of control flow (which can be painful in classic TF), you can use eager execution or py_func.
I am confused with what types of operations are supported for automatic differentiation in tf. Concretely, is tensor indexing operation as follows supported?
...
# feat is output from some conv layer and the shape is B*H*W*C
# case one
loss = feat[:,1:,1:,:] - feat[:,:-1,:-1,:]
# case two
feat[:,1:,1:,:] = feat[:,1:,1:,:]/2. # assign and replace part original value
loss = tf.reduce_sum(feat)
This isn'ta direct answer, but as a clue, this automatic differentiation library autograd lists operations that are not supported, see Non-differentiable functions, for example floor(), round() are not auto differentiable.
One can also define their own operations, provided if you can code the gradients yourself, see extend-autograd-by-defining-your-own
I would guess tf is very similar to this.
Maybe I miss the forest for the trees, but I think the description of Keras backend function gradients() is wrong (see here).
In my opinion it should be the other way around like
Returns the gradients of loss w.r.t. variables.
(instead of Returns the gradients of variables w.r.t. loss.)
This would also match with the tensorflow description for tf.gradients() (see here) which is used inside Keras gradients().
Do you agree?
In order to use the contrib.learn.Estimator for multi-GPU training, I am attempting to specify GPU assignments in my model_fn.
In pseudo-code:
def model_fn(X, y):
with tf.device('/gpu:1'):
... various tensorflow ops for model ...
return predictions, loss, train_op
Everything works fine without the tf.device('/gpu:1') call, but with it I encounter the following error:
InvalidArgumentError (see above for traceback): Cannot assign a device to
node 'save/ShardedFilename_1': Could not satisfy explicit device
specification '/device:GPU:1' because no supported kernel
for GPU devices is available.
I do not believe that I am adding the offending op to the graph myself, but rather that it is injected through the Estimator's snapshot functionality.
I believe that the solution is to set allow_soft_placement=True so that non GPU functions will fall to CPU, but it's not obvious to me how that exposed when dealing with contrib.learn.Estimator.
I see that the option is usually set in ConfigProto & passed to the session, but I've been using the Estimator's functionality to manage the session for me. Should I be taking control of the session creation, or am I missing a parameter somewhere to accomplish this?
Many thanks in advance for any advice.
Along with Estimator leaving contrib in Tensorflow 1.0 this is fixed.
For my application, I was able to create a new function using only predefined ops. Is there any need to define a new op in this case?
The pseudocode for my function is:
z1 = myGauss(arg, arg2)
def myGauss(arg, arg2):
# Here I only used defined tensorflow operations
If you can achieve what you set out to do with a composition of existing ops, then that's great! You don't need to create a new op.
There are circumstances when we've found it necessary to create a new op, however:
Sometimes you can gain performance by fusing ops together into a single op. For example many of the "training" ops have fused implementations, even though they were initially implemented using simple ops.
Another example is when you want to define a gradient for a composition of ops (because it's more efficient or stable to consider the expression as a whole). This is the rationale for ops like tf.nn.softmax_cross_entropy_with_logits().