I'm working on a network with 4 class-specific autoencoders (3 layer feedforward), and within the training iteration, there is a case check to decide, which autoencoder has to be updated:
def f(k): return tf.train.AdamOptimizer(learning_rate=lernrate).minimize(Cost_List[k]), n_List[k].assign_add(1.0), Cost_List[k]
def g(): ???
nothing = g()
min_index = tf.argmin(Cost_List, 0)
Case_0 = (tf.equal(min_index,0), lambda: f(0))
Case_1 = (tf.equal(min_index,1), lambda: f(1))
Case_2 = (tf.equal(min_index,2), lambda: f(2))
Case_3 = (tf.equal(min_index,3), lambda: f(3))
Case_List = [Case_0, Case_1, Case_2, Case_3]
[optimizer, update, cost] = tf.case(Case_List, nothing)
In the case, that no condition is fulfilled, nothing should be done. In this scenario, one of the four cases will be realized, so it's no practical problem here yet, but I want to modify the code, and then it will be important, that the training sample will be skipped in the default case. The problem is, that the return type(s) of f_default and all other return types have to be the same, because sess.run([optimizer, update, cost]) is expecting a certain type. How can I do this, that really nothing happens in the default case? I have already tried to use tf.no_op() but that's not working...
Thanks,
Meridius
To make the signatures match, you could define g() as follows:
def g():
return tf.no_op(), tf.no_op(), tf.constant(0.0)
Note that it would be slightly more efficient to pass g directly as f_default (as opposed to passing g() as the current code does) but the behaviour should be the same.
Related
Goal: I want to train a PPO agent on a problem and determine its optimal value function for a range of observations. Later I plan to work with this value function (economic inequality research). The problem is sufficiently complex so that dynamic programming techniques no longer work.
Approach: In order to check, whether I get correct outputs for the value function, I have trained PPO on a simple problem, whose analytical solution is known. However, the results for the value function are rubbish, which is why I suspect that I have done sth wrong.
The code:
from keras import backend as k_util
...
parser = argparse.ArgumentParser()
# Define framework to use
parser.add_argument(
"--framework",
choices=["tf", "tf2", "tfe", "torch"],
default="tf",
help="The DL framework specifier.",
)
...
def get_rllib_config(seeds, debug=False, framework="tf") -> Dict:
...
def get_value_function(agent, min_state, max_state):
policy = agent.get_policy()
value_function = []
for i in np.arange(min_state, max_state, 1):
model_out, _ = policy.model({"obs": np.array([[i]], dtype=np.float32)})
value = k_util.eval(policy.model.value_function())[0]
value_function.append(value)
print(i, value)
return value_function
def train_schedule(config, reporter):
rllib_config = config["config"]
iterations = rllib_config.pop("training_iteration", 10)
agent = PPOTrainer(env=rllib_config["env"], config=rllib_config)
for _ in range(iterations):
result = agent.train()
reporter(**result)
values = get_value_function(agent, 0, 100)
print(values)
agent.stop()
...
resources = PPO.default_resource_request(exp_config)
tune_analysis = tune.Tuner(tune.with_resources(train_schedule, resources=resources), param_space=exp_config).fit()
ray.shutdown()
So first I get the policy (policy = agent.get_policy()) and run a forward pass with each of the 100 values (model_out, _ = policy.model({"obs": np.array([[i]], dtype=np.float32)})). Then, after each forward pass I use the value_function() method to get the output of the critic network and evaluate the tensor via keras backend.
The results:
True VF (analytical solution)
VF output of Rllib
Unfortunately you can see that the results are not that promising. Maybe I have missed a pre- or postprocessing step? Does the value_function() method even return the last layer of the critic network?
I am very grateful for any help!
It's not part of your script, but I assume that you have trained the policy before you attempt to get useful values out of it.
You are correct in assuming that the value_function() returns the output of the last layer of the critic network in RLlib's implementations.
Have a look at the value function metrics to see if it's actually learning anything (RLlib logs .../learner_stats/vf_loss and .../learner_stats/vf_explained_var)!
After training the model, I'd also try to query the model directly. If that looks better, something is likely off with the code you posted here.
I am trying to create a custom gradient in tensorflow to implement the exponentially smoothed (unbiased) gradient of a logarithm that is suggested in this paper (https://arxiv.org/pdf/1801.04062.pdf). What I need to do is crease a new variable that stores an exponentially smoothed value, which is updated and used in a custom gradient function. Additionally, I need a flag which tells me when the first gradient calculation is being done, so I can initialize the exponentially smoothed value to the appropriate (data-dependent) value. Furthermore, the output of the custom gradient function must be just the gradient, so it will be a pain in the butt to access the output of a tf.assign from inside the custom gradient. Lastly, I do not want to create a second operation that 'manually' initializes the exponential smoothing by running it separately in my training loop. Anyway, this is all too complicated, so I have an abstract, but simple, problem outlined below, the solution to which would solve my problem:
What I need to be able to do is update one variable in a manner which is conditional upon a second, and furthermore I need to update the second variable without providing it as explicit output by my function. Example code demonstrating my problem is below:
import tensorflow as tf
a = tf.get_variable(name = "test",initializer=True)
b = tf.get_variable(name = "testval",initializer = 10.)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
def make_function(inp):
with tf.variable_scope("",reuse = True):
a = tf.get_variable(name = "test",dtype = tf.bool)
b = tf.get_variable(name = "testval")
iftrue = lambda: [tf.assign(b,inp),tf.assign(a,False)]
iffalse = lambda: [tf.assign(b,(b + inp)/2),tf.assign(a,False)]
acond,bcond = tf.cond(a,iftrue,iffalse)
return acond
I = tf.placeholder(tf.float32)
tcond = make_function(I)
print("{}\tThe initial values of a and b".format(sess.run([a,b])))
print("{}\t\tRun, tcond1. output is the updated value of b.".format(sess.run(tcond,{I:1})))
print("{}\tNow we see that b has been updated, but a has not.".format(sess.run([a,b])))
print("{}\t\tSo now the value is 2 instead of 1.5 like it should be.".format(sess.run(tcond,{I:2})))
The output is:
[True, 10.0] The initial values of a and b
1.0 Run, tcond1. output is the updated value of b.
[True, 1.0] Now we see that b has been updated, but a has not.
2.0 So now the value is 2 instead of 1.5 like it should be.
Now, I understand that I need to have a line like sess.run(acond) where acond is the output of the conditional within make_function, but I can't return that because my function needs to only return the value of b (not a), and I don't want to have to carry around an extra op that I need to remember to run on the first training iteration, but not on the others.
So, is there a way to add the assignment op acond to the computational graph without explicitly returning it and running with it sess.run?
Add this operation to a custom collection and, then, create a dependency between your final op (e.g. the train_op) and your acond.
Inside the method:
tf.add_to_collection("to_run", acond)
In the definition of the final op:
to_run = tf.get_collection("to_run")
with tf.control_dependencies(to_run):
final_op = <something>
When you run final_op you are assured your acond has been already executed.
The weights retrieved from restored model doesn't change and the input is also constant
But the output of 'Relu:0' operation is giving different results each time.
Below is my code:
sess=tf.Session()
saver = tf.train.import_meta_graph('checkpoints/checkpoints_otherapproach_1/cameranetwork_RAID_CNN-3100.meta')
saver.restore(sess,tf.train.latest_checkpoint(checkpoint_dir='checkpoints/checkpoints_otherapproach_1/'))
images = tf.get_default_graph().get_tensor_by_name('images:0')
phase = tf.get_default_graph().get_tensor_by_name('phase:0')
Activ = tf.get_default_graph().get_tensor_by_name('network/siamese_model/convolution_1/conv_1/Relu:0')
image_array = np.zeros(shape = [1,3,128,64,3]) #*******
imagepath = 'RAiD_Dataset' + '/images_afterremoving_persons_notinallcameras/'+'test'+'/camera_'+str(1)
fullfile_name = imagepath+"/"+ 'camera_1_person_23_index_1.jpg'
image_array[0][0] = cv2.imread(fullfile_name)
image_array[0][1] = image_array[0][0]
image_array[0][2] = image_array[0][0]
image_array = image_array.astype(np.float32)
feed_dict_values ={images: image_array, phase:False}
temp2 = sess.run(Activ, feed_dict =feed_dict_values)
temp1 = sess.run(Activ, feed_dict =feed_dict_values)
print (temp1==temp2).all() #output is false
There are two possible reasons for this:
Some of the tensorflow ops inherit non-deterministic behavior from CUDA. This results in small numerical errors (which might be amplified by non-linearities). See this answer on how to try running your model on a single CPU thread. If the two arrays will turn out to be identical in this condition, then this is the case.
I'm assuming that you know the graph you are loading, but the graph itself might produce inconsistent results 'by design' due to operations deliberately introducing either randomness or inconstant data. For example, consider operations that use the random number generator or operations that update variables (e.g., tf.assign) each time Activ is evaluated.
Two parts to this question:
(1) What is the best way to update a subset of a tensor in tensorflow? I've seen several related questions:
Adjust Single Value within Tensor -- TensorFlow
and
How to update a subset of 2D tensor in Tensorflow?
and I'm aware that Variable objects can be assigned using Variable.assign() (and/or scatter_update, etc.), but it seems very strange to me that tensorflow does not have a more intuitive way to update a part of a Tensor object. I have searched through the tensorflow api docs and stackoverflow for quite some time now and can't seem to find a simpler solution than what is presented in the links above. This seems particularly odd, especially given that Theano has an equivalent version with Tensor.set_subtensor(). Am I missing something or is there no simple way to do this through the tensorflow api at this point?
(2) If there is a simpler way, is it differentiable?
Thanks!
I suppose the immutability of Tensors is required for the construction of a computation graph; you can't have a Tensor update some of its values without becoming another Tensor or there will be nothing to put in the graph before it. The same issue comes up in Autograd.
It's possible to do this (but ugly) using boolean masks (make them variables and use assign, or even define them prior in numpy). That would be differentiable, but in practice I'd avoid having to update subtensors.
If you really have to, and I really hope there is a better way to do this, but here is a way to do it in 1D using tf.dynamic_stitch and tf.setdiff1d:
def set_subtensor1d(a, b, slice_a, slice_b):
# a[slice_a] = b[slice_b]
a_range = tf.range(a.shape[0])
_, a_from = tf.setdiff1d(a_range, a_range[slice_a])
a_to = a_from
b_from, b_to = tf.range(b.shape[0])[slice_b], a_range[slice_a]
return tf.dynamic_stitch([a_to, b_to],
[tf.gather(a, a_from),tf.gather(b, b_from)])
For higher dimensions this could be generalised by abusing reshape (where nd_slice could be implemented like this but there is probably a better way):
def set_subtensornd(a, b, slice_tuple_a, slice_tuple_b):
# a[*slice_tuple_a] = b[*slice_tuple_b]
a_range = tf.range(tf.reduce_prod(tf.shape(a)))
a_idxed = tf.reshape(a_range, tf.shape(a))
a_dropped = tf.reshape(nd_slice(a_idxed, slice_tuple_a), [-1])
_, a_from = tf.setdiff1d(a_range, a_dropped)
a_to = a_from
b_range = tf.range(tf.reduce_prod(tf.shape(b)))
b_idxed = tf.reshape(b_range, tf.shape(b))
b_from = tf.reshape(nd_slice(b_idxed, slice_tuple_b), [-1])
b_to = a_dropped
a_flat, b_flat = tf.reshape(a, [-1]), tf.reshape(b, [-1])
stitched = tf.dynamic_stitch([a_to, b_to],
[tf.gather(a_flat, a_from),tf.gather(b_flat, b_from)])
return tf.reshape(stitched, tf.shape(a))
I have no idea how slow this will be. I'd guess quite slow. And, I haven't tested it much beyond running it on a couple of tensors.
The (source code) documentation for tf.cond is unclear on whether the functions to be performed when the predicate is evaluated can have side effects or not. I've done some tests but I'm getting conflicting results. For example the code below does not work:
import tensorflow as tf
from tensorflow.python.ops import control_flow_ops
pred = tf.placeholder(tf.bool, [])
count = tf.Variable(0)
adder = count.assign_add(1)
subtractor = count.assign_sub(2)
my_op = control_flow_ops.cond(pred, lambda: adder, lambda: subtractor)
sess = tf.InteractiveSession()
tf.initialize_all_variables().run()
my_op.eval(feed_dict={pred: True})
count.eval() # returns -1
my_op.eval(feed_dict={pred: False})
count.eval() # returns -2
I.e. no matter what value the predicate evaluates to, both functions are getting run, and so the net result is a subtraction of 1. On the other hand, this code snippet does work, where the only difference is that I add new ops to the graph every time my_op is called:
pred = tf.placeholder(tf.bool, [])
count = tf.Variable(0)
my_op = control_flow_ops.cond(pred, lambda:count.assign_add(1), lambda:count.assign_sub(2))
sess = tf.InteractiveSession()
tf.initialize_all_variables().run()
my_op.eval(feed_dict={pred: False})
count.eval() # returns -2
my_op.eval(feed_dict={pred: True})
count.eval() # returns -1
Not sure why creating new ops every time works while the other case doesn't, but I'd obviously rather not be adding nodes as the graph will eventually become too big.
Your second version—where the assign_add() and assign_sub() ops are creating inside the lambdas passed to cond()—is the correct way to do this. Fortunately, each of the two lambdas is only evaluated once, during the call to cond(), so your graph will not grow without bound.
Essentially what cond() does is the following:
Create a Switch node, which forwards its input to only one of two outputs, depending on the value of pred. Let's call the outputs pred_true and pred_false. (They have the same value as pred but that's unimportant since this is never directly evaluated.)
Build the subgraph corresponding to the if_true lambda, where all of the nodes have a control dependency on pred_true.
Build the subgraph corresponding to the if_false lambda, where all of the nodes have a control dependency on pred_false.
Zip together the lists of return values from the two lambdas, and create a Merge node for each of these. A Merge node takes two inputs, of which only one is expected to be produced, and forwards it to its output.
Return the tensors that are the outputs of the Merge nodes.
This means you can run your second version, and be content that the graph remains a fixed size, regardless of how many steps you run.
The reason your first version doesn't work is that, when a Tensor is captured (like adder or subtractor in your example), an additional Switch node is added to enforce the logic that the value of the tensor is only forwarded to the branch that actually executes. This is an artifact of how TensorFlow combines feed-forward dataflow and control flow in its execution model. The result is that the captured tensors (in this case the results of the assign_add and assign_sub) will always be evaluated, even if they aren't used, and you'll see their side effects. This is something we need to document better, and as Michael says, we're going to make this more usable in future.
The second case works because you have added the ops within the cond: this causes them to conditionally execute.
The first case it is analogous to saying:
adder = (count += 1)
subtractor = (count -= 2)
if (cond) { adder } else { subtractor }
Since adder and subtractor are outside the conditional, they are always executed.
The second case is more like saying
if (cond) { adder = (count += 1) } else { subtractor = (count -= 2) }
which in this case does what you expected.
We realize that the interaction between side effects and (somewhat) lazy evaluation is confusing, and we have a medium-term goal to make things more uniform. But the important thing to understand for now is that we do not do true lazy evaluation: the conditional acquires a dependency on every quantity defined outside the conditional that is used within either branch.