Resetting graph in Tensorflow 2 - tensorflow

In my use case, I have some time series data where at each time t, I train a new model over a rolling window. In tensorflow 1, I had to do the following otherwise models will accumulate in the default graph and essentially leak memory.
import tensorflow as tf
import keras.backend as K
...
tf.reset_default_graph()
K.clear_session()
In tensorflow 2, I've found equivalent functions tf.compat.v1.reset_default_graph() and tf.keras.backend.clear_session(). However, from the documentation, TF2 ties graph variables to python variables so theoretically if a python variable is destroyed, the graph variable should also be destroyed. Is this interpretation correct? I've tried putting model creation code in a loop, whilst memory usage still grows, it wasn't the sort of explosion I've witnessed in TF1.

Related

Display pytorch or tensorflow graph with a value at each node

In tensorboard, it is possible to plot the computational graph of a deep learning model.
Is it possible to display a value for each node (for example, the norm of the output)?
Is it possible to do it in both pytorch and tensorflow?
Example (display computational graph with torch.norm of output of each computational graph's node in vgg11):
import torch
import torchvision
vgg11 = torchvision.models.vgg11(pretrained=True)
image = torch.randn(8, 3, 224, 224)
out = vgg11(image)
So in the output node, we want the value in the computational graph to be
torch.norm(out)
One issue in pytorch side is that, there is no explicit computational graph to visualize (e.g. in pydot).
This is a poor question, but:
Yes is is possible.
Access the underlying internal members via the _ or __ prefix and calculate the norm...
So yes it is possible, but the same code will not work across frameworks.

Scope of variables replicated by multiprocessing

I'm trying to speed up training multiple models by using Python's multiprocessing.Pool.apply_async. In order to save memory, I've converted my pandas dataframe into 32bit floats. I can see the pickle is ~2gb on disk. Once I've loaded the pickle into memory, I can see in task manager that python is using ~4-5gb memory.
I then create my tensorflow models in the main thread like so:
for i in range(n):
self.estimators.append(DNNEstimator())
with Pool(processes=4) as pool:
for i in range(n):
# DNNEstimator is a wrapper over a keras neural network with scikit-learn
# compatible interface (i.e., it has methods fit(x, y) and predict(x))
self.results.append(pool.apply_async(self.estimators[i].fit, (x, y)))
for i in range(n):
self.results[i] = self.results[i].get()
My understanding of how multiprocessing works is that it pickles the data that needs to be processed, and run it in a new python process/instance. I noticed that each of the python processes was taking ~6gb of memory during training. I'm suspecting subprocesses are taking this much memory because it recreated all the variables in the main thread. So my question is, how much of the main thread's scope is recreated in each subprocess?

How to specify variables in tensorflow simple save

I am trying unsuccessfully to save my tensorflow model using the simple save method.
I have built a model using keras and trained it successfully, with an accuracy of 88%. I am now trying to save this model so we can serve it, but the function I need, simple save, isn't clear about how to specify the variables that get passed in.
The the session and the export directory is clear enough, but the inputs and outputs are mysterious. I believe that because I've used Keras, these variables are hidden by the abstraction of keras and the documentation from Tensorflow on simple save offers no explanation.
As a hailmary, I set Z equal to y just to put something in there, but obviously that is wrong. Do I need to set up an output variable Z, and if so, what type is it?
Not sure if this is enough code to get to the bottom of this. Even getting pointed at the right docs would be a big boost.
import tensorflow as tf
session = tf.keras.backend.get_session()
export_dir = "/Users/somedir/"
z = np.array([])
tf.saved_model.simple_save(session,
export_dir,
inputs={"x": X, "y": y},
outputs={"z": z})
X is my dataset -- an array of all independent variables. Y is the outcome (dependent variable). I don't have another candidate for z, so I set it to an empty array.
I get AttributeError: 'numpy.ndarray' object has no attribute 'get_shape'
Turns out that you can query the model itself for its inputs and outputs.
Don't forget to import the right libs:
import time
import tensorflow as tf
import tensorflow.python.saved_model
Then set an export path variable, for convenience this is timestamped, so you can run this again and again:
export_path = "/somedirectory/{}".format(time.strftime("%Y%m%d_%H%M%S"))
Then inside of get_session() block, the following will do the trick:
with tf.keras.backend.get_session() as sess:
tf.saved_model.simple_save(
sess,
export_path,
inputs={t.name:t for t in model.inputs},
outputs={t.name:t for t in model.outputs})

How to define a loss function that needs to input numpy array(not tensor) when build a tensorflow graph?

I want to add a constraint option in my loss function. The definition of this constraint option needs numpy array type as input. So, I can not define it as a tensor type as a graph node in tensorflow. How can I define this part in graph so as to join in the network optimization?
Operations done on numpy arrays cannot be automatically differentiated in TensorFlow. Since you are using this computation as part of loss computation, I assume you want to differentiate it. In this case, your best option is probably to reimplement the constraint in TensorFlow. The only other approach I can think of is to use autograd in conjuction with TF. This seems possible - something along the lines of evaluate part of the graph with TF, get numpy arrays out, call your function under autograd, get gradients, feed them back into TF - but will likely be harder and slower.
If you are reimplementing it in TF, most numpy operations have easy one-to-one corresponded operations in TF. If the implementation is using a lot of control flow (which can be painful in classic TF), you can use eager execution or py_func.

Does tensorflow create a new numpy array each time it calls compute_gradients()?

A typical training loop in tensorflow maybe as follows:
cg = opt.compute_gradients(loss)
grads = [None] * len(cg)
for gv in cg:
grads[i] = gv[0]
# ... do some process to grads ...
apply_gradients = opt.apply_gradients(cg)
while (...):
gradients = sess.run(grads)
feed = dict()
for i, grad_var in enumerate(cg)
feed[grad_var[0]] = gradients[i]
sess.run(apply_gradients, feed_dict=feed)
Each time it calls sess.run(grads), a new numpy array gradients (with new-allocated inner memory) is generated. I want to use a fixed numpy array for all the training iterations, how could I do that?
The tf.Optimizer.compute_gradients() method should not create any new NumPy arrays: instead it builds a graph of TensorFlow operations for computing the gradients of the loss with respect to some or all of the variables in your model. The return value is not a NumPy array; it is a list of pairs of gradient tf.Tensor objects and the corresponding tf.Variable to which that gradient should be applied.
Nevertheless, it is usually wasteful of memory to call opt.compute_gradients() inside a loop. It's hard to say whether this will work exactly without seeing more of your code, but you should be able to move the call to opt.compute_gradients() before the loop, since it does not seem to depend on anything computed inside the loop. This will avoid building a new segment of TensorFlow graph in each loop iteration, and should reduce the memory cost.