Add other metrics to compute performance - tensorflow

I use TFF version 0.12.0
In order to compute performance of model, I would like to add (with accuracy ) sensitivity and specificity metrics,
def specificity
...
def create_compiled_keras_model():
....
model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.001, momentum =0.9),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=([tf.keras.metrics.BinaryAccuracy()], sensitivity, specificity))
return model
I found this error:
TypeError: Type of `metrics` argument not understood. Expected a list or dictionary, found: ([<tensorflow.python.keras.metrics.BinaryAccuracy object at 0x7fb5b0711748>], <function sensitivity at 0x7fb6adf45e18>, <function specificity at 0x7fb5fdaf5f28>)
So how can I add metrics in Tensorflow federated
Thanks

TFF requires metrics to be implemented using the tf.keras.metrics.Metric interface, and can't wrap arbitrary Python functions.
An example of making a custom metric based off of the tf.keras.metrics.Sum subclass can be found in https://github.com/tensorflow/federated/blob/3ed93c8036501fe327ede249a4b0f20d02c6f476/tensorflow_federated/python/learning/keras_utils_test.py#L33. The key being the implementation of update_state method.
For sensitivity and specificity metrics, looking at the implementation of tf.keras.metrics.SensitivityAtSpecificity and it's base class tf.keras.metrics.SensitivitySpecificityBase might be useful examples.

Related

Is there linter for model(inputs) of PyTorch like model.predict(inputs) of TensorFlow?

My goal is to do object detection. However, YOLOv7 and (hack to create bounding box with feature map) tutorial is using PyTorch.
The problem is: model(inputs) do not have typings.
The code L148-L150
out = model(inputs)
probs, class_preds = torch.max(out[0], dim=-1)
feature_maps = out[1].to("cpu")
The forced me to debug the helper.py file to understand what [0] and out[1] are. Currently, I assume that out[0] as the softmax probability and out[1] as the feature maps.
I think the answer is no, in general it is non-trivial to automatically infer the semantic meaning of the outputs of a neural network; this is a product of the semantic meaning of the inputs and the model structure itself. You could reference the Yolo model architecture provided in model.py (though as an aside you should not link to external code but rather provide relevant code in your question itself) and investigate the structure of the outputs, then reference the structure of the labeled inputs (as the model by definition is learning to replicate the structure of the labels.)
That being said in your case the output is quite obviously per-class probabilities and class indexes as shown in line 149:
probs, class_preds = torch.max(out[0], dim=-1)
as the outputs from torch.max per pytorch documentation are (maximum value, maximum index).

How to initialize the model with certain weights?

I am using the example "stateful_clients" in tensorflow-federated examples. I want to use my pretrained model weights to initialize the model. I use the function model.load_weights(init_weight). But it seems that it doesn't work. The validation accuracy in the first round is still low. How can I solve the problem?
def tff_model_fn():
"""Constructs a fully initialized model for use in federated averaging."""
keras_model = get_five_layers_cnn([28, 28, 1])
keras_model.load_weights(init_weight)
loss = tf.keras.losses.SparseCategoricalCrossentropy()
return stateful_fedavg_tf.KerasModelWrapper(keras_model,
test_data.element_spec, loss)
A quick primer on state and model weights in TFF
TFF takes a distinct perspective on state in machine learning, generally a consequence of its desire to be purely functional.
Usually in machine learning, a model is conceptually a function which takes data and produces a prediction. However, this notion is a little overloaded at times; does 'model' refer to a trained model (fitting the specification above), or an architecture which is parameterized by its parameters, and therefore needs to accept these parameters as an argument to be considered truly a 'function'? A conception somewhat in the middle is that of a 'stateful function', which I think tends to be what people intend to refer to when they use the term 'model'.
TFF standardizes on the latter understanding. For TFF, a 'model' is a function which accepts parameters along with data as an argument, producing a prediction. This is generally to avoid the notion of a stateful function, which is disallowed by a purely functional perspective (f(x) == f(x) should always be true, so f cannot have any state which affects its output).
On the code in question
I'm not super familiar with this portion of the TFF codebase; in particular I'm a little surprised at the behavior of the keras model wrapper, as usually TFF wants to serialize all logic into TFF-defined data structures as soon as possible (at least, this is how I think about it). Glancing at the code, it looks to me like it could work--but there have been exciting interactions between TFF and Keras in the past.
Briefly, here is how this path should be working:
The model function you define above is invoked while building the initialize computation, in a graph context; the logic to load weights (or assignment of the weights themselves, baked into the graph as a constant) would hopefully be serialized into the graph that TFF generates to represent initialize.
Upon calling iterative_process.initialize, you would find your desired weights populated in the appropriate attributes of the returned data structure. This would serve as your initial starting point for your iterative process, and you would be off to the races.
What I am suspicious of in the above is 1. TFF will silently invoke your model_fn in a TensorFlow graph context, resulting in non program-order semantics; if there is no control dependency between the assignment and the return value of your function (which there isn't in the code above, and in fact it is not obvious how to force this), the assignment may be skipped at initialize time. Therefore the state returned from initialize won't have your specified weights.
If this suspicion is true, the appropriate solution is to run this to run the weight loading logic directly in Python. TFF provides some utilities to help with this kind of thing, like tff.learning.state_with_new_model_weights. This would be used like:
state = iterative_process.initialize()
weights = tf.keras.load_weights(...) # No idea if this call is correct, probably not.
state_with_loaded_weights = tff.learning.state_with_new_model_weights(state, weights)
...
# continue on using state in the iterative process

How are function metrics aggregated over batches in tensorflow model validation?

In tensorflow tf.keras.Model.compile, you can pass a lambda y_true, y_pred: val function as a metric (though, it seems not documented), but I asked my self : "How does it aggregate it over the batches" ?
I searched the documentation, but I've found nowhere how it is done ?
By the way, I don't even know if it is an undefined behavior to do so and one should instead subclass the Metric class ? ( or at least provide the required methods).
Also, is it pertinent to pass a loss as a metric (and in this case, same question : how is it aggregated over the batches ? )
To understand "How does it aggregate (I'm assuming for display in the progress bar)", I suggest you check tf.keras.utils.Progbar. Aggregation over batches is done when you use model.fit, not model.compile.
Is using a lambda as a loss or metric undefined behaviour? No, if defined properely. If you do not write the lambda expression properly, TensorFlow will throw an Exception.
Is using a lambda as a loss or metric recommended? Nope. There is a reason TensorFlow provides separate classes for these. Extending inbuilt classes simplifies other parts of the pipeline, such as saving or loading models. It also makes the code much more readable.
It should just take the average over batches. I don't think it's undefined behavior.
Check out the "Creating Custom Metrics" section here. The metric you use (the lambda) is a stateless, and therefore, during training, it's
the average of the per-batch metric values for all batches seen during a given epoch.

Should I use #tf.function for all functions?

An official tutorial on #tf.function says:
To get peak performance and to make your model deployable anywhere,
use tf.function to make graphs out of your programs. Thanks to
AutoGraph, a surprising amount of Python code just works with
tf.function, but there are still pitfalls to be wary of.
The main takeaways and recommendations are:
Don't rely on Python side effects like object mutation or list appends.
tf.function works best with TensorFlow ops, rather than NumPy ops or Python primitives.
When in doubt, use the for x in y idiom.
It only mentions how to implement #tf.function annotated functions but not when to use it.
Is there a heuristic on how to decide whether I should at least try to annotate a function with tf.function? It seems that there are no reasons not to do it, unless I am to lazy to remove side effects or change some things like range()-> tf.range(). But if I am willing to do this...
Is there any reason not to use #tf.function for all functions?
TLDR: It depends on your function and whether you are in production or development. Don't use tf.function if you want to be able to debug your function easily, or if it falls under the limitations of AutoGraph or tf.v1 code compatibility.
I would highly recommend watching the Inside TensorFlow talks about AutoGraph and Functions, not Sessions.
In the following I'll break down the reasons, which are all taken from information made available online by Google.
In general, the tf.function decorator causes a function to be compiled as a callable that executes a TensorFlow graph. This entails:
Conversion of the code through AutoGraph if required (including any functions called from an annotated function)
Tracing and executing the generated graph code
There is detailed information available on the design ideas behind this.
Benefits of decorating a function with tf.function
General benefits
Faster execution, especially if the function consists of many small ops (Source)
For functions with Python code / Using AutoGraph via tf.function decoration
If you want to use AutoGraph, using tf.function is highly recommended over calling AutoGraph directly.
Reasons for this include: Automatic control dependencies, it is required for some APIs, more caching, and exception helpers (Source).
Drawbacks of decorating a function with tf.function
General drawbacks
If the function only consists of few expensive ops, there will not be much speedup (Source)
For functions with Python code / Using AutoGraph via tf.function decoration
No exception catching (should be done in eager mode; outside of the decorated function) (Source)
Debugging is much harder
Limitations due to hidden side effects and TF control flow
Detailed information on AutoGraph limitations is available.
For functions with tf.v1 code
It is not allowed to create variables more than once in tf.function, but this is subject to change as tf.v1 code is phased out (Source)
For functions with tf.v2 code
No specific drawbacks
Examples of limitations
Creating variables more than once
It is not allowed to create variables more than once, such as v in the following example:
#tf.function
def f(x):
v = tf.Variable(1)
return tf.add(x, v)
f(tf.constant(2))
# => ValueError: tf.function-decorated function tried to create variables on non-first call.
In the following code, this is mitigated by making sure that self.v is only created once:
class C(object):
def __init__(self):
self.v = None
#tf.function
def f(self, x):
if self.v is None:
self.v = tf.Variable(1)
return tf.add(x, self.v)
c = C()
print(c.f(tf.constant(2)))
# => tf.Tensor(3, shape=(), dtype=int32)
Hidden side effects not captured by AutoGraph
Changes such as to self.a in this example can't be hidden, which leads to an error since cross-function analysis is not done (yet) (Source):
class C(object):
def change_state(self):
self.a += 1
#tf.function
def f(self):
self.a = tf.constant(0)
if tf.constant(True):
self.change_state() # Mutation of self.a is hidden
tf.print(self.a)
x = C()
x.f()
# => InaccessibleTensorError: The tensor 'Tensor("add:0", shape=(), dtype=int32)' cannot be accessed here: it is defined in another function or code block. Use return values, explicit Python locals or TensorFlow collections to access it. Defined in: FuncGraph(name=cond_true_5, id=5477800528); accessed from: FuncGraph(name=f, id=5476093776).
Changes in plain sight are no problem:
class C(object):
#tf.function
def f(self):
self.a = tf.constant(0)
if tf.constant(True):
self.a += 1 # Mutation of self.a is in plain sight
tf.print(self.a)
x = C()
x.f()
# => 1
Example of limitation due to TF control flow
This if statement leads to an error because the value for else needs to be defined for TF control flow:
#tf.function
def f(a, b):
if tf.greater(a, b):
return tf.constant(1)
# If a <= b would return None
x = f(tf.constant(3), tf.constant(2))
# => ValueError: A value must also be returned from the else branch. If a value is returned from one branch of a conditional a value must be returned from all branches.
tf.function is useful in creating and using computational graphs, they should be used in training and in deployment, however it isnt needed for most of your functions.
Lets say that we are building a special layer that will be apart of a larger model. We would not want to have the tf.function decorator above the function that constructs that layer because it is merely a definition of what the layer will look like.
On the other hand, lets say that we are going to either make a prediction or continue our training using some function. We would want to have the decorator tf.function because we are actually using the computational graph to get some value.
A great example would be constructing a encoder-decoder model.
DONT put the decorator around the function the create the encoder or decoder or any layer, that is only a definition of what it will do.
DO put the decorator around the "train" or "predict" method because those are actually going to use the computational graph for computation.
Per my understanding and according to the documentation, using tf.function is highly recommended mainly for speeding up your code since the code wrapped by tf.function would be converted to a graph and therefore there is a room for some optimizations (e.g. op pruning, folding, etc.) to be done which may not be performed when the same code is run eagerly.
However, there are also a few cases where using tf.function might incur additional overhead or does not result in noticeable speedups. One notable case is when the wrapped function is small and only used a few times in your code and therefore the overhead of calling the graph might be relatively large. Another case is when most of the computations are already done on an accelerator device (e.g. GPU, TPU), and therefore the speedups gained by graph computation might not be significant.
There is also a section in the documentation where the speedups are discussed in various scenarios, and at the beginning of this section the two cases above have been mentioned:
Just wrapping a tensor-using function in tf.function does not automatically speed up your code. For small functions called a few times on a single machine, the overhead of calling a graph or graph fragment may dominate runtime. Also, if most of the computation was already happening on an accelerator, such as stacks of GPU-heavy convolutions, the graph speedup won't be large.
For complicated computations, graphs can provide a significant speedup. This is because graphs reduce the Python-to-device communication and perform some speedups.
But at the end of the day, if it's applicable to your workflow, I think the best way to determine this for your specific use case and environment is to profile your code when it gets executed in eager mode (i.e. without using tf.function) vs. when it gets executed in graph mode (i.e. using tf.function extensively).

What operations are supported for automatic differentiation in tensorflow

I am confused with what types of operations are supported for automatic differentiation in tf. Concretely, is tensor indexing operation as follows supported?
...
# feat is output from some conv layer and the shape is B*H*W*C
# case one
loss = feat[:,1:,1:,:] - feat[:,:-1,:-1,:]
# case two
feat[:,1:,1:,:] = feat[:,1:,1:,:]/2. # assign and replace part original value
loss = tf.reduce_sum(feat)
This isn'ta direct answer, but as a clue, this automatic differentiation library autograd lists operations that are not supported, see Non-differentiable functions, for example floor(), round() are not auto differentiable.
One can also define their own operations, provided if you can code the gradients yourself, see extend-autograd-by-defining-your-own
I would guess tf is very similar to this.