Applying FedSGD in Tensorflow Federated framework - tensorflow

In federated learning, the state-of-the-art and most known method is the federated averaging algorithm or (FedAvg)
and it can be easily applied in TFF using the function
tff.learning.build_federated_averaging_process
I want to understand how it differs from the function
tff.learning.algorithms.build_fed_sgd
and is there any other aggregation method I can apply using TFF? such as (FedSGD)

Related

Federated Learning in Tensorflow Federated, is there any way to apply Early stopping on the client side?

I am using Tensorflow Federated to train a text classification model with the federated learning approach.
Is there any way to apply Early Stopping on the client-side? Is there an option for cross-validation in the API?
The only thing I was able to find is the evaluation:
evaluation = tff.learning.build_federated_evaluation(model_fn)
Which is applied to the model by the end of a federated training round.
Am I missing something?
One straightforward way to control the number of steps a client takes when using tff.learning.build_federated_averaging_process is by setting up each clients tf.data.Dataset with different parameters. For example limiting the number of steps with tf.data.Dataset.take. The guide tf.data: Build TensorFlow input pipelines has many more details.
Alternatively stopping based on a measurement of learning progress would require modifying some internals of the algorithm currently. Rather than using the APIs in tff.learning, it maybe simpler to poke around federated/tensorflow_federated/python/examples/simple_fedavg/ particularly the client training loop is here and could be modified to stop based on some criteria other than "end of dataset" (as currently used).

Trouble with implementing local response normalization in TensorFlow

I'm trying to implement a local response normalization layer in Tensorflow to be used in a Keras model:
Here is an image of the operation I am trying to implement:
Here is the Paper link, please refer to section 3.3 to see the description of this layer
I have a working NumPy implementation, however, this implementation uses for loops and inbuilt python min and max operators to compute the summation. However, these pythonic operations will cause errors when defining a custom keras layer, so I can't use this implementation.
The issue here lies in the fact that I need to iterate over all the elements in the feature map and generate a normalized value for each of them. Additionally, the upper and lower bound on the summation change depending on which value I am currently normalizing. I can't really think of a way to handle this without nested for loops, but this will not work in a Keras custom layer as it isn't a native TensorFlow function.
Could anyone point me towards tensorflow/keras backend functions that could help me in implementing this layer?
EDIT: I know that this layer is implemented as a keras layer, but I want to build intuition about custom layers, so I want to implement this layer using tensor ops.

How to implement different ranking algorithms in tf-ranking framework?

What is the algorithm behind tf-ranking? and can we use Lambdamart algorithm in tf-ranking .Can anyone suggest some good sources to learn these
TF Ranking provides a framework for you to implement your own algorithm. It helps you set up the components around the core of your model (input_fn, metrics and loss functions, etc.). But the scoring logic (aka scoring function) should be provided by the user.
You probably have already seen these, but just in case:
TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank
and
Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks

Complete Beta (or Gamma) function in Tensorflow

I've read some article about using other distribution to modeling a stochastic policy in Reinforcement Learning. Usually we use a Gaussian distribution but some used Beta distribution : https://en.wikipedia.org/wiki/Beta_distribution
There is already a Beta distribution class inside Tensorflow, allow people to use it as Tensors.
But for some policy gradient methods, they are using constraint on the optimization process, using the Kullback Leiber Divergence.
In the formula, there is the digamma function, already implemented in Tensorflow. But I can't find the beta function (nor the gamma function since they're linked) in Tensorflow. Only log gamma or incomplete gamma. And I cannot use the scipy.special.beta function because it cannot manipulate tensors (since my alpha and beta parameters are produced by a neural network)
I'm not specialist enough in this field, perhaps my question is foolish, but I'd really like an explanation there.
Thanks a lot

Tensorflow - Why are there so many similar or even duplicate functions in tf.nn and tf.layers / tf.losses / tf.contrib.layers etc?

In Tensorflow (as of v1.2.1), it seems that there are (at least) two parallel APIs to construct computational graphs. There are functions in tf.nn, like conv2d, avg_pool, relu, dropout and then there are similar functions in tf.layers, tf.losses and elsewhere, like tf.layers.conv2d, tf.layers.dense, tf.layers.dropout.
Superficially, it seems that this situation only serves to confuse: for example, tf.nn.dropout uses a 'keep rate' while tf.layers.dropout uses a 'drop rate' as an argument.
Does this distinction have any practical purpose for the end-user / developer?
If not, is there any plan to cleanup the API?
Tensorflow proposes on the one hand a low level API (tf., tf.nn....), and on the other hand, a higher level API (tf.layers., tf.losses.,...).
The goal of the higher level API is to provide functions that greatly simplify the design of the most common neural nets. The lower level API is there for people with special needs, or who wishes to keep a finer control of what is going on.
The situation is a bit confused though, because some functions have the same or similar names, and also, there is no clear way to distinguish at first sight which namespace correspond to which level of the API.
Now, let's look at conv2d for example. A striking difference between tf.nn.conv2d and tf.layers.conv2d is that the later takes care of all the variables needed for weights and biases. A single line of code, and voilĂ , you just created a convolutional layer. With tf.nn.conv2d, you have to take declare the weights variable yourself before passing it to the function. And as for the biases, well, they are actually not even handled: you need to add them yourself later.
Add to that that tf.layers.conv2d also proposes to add regularization and activation in the same function call, you can imagine how this can reduce code size when one's need is covered by the higher-level API.
The higher level also makes some decisions by default that could be considered as best practices. For example, losses in tf.losses are added to the tf.GraphKeys.LOSSES collection by default, which makes recovery and summation of the various component easy and somewhat standardized. If you use the lower level API, you would need to do all of that yourself. Obviously, you would need to be careful when you start mixing low and high level API functions there.
The higher-level API is also an answer to a great need from people that have been otherwise used to similarly high-level function in other frameworks, Theano aside. This is rather obvious when one ponders the number of alternative higher level APIs built on top of tensorflow, such as keras 2 (now part of the official tensorflow API), slim (in tf.contrib.slim), tflearn, tensorlayer, and the likes.
Finally, if I may add an advice: if you are beginning with tensorflow and do not have a preference towards a particular API, I would personnally encourage you to stick to the tf.keras.* API:
Its API is friendly and at least as good as the other high-level APIs built on top of the low-level tensorflow API
It has a clear namespace within tensorflow (although it can -- and sometimes should -- be used with parts from other namespaces, such as tf.data)
It is now a first-class citizen of tensorflow (it used to be in tf.contrib.keras), and care is taken to make new tensorflow features (such as eager) compatible with keras.
Its generic implementation can use other toolkits such as CNTK, and so does not lock you to tensorflow.