I want to use CNTK Evaluation API to give scores for my data. Currently I see there is one by one way of evaluation. I am curious whether the minibatch supported in CNTK evaluation? Where could I find some sample code for it?
Evaluation with batch data is supported by the new version of C# Eval API. This page describes how to use the API. And you can find examples here with building instructions.
I know this was posted long ago but in case someone else ends on this question.
In python there is a direct parallel between using a minibatch in training and evaluation. In training, the .train_minibatch method can be used on a cntk.Trainer object and for evaluation the .test_minibatch method can be used on a cntk.Evaluator object.
See the section "3. Feeding Data Via an Explicit Minibatch Loop" in CNTK_200_GuidedTour for an example.
Related
I am using Tensorflow Federated to train a text classification model with the federated learning approach.
Is there any way to apply Early Stopping on the client-side? Is there an option for cross-validation in the API?
The only thing I was able to find is the evaluation:
evaluation = tff.learning.build_federated_evaluation(model_fn)
Which is applied to the model by the end of a federated training round.
Am I missing something?
One straightforward way to control the number of steps a client takes when using tff.learning.build_federated_averaging_process is by setting up each clients tf.data.Dataset with different parameters. For example limiting the number of steps with tf.data.Dataset.take. The guide tf.data: Build TensorFlow input pipelines has many more details.
Alternatively stopping based on a measurement of learning progress would require modifying some internals of the algorithm currently. Rather than using the APIs in tff.learning, it maybe simpler to poke around federated/tensorflow_federated/python/examples/simple_fedavg/ particularly the client training loop is here and could be modified to stop based on some criteria other than "end of dataset" (as currently used).
I'm trying to incorporate image normalization in my keras model to run on Google's cloud TPU. Therefore I inserted a line into my code:
with strategy.scope():
input_shape=(128,128,3)
image_0 = Input(shape=input_shape)
**image_1 = tf.image.per_image_standardization(image_0)**
...
There was nor error thrown, but according the documentation of google tf.image.per_image_standardization
is not a supported function. Does anybody know if it works anyhow, or does anybody have an idea how to check if it works?
From the TensorFlow Model Garden reference for ResNet, the mean and standard deviation of a dataset is often calculated beforehand and each batch is standardized via mean subtract and dividing by the standard deviation. See here for a reference (this uses ImageNet statistics).
I would suggest creating a separate script that calculates the mean and standardization and doing the same. Could you also point to the documentation where tf.image.per_image_standardization is not supported? I don't see why this wouldn't work, but you shouldn't apply it as a layer like in the provided code snippet. It should be in the data preprocessing pipeline like in the above reference.
How can I evaluate my object detection model in a simple and understandable way, I used the TensorFlow's Object Detection API, but I didn’t understand the Tensorboard graphs. Can I evaluate it manually?
Any help? :(
Welcome to StackOverflow!
In short, yes you can. Yet it could be quite time-consuming to achieve your goal.
Here are the steps you might want to follow(assuming you have some basic understanding of Tensorflow graphs and sessions, otherwise please update your question):
Export your model to a frozen graph(*.pb file) via HERE. This step will give you an out-of-the-box model that you could load without any dependencies of Object Detection API.
Write a script to load your model(frozen graph) and perform the evaluation. Some instructions can be found from HERE. Make sure you use tools such as Netron to check the input and output node names of your frozen graph.
Once you could perform the evaluation, you could write metrics on your own dataset, such as mAP, and loop through all images to get the desired evaluation performed.
You could use the confusion matrix to evaluate your model on the test dataset.
After training the model on your dataset, export the inference graph for evaluation.
Find the attached link which helps you step by step towards evaluation.
Best of luck!
confusion_matrix
I've been following this tutorial on the Tensorflow Object Detection API, and I've successfully trained my own object detection model using Google's Cloud TPUs.
However, the problem is that on Tensorboard, the plots I'm seeing only have 2 data points each (so it just plots a straight line), like this:
...whereas I want to see more "granular" plots like these below, which are much more detailed:
The tutorial I've been following acknowledges that this issue is caused by the fact that TPU training requires very few steps to train:
Note that these graphs only have 2 points plotted since the model
trains quickly in very few steps (if you’ve used TensorBoard before
you may be used to seeing more of a curve here)
I tried adding save_checkpoints_steps=50 in the file model_tpu_main.py (see code fragment below), and when I re-ran training, I was able to get a more granular plot, with 1 data point every 300 steps or so.
config = tf.contrib.tpu.RunConfig(
# I added this line below:
save_checkpoints_steps=50,
master=tpu_grpc_url,
evaluation_master=tpu_grpc_url,
model_dir=FLAGS.model_dir,
tpu_config=tf.contrib.tpu.TPUConfig(
iterations_per_loop=FLAGS.iterations_per_loop,
num_shards=FLAGS.num_shards))
However, my training job is actually saving a checkpoint every 100 steps, rather than every 300 steps. Looking at the logs, my evaluation job is running every 300 steps. Is there a way I can make my evaluation job run every 100 steps (whenever there's a new checkpoint) so that I can get more granular plots on Tensorboard?
Code which addresses this issue is explained by a technical lead for the Google cloud platform in a Medium blogpost. Alternatively go directly to the Github code.
The train_and_evaluate function of 81 lines defines an TPUEstimator, train_input_fn and eval_input_fn. Then it iterates to the training steps and calls estimator.train and estimator.evaluate in each iteration. The metrics can be defined in the model_fn, which is called image_classifier. Note that it currently has no effect to add tf.summary calls in the model functions since the TPU does not support it:
"TensorBoard summaries are a great way see inside your model. A minimal set of basic summaries are automatically recorded by the TPUEstimator, to event files in the model_dir. Custom summaries, however, are currently unsupported when training on a Cloud TPU. So while the TPUEstimator will still run locally with summaries, it will fail if used on a TPU." (source)
If summaries are important it might be more convenient to switch to training on GPU.
Personally I think writing this code is quite a hassle for something which should be handled by the API. Please update this answer if better solutions exist! I'm looking forward to it.
Set save_summary_steps in RunConfig to 100, so you get the statistics you want
Also iterations_per_loop to 100 so that the training doesn't go more steps
p.s. I hope you realize that checkpointing is very slow. You are probably raising the cost of your job just for the sake of a pretty graph :)
You can try adding throttle_secs=100 to the EvalSpecs constructor here. The default is 600 seconds.
If we use beam search in seq2seq model it will give more proper results. There are several tensorflow implementations.
But with the softmax function in each cell you can't use beam search in the training process. So is there any other modified optimization function when using beam search?
As Oliver mentioned in order to use beam search in the training procedure we have to use beam search optimization which is clearly mentioned in the paper Sequence-to-Sequence Learning as Beam-Search Optimization.
We can't use beam search in the training procedure with the current loss function. Because current loss function is a log loss which is taken on each time step. It's a greedy way. It also clearly mentioned in the this paper Sequence to Sequence Learning
with Neural Networks.
In the section 3.2 it has mentioned the above case neatly.
"where
S
is the training set. Once training is complete, we produce tr
anslations by finding the most
likely translation according to the LSTM:"
So the original seq2seq architecture use beam search only in the testing time. If we want to use this beam search in the training time we have to use another loss and optimization method as in the paper.
Sequence-to-Sequence Learning as Beam-Search Optimization is a paper that describes the steps neccesary to use beam search in the training process.
https://arxiv.org/abs/1606.02960
The following issue contains a script that can perform the beam search however it does not contain any of the training logic
https://github.com/tensorflow/tensorflow/issues/654
What I understand is, if loss is calculated at individual word level, there is no sense of sequence. A bad sequence(with mostly random words) can have loss similar to a better sequence(with mostly connected words) as loss can be spread in different ways over the vocabulary.
No, we do not need to use a beam search in the training stage. When training modern-day seq-to-seq models like Transformers we use teacher enforcing training mechanism, where we feed right-shifted target sequence to the decoder side. Again beam-search can improve generalizability, but it is not practical to use in the training stage. But there are alternatives like the use of loss function label-smoothed-cross-entropy.