Different evaluation accuracy when loading BERT from checkpoint - tensorflow

For some reason, I get wildly different loss and acc when I evaluate my BERT test set right after training vs. when I load from a saved checkpoint. I thought it might have been my adaptation of BERT, so I tried modifying the run_classifier.py script as little as possible to fit my use case, and I still am seeing this problem.
The only reason I can think of is that the model isn't loading correctly, but I don't know how to fix it. I believe I'm loading how originally intended. For the init_checkpoint parameter, I pass path/to/classifier/model.ckpt-{last_step}. There are three model files (meta, index, data) but there are also the checkpoint, events, and graph files. Do I need to be doing something with those other three files as well? I'm used to using keras, and this pure tensorflow saving/loading process seems unnecessarily convoluted to me.
Thank you in advance for any help/insight regarding BERT or pure tf saving/loading! If you're unfamiliar with BERT, here's the github link: BERT GitHub

Related

How to transfer learning or fine tune YOLOv4-darknet with freeze some layers?

I'm a beginner in object detection field.
First, I followed YOLOv4 custom-train from here, I have successfully followed the tutorial. Then I started to think that if I have a new task which is similar to YOLOv4 pre-trained (which using COCO 80 classes) and I have only small dataset size, then I think it would be great if I can fine tune the model (unfreeze only the last layer) to keep or even to increase the detector performance by using only small & similar dataset. This reference seems to legitimate my thought about the fine-tuning I wanted to do.
Then I go to Alexey github here to check how to freeze layers, and found that I should use stopbackward=1. It says that
"...set param stopbackward=1 for layer-136 in cfg-file"
But I have no idea about where is "layer-136" in the cfg-file here and also I have no idea where to put stopbackward=1 if I only want to unfreeze the last layer (with freezing all the other layers). So to summarize my questions.
Where (in which line) to put stopbackward=1 in the yolov4-custom.cfg if I want to unfreeze last layer and freeze the other layers?
What is "layer-136" which mentioned in Alexey github reference? (is it one of the classifier layer? or else?)
In which line of yolov4-custom.cfg should I put the stopbackward=1 for that layer-136?
Any further information from you is really appreciated. Please advise.
Thank you in advance.
Regards,
Sona
the "layer-136" is located before the head of yolov4. To make it easy to see, try to visualize the .cfg file to Netron apps and read the .cfg via text editor, so you can understand the location of layer. You can notice the input and output (the x-layer) when you analyze it with Netron

Evaluate the TensorFlow object detection model

How can I evaluate my object detection model in a simple and understandable way, I used the TensorFlow's Object Detection API, but I didn’t understand the Tensorboard graphs. Can I evaluate it manually?
Any help? :(
Welcome to StackOverflow!
In short, yes you can. Yet it could be quite time-consuming to achieve your goal.
Here are the steps you might want to follow(assuming you have some basic understanding of Tensorflow graphs and sessions, otherwise please update your question):
Export your model to a frozen graph(*.pb file) via HERE. This step will give you an out-of-the-box model that you could load without any dependencies of Object Detection API.
Write a script to load your model(frozen graph) and perform the evaluation. Some instructions can be found from HERE. Make sure you use tools such as Netron to check the input and output node names of your frozen graph.
Once you could perform the evaluation, you could write metrics on your own dataset, such as mAP, and loop through all images to get the desired evaluation performed.
You could use the confusion matrix to evaluate your model on the test dataset.
After training the model on your dataset, export the inference graph for evaluation.
Find the attached link which helps you step by step towards evaluation.
Best of luck!
confusion_matrix

Looking for clarification on "running" Tensorflow models

from Tensorflow's documentation, there seems to be a large array of options for "running", serving, testing, and predicting using a Tensorflow model. I've made a model very similar to MNIST, where it outputs a distribution from an image. For a beginner, what would be the easiest way to take one or a few images, and send them through the model, getting an output prediction? It is mostly for experimentation purposes. Sorry if this is too redundant, but all my research has led me to so many different ways of doing this and the documentation doesn't really give any info on the pros and cons of the different methods. Thanks
I guess you are using placeholders for your model input and then using feed_dict to feed values into your model.
If that's the case the simplest way would be after you have a trained model you save it using tf.saver. Then you can have a test script where you restore your model and then sess.run on your output variable with a feed_dict of whatever you want your input to be.

What does freezing a graph in TensorFlow mean?

I am a beginner in NN APIs and TensorFlow.
I am trying to save my trained model in protobuff format (.pb), there are many blogs explaining how to save the model as protobuff. One thing I did not understand is what is the importance of freezing the graph before saving it as protobuff? I read that freezing coverts variable to constants, does that mean the model is not trainable anymore?
What else will freezing do on models?
What is that model loses after freezing?
Can anyone please explain or give some pointers on details of freezing?
This is only a partial answer to your question.
A freezed graph is easily optimizable. When doing inference (forward propagation) for instance you can fuse some of the layers together. This you can't do with a graph separated between variables and operations (a not frozen graph). Why would you want to fuse layers together? There are multiple reasons. Going hardware specific: it might be easier to compute a number of operations together in a group of tensors, specific to the structure of your cpu or gpu. TensorRT is a graph optimizer for instance that works starting from a frozen graph (here more info on graph optimizations done by tensorRT: https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference/ ). This software does graph optimizations as well as hardware specific ones.
As far as I understand you can unfreeze a graph. I have only worked optimizing them, so I haven't use this feature. But here there is code for it: https://gist.github.com/tokestermw/795cc1fd6d0c9069b20204cbd133e36b
Here is another question that might be useful:
TensorFlow: Is there a way to convert a frozen graph into a checkpoint model?
It has not yet been answered though.
Freezing the model means producing a singular file containing information about the graph and checkpoint variables, but saving these hyperparameters as constants within the graph structure. This eliminates additional information saved in the checkpoint files such as the gradients at each point, which are included so that the model can be reloaded and training continued from where you left off. As this is not needed when serving a model purely for inference they are discarded in freezing. A frozen model is a file of the Google .pb file type.

How to predict using Tensorflow?

This is a newbie question for the tensorflow experts:
I reading lot of data from power transformer connected to an array of solar panels using arduinos, my question is can I use tensorflow to predict the power generation in future.
I am completely new to tensorflow, if can point me to something similar I can start with that or any github repo which is doing similar predictive modeling.
Edit: Kyle pointed me to the MNIST data, which I believe is a Image Dataset. Again, not sure if tensorflow is the right computation library for this problem or does it only work on Image datasets?
thanks, Rajesh
Surely you can use tensorflow to solve your problem.
TensorFlow™ is an open source software library for numerical
computation using data flow graphs.
So it works not only on Image dataset but also others. Don't worry about this.
And about prediction, first you need to train a model(such as linear regression) on you dataset, then predict. The tutorial code can be found in tensorflow homepage .
Get your hand dirty, you will find it works on your dataset.
Good luck.
You can absolutely use TensorFlow to predict time series. There are plenty of examples out there, like this one. And this is a really interesting one on using RNN to predict basketball trajectories.
In general, TF is a very flexible platform for solving problems with machine learning. You can create any kind of network you can think of in it, and train that network to act as a model for your process. Depending on what kind of costs you define and how you train it, you can build a network to classify data into categories, predict a time series forward a number of steps, and other cool stuff.
There is, sadly, no short answer for how to do this, but that's just because the possibilities are endless! Have fun!