TFLearn, tf.contrib.learn or tf.estimator? - tensorflow

I've been tooling around with Tensorflow and TFLearn for a few months. I've made some progress. However, I was expecting to be able to construct a functioning scikit-learn type Estimator as a TFLearn.DNN. I can fit, and I can predict, but I can't do cross-validation because evaluate is failing for me. TensorFlow is throwing:
ValueError: Cannot use the given session to evaluate tensor: the tensor's graph is different from the session's graph.
when I call evaluate. I thought the whole point of the TFLearn API was to abstract things like session management away from my code.
I have asked questions about problems I've had with TFLearn in several forums, including on the project's GitHub page. Unfortunately, I'm not getting any answers.
A few days ago, suddenly I encountered the tf.contrib.learn namespace. I'm seeing a lot of overlap between those classes and TFLearn. Then, I also found the tf.estimator class.
Finally, I just figured out that tensorflow.contrib sub-packages are third-party contributions. This leads me to wonder whether the original TFLearn is simply being absorbed into the larger TensorFlow package. Which direction is the code flowing?
I don't care what I use, as long as I get all the functionality of a scikit-learn estimator object.

I think it's best to use the official sub-modules of TensorFlow like tf.data and tf.estimator. They should be well maintained and features are added quickly.
For instance, #mrry seems in charge of tf.data and the module is very clean, easy to use and compatible with tf.estimator.
The module tf.estimator is a bit less clear, and comes from tf.contrib.learn. Don't take my word for it but I think tf.estimator will slowly replace tf.contrib.learn and it should be the official high-level API for TensorFlow (along with tf.keras).
You can find more information in the official blog post, where they explain the relationship between all modules.

Related

How to use a custom model with Tensorflow Hub?

My goal is to test out Google's BERT algorithm in Google Colab.
I'd like to use a pre-trained custom model for Finnish (https://github.com/TurkuNLP/FinBERT). The model can not be found on TFHub library. I have not found a way to load model with Tensorflow Hub.
Is there a neat way to load and use a custom model with Tensorflow Hub?
Fundamentally: yes. Everyone can create the kind of models that TF Hub hosts, and I hope authors of interesting models do consider that.
For TF1 and the hub.Module format tailored to it, see
https://www.tensorflow.org/hub/tf1_hub_module#creating_a_new_module
For TF2 and its revised SavedModel format, see
https://www.tensorflow.org/hub/tf2_saved_model#creating_savedmodels_for_tf_hub
That said, a sophisticated model like BERT requires a bit of attention to export it with all bells and whistles, so it helps to have some tooling to build on. The BERT reference implementation for TF2 at https://github.com/tensorflow/models/tree/master/official/nlp/bert comes with an open-sourced export_tfhub.py script, and anyone can use that to export custom BERT instances created from that code base.
However, I understand from https://github.com/TurkuNLP/FinBERT/blob/master/nlpl_tutorial/training_bert.md#general-info that you are using Nvidia's fork of the original TF1 implementation of BERT. There are Hub modules created from the original research code, but the tooling to that end has not been open-sourced, and Nvidia doesn't seem to have added their own either.
If that's not changing, you'll probably have to resort to doing things the pedestrian way and get acquainted with their codebase and load their checkpoints into it.

Redundancies in tf.keras.backend and tensorflow libraries

I have been working in TensorFlow for about a year now, and I am transitioning from TF 1.x to TF 2.0, and I am looking for some guidance on how to use the tf.keras.backend library in TF 2.0. I understand that the transition to TF 2.0 is supposed to remove a lot of redundancies in modeling and building graphs, since there were many ways to create equivalent layers in earlier TensorFlow versions (and I'm insanely grateful for that change!), but I'm getting stuck on understanding when to use tf.keras.backend, because the operations appear redundant with other TensorFlow libraries.
I see that some of the functions in tf.keras.backend are redundant with other TensorFlow libraries. For instance, tf.keras.backend.abs and tf.math.abs are not aliases (or at least, they're not listed as aliases in the documentation), but both take the absolute value of a tensor. After examining the source code, it looks like tf.keras.backend.abs calls the tf.math.abs function, and so I really do not understand why they are not aliases. Other tf.keras.backend operations don't appear to be duplicated in TensorFlow libraries, but it looks like there are TensorFlow functions that can do equivalent things. For instance, tf.keras.backend.cast_to_floatx can be substituted with tf.dtypes.cast as long as you explicitly specify the dtype. I am wondering two things:
when is it best to use the tf.keras.backend library instead of the equivalent TensorFlow functions?
is there a difference in these functions (and other equivalent tf.keras.backend functions) that I am missing?
Short answer: Prefer tensorflow's native API such as tf.math.* to thetf.keras.backend.* API wherever possible.
Longer answer:
The tf.keras.backend.* API can be mostly viewed as a remnant of the keras.backend.* API. The latter is a design that serves the "exchangeable backend" design of the original (non-TF-specific) keras. This relates to the historical aspect of keras, which supports multiple backend libraries, among which tensorflow used to be just one of them. Back in 2015 and 2016, other backends, such as Theano and MXNet were quite popular too. But going into 2017 and 2018, tensorflow became the dominant backend of keras users. Eventually keras became a part of the tensorflow API (in 2.x and later minor versions of 1.x). In the old multi-backend world, the backend.* API provides a backend-independent abstraction over the myriad of supported backend. But in the tf.keras world, the value of the backend API is much more limited.
The various functions in tf.keras.backend.* can be divided into a few categories:
Thin wrappers around the equivalent or mostly-equivalent tensorflow native API. Examples: tf.keras.backend.less, tf.keras.backend.sin
Slightly thicker wrappers around tensorflow native APIs, with more features included. Examples: tf.keras.backend.batch_normalization, tf.keras.backend.conv2d(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/backend.py#L4869). They often perform proprocessing and implement other logics, which make your life easier than using native tensorflow API.
Unique functions that don't have equivalent in the native tensorflow API. Examples: tf.keras.backend.rnn, tf.keras.backend.set_learning_phase
For category 1, use native tensorflow APIs. For categories 2 and 3, you may want to use the tf.keras.backend.* API, as long as you can find it in the documentation page: https://www.tensorflow.org/api_docs/python/, because the documented ones have backward compatibility guarantees, so that you don't need to worry about a future version of tensorflow removing it or changing its behavior.

Recreating BERT extract_features.py output with TensorFlow Hub model

I produced a feature vector I am very happy with by cloning the BERT repo, downloading the "BERT-Base, Uncased" pre-trained model, and running extract_features.py like so:
PYTHONPATH=. python extract_features.py --input_file=~/sandbox/input.txt --output_file=~/sandbox/bert_output.jsonl --vocab_file=$BERT_BASE_DIR/vocab.txt --bert_config_file=$BERT_BASE_DIR/bert_config.json --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt --layers=-2 --max_seq_length=128 --batch_size=8
Note the --layers=-2 arg, which specifies that I want the features from the second-to-last layer.
I am now trying to reproduce the same features using this TensorFlow Hub model, which I believe to be the same model. I used this hack suggested on the TF Hub GitHub to access the desired layer, since only the output layer is exposed. The feature vector I get is suspiciously close, but not identical (individual floats are within about 1% of each other). I have confirmed that my input tokens are identical in both cases. Hoping someone with more knowledge of BERT configurations and internals can spot something obvious that I've overlooked, or suggest a way to proceed with debugging? I'm at a loss since the interfaces are pretty different.

fast.ai equivalent in tensorflow

Is there any equivalent/alternate library to fastai in tensorfow for easier training and debugging deep learning models including analysis on results of trained model in Tensorflow.
Fastai is built on top of pytorch looking for similar one in tensorflow.
The obvious choice would be to use tf.keras.
It is bundled with tensorflow and is becoming its official "high-level" API -- to the point where in TF 2 you would probably need to go out of your way not using it at all.
It is clearly the source of inspiration for fastai to easy the use of pytorch as Keras does for tensorflow, as mentionned by the authors time and again:
Unfortunately, Pytorch was a long way from being a good option for part one of the course, which is designed to be accessible to people with no machine learning background. It did not have anything like the clear simple API of Keras for training models. Every project required dozens of lines of code just to implement the basics of training a neural network. Unlike Keras, where the defaults are thoughtfully chosen to be as useful as possible, Pytorch required everything to be specified in detail. However, we also realised that Keras could be even better. We noticed that we kept on making the same mistakes in Keras, such as failing to shuffle our data when we needed to, or vice versa. Also, many recent best practices were not being incorporated into Keras, particularly in the rapidly developing field of natural language processing. We wondered if we could build something that could be even better than Keras for rapidly training world-class deep learning models.

How to deploy parsey's cousins with tensorflow serving

Are there instructions or some documentation somewhere or could somebody describe how to deploy the models available as "Parsey's Cousins" (see https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md) with SyntaxNet under Tensorflow Serving? Even deploying just Parsey is a rather complex undertaking that is not really documented anywhere, but how to do this for the additional 40 languages?
This pull request partially addresses your request, but it still has some issues: https://github.com/tensorflow/models/pull/250.
We do have some tentative plans to provide easier integration between SyntaxNet and Tensorflow Serving, but no precise timeline.
Just for the benefit of anyone else who finds this question, after some digging around on GitHub, one can find the following issue started by Johann Petrak:
https://github.com/dsindex/syntaxnet/issues/7
a model from parsey's cousin is not able to export by that patch due to version mismatch
So whilst some people have been able to modify syntaxnet so that it works with Tensorflow Serving, this seems to be at the cost of using a version which is not compatible with Parsey's Cousins.
Currently the only way to get Tensorflow Serving working with languages other than English is to use something like dsindex's code and train your own models.