Saving Word2Vec for CNN Text Classification

Saving Word2Vec for CNN Text Classification - tensorflow

I want to train my own Word2Vec model for my text corpus. I can get the code from TensorFlow's tutorial. What I don't know is how to save this model to use for CNN text classification later? Should I use pickle to save it and then read it later?

No pickling is not the way of saving the model in case of tensorflow.
Tensorflow provides with tensorflow serving for saving the models as proto bufs(for exporting the model). The way to save model would be to save the tensorflow session as:
saver.save(sess, 'my_test_model',global_step=1000)
Heres the link for complete answer:
Tensorflow: how to save/restore a model?

You can use pickle to save it to disk. Then when you are creating the CNN model, load the saved word embedding table and use it to initialize the TensorFlow variable that holds the word embeddings for your CNN classifier.

Related

How to deploy custom tensorflow model to web?

so im facing a problem about deployment my custom sign-language recognition model. I converted my_ssd_mobnet with exporter_main_v2.py to saved_model.pb and then i tried to use the tensorflowjs convertor with this code:
from tensorflow import keras
import tensorflowjs as tfjs
def importModel(modelPath):
model = tf.keras.models.load_model(modelPath)
tfjs.converters.save_tf_model(model, "tfjsmodel")
importModel("saved_model")
#importModel("modelDirectory")
then i got an error like this..
ValueError: Unable to create a Keras model from this SavedModel. This SavedModel was created with tf.saved_model.save, and lacks the Keras metadata.Please save your Keras model by calling model.saveor tf.keras.models.save_model.
Finally i decide to convert my model to h5, but.. i don't know how.
How can i convert my_ssd_mobnet model to h5?
Thanks!

If you're creating a custom Keras layer in python and wanting to export it to tfjs for the browser to predict, then you'll most likely encounter "Unknown layer" and will have to implement them yourself in JS.
Instead of exporting the layers, it's best to export a graph since you're only using it for prediction and not training in the browser.
tf.saved_model.save(model, 'saved_model')
This will save the files in the saved_model folder and contains the .pb file.
Use the tensorflowjs_converter tool to convert the model into a graph tfjs model.
tensorflow_converter --input_format=tf_saved_model saved_model model
This will convert your saved model into the browser-compatible tfjs model without the custom layer. (The Keras layers will be built in.)
Move this folder to your website's public folder.
In the browser:
const model = await tf.loadGraphModel('/model/model.json')
const img = tf.browser.fromPixels(imageData, 3) // imageElement, videoElement, ImageData
.toFloat().resizeBilinear([224, 224]) // mobilenet dims
.div(tf.scalar(255)) // mobilenet [0,1] normalization
.expandDims()
const { values, indices } = model.predict(img).topk()
const label = indices.dataSync()[0]
const confidence = values.dataSync()[0]
NOTE: The .bin files will end up in the 10's of MB so put this inside a webworker. You can send a buffered data from the main thread to the worker thread for processing.

First and foremost, if you have used "exporter_main_v2.py" script to export the model, you will only get the model format in tensorflow model. This way of exporting is mainly used to make inference on the trained model. So the main problem in your code is that you are trying to import a "keras model" with that tf.keras.models.load_model() function. Instead of using "exporter_main_v2.py" you have to use tf.keras.models.save_model() function to export/save your model.
I am also giving you a simple video explanation link to clarify a few things for you
https://www.youtube.com/watch?v=Lx7OCFXPG8o
After watching the video you might want to checkout the following colab notebook
https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l07c01_saving_and_loading_models.ipynb
This is a material provided by Udacity from its introduction to tensorflow training course. That should be very helpful in your case to understand the difference between tensorflow model file and keras model file.
Have a nice day.
Edit:
HDF5 format
Keras provides a basic save format using the HDF5 standard.
Create and train a new model instance.
model = create_model()
model.fit(train_images, train_labels, epochs=5)
Save the entire model to a HDF5 file.
The '.h5' extension indicates that the model should be saved to HDF5.
model.save('my_model.h5')
You should add '.h5' extension to filename when calling model.save function, by this way the model will be saved in h5 format.

Training a keras model on pretrained weights using load_weights()

I am using a custom keras model in Databricks environment.
For a custom keras model, model.save(model.h5) does not work, because custom model is not serializable. Instead it is recommended to use model.save_weights(path) as an alternate.
model.save_weights(pathDirectory) works. This yields 3 files checkpoint,.data-00000-of-00001,.index in the pathDirectory
For loading weights, Following mechanism is working fine.
model = Model()
model.load_weights(path)
But I want to train my model on pretrained weights I just saved. Like I saved model weights, and continue training on these saved weights afterwards.
So, when I load model weights and apply training loop, I get this error, TypeError: 'CheckpointLoadStatus' object is not callable

After much research, I have found a workaround,
we can also save model using
model.save("model.hpy5") and read it the saved model in databricks.
model.h5 not work for customized models, but it works for standard models.

Save trained gensim word2vec model as a tensorflow SavedModel

Do we have an option to save a trained Gensim Word2Vec model as a saved model using tf 2.0 tf.saved_model.save? In other words, how can I save a trained embedding vector as a saved model signature to work with tensorflow 2.0. The following steps are not correct normally:
model = gensim.models.Word2Vec(...)
model.init_sims(..)
model.train(..)
model.save(..)
module = gensim.models.KeyedVectors.load_word2vec(...)
tf.saved_model.save(
module,
export_dir
)
EDIT:
This example helped me about how to do it : https://keras.io/examples/nlp/pretrained_word_embeddings/

Gensim does not use TensorFlow and it has its own methods for loading and saving models.
You would need to convert Gensim embeddings into a TensorFlow a model which only makes sense if you further plan to use your embeddings within TensorFlow and possibly fine-tune them for your task.
Gensim Word2Vec are two steps in TensorFlow:
Vocabulary lookup: a table that assigns indices to tokens.
Embedding lookup layer that picks up the actual embeddings for the indices.
Then, you can save it as any other TensorFlow model.

How to convert a Tensorflow session to a keras model object?

Suppose I have a pre-trained model stored in a Tensorflow checkpoint. I'd like to convert it into a Keras model. I can load the checkpoint into a TF session alright but that's where I get stuck.

I think it's impossible to create a Keras model using TF checkpoint, but you can copy it's weights to the already created Keras model.
Checkout this. https://github.com/yuyang-huang/keras-inception-resnet-v2
The extract_weights.py is to save the TF weights to numpy array, while load_weights.py is for load the npy file to the Keras model.
For more reference, this is how I implement it https://github.com/DableUTeeF/keras-efficientnet/tree/master/keras_efficientnet.

Save and load a Tensorflow model after training to predict new input

Hello tensorflow Community.
i am new in tesnsorflow , i use tensorflow to classify images now i work with cats_dogs dataset.
i want to save my model after training,and load it in an other program to predict other input
Is there a way to do that ?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Saving Word2Vec for CNN Text Classification - tensorflow

I want to train my own Word2Vec model for my text corpus. I can get the code from TensorFlow's tutorial. What I don't know is how to save this model to use for CNN text classification later? Should I use pickle to save it and then read it later?

You can use pickle to save it to disk. Then when you are creating the CNN model, load the saved word embedding table and use it to initialize the TensorFlow variable that holds the word embeddings for your CNN classifier.

Related

How to deploy custom tensorflow model to web?

Training a keras model on pretrained weights using load_weights()

Save trained gensim word2vec model as a tensorflow SavedModel

How to convert a Tensorflow session to a keras model object?

Save and load a Tensorflow model after training to predict new input

Categories

Resources