I have been trying to fine-tune a conversational model of HuggingFace: Blendebot. I have tried the conventional method given on the official hugging face website which asks us to do it using the trainer.train() method. I also tried it using the .compile() method. I have tried fine-tuning using PyTorch as well as TensorFlow on my dataset. Both methods seem to fail and give us an error saying that there is no method called compile or train for the Blenderbot model.
I have also looked everywhere online to check how Blenderbot could be fine-tuned on my custom data and nowhere does it mention properly that runs without throwing an error. I have gone through Youtube tutorials, blogs, and StackOverflow posts but none answer this question. Hoping someone would respond here and help me out. I am open to using other HuggingFace Conversational Models as well for fine-tuning.
Thank you! :)
Here is a link I am using to fine-tune the blenderbot model.
Fine-tuning methods: https://huggingface.co/docs/transformers/training
Blenderbot: https://huggingface.co/docs/transformers/model_doc/blenderbot
from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration
mname = "facebook/blenderbot-400M-distill"
model = BlenderbotForConditionalGeneration.from_pretrained(mname)
tokenizer = BlenderbotTokenizer.from_pretrained(mname)
#FOR TRAINING:
trainer = Trainer(
model=model,
args=training_args,
train_dataset=small_train_dataset,
eval_dataset=small_eval_dataset,
compute_metrics=compute_metrics,
)
trainer.train()
#OR
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=5e-5),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=tf.metrics.SparseCategoricalAccuracy(),
)
model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3)
None of these work! :(
Related
I have spent few hours to troubleshoot this issue but not clue.... hope someone can help.
The error I got it below
ValueError: Model XXXXXXX cannot be saved either because the input shape is not available or because the forward pass of the model is not defined.To define a forward pass, please override Model.call(). To specify an input shape, either call build(input_shape) directly, or call the model on actual data using Model(), Model.fit(), or Model.predict(). If you have a custom training step, please make sure to invoke the forward pass in train step through Model.call_, i.e. model(inputs), as opposed to model.call().
import tensorflow as tf
import tensorflow_hub as hub
class queryEncoder(tf.keras.Model):
def __init__(self):
super().__init__()
module = hub.load('https://tfhub.dev/google/universal-sentence-encoder-qa/3')
self.encoder = module.signatures['question_encoder']
def call(self, inputs):
return self.encoder(input=tf.constant(inputs))['outputs']
qModel = queryEncoder()
qModel.save('use-query-encoder')
!gsutil cp use-query-encoder gs://ed-model-artifacts-xxyyzz
Actually, what I wanted to do is follow Google's youtube to build a Q&A model with Vertex AI.
https://www.youtube.com/watch?v=5iSmX8sqtx8
But the tutorial didn't give any source code, so I typed the codes follow the video.
If I didn't run qModel.save('use-query-encoder') , but run qModel((["This is hello world"])) , it seems works - returning embedding successfully.
So, the model should be working fine, but some how cannot be save.
I read a lot of posts in Stackoverflow, but I still cannot solve my issue. Perhaps I am too new to Python and TF. I found some posts said, I should run fit() before save(). However, in this can I don't know what parameters should be put in fit()...
I am newbie to Python, perhaps I did some silly thing. If so, please point out :)
im quite new to object detection but i managed to train my first Tensorflow custom model yesterday. I think it worked fine besides some warnings, at least i got my exported_model folder with checkpoint, saved model and pipeline.config. I built it with exporter_main_v2.py from Tensorflow. I just loaded some images of deers and want to try to detect some on different pictures.
That's what i would like to test now, but i dont know how. I already did an object detection tutorial with pre trained models and it worked fine. I tried to just replace config_file_path, saved_model_path and image_path with the paths linking to my exported model but it didnt work:
error: OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\tensorflow\tf_io.cpp:42: error: (-2:Unspecified error) FAILED: ReadProtoFromBinaryFile(param_file, param). Failed to parse GraphDef file: D:\VSCode\Machine_Learning_Tests\Tensorflow\workspace\exported_models\first_model\saved_model\saved_model.pb in function 'cv::dnn::ReadTFNetParamsFromBinaryFileOrDie'
There are endless tutorials on how to train custom detection but i cant find a good explanation how to manually test my exported model.
Thanks in advance!
EDIT: I need to know how to build a script where i can import a model i saved with Tensorflow exporter_main_v2.py and an image i want to test this model on and get a result, either in text or with rectangels in picture. Seeing many tutorials but none works for me with a model i saved with Tensorflow exporter_main_v2.py
From the error it looks like you have a model saved as .pb. If you want to do inference you can write something like this:
# load the model
model = tf.keras.models.load_model(my_model_dir)
prediction = model.predict(x=x_test, ...)
You'll have to set x which is the only mandatory argument. It is your test dataset (the images you want to obtain predictions from). Also, predict is useful when you have a great amount of images to predict. It handles the prediction in a batched way, avoiding filling up the memory. If you have just a few you can use directly the __call__() method of your model, like this:
prediction = model(x_test, training=False)
More about prediction can be found at the Tensorflow documentation.
Summary
My question is composed by:
A context in which I present my project, my working environment and my workflow
The detailed problem
The concerned parts of my code
The solutions I tried to solve my problem
The question reminder
Context
I've written a Python Keras implementation of a downgraded version of the original Super-Resolution GAN. Now I want to test it using Google Firebase Machine Learning Kit, by hosting it in the Google servers. That's why I have to convert my Keras program to a TensorFlow Lite one.
Environment and workflow (with the problem)
I'm training my program on Google Colab working environment: there, I've installed TF 2.0.0-beta1 (this choice is motivated by this uncorrect answer: https://datascience.stackexchange.com/a/57408/78409).
Workflow (and problem):
I write locally my Python Keras program, keeping in mind that it will run on TF 2. So I use TF 2 imports, for example: from tensorflow.keras.optimizers import Adam and also from tensorflow.keras.layers import Conv2D, BatchNormalization
I send my code to my Drive
I run without any problem my Google Colab Notebook: TF 2 is used.
I get the output model in my Drive, and I download it.
I try to convert this model to the TFLite format by executing the following CLI: tflite_convert --output_file=srgan.tflite --keras_model_file=srgan.h5: here the problem appears.
The problem
Instead of outputing the TF Lite converted model from the TF (Keras) model, the previous CLI outputs this error:
ValueError: Unknown loss function:build_vgg19_loss_network
The function build_vgg19_loss_network is a custom loss function that I've implemented and that must be used by the GAN.
Parts of code that rise this problem
Presenting the custom loss function
The custom loss function is implemented like that:
def build_vgg19_loss_network(ground_truth_image, predicted_image):
loss_model = Vgg19Loss.define_loss_model(high_resolution_shape)
return mean(square(loss_model(ground_truth_image) - loss_model(predicted_image)))
Compiling the generator network with my custom loss function
generator_model.compile(optimizer=the_optimizer, loss=build_vgg19_loss_network)
What I've tried to do in order to solve the problem
As I read it on StackOverflow (link at the beginning of this question), TF 2 was thought to be sufficient to output a Keras model which would be correctly processed by my tflite_convert CLI. But it's not, obviously.
As I read it on GitHub, I tried to manually set my custom loss function among Keras' loss functions, by adding these lines: import tensorflow.keras.losses
tensorflow.keras.losses.build_vgg19_loss_network = build_vgg19_loss_network. It didn't work.
I read on GitHub I could use custom objects with load_model Keras function: but I only want to use compile Keras function. Not load_model.
My final question
I want to do only minor changes to my code, since it works fine. So I don't want, for example, to replace compile with load_model. With this constraint, could you help me, please, to make my CLI tflite_convert works with my custom loss function?
Since you are claiming that TFLite conversion is failing due to a custom loss function, you can save the model file without keep the optimizer details. To do that, set include_optimizer parameter to False as shown below:
model.save('model.h5', include_optimizer=False)
Now, if all the layers inside your model are convertible, they should get converted into TFLite file.
Edit:
You can then convert the h5 file like this:
import tensorflow as tf
model = tf.keras.models.load_model('model.h5') # srgan.h5 for you
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
Usual practice to overcome the unsupported operators in TFLite conversion is documented here.
I had the same error. I recommend changing the loss to "mse" since you already have a well-trained model and you don't need to train with the .tflite file.
I am trying to model a Bayesian Network in python using Pomegranate package. The network should be learned from data. So I am using .from_samples method. However I am having trouble using the method .predict_proba() and it gives me error.
This is how I build the model:
model = BayesianNetwork.from_samples(X_train, algorithm='chow-liu')
and this is how I do prediction:
model.predict_proba(X_train)
and this is the error I get:
ValueError: Sample does not have the same number of dimensions as the model. Your help would be highly appreciated.
I got the answer: you should define your state_names when calling the from_samples method.
Another question is how do we do classification using this model?
you should use predict() method to predict the state of the not-valued nodes.
Check the documentation for more details.
Also, in the repository you can find some interesting tutorials that will help you.
Please add [] around the sample you are passing
I fine-tuned im2txt model and obtained the ckpt.data, ckpt.index and ckpt.meta files and a graph.pbtxt file using the procedure in the im2txt github.
The model seems to work well as it produces almost correct captions.
Now I would like to freeze this model to use it on android.
I used the freeze_graph.py script in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py.
python freeze_graph.py --input_graph=/path/to/graph.pbtxt --input_binary=false --input_checkpoint=/path/to/model.ckpt --output_graph=/path/to/output_graph.pb --output_node_names="softmax,lstm/initial_state,lstm/state"
And I have the following error : AssertionError: softmax is not in graph.
The discussion in https://github.com/tensorflow/models/issues/816 is about the same problem but it did not help me very much.
Indeed, when I look in the graph.pbtxt generated after fine-tuning, I cannot find softmax, lstm/initial_state and lstm/state.
But in the show_and_tell_model.py file of im2txt, the names of the tensors seem to be "softmax", "lstm/initial_state" and "lstm/state". So, I don't know what's happening.
I hope I was clear enough about what I've tried so far. Thanks in advance for any help.
Regards,
Stephane
Found and verified the answer: in inference_wrapper.base.py, just add something like saver.save(sess, "model/ckpt4") after saver.restore(sess, checkpoint_path) in def _restore_fn(sess):. Then rebuild and run_inference and you'll get a model that can be frozen, transformed, and optionally memmapped, to be loaded by iOS and Android apps.
For detailed commands to freeze, transform, and convert to memmapped, see my answer at Error using Model after using optimize_for_inference.py on frozen graph.
Ok,
I think I finally got the solution. In case it is useful for others, here it is:
After training, you obtain ckpt.data, ckpt.index and ckpt.meta files and a graph.pbtxt file.
You then have to load this model in 'inference' mode (see InferenceWrapper in im2txt). It builds a graph with the correct names 'softmax', 'lstm/initial_state' and 'lstm/state'. You save this graph (with the same ckpt format) and then you can apply the freeze_graph script to obtain the frozen model.
Regards,
Stephane