How do I save the tflearn logs into tflearn.model? - tensorflow

So basically I forgot to save my model for each training loops. how do I save the /tmp/tflearn_logs/subdir into the model? is there any way to collect it as model like:
# Save a model
model.save('my_model.tflearn')
from the event logs?
And after that I can automatically load it with:
# Load a model
model.load('my_model.tflearn')
Here are my event logs:
Thank you..

Nevermind, There's no method to do that. Because logs only save our events data for visualization purpose, while model used for learning model using tflearn. they both working in the opposite way. Solution: Recreate save('.model.tflearn') and run it from 0. :)

Related

Saving Evaluation to a file while using TF OD API

The evaluation parameters look like this:
https://github.com/armaanpriyadarshan/Training-a-Custom-TensorFlow-2.X-Object-Detector/blob/master/doc/evaluation.png
But the thing is, all these are displayed only on the CMD, is there any way to store them in files?
My target is to get a dictionary like this {"model_name" : evaluation_parameters}
Tensorboard is a way out, but again, I want to automate this process in order to get the best model to display everything at once.
So, any ideas or suggestions are welcome!

How to add model Checkpoint as Callback, when running model on TPU?

I am trying to save my model by using tf.keras.callbacks.ModelCheckpoint with filepath as some folder in drive, but I am getting this error:
File system scheme '[local]' not implemented (file: './ckpt/tensorflow/training_20220111-093004_temp/part-00000-of-00001')
Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors.
Does anybody know what is the reason for this and the workaround for this?
Looks to me that you are trying to access the file system of your host VM from the TPU which is not directly possible.
When using the TPU and you want to access files in e.g. GoogleColab you should place it within:
with tf.device('/job:localhost'):
<YOUR_CODE>
Now to your problem:
The local host acts as parameter server when training on TPU. So if you want to checkpoint your training, the localhost must do so.
When you check the documention for said callback, you cann find the parameter options.
checkpoint_options = tf.train.CheckpointOptions(experimental_io_device='/job:localhost')
checkpoint = tf.keras.callbacks.ModelCheckpoint(<YOUR_PATH>, options = checkpoint_options)
Hope this solves your issue!
Best,
Sascha

How can I load a single image on Keras, using Google Colab?

I have built a deep learning model which classifies cats and dogs. I have successfully mounted Google drive and trained the images as needed. However, I am trying to make a single prediction by uploading one image and having Keras make a prediction.
In a regular IDE like Spyder, it's like this :
test_image = image.load_img('image1.jpg',target_size=(64,64))
But it throws this error :
Transport endpoint is not connected: 'image1.jpg'
I remounted the drive, and then it tells me :
No such file or directory: 'image1.jpg'
After that, I played with how I would write the directory on the image.load() method, but ran out of ideas at this point.
You can mount the drive, connect with an authentication, then import the files you want and predict using your model. Please check a GitHub gist here. You can follow the steps below and let me know if you need any help. Thanks!
from google.colab import drive
# This will prompt for authorization.
drive.mount('/content/drive')
# Predicting Roses
img=mpimg.imread('/content/drive/My Drive/5602220566_5cdde8fa6c_n.jpg')
imgplot = plt.imshow(img)
img = tf.expand_dims(img,0) # need this to make batch_shape = 1
img=img/255 # normalizing the image
img=tf.image.resize(img,size=(224, 224)) # resizing image
Prob=loaded_model.predict(img) # prediction
indd=tf.argmax(Prob[0],axis=-1).numpy()
print(tf.argmax(Prob[0],axis=-1).numpy())
print(labels_string[indd])

recommended way of profiling distributed tensorflow

Currently, I am using tensorflow estimator API to train my tf model. I am using distributed training that is almost 20-50 workers and 5-30 parameter servers based on the training data size. Since I do not have access to the session, I cannot use run metadata a=with full trace to look at the chrome trace. I see there are two other approaches :
1) tf.profiler.profile
2) tf.train.profilerhook
I am specifically using
tf.estimator.train_and_evaluate(estimator, train_spec, test_spec)
where my estimator is a prebuilt estimator.
Can someone give me some guidance (concrete code samples and code pointers will be really helpful since I am very new to tensorflow) what is the recommended way to profile estimator? Are the 2 approaches getting some different information or serve the same purpose? Also is one recommended over another?
There are two things you can try:
ProfilerContext
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/profiler/profile_context.py
Example usage:
with tf.contrib.tfprof.ProfileContext('/tmp/train_dir') as pctx:
train_loop()
ProfilerService
https://www.tensorflow.org/tensorboard/r2/tensorboard_profiling_keras
You can start a ProfilerServer via tf.python.eager.profiler.start_profiler_server(port) on all workers and parameter servers. And use TensorBoard to capture profile.
Note that this is a very new feature, you may want to use tf-nightly.
Tensorflow have recently added a way to sample multiple workers.
Please have a look at the API:
https://www.tensorflow.org/api_docs/python/tf/profiler/experimental/client/trace?version=nightly
The parameter of the above API which is important in this context is :
service_addr: A comma delimited string of gRPC addresses of the
workers to profile. e.g. service_addr='grpc://localhost:6009'
service_addr='grpc://10.0.0.2:8466,grpc://10.0.0.3:8466'
service_addr='grpc://localhost:12345,grpc://localhost:23456'
Also, please look at the API,
https://www.tensorflow.org/api_docs/python/tf/profiler/experimental/ProfilerOptions?version=nightly
The parameter of the above API which is important in this context is :
delay_ms: Requests for all hosts to start profiling at a timestamp
that is delay_ms away from the current time. delay_ms is in
milliseconds. If zero, each host will start profiling immediately upon
receiving the request. Default value is None, allowing the profiler
guess the best value.

How to tell if there was a save or validation attempt in a Rails model?

I need to check if a model has had any attempt to be saved or validated without actually saving/validating it. The method #valid? does run the validations, so it doesn't fit here.
You can add a custom validation method. Your method will be executed on any validation attempt. You can set a flag inside your validation method to know if it has been executed. But I see no point in this task. I doubt you really should do this.