GANs with TensorFlow - tensorflow

Recently, I started one of the tutorial for GANs created by TensorFlow.
See link: https://github.com/PacktPublishing/Advanced-Deep-Learning-with-Keras/blob/master/chapter4-gan/dcgan-mnist-4.2.1.py
But in the initializer (i.e. if name == 'main') I am having to issues to run it, since I run it
from Colab and is a .py program. So that, it says that error:
usage: ipykernel_launcher.py [-h] [-g GENERATOR]
ipykernel_launcher.py: error: unrecognized arguments: -f
/root/.local/share/jupyter/runtime/kernel-bcd12960-b051-4b8d-b6b0-3ff02367dbcc.json
An exception has occurred, use %tb to see the full traceback.
Would anyone know how to debug it for .ipynb files?
Thank you in advance

Related

Google Colab Crash When Loading Pre-Trained Transformer

I am new to transformers. I have fine-tuned my MT5 model in Google Colab and saved it with save_pretrained(PATH). However, when I tried to load the model with MT5ForConditionalGeneration.from_pretrained(PATH), Google Colab keep crashing with log error:
terminate called after throwing an instance of 'std::__ios_failure'
what(): basic_filebuf::underflow error reading the file: iostream error
I suspect that it was out-of-memory problem, but I have no clue how to fix it.

Create Version Failed. Bad model detected with error: "Error loading the model" - AI Platform Prediction

I created a model through AI Platform UI that uses a global endpoint. I am trying to deploy a basic tensorflow 1.15.0 model I exported using the Saved Model builder. When I try to deploy this model I get a Create Version Failed. Bad model detected with error: "Error loading the model" error in the UI and the I see the following in the logs:
ERROR:root:Failed to import GA GRPC module. This is OK if the runtime version is 1.x
Failure: Could not reach metadata service: Internal Server Error.
ERROR:root:Command '['/tools/google-cloud-sdk/bin/gsutil', '-o', 'GoogleCompute:service_account=default', 'cp', '-R', 'gs://cml-365057443918-1608667078774578/models/xsqr_global/v6/7349456410861999293/model/*', '/tmp/model/0001']' returned non-zero exit status 1.
ERROR:root:Error loading model: 'generator' object has no attribute 'next'
ERROR:root:Error loading the model
Framework/ML runtime version: Tensorflow 1.15.0
Python: 3.7.3
What is strange is that the gcloud ai-platform local predict works correctly with this exported model, and I can deploy this exact same model on a regional endpoint with no issues. It only gives this error if I try to use a global endpoint model. But I need the global endpoint because I plan on using a custom prediction routine (if I can get this basic model working first).
The logs seem to suggest an issue with copying the model from storage? I've tried giving various IAM roles additional viewer permissions, but I still get the same errors.
Thanks for the help.
I think it's the same issue as https://issuetracker.google.com/issues/175316320
The comment in the issue says the fix is now rolling out.
Today I faced the same error (ERROR: (gcloud.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Error loading the model") & for those who wants a summary:
The recommendation is to use n1* machine types (for example: n1-standard-4) via regional endpoints (for example: us-central1) instead of mls1* machines while deploying version. Also I made sure to mention the same region (us-central1) while creating the model itself using the below command, thereby resolving the above mentioned error.
!gcloud ai-platform models create $model_name
--region=$REGION

Deeplab: "Failed to find all Cityscapes modules"

I am trying to run the tensorflow DeepLab tutorial on the cityscapes dataset. I downloaded the gtFine dataset and cloned into the cityscapesScripts folder, and set up the directories as recommended in the tutorial. When I ran the following line from the tutorial,
sh convert_cityscapes.sh,
I received an error message stating "Failed to find all Cityscapes modules".
I checked the cityscapesScripts documentation and I think I am missing the labels module, which is likely causing the error. Where can I clone or download the missing module(s)?
In the dependencies for sh convert_cityscapes.sh, there's a file with invalid syntax.
You can get it to work on Python3 by commenting out the line
print type(obj).name
from cityscapeScripts/helpers/annotation.py line 238

ImportError: cannot import name 'preprocessor_pb2' in the training part after installation was successful

I installed the object detection API correctly using this https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md and I checked by running model_builder_test.py
This gave me a OK result.
Then I moved on to running the train.py on my dataset using the following command
python train.py --logtostderr --pipeline_config_path=pipeline.config --train_dir=train_file
And I am getting the error ImportError: cannot import name 'preprocessor_pb2'
This particular preprocessor_pb2.py exists in the path it is looking for i.e
C:\Users\SP-TestMc\Anaconda3\envs\tensorflow\models-master\models-master\research\object_detection\protos
What could be the reason for this error then?
See https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md#protobuf-compilation. Sounds like you need to compile the protos before using the Python scripts.

Python Configuration Error when build retrain.py by bazel, following google doc

I am learning transfer learning according to How to Retrain Inception's Final Layer for New Categories however, when I build 'retrain.py' using bazel, the following error ocures:
The error message is:
python configuration error:'PYTHON_BIN_PATH' environment variable is not set and referenced by '//third_party/py/numpy:headers'
I am so sorry, I have done my best to display the error image.unfortunately, I failed.
I use python2.7, anaconda2 and bazel0.6.1, tensorflow1.3.
appreciate for your any reply !