TensorBoard failed to create session - tensorflow

Hello
I'm using TensorFlow v 1.4.0 and when I want to create a TensorBoard session with the following commands:
tensorboard --logdir="folder_path"
I have an error:
2018-04-11 17:18:44.422839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:02:00.0
totalMemory: 11,91GiB freeMemory: 11,74GiB
2018-04-11 17:18:44.467559: E tensorflow/core/common_runtime/direct_session.cc:167] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE
Traceback (most recent call last):
File "/usr/local/bin/tensorboard", line 11, in <module>
sys.exit(run_main())
File "/usr/local/lib/python3.5/dist-packages/tensorboard/main.py", line 36, in run_main
tf.app.run(main)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/usr/local/lib/python3.5/dist-packages/tensorboard/main.py", line 45, in main
default.get_assets_zip_provider())
File "/usr/local/lib/python3.5/dist-packages/tensorboard/program.py", line 166, in main
tb = create_tb_app(plugins, assets_zip_provider)
File "/usr/local/lib/python3.5/dist-packages/tensorboard/program.py", line 200, in create_tb_app
window_title=FLAGS.window_title)
File "/usr/local/lib/python3.5/dist-packages/tensorboard/backend/application.py", line 124, in standard_tensorboard_wsgi
plugin_instances = [constructor(context) for constructor in plugins]
File "/usr/local/lib/python3.5/dist-packages/tensorboard/backend/application.py", line 124, in <listcomp>
plugin_instances = [constructor(context) for constructor in plugins]
File "/usr/local/lib/python3.5/dist-packages/tensorboard/plugins/beholder/beholder_plugin.py", line 47, in __init__
self.most_recent_frame = im_util.get_image_relative_to_script('no-data.png')
File "/usr/local/lib/python3.5/dist-packages/tensorboard/plugins/beholder/im_util.py", line 277, in get_image_relative_to_script
return read_image(filename)
File "/usr/local/lib/python3.5/dist-packages/tensorboard/plugins/beholder/im_util.py", line 265, in read_image
return np.array(decode_png(image_file.read()))
File "/usr/local/lib/python3.5/dist-packages/tensorboard/plugins/beholder/im_util.py", line 182, in __call__
self._lazily_initialize()
File "/usr/local/lib/python3.5/dist-packages/tensorboard/plugins/beholder/im_util.py", line 160, in _lazily_initialize
self._session = tf.Session(graph=graph, config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1509, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 638, in __init__
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
TensorBoard worked when I used TensorFlow 1.6 but I think it is not the problem because I tried to re-use the version 1.6 today and it is not working
My folder contains a file "event.out.po", I checked it.
Do you know where is the problem ?
Thank you

I found the problem. In the batch before using TensorBoard, this command must be run to use the gpu:
export CUDA_VISIBLE_DEVICES=0
If the precedent command does not work, you can try:
export CUDA_VISIBLE_DEVICES=''

Related

ERROR IN CELEB-A dataset DOWNLOAD BY tensorflow_datasets

I am trying to download CELEB-A by tensorflow_datasets (version: 4.5.2) and getting an API error. How can I fix it?
I have update the tensorflow_datasets but the issue does does not fix.
My code is:
import tensorflow_datasets as tf ds
dataset_builder = tfds.builder('celeb_a')
dataset_builder.download_and_prepare()
I am getting the following error:
Downloading and preparing dataset 1.38 GiB (download: 1.38 GiB, generated: 1.62 GiB, total: 3.00 GiB) to /root/tensorflow_datasets/celeb_a/2.0.1...
Dl Size...: 0 MiB [00:00, ? MiB/s] | 0/4 [00:00<?, ? url/s]
Dl Completed...: 0%| | 0/4 [00:00<?, ? url/s]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 464, in download_and_prepare
download_config=download_config,
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1158, in _download_and_prepare
dl_manager, **optional_pipeline_kwargs)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/image/celeba.py", line 129, in _split_generators
"landmarks_celeba": LANDMARKS_DATA,
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/download_manager.py", line 549, in download
return _map_promise(self._download, url_or_urls)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/download_manager.py", line 767, in _map_promise
res = tf.nest.map_structure(lambda p: p.get(), all_promises) # Wait promises
File "/miniconda/lib/python3.7/site-packages/tensorflow/python/util/nest.py", line 867, in map_structure
structure[0], [func(*x) for x in entries],
File "/miniconda/lib/python3.7/site-packages/tensorflow/python/util/nest.py", line 867, in <listcomp>
structure[0], [func(*x) for x in entries],
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/download_manager.py", line 767, in <lambda>
res = tf.nest.map_structure(lambda p: p.get(), all_promises) # Wait promises
File "/miniconda/lib/python3.7/site-packages/promise/promise.py", line 512, in get
return self._target_settled_value(_raise=True)
File "/miniconda/lib/python3.7/site-packages/promise/promise.py", line 516, in _target_settled_value
return self._target()._settled_value(_raise)
File "/miniconda/lib/python3.7/site-packages/promise/promise.py", line 226, in _settled_value
reraise(type(raise_val), raise_val, self._traceback)
File "/miniconda/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/miniconda/lib/python3.7/site-packages/promise/promise.py", line 844, in handle_future_result
resolve(future.result())
File "/miniconda/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/miniconda/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/miniconda/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/downloader.py", line 216, in _sync_download
with _open_url(url, verify=verify) as (response, iter_content):
File "/miniconda/lib/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/downloader.py", line 276, in _open_with_requests
url = _get_drive_url(url, session)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/downloader.py", line 298, in _get_drive_url
_assert_status(response)
File "/miniconda/lib/python3.7/site-packages/tensorflow_datasets/core/download/downloader.py", line 310, in _assert_status
response.url, response.status_code))
tensorflow_datasets.core.download.downloader.DownloadError: Failed to get url https://drive.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM. HTTP code: 404.
It seems the link is broken hence this error is shown while fetching this celeb_a tensorflow dataset. However you can download this dataset manually using this link till we fix that error, .

tensorflow.python.framework.errors_impl.NotFoundError while running deep learning model on Google Colaboratory

I'm trying to run on cloud this deep learning model:
https://github.com/razvanmarinescu/brgm#image-reconstruction-with-pre-trained-stylegan2-generators
What I do is simply utilizing their Colab Notebook: https://colab.research.google.com/drive/1G7_CGPHZVGFWIkHOAke4HFg06-tNHIZ4?usp=sharing#scrollTo=qMgE6QFiHuSL
When I try to exectute:
!python recon.py recon-real-images --input=/content/drive/MyDrive/boeing/EDGEconnect/val_imgs --masks=/content/drive/MyDrive/boeing/EDGEconnect/val_masks --tag=brains --network=dropbox:brains.pkl --recontype=inpaint --num-steps=1000 --num-snapshots=1
I receive this error:
args: Namespace(command='recon-real-images', input='/content/drive/MyDrive/boeing/EDGEconnect/val_imgs', masks='/content/drive/MyDrive/boeing/EDGEconnect/val_masks', network_pkl='dropbox:brains.pkl', num_snapshots=1, num_steps=1000, recontype='inpaint', superres_factor=4, tag='brains')
Local submit - run_dir: results/00004-brains-inpaint
dnnlib: Running recon.recon_real_images() on localhost...
Processing image 1/4
Loading networks from "dropbox:brains.pkl"...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Failed!
Traceback (most recent call last):
File "recon.py", line 270, in <module>
main()
File "recon.py", line 263, in main
dnnlib.submit_run(sc, func_name_map[subcmd], **kwargs)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/submit.py", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/internal/local.py", line 22, in submit
return run_wrapper(submit_config)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/submit.py", line 280, in run_wrapper
run_func_obj(**submit_config.run_func_kwargs)
File "/content/drive/MyDrive/boeing/brgm/brgm/recon.py", line 189, in recon_real_images
recon_real_one_img(network_pkl, img_list[image_idx], masks, num_snapshots, recontype, superres_factor, num_steps)
File "/content/drive/MyDrive/boeing/brgm/brgm/recon.py", line 132, in recon_real_one_img
_G, _D, Gs = pretrained_networks.load_networks(network_pkl)
File "/content/drive/MyDrive/boeing/brgm/brgm/pretrained_networks.py", line 83, in load_networks
G, D, Gs = pickle.load(stream, encoding='latin1')
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/network.py", line 297, in __setstate__
self._init_graph()
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/network.py", line 154, in _init_graph
out_expr = self._build_func(*self.input_templates, **build_kwargs)
File "<string>", line 395, in G_synthesis_stylegan2
File "<string>", line 359, in layer
File "<string>", line 106, in modulated_conv2d_layer
File "<string>", line 75, in apply_bias_act
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 68, in fused_bias_act
return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 122, in _fused_bias_act_cuda
cuda_kernel = _get_plugin().fused_bias_act
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 16, in _get_plugin
return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/custom_ops.py", line 156, in get_plugin
plugin = tf.load_op_library(bin_file)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/_cudacache/fused_bias_act_237d55aca3e3c3ec0547da06888d8e66.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
I found that the very last part of an error:
tensorflow.python.framework.errors_impl.NotFoundError: /content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/_cudacache/fused_bias_act_237d55aca3e3c3ec0547da06888d8e66.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Can be solved by changing a flag in Cuda Makefile: https://github.com/mgharbi/hdrnet_legacy/issues/2 or by installing tf 1.14(colab runs on 1.15.2 and this change made no positive effect).
My question is, how can I get rid of this error, is there an option to change smth inside Google Colab's Cuda Makefile?

InvalidArgumentError: Cannot assign a device for operation replica_0/lambda_1/Shape

I am testing Yolo-v3 (https://github.com/experiencor/keras-yolo3) with tensorflow-gpu 1.15 an keras 2.3.1. The training process is started by:
runfile("train.py",'-c config.json')
Here are the printed out messages:
Using TensorFlow backend.
WARNING:tensorflow:From train.py:40: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.
valid_annot_folder not exists. Spliting the trainining set.
Seen labels: {'kangaroo': 266}
Given labels: ['kangaroo']
Training on: ['kangaroo']
WARNING:tensorflow:From C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
.....
Loading pretrained weights.
C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\callbacks\callbacks.py:998: UserWarning: `epsilon` argument is deprecated and will be removed, use `min_delta` instead.
warnings.warn('`epsilon` argument is deprecated and '
Traceback (most recent call last):
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1348, in _run_fn
self._extend_graph()
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1388, in _extend_graph
tf_session.ExtendSession(self._session)
InvalidArgumentError: Cannot assign a device for operation replica_0/lambda_1/Shape: {{node replica_0/lambda_1/Shape}} was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[replica_0/lambda_1/Shape]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 305, in <module>
_main_(args)
File "train.py", line 282, in _main_
max_queue_size = 8
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\engine\training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\engine\training_generator.py", line 42, in fit_generator
model._make_train_function()
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\engine\training.py", line 333, in _make_train_function
**self._function_kwargs)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\backend\tensorflow_backend.py", line 3006, in function
v1_variable_initialization()
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\backend\tensorflow_backend.py", line 420, in v1_variable_initialization
session = get_session()
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\keras\backend\tensorflow_backend.py", line 385, in get_session
return tf_keras_backend.get_session()
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\keras\backend.py", line 486, in get_session
_initialize_variables(session)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\keras\backend.py", line 903, in _initialize_variables
[variables_module.is_variable_initialized(v) for v in candidate_vars])
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
run_metadata_ptr)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
run_metadata)
File "C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: Cannot assign a device for operation replica_0/lambda_1/Shape: node replica_0/lambda_1/Shape (defined at C:\Users\Dy\Anaconda3\envs\tf1x\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[replica_0/lambda_1/Shape]]
I don't understand what caused the InvalidArgumentError. Is my tensoflow-gpu not installed correctly? Or there is some conflict in deploying gpu?
Try changing the "gpus" value to "0" if it is anythong else. It should work if you are executing in GPU.

Protobuf errors while using Tensorflow Object Detection API locally

I got tensorflow and object detection API on my machine.
Test run shows that everything works.
~ $ cd models/research
research $ protoc object_detection/protos/*.proto --python_out=.
research $ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
research $ python3 object_detection/builders/model_builder_test.py
...............
----------------------------------------------------------------------
Ran 15 tests in 0.144s
OK
Then I tried to retrain a model and got the protobuf error
research $ cd object_detection
object_detection $ python3 train.py --logtostderr --train_dir=training/ --pipeline_config_path=ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config
WARNING:tensorflow:From /Users/me/models/research/object_detection/trainer.py:257: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
Traceback (most recent call last):
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 135, in load_labelmap
text_format.Merge(label_map_string, label_map)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 533, in Merge
descriptor_pool=descriptor_pool)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 587, in MergeLines
return parser.MergeLines(lines, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 620, in MergeLines
self._ParseOrMerge(lines, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 635, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 735, in _MergeField
merger(tokenizer, message, field)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 823, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 722, in _MergeField
tokenizer.Consume(':')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1087, in Consume
raise self.ParseError('Expected "%s".' % token)
google.protobuf.text_format.ParseError: 3:10 : Expected ":".
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1083, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1105, in InternalParse
(tag_bytes, new_pos) = local_ReadTag(buffer, pos)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 184, in <module>
tf.app.run()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "train.py", line 180, in main
graph_hook_fn=graph_rewriter_fn)
File "/Users/me/models/research/object_detection/trainer.py", line 264, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/Users/me/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "train.py", line 121, in get_next
dataset_builder.build(config)).get_next()
File "/Users/me/models/research/object_detection/builders/dataset_builder.py", line 155, in build
label_map_proto_file=label_map_proto_file)
File "/Users/me/models/research/object_detection/data_decoders/tf_example_decoder.py", line 245, in __init__
use_display_name)
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 152, in get_label_map_dict
label_map = load_labelmap(label_map_path)
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 137, in load_labelmap
label_map.ParseFromString(label_map_string)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/message.py", line 185, in ParseFromString
self.MergeFromString(serialized)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1089, in MergeFromString
raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.
object_detection $
I tried a bunch of similar problems' solutions, but they didn't work for my case. For example this one suggest to encode pbtxt file with ASCII.
Python 2 gives an error too. Here is it's last line
google.protobuf.message.DecodeError: Unexpected end-group tag.
Context:
macOS 10.13.4
Local run on CPU
Python 3.6.4
protobuf 3.5.1
libprotoc 3.4.0
tensorflow 1.8.0
Google Cloud SDK 200.0.0
bq 2.0.33
core 2018.04.30
gsutil 4.31

CudnnLSTM runs out of space with Eager Execution

I'm using 3 tf.contrib.cudnn_rnn.CudnnLSTM(1, 128, direction='bidirectional') layers with a batch size of 32 on an AWS p2.xlarge instance. The exact same configuration works correctly with non-eager(standard) tensorflow. Following is the error log:
2018-04-27 18:15:59.139739: E tensorflow/stream_executor/cuda/cuda_dnn.cc:1520] Failed to allocate RNN workspace of 74252288 bytes.
2018-04-27 18:15:59.139758: E tensorflow/stream_executor/cuda/cuda_dnn.cc:1697] Unable to create rnn workspace
Traceback (most recent call last):
File "tf_run_eager.py", line 424, in <module>
run_experiments()
File "tf_run_eager.py", line 417, in run_experiments
train_losses.append(model.optimize(bX, bY).numpy())
File "tf_run_eager.py", line 397, in optimize
loss, grads_and_vars = self.loss(phoneme_features, utterances)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 233, in grad_fn
sources)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/eager/imperative_grad.py", line 65, in imperative_grad
tape._tape, vspace, target, sources, output_gradients, status) # pylint: disable=protected-access
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 141, in grad_fn
op_inputs, op_outputs, orig_outputs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 109, in _magic_gradient_function
return grad_fn(mock_op, *out_grads)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/contrib/cudnn_rnn/python/ops/cudnn_rnn_ops.py", line 1609, in _cudnn_rnn_backward
direction=op.get_attr("direction"))
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/contrib/cudnn_rnn/ops/gen_cudnn_rnn_ops.py", line 320, in cudnn_rnn_backprop
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Failed to call ThenRnnBackward [Op:CudnnRNNBackprop]