Intermittent OutOfMemoryError in Cupy - cupy

I'm making progress with Cupy and have been able to accelerate my iterative image reconstruction that previously used numpy and C++ by about a factor of 3. Now I have an intermittent issue that points out that I seem to miss something important.
I'm running this on a Mac Pro Late 2013, OSX 10.13.6 using a NVIDIA GeForce GTX 1080 Ti.
I'm limited by memory and keep loading data to the GPU using cupy.asarray() and then removing it with setting the variables to None. This works but intermittently I get:
cupy.cuda.memory.OutOfMemoryError
I run the following loop within the iteration:
# iterations
for i in range(nr_iterations):
[...]
# loop within iterations
for idx in range(dim2):
print("GPU memory info - iteration: " + str(i) + " - used: " + str(mempool.used_bytes()) + ", total: " + str(mempool.total_bytes()) + " pinned :" + str(pinned_mempool.n_free_blocks()) + " in loop with idx = " + str(idx) )
cupy_array = cp.asarray( cpp_function(numpy_array[:,flow_idx:flow_idx+1,...]) )
# do all the work
[...]
# last line in for loop
cupy_array = None
The print out shows stable memory usage, until suddenly it throws an error at the line with cupy_array = cp.asarray(). Here an excerpt of the logging statements:
GPU memory info - iteration: 0 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 0
GPU memory info - iteration: 0 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 1
GPU memory info - iteration: 1 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 0
GPU memory info - iteration: 1 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 1
GPU memory info - iteration: 2 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 0
GPU memory info - iteration: 2 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 1
GPU memory info - iteration: 3 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 0
GPU memory info - iteration: 3 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 1
GPU memory info - iteration: 4 - used: 3477175296, total: 8529552896 pinned :2 in loop with idx = 0
File "cupy/core/core.pyx", line 1712, in cupy.core.core.array
File "cupy/core/core.pyx", line 1751, in cupy.core.core.array
File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__
File "cupy/cuda/memory.pyx", line 518, in cupy.cuda.memory.alloc
File "cupy/cuda/memory.pyx", line 1085, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1106, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 934, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
File "cupy/cuda/memory.pyx", line 949, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
File "cupy/cuda/memory.pyx", line 697, in cupy.cuda.memory._try_malloc
cupy.cuda.memory.OutOfMemoryError: out of memory to allocate 2158119936 bytes (total 10687672832 bytes)
I am specifically surprised that in the instance of the error, the system seem to require 10687672832 bytes, where the logging suggests that 3477175296 bytes are used, while allocating 2158119936 bytes.
While I realized that I need ~twice the memory available to load an array, in this instance it actually requires more than 3 times the size of the array being loaded.
Anything obvious that I'm missing? Thank you for looking at this.

Related

using MPI in tensorflow's distributed deeplearning

I am using a cluster which has 8*2080TI (11gb each) for distributed deeplearning. My goal is to utilize all the GPUs for training the model. My code uses MPI to gather all the process accross the cluster and tries to distribute accross all the workers. But this is giving me an error. I am using python 3.9 and TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1.
I am currently lost and dont know what i need to do in this case. I tried installing other tenorflow versions using conda but it ends up like this.
slurm file :
#!/bin/bash
#SBATCH --job-name=job1 # Job name
#SBATCH --mem=30000 # Job memory request
#SBATCH --gres=gpu:4 # Number of requested GPU(s)
#SBATCH --time=3-23:00:00 # Time limit days-hrs:min:sec
#SBATCH --constraint=rtx_2080 # Specific hardware constraint
#SBATCH --error=slurm.err # Error file name
#SBATCH --output=slurm.out # Output file name
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --array=1-2%1
if [ -d "model-final" ]
then
scancel $SLURM_ARRAY_JOB_ID
else
module load Anaconda3/2020.07
module load TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1
mpirun python -u main.py resume_latest
fi
my error:
Instructions for updating:
use distribute.MultiWorkerMirroredStrategy instead
2023-01-18 13:18:35.789808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9687 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3d:00.0, compute capability: 7.5
2023-01-18 13:18:35.790848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9687 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3e:00.0, compute capability: 7.5
2023-01-18 13:18:35.791743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 9687 MB memory: -> device: 2, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:88:00.0, compute capability: 7.5
2023-01-18 13:18:35.792678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 9687 MB memory: -> device: 3, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5
2023-01-18 13:18:35.804893: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:worker/replica:0/task:0/device:GPU:0 with 9687 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3d:00.0, compute capability: 7.5
2023-01-18 13:18:35.805620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:worker/replica:0/task:0/device:GPU:1 with 9687 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3e:00.0, compute capability: 7.5
2023-01-18 13:18:35.806333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:worker/replica:0/task:0/device:GPU:2 with 9687 MB memory: -> device: 2, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:88:00.0, compute capability: 7.5
2023-01-18 13:18:35.807029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:worker/replica:0/task:0/device:GPU:3 with 9687 MB memory: -> device: 3, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5
2023-01-18 13:18:35.810512: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> g01:37672}
2023-01-18 13:18:35.810736: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:427] Started server with target: grpc://g01:37672
/usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/keras/optimizer_v2/optimizer_v2.py:355: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
warnings.warn(
WARNING:tensorflow:`period` argument is deprecated. Please use `save_freq` to specify the frequency in number of batches seen.
2023-01-18 13:18:42.547198: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:695] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: "TensorSliceDataset/_2"
op: "TensorSliceDataset"
input: "Placeholder/_0"
input: "Placeholder/_1"
attr {
key: "Toutput_types"
value {
list {
type: DT_DOUBLE
type: DT_DOUBLE
}
}
}
attr {
key: "output_shapes"
value {
list {
shape {
dim {
size: 15
}
}
shape {
dim {
size: 13
}
}
}
}
}
2023-01-18 13:18:42.740015: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
[g01:44037:0:44313] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid: 44313) ====
0 0x000000000002137e ucs_debug_print_backtrace() /umbc/ebuild-soft/cascade-lake/build/UCX/1.10.0/GCCcore-10.3.0/ucx-1.10.0/src/ucs/debug/debug.c:656
1 0x000000000382045b tensorflow::NcclCommunicator::Enqueue() collective_communicator.cc:0
2 0x0000000005c9f88a tensorflow::NcclReducer::Run() ???:0
3 0x00000000009086dc tensorflow::BaseCollectiveExecutor::ExecuteAsync(tensorflow::OpKernelContext*, tensorflow::CollectiveParams const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void (tensorflow::Status const&)>)::{lambda()#3}::operator()() base_collective_executor.cc:0
4 0x0000000000b99403 tensorflow::UnboundedWorkQueue::PooledThreadFunc() ???:0
5 0x0000000000b9f6b1 tensorflow::(anonymous namespace)::PThread::ThreadFn() env.cc:0
6 0x0000000000007ea5 start_thread() pthread_create.c:0
7 0x00000000000feb0d __clone() ???:0
=================================
[g01:44037] *** Process received signal ***
[g01:44037] Signal: Segmentation fault (11)
[g01:44037] Signal code: (-6)
[g01:44037] Failing at address: 0x2ecf70000ac05
[g01:44037] [ 0] /lib64/libpthread.so.0(+0xf630)[0x2aaaab7e6630]
[g01:44037] [ 1] /usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so(+0x382045b)[0x2aaab68fc45b]
[g01:44037] [ 2] /usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so(_ZN10tensorflow11NcclReducer3RunESt8functionIFvRKNS_6StatusEEE+0x1ca)[0x2aaab8d7b88a]
[g01:44037] [ 3] /usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/tensorflow/python/../libtensorflow_framework.so.2(+0x9086dc)[0x2aaadc7556dc]
[g01:44037] [ 4] /usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/tensorflow/python/../libtensorflow_framework.so.2(_ZN10tensorflow18UnboundedWorkQueue16PooledThreadFuncEv+0x1b3)[0x2aaadc9e6403]
[g01:44037] [ 5] /usr/ebuild/software/TensorFlow/2.6.0-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/tensorflow/python/../libtensorflow_framework.so.2(+0xb9f6b1)[0x2aaadc9ec6b1]
[g01:44037] [ 6] /lib64/libpthread.so.0(+0x7ea5)[0x2aaaab7deea5]
[g01:44037] [ 7] /lib64/libc.so.6(clone+0x6d)[0x2aaaac468b0d]
[g01:44037] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 44037 on node g01 exited on signal 11 (Segmentation fault).
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
# Load in the parameter files
from json import load as loadf
with open("params.json", 'r') as inFile:
params = loadf(inFile)
# Get data files and prep them for the generator
from tensorflow import distribute as D
callbacks = []
devices = getDevices()
print(devices)
set_tf_config_mpi()
strat = D.experimental.MultiWorkerMirroredStrategy(
communication=D.experimental.CollectiveCommunication.NCCL)
# Create network
from sys import argv
resume_training = False
print(argv)
if "resume_latest" in argv:
resume_training = True
with strat.scope():
# Scheduler
if isinstance(params["learning_rate"], str):
# Get the string for the importable function
lr = params["learning_rate"]
from tensorflow.keras.callbacks import LearningRateScheduler
# Use a dummy learning rate
params["learning_rate"] = 0.1
# model = create_model(**params)
# Get the importable function
lr = lr.split(".")
baseImport = __import__(lr[0], globals(), locals(), [lr[1]], 0)
lr = getattr(baseImport, lr[1])
# Make a schedule
lr = LearningRateScheduler(lr)
callbacks.append(lr)
# Resume Model?
model_name = None
if resume_training:
initial_epoch, model_name = getInitialEpochsAndModelName(rank)
if model_name is None:
initial_epoch=0
model = create_model(**params)
resume_training = False
else:
from tensorflow.keras.models import load_model
model = load_model(model_name)
# Load data from disk
import numpy
if "root" in params.keys():
root = params['root']
else:
root = "./"
if "filename" in params.keys():
filename = params["filename"]
else:
filename = "dataset_timeseries.csv"
restricted = [
'euc1', 'e1', 'x1', 'y1', 'z1',
'euc2', 'e2', 'x2', 'y2', 'z2',
'euc3', 'e3', 'x3', 'y3', 'z3',
]
x, y = getOneHot("{}/{}".format(root, filename), restricted=restricted, **params)
# val_x, val_y = getOneHot("{}/{}".format(root, val_filename), restricted=restricted)
val_x, val_y = None, None
params["gbatch_size"] = params['batch_size'] * len(devices)
print("x.shape =", x.shape)
print("y.shape =", y.shape)
print("epochs =", params['epochs'], type(params['epochs']))
print("batch =", params['batch_size'], type(params['batch_size']))
print("gbatch =", params['gbatch_size'], type(params['gbatch_size']))
# Load data into a distributed dataset
# Dataset object does nothing in place:
# https://stackoverflow.com/questions/55645953/shape-of-tensorflow-dataset-data-in-keras-tensorflow-2-0-is-wrong-after-conver
from tensorflow.data import Dataset
data = Dataset.from_tensor_slices((x, y))
# Create validation set
v = params['validation']
if val_x is not None:
vrecord = val_x.shape[0]
val = Dataset.from_tensor_slices((val_x, val_y))
validation = val # data.take(vrecord)
else:
vrecord = int(x.shape[0]*v)
validation = data.take(vrecord)
validation = validation.batch(params['gbatch_size'])
validation = validation.repeat(params['epochs'])
# Validation -- need to do kfold one day
# This set should NOT be distributed
vsteps = vrecord // params['gbatch_size']
if vrecord % params['gbatch_size'] != 0:
vsteps += 1
# Shuffle the data during preprocessing or suffer...
# Parallel randomness == nightmare
# data = data.shuffle(x.shape[0])
# Ordering these two things is very important!
# Consider 3 elements, batch size 2 repeat 2
# [1 2 3] -> [[1 2] [3]] -> [[1 2] [3] [1 2] [3]] (correct) batch -> repeat
# [1 2 3] -> [1 2 3 1 2 3] -> [[1 2] [3 1] [2 3]] (incorrect) repeat -> batch
# data = data.skip(vrecord)
data = data.batch(params['gbatch_size'])
data = data.repeat(params['epochs'])
records = x.shape[0] # - vrecord
steps = records // params['gbatch_size']
if records % params['gbatch_size']:
steps += 1
print("steps =", steps)
# Note that if we are resuming that the number of _remaining_ epochs has
# changed!
# The number of epochs * steps is the numbers of samples to drop
print("initial cardinality = ", data.cardinality())
print("initial v cardinality = ", data.cardinality())
data = data.skip(initial_epoch*steps)
validation = validation.skip(initial_epoch*vsteps)
print("final cardinality = ", data.cardinality())
print("final v cardinality = ", data.cardinality())
# data = strat.experimental_distribute_dataset(data)
# Split into validation and training
callbacks = createCallbacks(params, callbacks, rank, resume_training)
print(callbacks)
history = model.fit(data, epochs=params['epochs'],
batch_size=params['gbatch_size'],
steps_per_epoch=steps,
verbose=0,
initial_epoch=initial_epoch,
validation_data=validation,
validation_steps=vsteps,
callbacks=callbacks)
if rank == 0:
model.save("model-final")
else:
model.save("checkpoints/model-tmp")

Pybullet on colab, cannot connect X server

I am using rl-baselines-zoo 3 to run ddpg with my custom env on colab. After I used show video function in that zoo repo, it said it cannot connect to the server. It works fine on other built-in envs, so I guess it's my env problem. please, need some help...
I set every thing from zoo's tutorials
Traceback:
pybullet build time: Jul 12 2021 20:46:20
/usr/local/lib/python3.7/dist-packages/gym/logger.py:30: UserWarning:
WARN: Box bound precision lowered by casting to float32
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=VMware, Inc.
GL_RENDERER=llvmpipe (LLVM 10.0.0, 256 bits)
GL_VERSION=3.3 (Core Profile) Mesa 20.0.8
GL_SHADING_LANGUAGE_VERSION=3.30
pthread_getconcurrency()=0
Version = 3.3 (Core Profile) Mesa 20.0.8
Vendor = VMware, Inc.
Renderer = llvmpipe (LLVM 10.0.0, 256 bits)
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0
MotionThreadFunc thread started
ven = VMware, Inc.
ven = VMware, Inc.
Wrapping the env in a VecTransposeImage.
tcmalloc: large alloc 3276800000 bytes == 0x556b03bda000 # 0x7f7cad04a001 0x7f7caa3f554f 0x7f7caa445b58 0x7f7caa449b17 0x7f7caa4e8203 0x556a81194d54 0x556a81194a50 0x556a81209105 0x556a812037ad 0x556a81196c9f 0x556a811d7d79 0x556a811d4cc4 0x556a81196ea1 0x556a81205bb5 0x556a8119630a 0x556a812087f0 0x556a812037ad 0x556a811963ea 0x556a8120460e 0x556a812034ae 0x556a811963ea 0x556a8120532a 0x556a812034ae 0x556a812031b3 0x556a81201660 0x556a81194b59 0x556a81194a50 0x556a81208453 0x556a812034ae 0x556a811963ea 0x556a812043b5
tcmalloc: large alloc 3276800000 bytes == 0x556bc78da000 # 0x7f7cad04a001 0x7f7caa3f554f 0x7f7caa445b58 0x7f7caa449b17 0x7f7caa4e8203 0x556a81194d54 0x556a81194a50 0x556a81209105 0x556a812037ad 0x556a81196c9f 0x556a811d7d79 0x556a811d4cc4 0x556a81196ea1 0x556a81205bb5 0x556a8119630a 0x556a812087f0 0x556a812037ad 0x556a811963ea 0x556a8120460e 0x556a812034ae 0x556a811963ea 0x556a8120532a 0x556a812034ae 0x556a812031b3 0x556a81201660 0x556a81194b59 0x556a81194a50 0x556a81208453 0x556a812034ae 0x556a811963ea 0x556a812043b5
/content/gdrive/My Drive/hsr/rl-baselines3-zoo/logs/ddpg/FoodHuntingHSR-v0_3/videos/final-model-ddpg-FoodHuntingHSR-v0-step-0-to-step-200.mp4
/usr/local/lib/python3.7/dist-packages/gym/logger.py:30: UserWarning:
WARN: Tried to pass invalid video frame, marking as broken: Your frame has data type float32, but we require uint8 (i.e. RGB values from 0-255).
Saving video to /content/gdrive/My Drive/hsr/rl-baselines3-zoo/logs/ddpg/FoodHuntingHSR-v0_3/videos/final-model-ddpg-FoodHuntingHSR-v0-step-0-to-step-200.mp4
numActiveThreads = 0
stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed
finished
numActiveThreads = 0
btShutDownExampleBrowser stopping threads
Thread with taskId 0 exiting
Thread TERMINATED
destroy semaphore
semaphore destroyed
destroy main semaphore
main semaphore destroyed
Exception ignored in: <function VecVideoRecorder.__del__ at 0x7f7c2b5cc200>
Traceback (most recent call last):
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/vec_video_recorder.py", line 114, in __del__
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/vec_video_recorder.py", line 110, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/base_vec_env.py", line 278, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/dummy_vec_env.py", line 67, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/monitor.py", line 113, in close
File "/usr/local/lib/python3.7/dist-packages/gym/core.py", line 243, in close
File "/usr/local/lib/python3.7/dist-packages/gym/core.py", line 243, in close
File "/content/gdrive/My Drive/hsr/PyLIS/gym-foodhunting/gym_foodhunting/foodhunting/gym_foodhunting.py", line 538, in close
pybullet.error: Not connected to physics server
class FoodHuntingEnv(gym.Env):
metadata = {'render.modes': ['human','rgb_array']}
GRAVITY = -10.0
BULLET_STEPS = 120 # p.setTimeStep(1.0 / 240.0), so 1 gym step == 0.5 sec.
def __init__(self, render=False, robot_model=R2D2, max_steps=500, num_foods=3, num_fakes=0, object_size=1.0, object_radius_scale=1.0, object_radius_offset=1.0, object_angle_scale=1.0):
"""Initialize environment.
"""
### gym variables
self.observation_space = robot_model.getObservationSpace() # classmethod
self.action_space = robot_model.getActionSpace() # classmethod
self.reward_range = (-1.0, 1.0)
self.seed()
### pybullet settings
self.ifrender = render
self.physicsClient = p.connect(p.GUI if self.ifrender else p.DIRECT)
p.setAdditionalSearchPath(pybullet_data.getDataPath())
### env variables
self.robot_model = robot_model
self.max_steps = max_steps
self.num_foods = num_foods
self.num_fakes = num_fakes
self.object_size = object_size
self.object_radius_scale = object_radius_scale
self.object_radius_offset = object_radius_offset
self.object_angle_scale = object_angle_scale
self.plane_id = None
self.robot = None
self.object_ids = []
### episode variables
self.steps = 0
self.episode_rewards = 0.0
def close(self):
"""Close environment.
"""
p.disconnect(self.physicsClient)
def reset(self):
"""Reset environment.
"""
self.steps = 0
self.episode_rewards = 0
p.resetSimulation()
# p.setTimeStep(1.0 / 240.0)
p.setGravity(0, 0, self.GRAVITY)
self.plane_id = p.loadURDF('plane.urdf')
self.robot = self.robot_model()
self.object_ids = []
for i, (pos, orn) in enumerate(self._generateObjectPositions(num=(self.num_foods+self.num_fakes), radius_scale=self.object_radius_scale, radius_offset=self.object_radius_offset, angle_scale=self.object_angle_scale)):
if i < self.num_foods:
urdfPath = 'food_sphere.urdf'
else:
urdfPath = 'food_cube.urdf'
object_id = p.loadURDF(urdfPath, pos, orn, globalScaling=self.object_size)
self.object_ids.append(object_id)
for i in range(self.BULLET_STEPS):
p.stepSimulation()
obs = self._getObservation()
#print('reset laile')
#self.robot.printAllJointInfo()
return obs
def step(self, action):
"""Apply action to environment, then return observation and reward.
"""
self.steps += 1
self.robot.setAction(action)
reward = -1.0 * float(self.num_foods) / float(self.max_steps) # so agent needs to eat foods quickly
for i in range(self.BULLET_STEPS):
p.stepSimulation()
reward += self._getReward()
self.episode_rewards += reward
obs = self._getObservation()
done = self._isDone()
pos, orn = self.robot.getPositionAndOrientation()
info = { 'steps': self.steps, 'pos': pos, 'orn': orn }
if done:
#print('Done laile')
info['episode'] = { 'r': self.episode_rewards, 'l': self.steps }
# print(self.episode_rewards, self.steps)
#print(self.robot.getBaseRollPosition(), self.robot.getTorsoLiftPosition(), self.robot.getHeadPosition(), self.robot.getArmPosition(), self.robot.getWristPosition(), self.robot.getGripperPosition()) # for HSR debug
#print(self.robot.getHeadPosition(), self.robot.getGripperPosition()) # for R2D2 debug
return obs, reward, done, info
def render(self, mode='human', close=False):
"""This is a dummy function. This environment cannot control rendering timing.
"""
if mode != 'rgb_array':
return np.array([])
return self._getObservation()
def seed(self, seed=None):
"""Set random seed.
"""
self.np_random, seed = seeding.np_random(seed)
return [seed]
def _getReward(self):
"""Detect contact points and return reward.
"""
reward = 0
contacted_object_ids = [ object_id for object_id in self.object_ids if self.robot.isContact(object_id) ]
for object_id in contacted_object_ids:
reward += 1 if self._isFood(object_id) else -1
p.removeBody(object_id)
self.object_ids.remove(object_id)
return reward
def _getObservation(self):
"""Get observation.
"""
obs = self.robot.getObservation()
return obs
def _isFood(self, object_id):
"""Check if object_id is a food.
"""
baseLink, urdfPath = p.getBodyInfo(object_id)
return urdfPath == b'food_sphere.urdf' # otherwise, fake
def _isDone(self):
"""Check if episode is done.
"""
#print(self.object_ids,'self')
available_object_ids = [ object_id for object_id in self.object_ids if self._isFood(object_id) ]
#print(available_object_ids)
return self.steps >= self.max_steps or len(available_object_ids) <= 0
def _generateObjectPositions(self, num=1, retry=100, radius_scale=1.0, radius_offset=1.0, angle_scale=1.0, angle_offset=0.5*np.pi, z=1.5, near_distance=0.5):
"""Generate food positions randomly.
"""
def genPos():
r = radius_scale * self.np_random.rand() + radius_offset
a = -np.pi * angle_scale + angle_offset
b = np.pi * angle_scale + angle_offset
ang = (b - a) * self.np_random.rand() + a
return np.array([r * np.sin(ang), r * np.cos(ang), z])
def isNear(pos, poss):
for p, o in poss:
if np.linalg.norm(p - pos) < near_distance:
return True
return False
def genPosRetry(poss):
for i in range(retry):
pos = genPos()
if not isNear(pos, poss):
return pos
return genPos()
poss = []
for i in range(num):
pos = genPosRetry(poss)
orn = p.getQuaternionFromEuler([0.0, 0.0, 2.0*np.pi*self.np_random.rand()])
poss.append((pos, orn))
return poss

Debug when "failed to create the sampler"

Is it impossible to debug the following error, or is there any corresponding
function of browser().
I want to extract objects of the transformed data block in a Stan file when the function sampling() fails to create a stanfit object.
failed to create the sampler; sampling not done
Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) :
Exception: binomial_rng: Probability parameter is nan, but must be in the interval [0, 1] (in 'model23b420c17ad_SBC' at line 247)
AN ANSWER: A Function Print() in a Stan File As a Debugger:
Using print(), we can print any object in the transformed data block, regardless of consequence of the function rstan::sampling().
m <- rstan::stan_model(model_code = '
data{real x;}
transformed data{real z; z = Phi( (-1.14194+ 2.66963)/(-0.257783) );
print("Here, we can use print() as a debugger")
print("")
print("z = ", z)
}
parameters {real y;}
model {y ~ normal(z,1);}
generated quantities {real zhat = z;}')
f <- rstan::sampling(m, data=list(x=1), iter = 100,chains=1)
extract(f)[["zhat"]]
As an result of the above toy code, the specified objects in a Stan file are printed in the R console as follows:
> f <- rstan::sampling(m, data=list(x=1), iter = 100,chains=1)
Here, we can use print() as a debugger
z = 1.54953e-009
SAMPLING FOR MODEL 'adff65652652045694506de44240c84c' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: WARNING: There aren't enough warmup iterations to fit the
Chain 1: three stages of adaptation as currently configured.
Chain 1: Reducing each adaptation stage to 15%/75%/10% of
Chain 1: the given number of warmup iterations:
Chain 1: init_buffer = 7
Chain 1: adapt_window = 38
Chain 1: term_buffer = 5
Chain 1:
Chain 1: Iteration: 1 / 100 [ 1%] (Warmup)
Chain 1: Iteration: 10 / 100 [ 10%] (Warmup)
Chain 1: Iteration: 20 / 100 [ 20%] (Warmup)
Chain 1: Iteration: 30 / 100 [ 30%] (Warmup)
Chain 1: Iteration: 40 / 100 [ 40%] (Warmup)
Chain 1: Iteration: 50 / 100 [ 50%] (Warmup)
Chain 1: Iteration: 51 / 100 [ 51%] (Sampling)
Chain 1: Iteration: 60 / 100 [ 60%] (Sampling)

How to remove the list index out of range?

The code runs perfectly with the custom input but while running in competetive programming platform, it shows runtime error.
I have searched about this but couldn't resolve it.
def GCD(num1, num2):
if num1 < num2:
small = num1
else:
small = num2
for i in range(1, small + 1):
if (num1 % i == 0) and (num2 % i == 0):
gcd = i
return gcd
arr = [int(i) for i in input().split(' ')]
print(GCD(arr[0], arr[1]))
Runtime Error
Traceback (most recent call last): File Main.py , line 10, in print(GCD(arr[0], arr[1])) IndexError: list index out of range
In the last line you are printing the GCD of the first and the second element of the array which are of the index 0 and 1. But if user only enters a single number then index 1 is out of range. So you can just check if the size of array is less than two then simply print the element at the index 0.
if len(arr) is 1:
print(arr[0])
else:
print(GCD(arr[0], arr[1]))
Furthermore, if you are trying to find the GCD of the array then the algorithm is wrong. You will have to iterate over the array and find the GCD.
if len(arr) is 1:
print(arr[0])
else:
answer = arr[0];
for i in range(len(arr)):
answer = GCD(answer,arr[i])
print(answer)

Exception appending DataFrame chunk with string values to large HDF5 file using pandas

An exception happens while appending pandas.DataFrame() with string values (numerical values are OK) to HDF5 storage after filesize is greater than approximately 47 GiB. Neither minimal size of string, number of records, number of columns is not important. Filesize is important.
The bottom of the exception trace:
File "..\..\hdf5-1.8.14\src\H5FDsec2.c", line 822, in H5FD_sec2_write
file write failed: time = Tue Aug 18 18:26:17 2015
, filename = 'large_file.h5', file descriptor = 4, errno = 22, error message = 'Invalid argument', buf = 0000000066A40018, total write size = 262095, bytes this sub-write = 262095, bytes actually written = 18446744073709551615, offset = 47615949533
the code to reproduce:
import numpy as np
import pandas as pd
for i in range(200):
df = pd.DataFrame(np.char.mod('random string object (%f)', np.random.rand(5000000,3)), columns=('A','B','C'))
print('writing chunk №', i, '...', end='', flush=True)
with pd.HDFStore('large_file.h5') as hdf:
# Construct unique index
try:
nrows = hdf.get_storer('df').nrows
except:
nrows = 0
df.index = pd.Series(df.index) + nrows
# Append the dataframe to the storage. Exception hppens here
hdf.append('df', df, format='table')
print('done')
Environment:
Windows7 x64 machine, python 3.4.3, pandas 0.16.2, pytables 3.2.0, HDF5 1.8.14.
The question is how to fix the problem if it is located in python code above or how to avoid it if related to HDF5. Thanks.