Does High-Ram in Google Colab refer to GPU RAM or CPU RAM? - gpu

I see that you get better GPUs with the Pro account. But does switching to High-Ram affect the GPU's memory or the CPU's memory? I don't see it spelled out exactly anywhere, and maybe this is just common knowledge among ML experts.

Related

Colab pro and GPU availability

I need GPU for my project. Till now I had limited use and used Colab free. Now I think I may need as much as 3 hours a day. Now it says GPU is not available because they are already taken. My question is, what effect does upgrading to Colab pro have on GPU availability? How many hours should I expect to have GPU and are these hours arbitrary chosen by me or not?
I referred Here and There but no good detail about GPU availability is given.
In Their website they tell that these limitations vary and depends on previous usage, and a precise answer might not be even available, so even an approximated answer is welcome.
Thanks.
Yeah.I had the same experience that GPU is not available in colab.
Why not try gpushare.com to run 3090 or 2080ti with free credit.
The platform supports the most popular machine learning frameworks,like TensorFlow and PyTorch,users can be fast to instantiate a VM image.
I think it's appropriate to accelerate your model training.

Is there any limitations for google colab other than the session timeout after 12 hours?

one of the limitations is that we can get only 12 continuous hours per session. Is there any limitations for the usage for GPU and TPU?
Yes, you can only use 1 GPU with a limited memory of 12GB and TPU has 64 GB High Bandwidth Mmeory.You can read here in this article.
So, if you want to use large dataset then I would recommend you to use tf.data.Dataset for preparing it before training.
If you want to use GPUs you can use any TF version. But for TPU I would recommend using TF1.14.
From Colab's documentation,
In order to be able to offer computational resources for free, Colab needs to maintain the flexibility to adjust usage limits and hardware availability on the fly. Resources available in Colab vary over time to accommodate fluctuations in demand, as well as to accommodate overall growth and other factors.
In a nutshell, Colab has dynamic resource provisioning. So they can change the hardware, it it is being taxed too much automatically.
Google giveth and Google taketh away.
Link

How does TensorFlow use both shared and dedicated GPU memory on the GPU on Windows 10?

When running a TensorFlow job I sometimes get a non-fatal error that says GPU memory exceeded, and then I see the "Shared memory GPU usage" go up on the Performance Monitor on Windows 10.
How does TensorFlow achieve this? I have looked at CUDA documentation and not found a reference to the Dedicated and Shared concepts used in the Performance Monitor. There is a Shared Memory concept in CUDA but I think it is something on the device, not the RAM I see in the Performance Monitor, which is allocated by the BIOS from CPU RAM.
Note: A similar question was asked but not answered by another poster.
Shared memory in windows 10 does not refer to the same concept as cuda shared memory (or local memory in opencl), it refers to host accessible/allocated memory from the GPU. For integrated graphics processing host and device memory is usually the same as shared thanks to both the cpu and gpu being located on the same die and being able to access the same ram. For dedicated graphics with their own memory, this is separate memory allocated on the host side for use by the GPU.
Shared memory for compute APIs such as through GLSL compute shaders, or Nvidia CUDA kernels refer to a programmer managed cache layer (some times refereed to as "scratch pad memory") which on Nvidia devices, exists per SM, and can only be accessed by a single SM and is usually between 32kB to 96kB per SM. Its purpose is to speed up memory access to data which is used often.
If you see and increase shared memory used in Tensorflow, you have a dedicated graphics card, and you are experiencing "GPU memory exceeded" it most likely means you are using too much memory on the GPU itself, so it is trying to allocate memory from elsewhere (IE from system RAM). This potentially can make your program much slower as the bandwidth and latency will be much worse on non device memory for a dedicated graphics card.
I think I figured this out by accident. The "Shared GPU Memory" reported by Windows 10 Task Manager Performance tab does get used, if there are multiple processes hitting the GPU simultaneously. I discovered this by writing a Python programming that used multiprocessing to queue up multiple GPU tasks, and I saw the "Shared GPU memory" start filling up. This is the only way I've seen it happen.
So it is only for queueing tasks. Each individual task is still limited to the onboard DRAM minus whatever is permanently allocated to actual graphics processing, which seems to be around 1GB.

Getting the most of the GPU in an Embedded Platform

My platform is Ubuntu running ob Exynos4412CPU which has the Mali400GPU. I would like to do some computer vision using OpenCV and OpenGL, I'm also going to do some fragment shaders. My question is what is the fastest way to copy the contents from the GPU to the CPU, which is really slow on my platform using glreadpixels. Is it beneficial to utilize glreadpixels in its own thread or use OpenMP ? Suggestions are welcome please :).
The Exynos 4412 doesn't have separate CPU and GPU memory at the hardware level; it's all the same RAM and physically accessible by both. Thus, there is likely to be some way to access the GPU's portion of the memory directly from the CPU.

AMD Fusion how can CPU and GPU share the same memory?

AMD announced it's Fusion platform some time ago. Having read a bit about it I'm both excited and sceptic. For example it should make it possible that GPUs and CPUs share the same memory. (and the GPU and CPU are both in the same package) Now since GPUs have a much higher memory bandwidth (around 10x the bandwidth a CPU has) and that the way CPUs and GPUs use cache is fundamentally different, the question arises how the heck can they do this? I wonder if any details are known.
I have also searched for some detailed info on how this APU technically works, but haven't found anything better than AMD's whitepaper on the subject, which, in a slightly marketing-wise tone, does present a lot of good info.
By using high-bandwidth dual ported RAM. AMD is happy to explain.
Here is a good paper (from AMD) explaining the different memory access patterns in an APU
http://amddevcentral.com/afds/assets/presentations/1004_final.pdf
CPUs and GPUs have been sharing memory for years. Fusion is no different, in fact it's far from the first instance of a GPU being integrated into a general-purpose CPU core.
Like all other such solutions, it'll be fine for casual use, but it'll be far from cutting-edge 3D acceleration.