How can you check the number of CUDA cores that are being used from the NVIDIA Visual Profiler? I have the image below taken from the Visual Profiler. Is it just based on the number of "Thread" objects on the left side?
Related
Where can I purchase good VGA card(GPU) for Microsoft Cognitive CNTK/TensorFlow programming? can you suggest commonly used GPU model with affordable price?
For tensorflow programming you can find a list of CUDA compatible graphics cards here and choose whichever fits your needs best but know if you want to use the Tensorflow-gpu pre-built binaries you need a card with CUDA Compatability of 3.5 or higher(at least on windows).
If you want to run tensorflow on windows 10 with a graphics card of CUDA capability of 3.0 you could look at this, you would build from source with some edits to the CMakeList here This would allow you to find a card with 3.0 compatibility.
Although as a fair warning on building your own tensorflow binary "don't build a TensorFlow binary yourself unless you are very comfortable building complex packages from source and dealing with the inevitable aftermath should things not go exactly as documented." from here
You need a CUDA-compatible graphics card with Compute Capability (CC) 3.0 or more to use CNTK GPU capabilities.
You can find CUDA-compatible graphics cards here and here (for older cards)
You can also use Microsoft Azure’s specialized Deep Learning Virtual Machine if you’re considering the cloud as an option.
On Windows 8
Is there a way to increase the process Memory limit of 2GB. My script needs 2.5GB RAM to run after I performed garbage collection to the best of my knowledge.
I need to run in 64-bit (not related to largeaddressaware)
I am using kinect xbox1 for window camera, for computing skeleton data and rgb data.I am retrieving 30 frames per second. Also calculating joint values of human body and then calculate angle between joints. I want that my laptop/system compute faster values of joints and angle. And store into directory.But recently i am using my laptop which compute the joint values and angle very slowly.
Specification of my laptop are:
500GB Hard
600GB RAM
1.7GHZ processor
Kindly tell me which system i am used to calculate faster calculation. I want really fast system/laptop to calculate very fast calculations. Anyone have idea please tell me.
And also tell me the complete specifications of system. I want to use latest fastest technology or any machine which resolve my issue.
Your computer must have the following minimum capabilities:
32-bit (x86) or 64-bit (x64) processors
Dual-core, 2.66-GHz or faster processor
USB 2.0 bus dedicated to the Kinect
2 GB of RAM
Graphics card that supports DirectX 9.0c
Source: MSDN
Anyway I suggest you:
A Desktop PC
with a Processor with 3Ghz (More are usually better) multi-core processor
with a GPU compatible with DirectX 11 and C++ AMP
I've got a program that takes about 24 hours to run. It's all written in VB.net and it's about 2000 lines long. It's already multi-threaded and this works perfectly (after some sweat and tears). I typically run the processes with 10 threads but I'd like to increase that to reduce processing time, which is where using the GPU comes into it. I've search google for everything related that I can think of to find some info but no luck.
What I'm hoping for is a basic example of a vb.net project that does some general operations then sends some threads to the GPU for processing. Ideally I don't want to have to pay for it. So something like:
'Do some initial processing eg.
dim x as integer
dim y as integer
dim z as integer
x=int(textbox1.text)
y=int(textbox2.text)
z=x*y
'Do some multi-threaded operations on the gpu eg.
'show some output to the user once this has finished.
Any help or links will be much appreciated. I've read plenty of articles about it in c++ and other languages but I'm rubbish at understanding other languages!
Thanks all!
Fraser
The VB.NET compiler does not compile onto the GPU, it compiles down to an intermediate language (IL) that is then just-in-time compiled (JITed) for the target architecture at runtime. Currently only x86, x64 and ARM targets are supported. CUDAfy (see below) takes the IL and translates it into CUDA C code. In turn this is compiled with NVCC to generate code that the GPU can execute. Note that this means that you are limited to NVidia GPUs as CUDA is not supported on AMD.
There are other projects that have taken the same approach, for example a Python to CUDA translator in Copperhead.
CUDAfy - A wrapper on top of the CUDA APIs with additional libraries for FFTs etc. There is also a commercial version. This does actually
CUDAfy Translator
Using SharpDevelop's decompiler ILSpy as basis the translator converts .NET code to CUDA C.
There are other projects to allow you to use GPUs from .NET languages. For example:
NMath - A set of math libraries that can be used from .NET and are GPU enabled.
There may be others but these seem to be the main ones. If you decide to use CUDAfy then you will still need to invest some time in understanding enough of CUDA and how GPUs work to port your algorithm to fit the GPU data-parallel model. Unless it is something that can be done out of the box with one of the math libraries.
It's important to realize that there is still a performance hit for accessing the GPU from a .NET language. You must pay a cost for moving (marshaling) data from the .NET managed runtime into the native runtime. The overhead here largely depends on not only the size but also the type of data and if it can be marshaled without conversion. This is in addition to the cost of moving data from the CPU to the GPU etc.
Do the parallel-for in .net 4.0 takes privilege of GPU computing automatically? Or I have to configure with some drivers so that it uses GPU.
No, PFX doesn't do that for you. Take a look at Microsoft Accelerator to run some code on a GPU. I recommend in particular Tomas Petricek's series of articles on F# and Accelerator.
Also watch the gpu branch of LinqOptimizer.
If you want to take advantage of GPU parallelism for .NET, try the open soure Brahma library, noting that in its current incarnation is .NET 3.5. It's LINQ-able, just not 4.0 parallel LINQ-able.
Parallel.For does not use the GPU in your graphics card. It uses threads, multiple cores and hyperthreading to achieve its goals. There are no drivers available that will allow Parallel.For to make use of the GPU.
The only technology that I know of that allows you to parallelize work in the GPU (other than graphical processing work) is CUDA.