A nice starter kit for OpenCL? [closed] - ide

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've got some experience with OpenGL and it's programmable pipeline. I'd like to give OpenCL a try, though.
Could somebody propose a nice integrated kit for working with OpenCL?
I know only of QuartzComposer which looks nice, but it's mac-only. Anyone knows if it supports hand-editing of OpenCL kernels or is it all only through the GUI?
Any other Linux / Windows alternative?

Quartz Composer does have an OpenCL "patch," in which you can hand-edit your kernel. It's a pretty nice way to experiment with stuff like CL-based vertex or color generation, which you can then display on subsequent patches. Once you get something working there, you can usually make the jump to pure C/C++ code that utilizes the CL/GL interop facilities of your platform.
Using CL as above will definitely give you a feel for the OpenCL C language. You will still need to learn about the OpenCL runtime facilities, however.

OpenCL Studio
OpenCL studio hides much of the boiler plate code you would have to write by providing a Lua based infrastructure -- read as: for the host code (the code running on the CPU) you can code in Lua. It furthermore comes with a bunch of examples.
The GUI feels sometimes a bit hacked. I also didn't find much help/documentation on how to use the software itself.
All in all I think it dramatically reduces the learning effort of OpenCL.

(Disclaimer: I'm the developer of OpenCLHelper).
I'm not sure exactly what your requirements are, ie if you're looking for an IDE for OpenCL, complete with debugger and so on, then you can stop reading this reply.
However, if what you want is a way of using OpenCL with much less boilerplate, and which makes it easy to load kernels, and pass arguments to them, then you might consider OpenCLHelper
https://github.com/hughperkins/OpenCLHelper
OpenCLHelper:
handles much of the boilerplate of initializing OpenCL, locating devices, creating queues
makes it easy to pass arguments to, and receive arguments from, a kernel
uses clew, so that binding to OpenCL is at runtime, just in case you need OpenCL to be an optional part of your program, rather than mandatory at runtime
makes it relatively painless to take data from one kernel, and provide it to the next, without moving it back and forwards between GPU and PC memory

For Windows (Vista, 7) try VS2010 C++ Express
http://www.microsoft.com/visualstudio/en-us/products/2010-editions/visual-cpp-express
with the AMD App SDK
http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx
There are lots of OpenCL examples. It installs and runs on any Intel x86 or x64 CPU made in the last 7 years.

Some options are
AMD APP SDK
AMD APP SDK comes with OpenCL compiler, Profiler, Kernel Analyzer.
You can also try AMD gDEBugger which is probably the best you can get for OpenCL GPU debugging compared to NVIDIA and Intel.
NVIDIA OpenCL pack
NVIDIA OpenCL pack you can sign up and download opencl samples, Nsight, Visual profiler etc.
Intel OpenCL SDK
Intel OpenCL SDK is another option if you would like to run your programs on Sandybridge processors.
COPRTHR SDK
Brown deer technology has COPRTHR SDK which contains rich set of OpenCL libraries.
Multicoreare Tools
Another option is Multicoreare Tools which combines CUDA/OpenCL, Pyon and DSL Code and generates ISA for the underlaying hardware.

Related

Is it safer to use OpenCL rather than SYCL when the objective is to have the most hardware-compatible program?

My objective is to obtain the ability of parallelizing a code in order to be able to run it on GPU, and the Graal would be to have a software that can run in parallel on any GPU or even CPU (Intel, NVIDIA, AMD, and so...).
From what I understood, the best solution would be to use OpenCL. But shortly after that, I also read about SYCL, that is supposed to simplify the codes that run on GPU.
But is it just that ? Isn't better to use a lower level language in order to be sure that it will be possible to be used in the most hardware possible ?
I know that all the compatibilities are listed on The Khronos Group website, but I am reading everything and its opposite on the Internet (like if a NVIDIA card supports CUDA, then it supports OpenCL, or NVIDIA cards will never work with OpenCL, even though OpenCL is supposed to work with everything)...
This is a new topic to me and there are lots of informations on the Internet... It would be great if someone could give me a simple answer to this question.
Probably yes.
OpenCL is supported on all AMD/Nvidia/Intel GPUs and on all Intel CPUs since around 2009. For best compatibility with almost any device, use OpenCL 1.2. The nice thing is that the OpenCL Runtime is included in the graphics drivers, so you don't have to install anything extra to work with it or to get it working on another machine.
SYCL on the other hand is newer and not yet established that well. For example, it is not officially supported (yet) on Nvidia GPUs: https://forums.developer.nvidia.com/t/is-sycl-available-on-cuda/37717/7
But there are already SYCL implememtations that are compatible with Nvidia/AMD GPUs, essentially built on top of CUDA or OpenCL 1.2, see here: https://stackoverflow.com/a/63468372/9178992

Developing PTX instead of CUDA for optimization. Is it make sense? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm developing cuda code. But new device languages which are PTX or SPIR backends was announced. And i can come across some application which is being developed by them. At least i think we can say ptx language is enough to develop something at product level.
As we know, PTX is not real device code. It is just intermediate language for NVidia. But my question is what if i develop PTX instead of CUDA? Can i develop naturally optimized code, if i use ptx? Is it make sense?
In the other hand why/what's the motivation of PTX language?
Thanks in advance
Yes, it can make sense to implement CUDA code in PTX, just as it can make sense to implement regular CPU code in assembly instead of C++.
For instance, in CUDA C, there is no efficient way of capturing the carry flag and including it in new calculations. So it can be hard to implement efficient math operations that use more bits than what is supported natively by the machine (which is 32 bits on all current GPUs). With PTX, you can efficiently implement such operations.
I implemented a project in both CUDA C and PTX, and saw significant speedup in PTX. Of course, you will only see a speedup if your PTX code is better than the code created by the compiler from plain CUDA C.
I would recommend first creating a CUDA C version for reference. Then create a copy of the reference and start replacing parts of it with PTX, as determined by results from profiling, while making sure the results match that of the reference.
As far as the motivation for PTX, it provides an abstraction that lets NVIDIA change the native machine language between generations of GPUs without breaking backwards compatibility.
The main advantage of developing in PTX is that it can give you access to certain features which are not exposed directly in CUDA C. For instance, certain cache modifiers on load instructions, some packed SIMD operations, and predicates.
That said, I wouldn't advise anyone to code in PTX. On the CUDA Library team, we sometimes wrap PTX routines in a C function via inline assembly, and then use that. But programming in C/C++/Fortan is way easier than writing PTX.
Also, the run-time will re-compiled your PTX into an internal hardware-specific assembly language. In the process it may reorder instructions, assign registers, and change scheduling. So all of your careful ordering in PTX is mostly unnecessary and usually has little to do with the final assembly code. NVIDIA now ships at disassembler which lets you view the actual internal assembly - you can compare for yourself if you want to play around with it.

Tools for embedded development [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I would like to learn something about embedded development. I think the best thing would be to buy hardware stuff and play with it but I don't know where to start and, if possible, I would like not to pay too much....
If you have experience in this field, which would be the best road to follow?
I assume you mean real embedded and not embedded linux or some other operating system thing.
All above are good, sparkfun.com is a GREAT resource for sub $50 cards. Dont buy the embed. The armmite pro is nice, trivial to bypass the high level canned package and load your own binaries (I have a web page on how to do it if interested).
Stellaris is good, the 811 is easy to brick so be careful, the 1968 eval board is not a bad one. The problem with the stellaris boards is almost all of their I/O is consuemed by on board peripherals. The good thing about the stellaris eval boards, based on what you are wanting to do is that all the I/O is consumed by on board peripherals. Lots of peripherals for you to learn how to write embedded code for.
You are going to eventually want a jtag wiggler, I recommend the amontec jtag-tiny, it will open the door to a number of the olimex boards from sparkfun. the sam7 and stm32 header boards are good ones as well.
the lillypad is a good starting place for arduino (sparkfun), same price as the arduino pro mini, but you dont have to do any soldering. get a lillypad and the little usb to serial thing that powers it and gives you serial access to program it. Just like the armmite pro I have a web page on how to erase the as-shipped flash and have a linux programmer that lets you load any binary you want not just ones limited to their sandbox.
avoid PIC and 8051 unless you are interested in a history lesson. the PIC32X, not sure my first one is in the mail, it is a MIPS 32 not a PIC core.
the ez430 msp430 board is a very good one, the msp430 has a very nice architecture, better than the avr.
You can get your feet wet in simulation as well. I have a thumb instruction set emulator, thumbulator.blogspot.com. Thumb is a subset of the arm instruction set and if you learn thumb then you can jump right into a stellaris board or stm32. My sim does not support thumb2, the thumb2 processors also support thumb, the transition to thumb2 from thumb is trivial.
avoid the stm32 primer boards, avoid the stm32 primer boards, avoid the mbed2 boards, avoid the mbed2 boards, avoid the lpcxpresso boards, avoid the lpcxpresso boards!!
I recently found a behavioral model of an arm in verilog that you can simulate your programs, have not played with it much. qemu-arm is probably easier, not a bad place to get your feet wet although it can be frustrating. Which is why I wrote my own.
ARMS own armulator is out there, in the gdb source release for example, easier than qemu-arm to use, but can be frustrating as well.
go to codesourcery for arm gcc tools. use mspgcc4.sf.net for msp430 tools. llvm is rapidly catching and passing gcc, if nothing else I expect it to replace gcc for the universal cross compiler tool. at the moment it is much more stable and portable than gcc when it comes to building for cross compiling (because it is always/only a cross compiler wherever you find or use it). the msp backend for llvm was an afternoon experiment for someone, sadly, I would really like to have that supported. If you use llvm, use clang not llvm-gcc.
If you want to get your feet wet, get a cheap evaluation board like Stellaris LM3S811 Evaluation Kit (EK-LM3S811) which is $50 at Digi-Key then download CodeSourcery G++ which provides free command line tools or the IAR Kickstart Edition which allows you up to 32KB of code.
I would suggest starting up with MSP430. The MSP430 launchpad is quiet cheap. Alternatively, you could start up with the Stellaris (ARM Cortex M3) Boards. You can use the already provided libraries first to start developing apps rite away and then start writing your code for configuring and getting things done by referring the data sheet.You also get example codes, relevant documents and Keil 32K limited evaluation version. If you want to do things write from scratch, then get an ARM based board with IO breakout headers and start working. Lot of them are available from vendors like Olimex. One word of caution ARM is difficult to start with if you are working from scratch with little or no idea about embedded. So if you are looking for something easier go for AVR or 8051, but 8051 core is too old. So, Stellaris would be a good option in my opinion with their already available driver libs and codes.
Well, depending how much money you want to spend, and how much development expertise you have, you could either get an Arduino (arduino.cc) or a FEZ Domino (C# .NET) (tinyclr.com). Both are premade MCUs, with all the tools you need to start developing out of the box.
The Arduino is going to be very simplistic, but probably better for a beginner. The FEZ is a little harder to work with, but FAR more capable. Both have the same physical pinout, so you can use "shields" between them
I would recommend a kickstart kit from iar systems. They're fairly complete and work out of the box.
http://www.iar.com/website1/1.0.1.0/16/1/

mono for emdedded

I'm a C# developer, I'm interested in embedded development for chips like MSP430. Please suggest some tools and tutorials.
Mono framework is very powerful and customizable, mono specific examples will be more helpful.
Mono requires a 32 bit system, it is not going to work on 16-bit systems.
There is currently no full mono support for the MSP430.
Mono doesn't run in a vacuum - you will need to make a program that exposes the microcontroller functionality to Mono, then link to Mono and program the entire thing on the microcontroller. This program will have to provide some functionality to Mono that is normally provided by an operating system.
The paged igorgue linked to gives you a good starting point for this process: http://www.mono-project.com/Embedding%5FMono
I don't know what the requirements of the Mono VM are, though. It may be easy to compile and use, or you may have to write a lot of supporting code, or dig deep into mono to disable code you won't be using, or can't support on the chosen microcontroller.
Further, Mono isn't gargantuan, but it's complex and designed with larger 32 bit processors in mind. It may or may not fit onto the relatively limited 16 bit MSP430.
However, the MSP430 does have a GCC port, so you don't have to port the mono code to a new compiler, which should make your job easier.
Good luck, and please let us know what you decide to do, and how it works out!
-Adam
The tools to use Mono on an MSP430 just aren't available. Drop all the C# and use C/C++ instead.
MSP devices usually have 8 to 256KB Flash and 256 bytes (!) to 16kBytes of RAM.
Using C# or even c++ is really not an option. Also, complex frameworks are a no-go.
If you really want to start with MSP430 (which are powerful, fast and extremely low-power processors for their area of use), you should look for the MSPGCC toolchain.
http://mspgcc.sourceforge.net/
It contains compiler (GCC3.22 based) along with all necessary tools (make, JTAG programmer etc.). Most MSP processors are supported with code optimisation and support of internal hardware such as the hardware multiplier.
All you need is an editor (yopu can use Eclipse, UltraEdit or even the normal Notepad) and some knowledge about writing a simple makefile.
And you should prepare to write tight code (especially in terms of ram usage).
I think that Netduino can be of some interest for you.
Visit their web site at http://netduino.com/.
It's opensource hardware (like Arduino, http://www.arduino.cc/).
It runs .NET Micro Framework (http://www.microsoft.com/en-us/netmf/default.aspx), the breed oriented to embedded development.
Regards,
Giacomo

Intro to GPU programming [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Everyone has this huge massively parallelized supercomputer on their desktop in the form of a graphics card GPU.
What is the "hello world" equivalent of the GPU community?
What do I do, where do I go, to get started programming the GPU for the major GPU vendors?
-Adam
Check out CUDA by NVidia, IMO it's the easiest platform to do GPU programming. There are tons of cool materials to read.
[http://www.nvidia.com/object/cuda_home.html][1]
[1]: http://www.nvidia.com/object/cuda_home.html
Hello world would be to do any kind of calculation using GPU.
You get programmable vertex and
pixel shaders that allow execution
of code directly on the GPU to
manipulate the buffers that are to
be drawn. These languages (i.e.
OpenGL's GL Shader Lang and High
Level Shader Lang and DirectX's equivalents
), are C style syntax, and really
easy to use. Some examples of HLSL
can be found here for XNA game
studio and Direct X. I don't have
any decent GLSL references, but I'm
sure there are a lot around. These
shader languages give an immense
amount of power to
manipulate what gets drawn at a per-vertex
or per-pixel level, directly
on the graphics card, making things
like shadows, lighting, and bloom
really easy to implement.
The second thing that comes to mind is using
openCL to code for the new
lines of general purpose GPU's. I'm
not sure how to use this, but my
understanding is that openCL gives
you the beginnings of being able to
access processors on both the
graphics card and normal cpu. This is not mainstream technology yet, and seems to be driven by Apple.
CUDA seems to be a hot topic. CUDA is nVidia's way of accessing the GPU power. Here are some intros
I think the others have answered your second question. As for the first, the "Hello World" of CUDA, I don't think there is a set standard, but personally, I'd recommend a parallel adder (i.e. a programme that sums N integers).
If you look the "reduction" example in the NVIDIA SDK, the superficially simple task can be extended to demonstrate numerous CUDA considerations such as coalesced reads, memory bank conflicts and loop unrolling.
See this presentation for more info:
http://www.gpgpu.org/sc2007/SC07_CUDA_5_Optimization_Harris.pdf
OpenCL is an effort to make a cross-platform library capable of programming code suitable for, among other things, GPUs. It allows one to write the code without knowing what GPU it will run on, thereby making it easier to use some of the GPU's power without targeting several types of GPU specifically. I suspect it's not as performant as native GPU code (or as native as the GPU manufacturers will allow) but the tradeoff can be worth it for some applications.
It's still in its relatively early stages (1.1 as of this answer), but has gained some traction in the industry - for instance it is natively supported on OS X 10.5 and above.
Take a look at the ATI Stream Computing SDK. It is based on BrookGPU developed at Stanford.
In the future all GPU work will be standardized using OpenCL. It's an Apple-sponsored initiative that will be graphics card vendor neutral.
CUDA is an excellent framework to start with. It lets you write GPGPU kernels in C. The compiler will produce GPU microcode from your code and send everything that runs on the CPU to your regular compiler. It is NVIDIA only though and only works on 8-series cards or better. You can check out CUDA zone to see what can be done with it. There are some great demos in the CUDA SDK. The documentation that comes with the SDK is a pretty good starting point for actually writing code. It will walk you through writing a matrix multiplication kernel, which is a great place to begin.
Another easy way to get into GPU programming, without getting into CUDA or OpenCL, is to do it via OpenACC.
OpenACC works like OpenMP, with compiler directives (like #pragma acc kernels) to send work to the GPU. For example, if you have a big loop (only larger ones really benefit):
int i;
float a = 2.0;
float b[10000];
#pragma acc kernels
for (i = 0; i < 10000; ++i) b[i] = 1.0f;
#pragma acc kernels
for (i = 0; i < 10000; ++i) {
b[i] = b[i] * a;
}
Edit: unfortunately, only the PGI compiler really supports OpenACC right now, for NVIDIA GPU cards.
Try GPU++ and libSh
LibSh link has a good description of how they bound the programming language to the graphics primitives (and obviously, the primitives themselves), and GPU++ describes what its all about, both with code examples.
If you use MATLAB, it becomes pretty simple to use GPU's for technical computing (matrix computations and heavy math/number crunching). I find it useful for uses of GPU cards outside of gaming. Check out the link below:
http://www.mathworks.com/discovery/matlab-gpu.html