Everyone has this huge massively parallelized supercomputer on their desktop in the form of a graphics card GPU.
-Adam
This question is related to
gpu
OpenCL is an effort to make a cross-platform library capable of programming code suitable for, among other things, GPUs. It allows one to write the code without knowing what GPU it will run on, thereby making it easier to use some of the GPU's power without targeting several types of GPU specifically. I suspect it's not as performant as native GPU code (or as native as the GPU manufacturers will allow) but the tradeoff can be worth it for some applications.
It's still in its relatively early stages (1.1 as of this answer), but has gained some traction in the industry - for instance it is natively supported on OS X 10.5 and above.
Take a look at the ATI Stream Computing SDK. It is based on BrookGPU developed at Stanford.
In the future all GPU work will be standardized using OpenCL. It's an Apple-sponsored initiative that will be graphics card vendor neutral.
I think the others have answered your second question. As for the first, the "Hello World" of CUDA, I don't think there is a set standard, but personally, I'd recommend a parallel adder (i.e. a programme that sums N integers).
If you look the "reduction" example in the NVIDIA SDK, the superficially simple task can be extended to demonstrate numerous CUDA considerations such as coalesced reads, memory bank conflicts and loop unrolling.
See this presentation for more info:
http://www.gpgpu.org/sc2007/SC07_CUDA_5_Optimization_Harris.pdf
CUDA is an excellent framework to start with. It lets you write GPGPU kernels in C. The compiler will produce GPU microcode from your code and send everything that runs on the CPU to your regular compiler. It is NVIDIA only though and only works on 8-series cards or better. You can check out CUDA zone to see what can be done with it. There are some great demos in the CUDA SDK. The documentation that comes with the SDK is a pretty good starting point for actually writing code. It will walk you through writing a matrix multiplication kernel, which is a great place to begin.
I think the others have answered your second question. As for the first, the "Hello World" of CUDA, I don't think there is a set standard, but personally, I'd recommend a parallel adder (i.e. a programme that sums N integers).
If you look the "reduction" example in the NVIDIA SDK, the superficially simple task can be extended to demonstrate numerous CUDA considerations such as coalesced reads, memory bank conflicts and loop unrolling.
See this presentation for more info:
http://www.gpgpu.org/sc2007/SC07_CUDA_5_Optimization_Harris.pdf
Another easy way to get into GPU programming, without getting into CUDA or OpenCL, is to do it via OpenACC.
OpenACC works like OpenMP, with compiler directives (like #pragma acc kernels
) to send work to the GPU. For example, if you have a big loop (only larger ones really benefit):
int i;
float a = 2.0;
float b[10000];
#pragma acc kernels
for (i = 0; i < 10000; ++i) b[i] = 1.0f;
#pragma acc kernels
for (i = 0; i < 10000; ++i) {
b[i] = b[i] * a;
}
Edit: unfortunately, only the PGI compiler really supports OpenACC right now, for NVIDIA GPU cards.
CUDA is an excellent framework to start with. It lets you write GPGPU kernels in C. The compiler will produce GPU microcode from your code and send everything that runs on the CPU to your regular compiler. It is NVIDIA only though and only works on 8-series cards or better. You can check out CUDA zone to see what can be done with it. There are some great demos in the CUDA SDK. The documentation that comes with the SDK is a pretty good starting point for actually writing code. It will walk you through writing a matrix multiplication kernel, which is a great place to begin.
Take a look at the ATI Stream Computing SDK. It is based on BrookGPU developed at Stanford.
In the future all GPU work will be standardized using OpenCL. It's an Apple-sponsored initiative that will be graphics card vendor neutral.
Another easy way to get into GPU programming, without getting into CUDA or OpenCL, is to do it via OpenACC.
OpenACC works like OpenMP, with compiler directives (like #pragma acc kernels
) to send work to the GPU. For example, if you have a big loop (only larger ones really benefit):
int i;
float a = 2.0;
float b[10000];
#pragma acc kernels
for (i = 0; i < 10000; ++i) b[i] = 1.0f;
#pragma acc kernels
for (i = 0; i < 10000; ++i) {
b[i] = b[i] * a;
}
Edit: unfortunately, only the PGI compiler really supports OpenACC right now, for NVIDIA GPU cards.
If you use MATLAB, it becomes pretty simple to use GPU's for technical computing (matrix computations and heavy math/number crunching). I find it useful for uses of GPU cards outside of gaming. Check out the link below:
I think the others have answered your second question. As for the first, the "Hello World" of CUDA, I don't think there is a set standard, but personally, I'd recommend a parallel adder (i.e. a programme that sums N integers).
If you look the "reduction" example in the NVIDIA SDK, the superficially simple task can be extended to demonstrate numerous CUDA considerations such as coalesced reads, memory bank conflicts and loop unrolling.
See this presentation for more info:
http://www.gpgpu.org/sc2007/SC07_CUDA_5_Optimization_Harris.pdf
CUDA is an excellent framework to start with. It lets you write GPGPU kernels in C. The compiler will produce GPU microcode from your code and send everything that runs on the CPU to your regular compiler. It is NVIDIA only though and only works on 8-series cards or better. You can check out CUDA zone to see what can be done with it. There are some great demos in the CUDA SDK. The documentation that comes with the SDK is a pretty good starting point for actually writing code. It will walk you through writing a matrix multiplication kernel, which is a great place to begin.
OpenCL is an effort to make a cross-platform library capable of programming code suitable for, among other things, GPUs. It allows one to write the code without knowing what GPU it will run on, thereby making it easier to use some of the GPU's power without targeting several types of GPU specifically. I suspect it's not as performant as native GPU code (or as native as the GPU manufacturers will allow) but the tradeoff can be worth it for some applications.
It's still in its relatively early stages (1.1 as of this answer), but has gained some traction in the industry - for instance it is natively supported on OS X 10.5 and above.
If you use MATLAB, it becomes pretty simple to use GPU's for technical computing (matrix computations and heavy math/number crunching). I find it useful for uses of GPU cards outside of gaming. Check out the link below:
Take a look at the ATI Stream Computing SDK. It is based on BrookGPU developed at Stanford.
In the future all GPU work will be standardized using OpenCL. It's an Apple-sponsored initiative that will be graphics card vendor neutral.
CUDA is an excellent framework to start with. It lets you write GPGPU kernels in C. The compiler will produce GPU microcode from your code and send everything that runs on the CPU to your regular compiler. It is NVIDIA only though and only works on 8-series cards or better. You can check out CUDA zone to see what can be done with it. There are some great demos in the CUDA SDK. The documentation that comes with the SDK is a pretty good starting point for actually writing code. It will walk you through writing a matrix multiplication kernel, which is a great place to begin.
Take a look at the ATI Stream Computing SDK. It is based on BrookGPU developed at Stanford.
In the future all GPU work will be standardized using OpenCL. It's an Apple-sponsored initiative that will be graphics card vendor neutral.
Source: Stackoverflow.com