Earlier this past month, the final version
of the Apple-backed OpenCL specification was delivered to standards
group participants for vetting, and the final OpenCL 1.0 specification
has now released. Simultaneous with the final spec's release, NVIDIA announced
full support for it in its GPU products. Now we know why Apple has
standardized on NVIDIA GPU hardware across its entire product line.
In a nutshell, OpenCL
is a so-called "GPGPU" specification that enables programmers to tap
the power of the GPU as a data-parallel coprocessor without having to
learn to speak the specialized language of graphics, i.e., OpenGL or a
DirectX flavor. NVIDIA had been pushing things in this direction for
some time with C for CUDA, and Microsoft is also headed there with
DirectX 11 Compute, so it was natural that Apple would move to ensure
that its forthcoming "Snow Leopard" version of Mac OS X would sport
comparable capabilities.
Apple and NVIDIA collaborated
heavily on the development of OpenCL, but (as its name implies) the
standard is open and has been vetted and put out under the auspices of
the Khronos Group, a consortium of companies that have banded together
to develop and promote royalty-free media APIs. AMD has also promised OpenCL support on its hardware; Intel is a Khronos Group member, so it will presumably support OpenCL with Larrabee, as well.
Many of the industries that stand to benefit the most dramatically from
GPGPU have been extremely reluctant to invest a lot of development
labor in a single vendor's toolchain (i.e., NVIDIA's C for CUDA, which
has been the only real game in town). OpenCL gives them an open,
multivendor alternative to C for CUDA, although the two specs aren't
quite interchangeable. The following comparison of the two was taken
from a slide in NVIDIA's OpenCL presentation:
C for Cuda
|
OpenCL
|
C with parallel keywords
|
Hardware API—similar to OpenGL
|
C runtime that abstracts driver API
|
Programmer has complete access to hardware device
|
Memory managed by C runtime
|
Memory managed by programmer
|
Generates PTX
|
Generates PTX
|
(Note: PTX is assembler for CUDA. It's the layer that sits closest to NVIDIA's GPU hardware.)
The official spec launches today, and NVIDIA plans to have a beta of
it running on its hardware by the first quarter of the coming year,
with the final release arriving in the second quarter.
OpenCL memory model
I'm going to speculate that we'll see support for this on consoles
before long, with the PlayStation 3 being the most obvious candidate
(since it's powered by an NVIDIA GPU). This will give game developers
who want to go nuts rethinking the standard SGI render pipeline (I'm
thinking of Epic's Tim Sweeney) a cross-platform way to access GPU horsepower.
In connection with this, it's also worth mentioning that OpenCL can
take a regular CPU as a target, as one of the design goals listed on
slide 13 the OpenCL slide deck (PDF) is to
Enable use of all computational resources in a system
- Program GPUs, CPUs, Cell, DSP and other processors as peers
- Support both data- and task- parallel compute models
Note also the support for "task-parallel compute models." Task-parallel
compute models aren't exactly a good fit for a conventional GPU, but
they are for Cell and Larrabee.
Quick note: Apple and OpenCL
I've mentioned before that Apple has an internal "GPGPU" group that
serves the company's in-house app developers by giving them ways to use
the GPU to boost performance. Apple announced that an OpenCL
implementation will be a major feature of Snow Leopard, so third-party
developers will be able to get the same kinds of GPU-based speedups
that internal Apple developers see. This is good for developers, and
it's also good for Apple, because having the Mac ecosystem's collective
eyes on the code means more improvements in the company's
implementation.
source : http://arstechnica.com/