Tesla
Product Info
Additional Info
Relevant Links

GPU Computing

 

GPU computing or GPGPU is the use of a GPU (graphics processing unit) to do general purpose scientific and engineering computing.

The model for GPU computing is to use a CPU and GPU together in a heterogeneous co-processing computing model. The sequential part of the application runs on the CPU and the computationally-intensive part accelerated by the GPU. From the user’s perspective, the application just runs faster because it is using the high-performance of the GPU to boost performance.
Heterogeneous Computing

The GPU has evolved over the years to have teraflops of floating point performance. NVIDIA revolutionized the GPGPU and accelerated computing world in 2006-2007 by introducing its new massively parallel computing platform “CUDA.” The CUDA parallel computing platform consists of 100s of processor cores that operate together to crunch through the data set in the application.

The success of GPGPUs in the past few years has been the ease of programming of the associated CUDA parallel computing platform. With CUDA , the application developer can modify their application to take the compute-intensive kernels and map them to the GPU. The rest of the application remains on the CPU. Mapping a function to the GPU involves rewriting the function to expose the parallelism in the function and adding keywords to move data to and from the GPU. The developer writes the function such thati 1000s of threads are launched simultaneously. The GPU hardware manages the threads and handlesthread scheduling.

The Tesla 20-series GPU is based on the “Fermi” architecture, which is the latest CUDA parallel computing platform. Fermi is optimized for scientific applications with key features such as 500+ gigaflops of IEEE standard double precision floating point hardware support, L1 and L2 caches, ECC memory error protection, local user-managed data caches in the form of shared memory dispersed throughout the GPU, coalesced memory accesses and so on.

"GPUs have evolved to the point where many real-world applications are easily implemented on them and run significantly faster than on multi-core systems. Future computing architectures will be hybrid systems with parallel-core GPUs working in tandem with multi-core CPUs."

Prof. Jack Dongarra
Director of the Innovative Computing Laboratory
The University of Tennessee

History of GPU Computing

Graphics chips started as fixed function graphics pipelines. Over the years, these graphics chips became increasingly programmable, which led NVIDIA to introduce the first GPU or Graphics Processing Unit. In the 1999-2000 timeframe, computer scientists in particular, along with researchers in fields such as medical imaging and electromagnetics started using GPUs for running general purpose computational applications. They found out that the excellent 10-bit fixed point performance in GPUs led to a huge performance boost for a range of scientific applications. This was the advent of the movement called GPGPU or General Purpose computing on GPUs.

The problem was that GPGPU required using graphics programming languages like OpenGL and Cg to program the GPU. Developers had to make their scientific applications look like graphics applications and map them into problems that drew triangles and polygons. This limited the accessibility of tremendous performance of GPUs for science.

NVIDIA realized the potential to bring this performance to the larger scientific community and decided to invest in modifying the GPU to make it fully programmable for scientific applications and added support for high-level languages like C, C++, and Fortran. This led to the CUDA parallel computing platform for the GPU.

THE CUDA Parallel COMPUTING PLATFORM
The CUDA parallel computing platform provides a set of abstractions that enable expressing fine-grained and coarse-grained data and task parallelism. The programmer can choose to express the parallelism in high-level languages such as C, C++, Fortran or driver APIs such as DirectX™-11 Compute. The CUDA parallel computing platform is now widely deployed with 1000s of GPU-accelerated applications and 1000s of published research papers.

CUDA C

A complete range of CUDA tools and ecosystem solutions are available to developers.

The CUDA parallel computing platform is now widely deployed with 1000s of applications and 1000s of published research papers. CUDA Zone lists many of these applications and papers.