русский   english    [ Login ]

Course: "Massively parallel processes, CUDA architecture and programming environment"

The purpose of this course is to teach how to develop applications for processes with massively parallel computing architecture. A process is called "massively parallel" if it is capable of performing more than 64 arithmetic operations per one cycle of clock frequency. Nowadays the NVIDIA processors can be included into this category. Intel, AMD and IBM processors will begin using massively parallel architectures within next few years. An effective programming of such processors will require a scrupulous understanding of parallel programming principles, as well as of parallelism models, data exchange and knowledge of various architectural limitations of these processors. The target audience of this course includes software developers and researchers using heterogeneous computations for their projects.

Brief program of the course

  • Introduction. Existing multi-core systems. GPU as massively parallel processor. CUDA "hello, world".
  • Tesla architecture and CUDA programming model.
  • CUDA memory hierarchy. Global memory. Parallel matrix multiplication. Solving systems of linear equations in parallel.
  • CUDA memory hierarchy. Shared memory. Parallel reduction and scan operations in CUDA.

 

Register for Training Course