The purpose of this course is to teach how to develop applications for processes with massively parallel computing architecture. A process is called "massively parallel" if it is capable of performing more than 64 arithmetic operations per one cycle of clock frequency. Nowadays the NVIDIA processors can be included into this category. Intel, AMD and IBM processors will begin using massively parallel architectures within next few years. An effective programming of such processors will require a scrupulous understanding of parallel programming principles, as well as of parallelism models, data exchange and knowledge of various architectural limitations of these processors. The target audience of this course includes software developers and researchers using heterogeneous computations for their projects.