Formation Programmation CUDA Avancée / CUDA Advanced
Article mis en ligne le 14 janvier 2020
dernière modification le 26 mars 2021

Formateur : ATOS
Pré-requis : avoir suivi la formation CUDA Basics, ou équivalent
Requisite : Training CUDA Basics or equivalent attended

basics c/c++, basics parallel programming (thread, posix, openmp, mpi, ...).

Synopsys :

- Quick recap *
- Data transfer optimizations (pinned memory, zero copy, cuda managed memory) *
- concurrent execution (streams, events, levels of synchronization across warps/blocks) *
- Kernel optimizations (warps, impact of branches, global/constant/shared memory in detail espacially bank conflicts)
- overall GPU efficiency (occupancy, roofline model)
- Hardware specific behaviours (Kepler, Pascal, Volta) key differences for the programmer
- Compilation of CUDA in detail (execution model)
- Multi GPU (device management, CUDA context, Peer2Peer in CUDA, NV-Link, CUDA +
MPI (gdr-copy), Multi-Process-Service mps)
- Advanced profiling (nvidia-smi, nvprof, nvvp)

* = overlap with training "CUDA basics"