Skip to content

sbrhss/cuda-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fundamentals of Accelerated Computing with Modern CUDA C++

This tutorial teaches you the fundamentals of GPU programming and modern CUDA C++. You can watch lectures corresponding to this course on YouTube. You'll find the following content:

Brev Launchables of this tutorial should use:

  • L40S, L4, or T4 instances.
  • Crusoe or any other provider with Flexible Ports.

Notebooks

CUDA Made Easy: Accelerating Applications with Parallel Algorithms

Notebook Link
01.01.01 Introduction
01.02.01 Execution Spaces
01.02.02 Exercise Annotate Execution Spaces
01.02.03 Exercise Changing Execution Space
01.02.04 Exercise Compute Median Temperature
01.03.01 Extending Algorithms
01.03.02 Exercise Computing Variance
01.04.01 Vocabulary Types
01.04.02 Exercise Mdspan
01.05.01 Serial vs Parallel
01.05.02 Exercise Segmented Sum Optimization
01.05.03 Exercise Segmented Mean
01.06.01 Memory Spaces
01.06.02 Exercise Copy
01.07.01 Summary
01.08.01 Advanced

Unlocking the GPU’s Full Potential: Asynchrony and CUDA Streams

Notebook Link
02.01.01 Introduction
02.02.01 Asynchrony
02.02.02 Exercise Compute IO Overlap
02.02.03 Exercise Nsight
02.02.04 Exercise NVTX
02.03.01 Streams
02.03.02 Exercise Async Copy
02.04.01 Pinned
02.04.02 Exercise Copy Overlap

Implementing New Algorithms with CUDA Kernels

Notebook Link
03.01.01 Introduction
03.02.01 Kernels
03.02.02 Exercise Symmetry
03.02.03 Exercise Row Symmetry
03.02.04 Dev Tools
03.03.01 Histogram
03.03.02 Exercise Fix Histogram
03.04.01 Sync
03.04.02 Exercise Histogram
03.05.01 Shared
03.05.02 Exercise Optimize Histogram
03.06.01 Cooperative
03.06.02 Exercise Cooperative Histogram

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published