HOME ABOUT CONTACT

Introduction

CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) model created by NVIDIA. It allows developers to use NVIDIA GPUs for general purpose processing (an approach known as GPGPU, General-Purpose computing on Graphics Processing Units).

This article records some key learning points about designing a simple CUDA kernel, along with the basic workflow and core concepts.

Here’s a quick overview with a flowchart:

CUDA Flowchart

Github Repository:

Kernels

When it comes to kernels, there are a few important points to note:

Thread Hierarchy

Last updated: