【CUDA】overview

发表于2025-11-23|更新于2025-11-24|CUDA

|总字数:228|阅读时长:1分钟|浏览量:

Hello, CUDA

一、硬件架构分析（Ampere）

1. 架构总览

架构图

我们先从架构总览里知道有几个核心概念：

GPC（Graphics Processing Cluster）
TPC（Texture Processing Cluster）
SM(Streaming Multiprocessor)
Warp Scheduler
CUDA Core/Tensor Core
RT Core

二、 CUDA编程模型

三、hello world代码解析

/*
*hello_world.cu
*/

#include<stdio.h>

__global__ void hello_world(void)
{
  printf("GPU: Hello world!\n");
}

int main(int argc,char **argv)
{
  printf("CPU: Hello world!\n");
  hello_world<<<1,10>>>();
  cudaDeviceReset(); // if no this line ,it can not output hello world from gpu
  return 0;
}

我们先解析一下代码，相比较纯C++代码，有三个陌生的点：

__global__ ; 他的告诉编译器这个是个可以在设备上执行的核函数
hello_world<<<1, 10>>>: 他的作用是，告诉编译器，我的这个计算任务是由1个grip和10个block组成。
cudaDeviceReset : 这句话告诉cpu侧，即host侧，你得等等GPU跑完你再往下。

高性能计算 CUDA

相关推荐

【CUDA】overview

Hello, CUDA一、硬件架构分析（Ampere）1. 架构总览我们先从架构总览里知道有几个核心概念： GPC（Graphics Processing Cluster） TPC（Texture Processing Cluster） SM(Streaming Multiprocessor) Warp Scheduler CUDA Core/Tensor Core RT Core 二、 CUDA编程模型三、hello world代码解析123456789101112131415161718/**hello_world.cu*/#include<stdio.h>__global__ void hello_world(void){ printf("GPU: Hello world!\n");}int main(int argc,char **argv){ printf("CPU: Hello world!\n"); hello_world<<<1,10>>...