AI:冯诺依曼瓶颈

文章来自微信公众号“科文路”,欢迎关注、互动。转发须注明出处。

什么是冯诺依曼瓶颈?CPU为什么不适宜做深度学习中的运算?

本文将翻译What makes TPUs fine-tuned for deep learning? | Google Cloud Blog中的部分内容。

The CPU is a general purpose processor based on the von Neumann architecture. That means a CPU works with software and memory, like this:

CPU 是基于冯诺依曼架构的通用处理器。这也就是说 CPU 与软件和内存一起工作。

The greatest benefit of CPU is its flexibility. With its Von Neumann architecture, you can load any kind of software for millions of different applications. You could use a CPU for word processing in a PC, controlling rocket engines, executing bank transactions, or classifying images with a neural network.

CPU 的最大益处是它的灵活性。凭借其冯诺依曼架构,你可以在上百万种场景下加载任何类型的软件。你可以使用 CPU 在 PC 中进行文字处理、控制火箭发动机、执行银行交易或使用神经网络对图像进行分类。

But, because the CPU is so flexible, the hardware doesn’t always know what would be next calculation until it reads the next instruction from the software. A CPU has to store the calculation results on memory inside CPU (so called registers or L1 cache) for every single calculation. This memory access becomes the downside of CPU architecture called the von Neumann bottleneck. Even though the huge scale of neural network calculations means that these future steps are entirely predictable, each CPU’s Arithmetic Logic Units (ALU, the component that holds and controls multipliers and adders) executes them one by one, accessing the memory every time, limiting the total throughput and consuming significant energy.

但也就是由于 CPU 的灵活,硬件设备只有到它从软件中读到下一条指令的时候,它才知道自己要做什么。 CPU 必须在单次计算中将结果存储在 CPU 内的内存中(也就是所谓的寄存器或 L1 缓存)。这种内存访问是 CPU 架构的缺点,被称为冯诺依曼瓶颈。尽管大规模的神经网络计算意味着接下来的步骤是完全可预测的,但每个 CPU 的算术逻辑单元 (ALU),控制乘法器和加法器的组件)依序执行,每一次都访问内存。这限制了总吞吐量,并将增加能耗。

都看到这儿了,不如关注每日推送的“科文路”、互动起来~

至少点个赞再走吧~

Author

xlindo

Posted on

2022-06-30

Updated on

2023-05-10

Licensed under

Comments