← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2023第5期Memory12nm

DepFiN A 12-nm Depth-First High-Resolution CNN Processor for IO-Efficient Inferen

DepFiN处理器采用深度优先执行模式优化高分辨率CNN推理,显著降低内存带宽和能耗。
12nm工艺,0.6V电压下20TOPS/W峰值能效,MC-CNN-fast网络3.95TOPS/W(含IO功耗)
卷积神经网络能效优化深度优先执行硬件加速器高分辨率图像处理
深度优先执行模式减少中间特征图存储需求
支持动态配置的深度可分离卷积加速核心
深度层融合技术降低片外存储带宽
Abstract
Applying convolutional neural networks (CNNs) on high-resolution images leads to very large intermediate feature maps (FMs), which dominate the memory traffic. Processing in the classical layer-by-layer order creates the requirement to store the complete FMs at once, when moving from one layer to the next. As the size of these FMs only realistically allows this in off-chip memory, this leads to high off-chip bandwidth, which comes at great energy costs. The DepFiN processor chip, presented in thi