← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2020第10期Digital Circuits65nm

An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for I

提出一种支持CNN训练的高效能深度学习加速器,采用三种处理器核心优化不同计算类型。
65nm CMOS, 0.63-1.0V, 50MHz, 40.7mW, 47.4µJ/epoch
深度学习加速器卷积神经网络训练过程能效优化定点计算
传播核心中的掩码方案减少中间激活数据存储
权重梯度计算采用不同数据流架构提高PE利用率
修改的权重更新系统支持8位定点计算
Abstract
A scalable deep-learning accelerator supporting the training process is implemented for device personalization of deep convolutional neural networks (CNNs). It consists of three proces- sor cores operating with distinct energy-efficient dataflow for dif- ferent types of computation in CNN training. Unlike the previous works where they implement design techniques to exploit the same characteristics from the inference, we analyze major issues that occurred from training in a resource-constrained sys