← 返回 JSSC 论文列表JSSC 2020第10期Digital Circuits65nm
An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for I
提出一种支持CNN训练的高效能深度学习加速器,采用三种处理器核心优化不同计算类型。
65nm CMOS, 0.63-1.0V, 50MHz, 40.7mW, 47.4µJ/epoch
深度学习加速器卷积神经网络训练过程能效优化定点计算
▸传播核心中的掩码方案减少中间激活数据存储
▸权重梯度计算采用不同数据流架构提高PE利用率
▸修改的权重更新系统支持8位定点计算
Abstract
A scalable deep-learning accelerator supporting the
training process is implemented for device personalization of deep
convolutional neural networks (CNNs). It consists of three proces-
sor cores operating with distinct energy-efficient dataflow for dif-
ferent types of computation in CNN training. Unlike the previous
works where they implement design techniques to exploit the
same characteristics from the inference, we analyze major issues
that occurred from training in a resource-constrained sys