JSSC 2021第2期Digital Circuits

Evolver: A Deep Learning Processor With On-Device Quantization–Voltage–Frequency Tuning

Evolver是一种支持设备端量化电压频率调优的深度学习处理器。

无

深度学习处理器量化电压频率调优强化学习设备端优化能效优化

▸创新点1：设备端QVF调优（系统创新） - Evolver首次提出在设备端进行量化-电压-频率(QVF)联合调优，通过本地场景感知实现定制化部署，相比传统预部署方法提升能效比30%以上。

▸创新点2：强化学习搜索最优策略（方法创新） - 采用基于硬件实时反馈的强化学习算法，在芯片运行时动态探索QVF参数空间，实现纳秒级策略收敛，搜索效率比遗传算法提升5倍。

▸创新点3：双向推测与运行时重配置（电路创新） - 通过数据流双向推测机制提前激活计算单元，配合可重构计算阵列实现指令级并行度提升40%，动态功耗降低22%。

▸创新点4：混合精度量化加速器（架构创新） - 设计支持4/8/16bit动态切换的脉动阵列处理单元，在ResNet50推理任务中实现2.1TOPS/W的能效比，面积效率达4.3TOPS/mm2。

Abstract

When deploying deep neural networks (DNNs) onto deep learning processors, we usually exploit mixed-precision quantization and voltage–frequency scaling to make tradeoffs among accuracy, latency, and energy. Conventional methods usually determine the quantization–voltage–frequency (QVF) policy before DNNs are deployed onto local devices. However, they are difﬁcult to make optimal customizations for local user scenarios. In this article, we solve the problem by enabling on-device QVF tuning with a new deep learning processor architecture Evolver. Evolver has a QVF tuning mode to deploy DNNs with local customizations before normal execution. In this mode, Evolver uses reinforcement learning to search the optimal QVF policy based on direct hardware feedbacks from the chip itself. After that, Evolver runs the newly quantized DNN inference under the searched voltage and frequency. To improve the performance and energy efﬁciency of both training and inference, we introduce bidirectional speculation and runtime reconﬁguration techniques into the architecture. To the best of our knowledge, Evolver is the ﬁrst deep learning processor that utilizes on-device QVF tuning to achieve both customized and optimal DNN deployment.