← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2022第4期Memory28nm

OmniDRL An Energy-Efficient Deep Reinforcement Learning Processor With Dual-Mode

OmniDRL是一款面向边缘设备的高效能深度强化学习处理器,通过数据压缩和稀疏训练降低内存访问。
28nm CMOS, 3.6×3.6mm², 4.18 TFLOPS峰值性能, 29.3 TFLOPS/W峰值能效
深度强化学习边缘计算能效优化数据压缩稀疏训练
组稀疏训练(GST)提高权重压缩率
指数均值差编码进一步压缩权重和特征图
片上稀疏权重转置器避免片外转置
Abstract
In this article, we present an energy-efficient deep reinforcement learning (DRL) processor, OmniDRL, for DRL training on edge devices. Recently, the need for DRL train- ing is growing due to the DRL’s distinct characteristics that can be adapted to each user. However, a massive amount of external and internal memory access limits the implementation of DRL training on resource-constrained platforms. OmniDRL proposes four key features that can reduce external memory access by compressing as much d