← 返回 JSSC 论文列表JSSC 2022第4期Memory28nm
OmniDRL An Energy-Efficient Deep Reinforcement Learning Processor With Dual-Mode
OmniDRL是一款面向边缘设备的高效能深度强化学习处理器,通过数据压缩和稀疏训练降低内存访问。
28nm CMOS, 3.6×3.6mm², 4.18 TFLOPS峰值性能, 29.3 TFLOPS/W峰值能效
深度强化学习边缘计算能效优化数据压缩稀疏训练
▸组稀疏训练(GST)提高权重压缩率
▸指数均值差编码进一步压缩权重和特征图
▸片上稀疏权重转置器避免片外转置
Abstract
In this article, we present an energy-efficient deep
reinforcement learning (DRL) processor, OmniDRL, for DRL
training on edge devices. Recently, the need for DRL train-
ing is growing due to the DRL’s distinct characteristics that
can be adapted to each user. However, a massive amount of
external and internal memory access limits the implementation
of DRL training on resource-constrained platforms. OmniDRL
proposes four key features that can reduce external memory
access by compressing as much d