← 返回 JSSC 论文列表JSSC 2022第5期Digital Circuits28nmNeural Network Accelerator
A 121 TOPSW Quantized Network Acceleration Processor With Effective-Weight-Based
提出一种基于有效权重的量化网络加速处理器,优化CNN处理效率。
28nm CMOS, 1.9mm² core area
量化网络加速有效权重误差补偿残差流水线CNN加速
▸创新点1:基于有效权重的卷积方法(EWC)通过算法-硬件协同优化,识别并利用有效权重组替代冗余权重,显著减少乘法操作数量。实验证明该方法在不同UCNN实现中能提升1.59×–3.20×的能效。
▸创新点2:基于误差补偿的预测方法(ECP)采用训练补偿值替代部分非关键部分和,有效减少由ReLU函数引起的冗余加法操作。在AlexNet上对比Sna-PEA和Pred,能效分别提升1.23×和1.75×,且精度损失极小。
▸创新点3:残差流水线模式通过优化内存访问和计算流程,实现残差块的高效处理。相比现有方案,内存占用降低1.5×,功耗减少1.18×,硬件利用率平均提升13.15%。
▸创新点4:整体处理器架构在TSMC 28nm CMOS工艺下实现1.9mm²核心面积,以470MHz/0.9V运行AlexNet/VGGNet/GoogLeNet/ResNet时达到117.4FPS和131.6mW功耗,能效比现有最优方案提升1.77×–24.20×。
Abstract
In this article, a quantized network acceleration
processor (QNAP) is proposed to efficiently accelerate CNN
processing by eliminating most unessential operations based on
algorithm-hardware co-optimizations. First, an effective-weight-
based convolution (EWC) is proposed to distinguish a group
of effective weights (EWs) to replace the other unique weights.
Therefore, the input activations corresponding to the same EW
can be accumulated first and then multiplied by the EW to
reduce amounts of mult