← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2024第1期Digital Circuits28nm

Multipurpose Deep-Learning Accelerator for Arbitrary Quantization With Reduction

提出一种支持任意量化的深度学习加速器,具有高效能和多格式数据处理能力。
28nm LP CMOS, 1-8 bit, 30% sparsity, 0.87-5.55 TOPS, 15.1-95.9 TOPS/W
深度学习加速器任意量化运行时重配置位串行执行零消除器
基于LUT的运行时重配置
位串行执行减少计算浪费
兼容原始和游程压缩格式的零消除器和运行时密度检测器
Abstract
Various pruning and quantization heuristics have been proposed to compress recent deep-learning models. How- ever, the rapid development of new optimization techniques makes it difficult for domain-specific accelerators to efficiently process various models showing irregularly stored parameters or nonlinear quantization. This article presents a scalable- precision deep-learning accelerator that supports multiply-and- accumulate operations (MACs) with two arbitrarily quantized data sequences. The