← 返回 JSSC 论文列表JSSC 2024第1期Digital Circuits28nm
Multipurpose Deep-Learning Accelerator for Arbitrary Quantization With Reduction
提出一种支持任意量化的深度学习加速器,具有高效能和多格式数据处理能力。
28nm LP CMOS, 1-8 bit, 30% sparsity, 0.87-5.55 TOPS, 15.1-95.9 TOPS/W
深度学习加速器任意量化运行时重配置位串行执行零消除器
▸基于LUT的运行时重配置
▸位串行执行减少计算浪费
▸兼容原始和游程压缩格式的零消除器和运行时密度检测器
Abstract
Various pruning and quantization heuristics have
been proposed to compress recent deep-learning models. How-
ever, the rapid development of new optimization techniques
makes it difficult for domain-specific accelerators to efficiently
process various models showing irregularly stored parameters
or nonlinear quantization. This article presents a scalable-
precision deep-learning accelerator that supports multiply-and-
accumulate operations (MACs) with two arbitrarily quantized
data sequences. The