← 返回 JSSC 论文列表JSSC 2021第10期Memory28nmCIM
TIMAQ: A Time-Domain Computing-in-Memory- Based Processor Using Predictable Decomposed Convolution for Arbitrary Quantized DNNs Jianxun Y ang , Student Member , IEEE, Y uyao Kong, Zhao Zhang, Zhuangzhi Liu, Jing Zhou, Yiqi Wang, Y onggang Liu, Chenfu Guo, Te Hu, Congcong Li
TIMAQ处理器采用时域存内计算架构,通过独特权重卷积加速混合精度非均匀量化DNN,显著降低能耗和计算量。
28-nm CMOS
时域存内计算混合精度量化非均匀量化深度神经网络能效优化
▸bit-cross-flipping-based kernel decomposer(基于位交叉翻转的核分解器)
▸dual-mode-complementary predictor(双模式互补预测器)
▸activation-weight-adaptive pulse quantizer(激活-权重自适应脉冲量化器)
Abstract
Energy-efficient processors are crucial for accelerating deep neural networks (DNNs) on edge devices with limited battery capacity. To reduce energy consumption, time-domain computing-in-memory (TD-CIM) is a splendid architecture, which consumes low computation and memory access energy due to low toggle rate of time-based signals and less data movements, respectively. When deploying DNNs in TD-CIMs, quantization is required, which has two types: uniform quantization (UQ) and nonuniform quantization (NUQ). To reach the same accuracy for one DNN, NUQ achieves smaller model size than UQ. Due to varying weight distributions across layers, mixed-precision quantization can further reduce model size, without degrading accuracy. However, previous TD-CIMs are inefficient for mixed-precision NUQ-DNNs due to their adopted bit-serial convolution increasing computation amount significantly. To address that, we propose a unique- weight convolution to accelera te mixed-precision NUQ-DNNs by a special kernel decomposition, reducing computation count remarkably. Based on that, we design a TD-CIM- based processor, TIMAQ, with three architectural techniques: 1) bit-cross-flipping-based kernel decomposer to reduce memory accesses and operations of decomposing kernels; 2) dual-mode- complementary predictor to remove redundant computations; and 3) activation-weight-adaptive pulse quantizer to decrease pulse quantization energy and error. Fabricated in 28-nm CMOS technology and tested on 1–8-b NUQ-DNNs