JSSC 2021第10期Memory28nmCIM

TIMAQ: A Time-Domain Computing-in-Memory- Based Processor Using Predictable Decomposed Convolution for Arbitrary Quantized DNNs Jianxun Y ang , Student Member , IEEE, Y uyao Kong, Zhao Zhang, Zhuangzhi Liu, Jing Zhou, Yiqi Wang, Y onggang Liu, Chenfu Guo, Te Hu, Congcong Li

TIMAQ处理器采用时域存内计算架构，通过独特权重卷积加速混合精度非均匀量化DNN，显著降低能耗和计算量。

28-nm CMOS

时域存内计算混合精度量化非均匀量化深度神经网络能效优化

▸bit-cross-ﬂipping-based kernel decomposer（基于位交叉翻转的核分解器）

▸dual-mode-complementary predictor（双模式互补预测器）

▸activation-weight-adaptive pulse quantizer（激活-权重自适应脉冲量化器）

Abstract

Energy-efﬁcient processors are crucial for accelerating deep neural networks (DNNs) on edge devices with limited battery capacity. To reduce energy consumption, time-domain computing-in-memory (TD-CIM) is a splendid architecture, which consumes low computation and memory access energy due to low toggle rate of time-based signals and less data movements, respectively. When deploying DNNs in TD-CIMs, quantization is required, which has two types: uniform quantization (UQ) and nonuniform quantization (NUQ). To reach the same accuracy for one DNN, NUQ achieves smaller model size than UQ. Due to varying weight distributions across layers, mixed-precision quantization can further reduce model size, without degrading accuracy. However, previous TD-CIMs are inefﬁcient for mixed-precision NUQ-DNNs due to their adopted bit-serial convolution increasing computation amount signiﬁcantly. To address that, we propose a unique- weight convolution to accelera te mixed-precision NUQ-DNNs by a special kernel decomposition, reducing computation count remarkably. Based on that, we design a TD-CIM- based processor, TIMAQ, with three architectural techniques: 1) bit-cross-ﬂipping-based kernel decomposer to reduce memory accesses and operations of decomposing kernels; 2) dual-mode- complementary predictor to remove redundant computations; and 3) activation-weight-adaptive pulse quantizer to decrease pulse quantization energy and error. Fabricated in 28-nm CMOS technology and tested on 1–8-b NUQ-DNNs