← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2023第6期Memory未明确CIM

TranCIM Full-Digital Bitlin e-Transpose CIM-based Sparse Transformer Accelerator

提出基于数字存内计算的Transformer加速器TranCIM,支持动态稀疏注意力计算,显著降低能耗。
15.59 µJ/Token (BERT-base模型), 能效比现有方案提升12.08×–36.82×
存内计算Transformer加速器注意力机制数字电路稀疏计算
采用位线转置存内计算架构支持动态矩阵乘法
提出可重构流水线/并行模式适应不同计算需求
设计稀疏注意力调度器减少冗余计算
Abstract
Transformer models achieve excellent results in the fields like natural language processing, computer vision, and bioinformatics. Their large numbers of matrix multiplications (MMs) lead to substantial data movement and computation. Although computing-in-memory (CIM) has proven to be an efficient architecture for MM computation, transformer’s atten- tion mechanism raises new challenges in memory access and computation aspects: the dynamic MM in attention layers causes redundant OFF -chip memory ac