← 返回 JSSC 论文列表JSSC 2023第6期Memory未明确CIM
TranCIM Full-Digital Bitlin e-Transpose CIM-based Sparse Transformer Accelerator
提出基于数字存内计算的Transformer加速器TranCIM,支持动态稀疏注意力计算,显著降低能耗。
15.59 µJ/Token (BERT-base模型), 能效比现有方案提升12.08×–36.82×
存内计算Transformer加速器注意力机制数字电路稀疏计算
▸采用位线转置存内计算架构支持动态矩阵乘法
▸提出可重构流水线/并行模式适应不同计算需求
▸设计稀疏注意力调度器减少冗余计算
Abstract
Transformer models achieve excellent results in the
fields like natural language processing, computer vision, and
bioinformatics. Their large numbers of matrix multiplications
(MMs) lead to substantial data movement and computation.
Although computing-in-memory (CIM) has proven to be an
efficient architecture for MM computation, transformer’s atten-
tion mechanism raises new challenges in memory access and
computation aspects: the dynamic MM in attention layers causes
redundant
OFF -chip memory ac