← 返回 JSSC 论文列表JSSC 2024第10期Memory
CIMFormer A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-
CIMFormer是一种基于脉动CIM阵列的Transformer加速器,通过令牌剪枝优化注意力计算。
无
TransformerCIM加速器令牌剪枝注意力机制脉动阵列
▸令牌剪枝感知注意力重构(TPAR)
▸主可能性聚集-分散调度器(PPGSS)
▸脉动X|W-CIM宏阵列
Abstract
Transformer models have achieved impressive per-
formance in various artificial intelligence (AI) applications.
However, the high cost of computation and memory footprint
make its inference inefficient. Although digital compute-in-
memory (CIM) is a promising hardware architecture with
high accuracy, Transformer’s attention mechanism raises three
challenges in the access and computation of CIM: 1) the atten-
tion computation involving Query and Key results in massive
data movement and under-util