← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2024第10期Memory

CIMFormer A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-

CIMFormer是一种基于脉动CIM阵列的Transformer加速器,通过令牌剪枝优化注意力计算。
TransformerCIM加速器令牌剪枝注意力机制脉动阵列
令牌剪枝感知注意力重构(TPAR)
主可能性聚集-分散调度器(PPGSS)
脉动X|W-CIM宏阵列
Abstract
Transformer models have achieved impressive per- formance in various artificial intelligence (AI) applications. However, the high cost of computation and memory footprint make its inference inefficient. Although digital compute-in- memory (CIM) is a promising hardware architecture with high accuracy, Transformer’s attention mechanism raises three challenges in the access and computation of CIM: 1) the atten- tion computation involving Query and Key results in massive data movement and under-util