← 返回 JSSC 论文列表JSSC 2024第1期MemoryCIM
MulTCIM Digital Computing-in-Memory-Based Multimodal Transformer Accelerator Wit
MulTCIM是一种基于数字内存计算的Transformer加速器,通过混合稀疏性优化能效。
2.24 µJ/Token, 2.50×–5.91×能效提升
内存计算Transformer加速器多模态能效优化混合稀疏性
▸长重用消除动态重塑注意力模式以提高CIM利用率
▸运行时令牌修剪器(RTP)去除不重要令牌
▸模态自适应CIM网络(MACN)利用对称模态重叠减少CIM空闲
Abstract
Multimodal Transformers are emerging artificial
intelligence (AI) models that comprehend a mixture of signals
from different modalities like vision, natural language, and
speech. The attention mechanism and massive matrix multiplica-
tions (MMs) cause high latency and energy. Prior work has shown
that a digital computing-in-memory (CIM) network can be an
efficient architecture to process Transformers while maintaining
high accuracy. To further improve energy efficiency, attention-
token-bit hybr