← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2023第1期Digital Circuits

An Energy-Efficient Transformer Processor Exploiting Dynamic Weak Relevances in G

提出一种能效优化的Transformer处理器,通过动态弱相关性处理降低计算能耗
未明确说明(需查阅完整论文获取具体指标)
Transformer处理器能效优化动态弱相关性近似计算硬件加速
采用大-精确-小-近似处理单元(PE)自适应计算弱相关token
双向渐进推测单元消除冗余零注意力计算
针对全局注意力机制优化的专用硬件架构
Abstract
Transformer-based models achieve tremendous suc- cess in many artificial intelligence (AI) tasks, outperforming conventional convolution neural networks (CNNs) from natural language processing (NLP) to computer vision (CV). Their success relies on the self-attention mechanism that provides a global rather than local receptive field as CNNs. Despite its superiority, the global–level self-attention consumes ∼100× more operations than CNNs and cannot be effectively handled by the existing CNN process