← 返回 JSSC 论文列表
📄 下载 JSSC 原文 PDF
JSSC 2020第4期Memory40nm

A 73 M Output Non-Zeros-J- 117 M Output Non-Zeros-GB Reconfigurable Sparse Matri

一款40nm CMOS工艺的可重构稀疏矩阵乘法加速器,具有48个异构核心和可重构内存层次结构。
40nm CMOS, 12.6×能效提升, 11.7×带宽效率提升, 17.1×计算密度提升
稀疏矩阵乘法异构计算可重构内存能效优化交叉开关
异构核心设计(Arm Cortex-M0和Cortex-M4)
可重构内存层次结构
合成可聚合交叉开关
Abstract
A sparse matrix–matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40-nm CMOS. The compute fabric consists of dedicated floating-point multiplication units, and general-purpose Arm Cortex-M0 and Cortex-M4 cores. The on-chip memory reconfigures scratchpad or cache, depending on the phase of the algorithm. The memory and compute units are interconnected with synthesizable coalescing crossbars for efficient memory access. The 2.0-