← 返回 JSSC 论文列表JSSC 2020第4期Memory40nm
A 73 M Output Non-Zeros-J- 117 M Output Non-Zeros-GB Reconfigurable Sparse Matri
一款40nm CMOS工艺的可重构稀疏矩阵乘法加速器,具有48个异构核心和可重构内存层次结构。
40nm CMOS, 12.6×能效提升, 11.7×带宽效率提升, 17.1×计算密度提升
稀疏矩阵乘法异构计算可重构内存能效优化交叉开关
▸异构核心设计(Arm Cortex-M0和Cortex-M4)
▸可重构内存层次结构
▸合成可聚合交叉开关
Abstract
A sparse matrix–matrix multiplication (SpMM)
accelerator with 48 heterogeneous cores and a reconfigurable
memory hierarchy is fabricated in 40-nm CMOS. The compute
fabric consists of dedicated floating-point multiplication units,
and general-purpose Arm Cortex-M0 and Cortex-M4 cores. The
on-chip memory reconfigures scratchpad or cache, depending
on the phase of the algorithm. The memory and compute units
are interconnected with synthesizable coalescing crossbars for
efficient memory access. The 2.0-