← 返回 JSSC 论文列表JSSC 2024第10期Digital Circuits40nm
A 738k-InferencemJ SVM Learning Accelerator for Brain Pattern Recognition Tzu-We
一款用于脑模式识别的低功耗SVM学习加速器,采用CP-SVM算法和硬件优化,显著提升能效。
40nm CMOS, 0.85V, 40MHz, 9.68mW, 73.8k inference/mJ
SVM加速器脑模式识别能效优化硬件加速CMOS
▸创新点1:采用CP-SVM算法(方法创新),通过聚类分区策略将大规模数据分解为多个子问题并行处理,显著降低训练和推理延迟,分别达到99%和91%的减少,解决了传统SVM在嵌入式设备上的计算瓶颈。
▸创新点2:核变换技术(算法创新),通过数学重构将高维核运算转化为低维线性运算,减少PE阵列的硬件复杂度达42%,同时保持计算精度,优化了硬件资源利用率。
▸创新点3:稀疏感知跳过机制(系统创新),动态识别输入数据的稀疏性并跳过零值相关计算,消除冗余操作,结合数据调度策略提升PE利用率,整体PE阵列处理延迟降低96%。
▸创新点4:硬件架构优化(电路创新),采用链式互连减少数据交换器面积93%,集成多排序器为跨集群排序器节省52%面积,最终芯片能效达73.8k inference/mJ,面积效率510k inference/s/mm²,均超越现有技术3.4倍以上。
Abstract
Machine learning (ML) has been widely adopted
in neural signal processing and support vector machine (SVM)
stands out for its efficacy given limited training data. The
constrained battery capacity of implanted devices necessitates
a dedicated accelerator with high energy efficiency. This work
presents an energy-efficient SVM learning accelerator for brain
pattern recognition. By employing the cluster-partitioning SVM
(CP-SVM) algorithm, this work achieves up to 99% and 91%
latency reductions for