JSSC 2022第1期MemoryEmerging Memory

V ega: A Ten-Core SoC for IoT Endnodes With DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive

Vega是一款面向物联网终端节点的十核SoC，支持DNN加速和认知唤醒，具有超低功耗和高能效。

1.7μW睡眠功耗，32.2 GOPS峰值性能，615 GOPS/W (8-bit INT), 1.3 TOPS/W (8-bit DNN), 79 GFLOPS/W (32-bit FP), 129 GFLOPS/W (16-bit FP)

物联网终端深度学习加速RISC-V能效优化认知唤醒

▸十核RISC-V架构，支持多精度SIMD整数和浮点计算

▸集成1.6MB SRAM和4MB MRAM，支持状态保持

▸两个可编程机器学习加速器提升能效

Abstract

The Internet-of-Things (IoT) requires endnodes with ultra-low-power always-on capability for a long battery life- time, as well as high performance, energy efﬁciency, and extreme ﬂexibility to deal with complex and fast-evolving near-sensor analytics algorithms (NSAAs). We present Vega, an IoT endnode system on chip (SoC) capable of scaling from a 1.7- µW fully retentive cognitive sleep mode up to 32.2-GOPS (at 49.4 mW) peak performance on NSAAs, including mobile deep neural network (DNN) inference, exploiting 1.6 MB of state-retentive SRAM, and 4 MB of non-volatile magnetoresistive random access memory (MRAM). To meet the performance and ﬂexibility requirements of NSAAs, the SoC features ten RISC-V cores: one core for SoC and IO management and a nine-core cluster sup- porting multi-precision single instruction multiple data (SIMD) integer and ﬂoating-point (FP) computation. Vega achieves the state-of-the-art (SoA)-leading efﬁciency of 615 GOPS/W on 8-bit INT computation (boosted to 1.3 TOPS/W for 8-bit DNN inference with hardware acceleration). On FP computation, it achieves the SoA-leading efﬁciency of 79 and 129 GFLOPS/W on 32- and 16-bit FP, respectively. Two programmable machine learning (ML) accelerators boost energy efﬁciency in cognitive sleep and active states.