Neural Processing Unit (NPU): Accelerating neural network inference
Loading...
Author
Emery, Stéphane
DOI
Abstract
Artificial intelligence (AI) algorithms, such as neural networks, are fundamental to many advanced computer vision and signal processing capabilities, often exceeding human-level performance. These algorithms enable the automation of repetitive tasks by processing input data through millions of computations, achieving high accuracy and reliable performance. However, the significant memory and power requirements for processing these networks pose challenges for deployment in miniaturized, wearable, and other energy-constrained applications.
Local processing of AI/ML algorithms within the end node, known as edge processing, can enhance the performance of many applications. However, unless a large battery size or frequent recharging can be afforded, embedded AI/ML processing is restricted to low complexity tasks or necessitates offloading to cloud processing. This offloading incurs the costs of energy-intensive radio communications, increased latency, and additional privacy concerns.
Optimized and energy-efficient AI/ML chips address these challenges by accelerating neural network computations within a constrained power budget, thereby enhancing edge processing capabilities.
CSEM’s next-generation neural processing unit (NPU) has been designed to address these challenges, enabling neural network edge processing in power- and energy-constrained embedded systems. The NPU is a standalone AI/ML accelerator IP that delivers state-of-the-art ML acceleration performance, optimized for embedded edge processing. With nearly 200x higher measured throughput (910GOP/s) and significantly increased efficiency (3.5TOPS/W) than its predecessor, CSEM’s latest AI/ML system-on-chip showcases the NPU’s cutting-edge performance for low-power AI chips.
Publication Reference
Neural Processing Unit: Accelerating neural network inference, ASICs for the Edge, EMERY Stéphane
Year
2026