Machine learning-based accelerator modeling for rapid, hardware-aware neural architecture search

Loading...
Thumbnail Image
Author
Azarkhish, E.
Emery, S.
DOI
Abstract
As AI workloads scale, efficient hardware design becomes critical for competitiveness. This work presents a machine learning-based framework that delivers rapid and accurate performance and power estimation for Neural Processing Units (NPUs), overcoming the limitations of cycle-accurate simulation and analytical models. Trained on 3,000+ cycle-accurate simulations, a multi-stage Machine-Learning (ML) pipeline predicts bandwidth, Giga-Operations-Per-Second (GOPS) performance, and execution time, while power estimation leverages gate-level data and silicon measurements. The approach achieves near RTL-simulation accuracy while reducing prediction time from hours to seconds-enabling fast design-space exploration and hardware-aware Neural Architecture Search (NAS). Unlike traditional methods, it captures complex hardware interactions without manual tuning. Future work will expand datasets, refine power models, and integrate the solution into automated NAS pipelines for scalable, energy-efficient AI hardware design. This accelerates design cycles and shortens time-to-market for next-generation AI accelerators.
Publication Reference
CSEM Scientific and Technical Report 2025, p. 13–14
Year
2025
Sponsors