The Effect of Asymmetric Transistor Aging on Systolic Arrays for Mission Critical Machine Learning Applications

Firas Ramadan*, Gil Shomron, Freddy Gabbay

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Deep neural networks (DNNs) excel in various applications, such as computer vision, natural language processing, and other mission-critical systems. As the computational complexity of these models grows, there is an increasing need for specialized accelerators to handle the demanding workloads. In response, advancements in Very Large Scale Integration (VLSI) process nodes have significantly intensified the development of machine learning (ML) accelerators, offering enhanced transistor miniaturization and power efficiency. However, the susceptibility of these advanced nodes to transistor aging poses risks to ML accelerator performance, prediction accuracy, and reliability, which can impact the functional safety of mission-critical systems. This study focuses on the impact of asymmetric transistor aging, induced by Bias Temperature Instability (BTI), on systolic arrays (SAs), which are integral to many ML accelerators in mission-critical systems. Our aging-aware analysis indicates that SAs experience asymmetric aging, causing logical elements to age at varying rates. In addition, our simulations show that asymmetric transistor aging introduces persistent and transient faults in the SA's datapath, compromising the overall resiliency of the ML model. Our simulation results show that even with less than 1% of transient failure events, the top-1 prediction accuracy of ResNet-18 ML model drops significantly by 32-50% and with approximately 0.8% of transient failure events PTQ4ViT drops by almost 90%. To address this issue, we propose new hardware mechanisms and design flow solutions that can successfully mitigate the impact of asymmetric transistor aging on ML accelerator reliability with minimal power and area overhead.

Original languageEnglish
Pages (from-to)44041-44061
Number of pages21
JournalIEEE Access
Volume13
DOIs
StatePublished - 2025

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Keywords

  • Asymmetric aging
  • bias temperature instability
  • deep neural networks
  • machine learning accelerators
  • mission critical applications
  • systolic arrays
  • transistor aging
  • very large scale integration

Fingerprint

Dive into the research topics of 'The Effect of Asymmetric Transistor Aging on Systolic Arrays for Mission Critical Machine Learning Applications'. Together they form a unique fingerprint.

Cite this