Statistical Runtime Verification for LLMs via Robustness Estimation

Natan Levy, Adiel Ashrov*, Guy Katz

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Adversarial robustness verification is essential for ensuring the safe deployment of Large Language Models (LLMs) in runtime-critical applications. However, formal verification techniques remain computationally infeasible for modern LLMs due to their exponential runtime and white-box access requirements. This paper presents a case study adapting and extending the RoMA statistical verification framework to assess its feasibility as an online runtime robustness monitor for LLMs in black-box deployment settings. Our adaptation of RoMA analyzes confidence score distributions under semantic perturbations to provide quantitative robustness assessments with statistically validated bounds. Our empirical validation against formal verification baselines demonstrates that RoMA achieves comparable accuracy (within 1% deviation), and reduces verification times from hours to minutes. We evaluate this framework across semantic, categorial, and orthographic perturbation domains. Our results demonstrate RoMA’s effectiveness for robustness monitoring in operational LLM deployments. These findings point to RoMA as a potentially scalable alternative when formal methods are infeasible, with promising implications for runtime verification in LLM-based systems.

Original languageEnglish
Title of host publicationRuntime Verification - 25th International Conference, RV 2025, Proceedings
EditorsBettina Könighofer, Hazem Torfah
PublisherSpringer Science and Business Media Deutschland GmbH
Pages457-476
Number of pages20
ISBN (Print)9783032054340
DOIs
StatePublished - 2026
Event25th International Conference on Runtime Verification, RV 2025 - Graz, Austria
Duration: 15 Sep 202519 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16087 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Runtime Verification, RV 2025
Country/TerritoryAustria
CityGraz
Period15/09/2519/09/25

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

Keywords

  • LLM safety
  • LLM verification
  • Neural Network Verification
  • Robustness

Fingerprint

Dive into the research topics of 'Statistical Runtime Verification for LLMs via Robustness Estimation'. Together they form a unique fingerprint.

Cite this