2026-05-27agentsinfracode

How VLAs Fail Differently: Black-Box Action Monitoring Reveals Architecture-Specific Failure Signatures

Krishnam Gupta

PDF preview unavailable

Key claim

Architecture-matched monitoring is essential for VLA safety.

This paper reveals that different VLA architectures exhibit distinct failure patterns at the motor-command level, necessitating tailored monitoring strategies. A key finding is that direction reversal rates can predict failures across architectures, while common safety mechanisms like velocity checking are often ineffective. This insight is crucial for developers working with VLA systems to ensure safety and reliability.

In plain English

Novelty

8.0/10

The paper presents a significant new finding regarding the failure modes of VLA architectures, which extends the understanding of architecture-specific monitoring.

Reliability

8.0/10

The claims are well-supported by extensive evaluation across multiple architectures and a clear methodology.

Deep reliability assessment

The methodology supports the claim that, under these evaluation protocols, action-space metrics correlate differently with task failure for VQ-BeT, Diffusion Policy, and ACT across PushT and ALOHA episodes. It overclaims universality across VLA architectures and deployment settings: three architectures, two tasks, and AUROC correlations are not enough to prove architecture-family laws or real-world safety guarantees.

Reproducibility

Code: yes, the paper links to an open-source repository. Dataset: no clear standalone dataset release is mentioned; experiments use demonstration episodes and evaluation protocols for PushT and ALOHA, but no dataset URL is provided in the supplied text.

Discussion questions

1.Is direction reversal rate genuinely detecting model failure, or is it mostly a proxy for task phase, controller frequency, action scaling, or environment-specific contact dynamics?
2.For builders deploying VLAs, should action monitors be used only as alarms, or should they actively modify actions through clipping and velocity clamping despite the risk of changing policy behavior?
3.What result would falsify the architecture-specific story: a larger study where jerk predicts continuous diffusion/flow policies well, or where reversal rate fails on more diverse real-robot tasks?

Key figure

The key architectural diagram would show SafeContract inserted as a black-box layer between the VLA action output and robot motors, applying conformal bounds, velocity constraints, action-health metrics, logging, and shift detection before commands reach the robot.

Benchmark results

~PushT / ALOHA evaluation episodesAUROC using direction reversal rate: 0.93vs N/A; monitor evaluation on VQ-BeTN/A

~PushT / ALOHA evaluation episodesAUROC using direction reversal rate: 0.79vs N/A; monitor evaluation on Diffusion PolicyN/A

~PushT / ALOHA evaluation episodesAUROC using direction reversal rate: 0.91vs N/A; monitor evaluation on ACTN/A

~PushT / ALOHA evaluation episodesAUROC using jerk RMS: 0.88vs N/A; monitor evaluation on VQ-BeTN/A

~PushT / ALOHA evaluation episodesAUROC using jerk RMS: 0.69vs N/A; monitor evaluation on ACTN/A

~PushT / ALOHA evaluation episodesAUROC using jerk RMS: 0.41vs N/A; monitor evaluation on Diffusion PolicyN/A

~ALOHAAUROC using velocity violations: 0.52vs N/A; monitor evaluation on ACTN/A

~PushT / ALOHA evaluation episodesAUROC using velocity violations: 0.41vs N/A; monitor evaluation on Diffusion PolicyN/A

GitHub1 repo

krishnam94/vla-edgeOfficial