Supervisory AI Safety Agent Detected More Suicide-Risk Vignettes
TL;DR: A 2026 preprint on medRxiv tested suicide-risk vignettes and found that an independent supervisory safety agent detected intervention-level risk far more often than native ChatGPT Health safeguards. Key Findings 224 paired evaluations: Researchers tested suicide-related clinical vignettes under two information conditions, creating 224 paired comparisons between native safeguards and an external supervisory system. 91.5% …
