New benchmark evaluates AI safety alignment under stress
AI Safety JournalResearchers release a suite to test AI safety behaviors with adversarial prompts and red-teaming patterns.
PapersSafetyFeb 14, 2026
AI News Platform
Categoría
Filtrado por categoría; usa el calendario para elegir fechas.
Researchers release a suite to test AI safety behaviors with adversarial prompts and red-teaming patterns.