The Safety Divide: Open-Source AI Models Fall Short on Guardrails for Antisemitic, Dangerous Content
Report
ADL study finds popular open-source LLMs can easily be manipulated by malicious actors to produce antisemitic, extremist, and dangerous content amid weak safety guardrails.