3 Results

The Safety Divide: Open-Source AI Models Fall Short on Guardrails for Antisemitic, Dangerous Content

Report

ADL study finds popular open-source LLMs can easily be manipulated by malicious actors to produce antisemitic, extremist, and dangerous content amid weak safety guardrails.

December 09, 2025

Breaking the Building Blocks of Hate: A Case Study of Minecraft Servers

Report

The first analysis of hate and harassment on Minecraft server data.

July 26, 2022

Investigating Digital Abuse: Mitigating Harm Online and On the Ground: A Toolkit for Law Enforcement

Action Guide

Cloe-up of fingers typing on laptop keyboard

ADL’s toolkit on digital abuse and online hate equips law enforcement with tools to address hate or harassment that starts online but does not always stay there

February 06, 2024