Xiao Zhang
Xiao Zhang
About
Research
Publication
Teaching
Service
Contact
Light
Dark
Automatic
LLM Jailbreak Attacks
Jul 8, 2025
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
We propose AutoDefense, a response-filtering based multi-agent defense framework that filters harmful responses from LLMs.
Yifan Zeng
,
Yiran Wu
,
Xiao Zhang
,
Huazheng Wang
,
Qingyun Wu
PDF
Cite
Code
ArXiv
Cite
×