Xiao Zhang's Homepage
Xiao Zhang's Homepage
About
Research
Publication
Student
Teaching
Service
Contact
Open Position
Light
Dark
Automatic
Response Filtering
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
We propose AutoDefense, a response-filtering based multi-agent defense framework that filters harmful responses from LLMs.
Yifan Zeng
,
Yiran Wu
,
Xiao Zhang
,
Huazheng Wang
,
Qingyun Wu
PDF
Cite
Code
ArXiv
Cite
×