Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study
May 23, 2023ยท
,,,,,,,ยท
1 min read
Yi Liu
Gelei Deng
Zhengzi Xu
Yuekang Li
Yaowen Zheng
Ying Zhang
Lida Zhao
Tianwei Zhang
Yang Liu
Abstract
Large Language Models (LLMs) have revolutionized natural language processing, but their safety mechanisms can be circumvented through carefully crafted prompts. This empirical study systematically investigates jailbreaking techniques against ChatGPT, providing a comprehensive analysis of prompt engineering methods that can bypass content moderation and safety filters.
Type
Publication
arXiv preprint arXiv:2305.13860
This work presents the first comprehensive empirical study on jailbreaking ChatGPT through prompt engineering, identifying key vulnerability patterns and providing insights for improving LLM safety mechanisms.