What Makes a Good LLM Agent for Real-world Penetration Testing?

Feb 19, 2026·

Gelei Deng

Yi Liu

Yuekang Li

Ruozhao Yang

Xiaofei Xie

Jie Zhang

Han Qiu

Tianwei Zhang

· 1 min read

PDF Project DOI arXiv

Abstract

This work analyzes LLM-based penetration testing agents, identifies distinct engineering and planning failure modes, and introduces Excalibur, a difficulty-aware penetration testing agent that couples typed tooling, retrieval-augmented knowledge, and evidence-guided attack tree search.

Type

Preprint

Publication

arXiv preprint arXiv:2602.17622

This paper examines why LLM penetration testing systems succeed or fail in real-world settings, then proposes Excalibur to improve task selection and attack-chain planning through difficulty-aware reasoning.

Last updated on Feb 19, 2026