AISI Evaluates OpenAI’s GPT-5.5 Cyber Capabilities

The UK’s AISI has conducted an evaluation of OpenAI’s GPT-5.5, finding it to be one of the strongest AI models tested on cyber tasks. GPT-5.5 notably solved a complex multi-step corporate network attack simulation end-to-end in 2 out of 10 attempts, demonstrating advanced cyber-offensive skills.

The model excelled in a suite of 95 cyber tasks, achieving a 71.4% success rate on expert-level challenges, surpassing previous models including Anthropic’s Claude Mythos Preview. GPT-5.5 also completed a difficult reverse-engineering challenge in just over 10 minutes, a task that took human experts hours.

While GPT-5.5 did not solve an industrial control system attack simulation, its capabilities highlight the rapid advancement of AI in cybersecurity. The evaluation underscores the need for robust safeguards and defensive measures as AI-driven cyber threats evolve.

Source: AISI Blog