OpenAI has released GPT-Realtime-2, a new generation of voice models for the API. The models enable real-time conversational agents that can listen, reason, interrupt themselves mid-dialogue, call tools, and solve tasks during conversation.

Three models were announced:

  • GPT-Realtime-2 — a production-ready voice agent model with GPT-5-level reasoning, interruption handling, tool calls, and more natural dialogue.
  • GPT-Realtime-Translate — streaming real-time translation supporting over 70 input languages and 13 output languages.
  • GPT-Realtime-Whisper — streaming speech transcription for subtitles, notes, and live captions.

OpenAI — Advancing voice intelligence with new models in the API