OpenAI’s GPT-5.4 mini and nano are built for the subagent era

Wait 5 sec.

On Tuesday, OpenAI released GPT-5.4 mini and nano, two smaller models designed for tasks that agentic AI systems delegate: codebase searches, file reviews, and parallel subtasks that need to be fast and affordable.This is the first time OpenAI has launched new mini and nano models for a while. The last release was GPT-5 mini and nano in 2025.What’s especially interesting here is that, in some areas, the trade-off between using the mini model and full GPT-5.4 isn’t even that high, especially on coding and computer-use benchmarks, all while also running more than twice as fast, OpenAI says. Nano is the stripped-down version for high-volume work: classification, data extraction, ranking, and lightweight coding support. Both became available on Tuesday.Pricing and availabilityGPT-5.4 mini is available in the API, Codex, and ChatGPT. It has a 400,000-token context window, can take text and image inputs, and costs $0.75 per million input tokens and $4.50 per million output tokens.For developers using OpenAI’s Codex agentic coding engine, mini uses only 30% of the GPT-5.4 quota, OpenAI says, which should help developers handle routine coding tasks without burning through their quota.OpenAI is taking a different approach with GPT-5.4 nano. It’s API-only, but at $0.20 per million input tokens and $1.25 per million output tokens, it’s OpenAI’s cheapest model right now.How close is mini to the flagship model?On SWE-bench Pro, a benchmark that tests models on real software engineering tasks, mini scores 54.38%, only 3 percentage points behind the full GPT-5.4. On OSWorld-Verified, which measures computer-use ability, mini scores 72.13%, almost matching the flagship model’s 75.03% (all of these were run with ‘high’ reasoning efforts).Nano obviously doesn’t perform quite as well, but it still outperforms the original GPT-5 mini on coding and tool-calling tasks, but actually scores lower on OSWorld-Verified (39.01% vs. 42%). You definitely don’t want the nano model to surf the internet for you.Credit: OpenAI.Built for delegationThe overall pattern OpenAI is highlighting here is becoming familiar. In Codex, GPT-5.4 handles planning, coordination, and final review. Mini subagents run in parallel underneath, handling focused tasks: searching a codebase, reviewing a large file, processing supporting documents. In these settings, OpenAI says in its announcement, the best model is often not the largest one—it’s the one that can respond quickly, use tools reliably, and still perform well on complex professional tasks.”Notion AI Engineering Lead Abhisek Modi says this shift is already real. “GPT-5.4 mini handles focused, well-defined tasks with impressive precision. For editing pages specifically, it matched and often exceeded GPT-5.2 on handling complex formatting at a fraction of the compute,” he says. “Until recently, only the most expensive models could reliably navigate agentic tool calling. Today, smaller models like GPT-5.4 mini and nano can easily handle it, which will let our users build Custom Agents on Notion pick exactly the amount of intelligence they need.”OpenAI’s competitors are taking a similar approach with their smaller models. Anthropic’s Claude 4.5 Haiku is designed for lightweight agent tasks; Google’s Gemini 3 Flash is for similar use cases.As agents take on more complex work, most of the computing goes to these cheap workhorse models, not the frontier model at the top of the leaderboardThe post OpenAI’s GPT-5.4 mini and nano are built for the subagent era appeared first on The New Stack.