Microsoft’s new AI system finds 16 Windows flaws, including four critical RCEs

Wait 5 sec.

Microsoft has unveiled a new AI-driven vulnerability discovery system that identified 16 previously unknown Windows vulnerabilities, including four critical remote code execution flaws, in what security analysts say could mark a major shift in how software vulnerabilities are discovered and remediated.The system, codenamed MDASH, was developed by Microsoft’s Autonomous Code Security team alongside the Windows Attack Research and Protection group.The platform will enter private preview for enterprise customers next month, Microsoft said in a blog post announcing the system.The vulnerabilities were patched as part of Microsoft’s May 12 Patch Tuesday release.“Cyber defenders are facing an increasingly asymmetric battle,” Microsoft added in the blog post. “Attackers are using AI to increase the speed, scale, and sophistication of attacks.”Critical Windows components affectedThe four critical vulnerabilities affected core Windows components broadly deployed across enterprise environments, Microsoft said in the blog.Among them was CVE-2026-33827, a remote unauthenticated use-after-free flaw in the Windows IPv4 stack reachable through specially crafted packets carrying the Strict Source and Record Route option, Microsoft said.Another flaw, CVE-2026-33824, involved a pre-authentication double-free issue in the IKEEXT service affecting RRAS VPN, DirectAccess, and Always-On VPN deployments.Two additional critical flaws affected Netlogon and the Windows DNS Client, both carrying CVSS scores of 9.8.The remaining 12 vulnerabilities rated “Important” included denial-of-service, privilege-escalation, information disclosure, and security feature bypass flaws affecting components such as tcpip.sys, http.sys, ikeext.dll, and telnet.exe, according to Microsoft.How MDASH orchestrates AI agentsAccording to Microsoft, MDASH orchestrates more than 100 specialized AI agents across multiple frontier and distilled models, with each agent assigned to a different stage of the vulnerability discovery pipeline.Some agents scan source code for potential flaws, others validate whether findings are genuine, and another stage attempts to construct triggering inputs capable of reproducing the issue before the finding reaches a human engineer for review.“The model is one input. The system is the product,” Taesoo Kim, Microsoft vice president for agentic security, wrote in the blog.Microsoft said the architecture was intentionally designed to remain largely model-agnostic, allowing the company to swap underlying AI models without rebuilding the broader orchestration pipeline.That detail matters because MDASH arrives only weeks after Microsoft announced Project Glasswing, a partnership involving Anthropic and others to evaluate AI-driven vulnerability discovery using Anthropic’s Claude Mythos Preview model.“Microsoft is now operating as platform owner, security vendor, AI infrastructure player, OpenAI partner, Mythos integrator, and agentic security supplier,” said Sanchit Vir Gogia, chief analyst at Greyhound Research. “That is a formidable position. It is also a concentration of influence that security leaders must examine with clear eyes.”AI vs AI vulnerability raceThe announcement also highlights growing concern that AI-driven vulnerability discovery could accelerate offensive operations as well as defensive research.Anthropic has previously said its Mythos Preview model identified thousands of high-severity vulnerabilities, including a decades-old OpenBSD flaw and a long-undetected FFmpeg issue that traditional fuzzing tools failed to uncover despite millions of attempts.“We’ve entered an AI-versus-AI vulnerability discovery race,” said Sunil Varkey, advisor at Beagle Security. “The winners won’t be the organizations with the best static scanners anymore. They’ll be the ones who can run these agentic systems fastest against their own code and remediate at machine speed.”Varkey said enterprises should pursue early access to systems such as MDASH where possible rather than waiting for broader commercial availability.“Early access isn’t just nice-to-have,” he said. “It’s becoming a defensive necessity in the AI era.”For CISOs, the broader implication may be that vulnerability management is shifting from periodic scanning toward continuous, AI-assisted discovery and remediation.“The future belongs to security teams that can find, validate, contain, and fix in one governed motion,” Gogia said.Benchmarks show progress, but analysts urge cautionTo support its claims, Microsoft published benchmark results showing MDASH identified all 21 deliberately planted vulnerabilities in an internal Windows test driver without false positives. The company also said the system successfully recovered nearly all historical Microsoft Security Response Center cases tested against older Windows component snapshots.On the public CyberGym benchmark for vulnerability reproduction tasks, Microsoft said MDASH achieved a score of 88.45%, topping the public leaderboard at publication time.Gogia said the results show the category is maturing but warned against treating benchmark scores as direct proof of enterprise value.“CyberGym is a signal, not a buying decision,” he said. “The machinery around the model is beginning to resemble a serious security research workflow.”He added that many enterprises still lack the governance maturity required to operationalize machine-generated vulnerability discovery effectively.“Discovery without remediation discipline is theatre,” Gogia said. “It produces dashboards, not resilience.”