How Anthropic's Claude Helped Mozilla to Improve Firefox's Security

Wait 5 sec.

"It took Anthropic's most advanced artificial-intelligence model about 20 minutes to find its first Firefox browser bug during an internal test of its hacking prowess," reports the Wall Street Journal.The Anthropic team submitted it, and Firefox's developers quickly wrote back: This bug was serious. Could they get on a call? "What else do you have? Send us more," said Brian Grinstead, an engineer with Mozilla, Firefox's parent organization. Anthropic did. Over a two-week period in January, Claude Opus 4.6 found more high-severity bugs in Firefox than the rest of the world typically reports in two months, Mozilla said... In the two weeks it was scanning, Claude discovered more than 100 bugs in total, 14 of which were considered "high severity..." Last year, Firefox patched 73 bugs that it rated as either high severity or critical. A Mozilla blog post calls Firefox "one of the most scrutinized and security-hardened codebases on the web. Open source means our code is visible, reviewable, and continuously stress-tested by a global community." So they're impressed — and also thankful Anthropic provided test cases "that allowed our security team to quickly verify and reproduce each issue."Within hours, our platform engineers began landing fixes, and we kicked off a tight collaboration with Anthropic to apply the same technique across the rest of the browser codebase... . A number of the lower-severity findings were assertion failures, which overlapped with issues traditionally found through fuzzing, an automated testing technique that feeds software huge numbers of unexpected inputs to trigger crashes and bugs. However, the model also identified distinct classes of logic errors that fuzzers had not previously uncovered... We view this as clear evidence that large-scale, AI-assisted analysis is a powerful new addition in security engineers' toolbox. Firefox has undergone some of the most extensive fuzzing, static analysis, and regular security review over decades. Despite this, the model was able to reveal many previously unknown bugs. This is analogous to the early days of fuzzing; there is likely a substantial backlog of now-discoverable bugs across widely deployed software. "In the time it took us to validate and submit this first vulnerability to Firefox, Claude had already discovered fifty more unique crashing inputs" in 6,000 C++ files, Anthropic says in a blog post (which points out they've also used Claude Opus 4.6 to discover vulnerabilities in the Linux kernel). "Anthropic "also rolled out Claude Code Security, an automated code security testing tool, last month," reports Axios, noting the move briefly rattled cybersecurity stocks...Read more of this story at Slashdot.