Anthropic’s rough week: leaked models, exposed source code, and a botched GitHub takedown

Wait 5 sec.

Anthropic’s had a rough streak of luck lately. Last week, Fortune reported on an accidental leak of the AI company’s development of a new model, Mythos. Less than a week later, Anthropic was exposed again; this time, its source code was showing. Security researcher Chaofan Shou discovered that the company had shipped version 2.1.88 of Claude Code with a 59.8MB source map file attached to the npm package, effectively providing a full view of the codebase.The saga continued as Anthropic invoked U.S. digital copyright law, asking GitHub to take down repositories containing the leaked code. As reported by TechCrunch, the takedown ended up causing the removal of upwards of 8,000 repositories, which an Anthropic spokesperson chalks up as an accident: “The takedown reached more repositories than intended.”Anthropic has since retracted the takedown notice for the other repositories, but the clunky maneuver adds another stain to the company’s messy week. While Anthropic likely hopes this is the end of unwanted surprises, there’s no resealing the now-opened cans of worms — and there’s no telling what new security risks lie ahead. A look behind the Anthropic curtainWith 512,000 lines of code exposed, the AI community at large now has an unfettered view into Claude Code’s full architecture. Zahra Timsah, Ph.D., co-founder and CEO of i-GENTIC AI, who has contributed to global AI governance and policy for the World Economic Forum, says calling the event a leak is “too convenient”: “What you are actually looking at is a structural exposure of how the system thinks and enforces boundaries,” she tells The New Stack. “When system prompts, orchestration logic, and hidden flags are exposed, you are no longer dealing with a black box.” Beyond the leaked codebase, voyeurs also got an inside scoop this past week via an unsecured, publicly accessible data store for a new model Anthropic has been developing, Claude Mythos. An Anthropic spokesperson told Fortune it is “the most capable [model] we’ve built to date,” representing “a step change” in AI performance. “When system prompts, orchestration logic, and hidden flags are exposed, you are no longer dealing with a black box.” — Zahra Timsah, co-founder and CEO, i-GENTIC AIAlso in the data store reviewed by Fortune were details for a new tier of AI models to be called Capybara. As the Anthropic literature describes it, Capybara is “larger and more intelligent than our Opus models — which were, until now, our most powerful.” Security risks, now and laterAnthropic removed public access to the data store after being notified by Fortune, but now that the cat’s out of the bag, there’s plenty to mull over.First, what will come from the leak-sourced map that exposed Claude Code’s full architecture? “The leaked source exposes Claude Code’s exact permission-enforcement logic, its hook-orchestration paths, and the trust boundaries it uses to decide when to execute code in unfamiliar repositories,” explains earlier coverage from The New Stack. This creates easy pointers for bad actors to exploit weaknesses and bypass safeguards.But these are only the immediate concerns. By Anthropic’s own account, more cybersecurity threats lurk ahead. As revealed in one of the leaked documents first reported on by Fortune, Anthropic is well aware of the cybersecurity risks its new model Capybara could pose, which is why the AI company gave early access to select organizations: “We want to understand the model’s potential near-term risks in the realm of cybersecurity — and share the results to help cyber defenders prepare.” These risks stem from what Anthropic calls the model’s keen progress, describing it as “currently far ahead of any other AI model in cyber capabilities.” Should bad actors get their hands on these capabilities, Anthropic warns, “it presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”Move fast, and things will certainly be brokenTogether, the leaked Mythos and Capybara plans, the exposed source code, and the sloppy GitHub takedown make an awkward moment for Anthropic.“Anthropic built its positioning on being the responsible actor. That positioning just took a hit,” says Timsah. While she says it’s clear that the AI company invested in constraining model behavior, she thinks it wasn’t equally rigorous about the release pipeline and infrastructure controls: “You do not get to claim safety leadership if it only applies to the model layer.” “You do not get to claim safety leadership if it only applies to the model layer.” — Zahra Timsah, co-founder and CEO, i-GENTIC AIShayne Adler, co-founder and CEO, Aetos Data Consulting, an advisory firm for data privacy, AI governance, and cybersecurity, agrees, calling for a more comprehensive approach to AI governance: “Building trust in AI systems depends as much on proper, consistent governance and change control as it does on the performance of the frontier model,” Adler tells The New Stack.While Anthropic has been on a feature-release sprint as of late (including last week’s Claude computer-use capabilities), it’s not the only AI company going full steam ahead. And in this ferocious race to get ahead, it may seem that at least a few sloppy leaks are inevitable. Timsah affirms speed shouldn’t have to come at the cost of accountability. When asked if she considers such accidents inevitable, she sees no room for mistakes. “Fast-moving AI companies are optimizing for velocity and retrofitting accountability later,” she tells The New Stack. “As long as companies prioritize shipping over enforcement, you will keep seeing variations of this.” The post Anthropic’s rough week: leaked models, exposed source code, and a botched GitHub takedown appeared first on The New Stack.