Claude Mythos and Project Glasswing: why an AI superhacker has the tech world on alert

Wait 5 sec.

Westend61 / Getty ImagesNew, more powerful artificial intelligence (AI) models are announced pretty regularly these days: the latest version of ChatGPT or Claude or Gemini always has new features and new capabilities that its makers are eager for customers to try out.But now Anthropic has announced a new model with great fanfare, but is only giving access to a select handful of users. In what the New York Times calls a “terrifying warning sign” of the model’s power, the company has instead started an initiative called Project Glasswing to use the model for good instead of evil.Why? Early reports indicated that the model, with instruction, had been able to move outside a contained testing “sandbox” and send an email to a researcher.A little alarming, perhaps. But more significantly, Anthropic claims Mythos has uncovered software vulnerabilities and bugs “in every major operating system and every major web browser”. Finding hidden vulnerabilitiesIn one remarkable example, the model found a flaw in OpenBSD, a security-focused operating system used in firewalls and routers, which had gone undetected for 27 years. According to Anthropic, it also found a 16-year-old vulnerability in FFmpeg, a little-known but widely used behind-the-scenes piece of software that helps computers, apps, and websites handle audio and video files. Anthropic also says Mythos found several vulnerabilities in the kernel of the Linux operating system, and chained them together in a way that could give an attacker complete control of a machine. Anthropic’s internal testing (which has not been independently verified) showed the Mythos model was far more successful than earlier models at turning software bugs into working exploits. Anthropic Anthropic’s internal assessment of the model highlights both its technical promise and the need for vigilance. The report outlines a hypothetical risk that an advanced AI might exploit its access within an organisation, but concludes that the model poses a very low threat of harmful autonomous actions. In other words, it is unlikely to “go rogue” – but may follow human directions to do things that cause harm.Why Anthropic is keeping Mythos off‑limitsAnthropic says it decided not to release the model publicly because of its capabilities and the potential risks it poses. At the same time, the company launched Project Glasswing. The effort brings together a broad coalition of tech companies such as Microsoft, Amazon, Google, Apple, Cisco and NVIDIA, open-source organisations such as the Linux Foundation, and major financial actors such as JPMorganChase, to channel Mythos towards cyber defence rather than misuse. The idea is to give defenders a head start to find and fix weaknesses in critical software before similar AI capabilities become widely available to attackers.Reading between the lines of Anthropic’s messagesThis is not the first time an AI firm has decided a model was too powerful to release widely. In 2019, years before the ChatGPT era, OpenAI did something similar with its (now quite primitive-looking) GPT-2 model. (Dario Amodei, now chief executive of Anthropic, was a key OpenAI researcher at the time.)However, this doesn’t mean these announcements should not be taken seriously.Anthropic has published unusually detailed material for a model it is not widely releasing. Reports suggest US authorities convened major US bank CEOs in Washington to discuss the cyber risks associated with Mythos. However, we should exercise caution about Anthropic’s claims, because outsiders cannot yet verify most of the underlying evidence. Anthropic says more than 99% of the vulnerabilities it found are still undisclosed because they have not yet been patched. That is responsible disclosure, but it also means the public is being asked to trust a great deal it cannot fully inspect.What Mythos could mean for the future of cybersecurityCybersecurity failures can have real effects on individuals. In Australia, the Optus breach exposed the personal information of about 9.5 million people. In another case, stolen Medibank records included sensitive health information, and some of the data was later released on the dark web.These were not just database problems. They became crises of privacy, identity and trust.That is why Mythos matters. Mythos and other AI models like it could change the basic economics of cybersecurity. In the past, serious vulnerabilities have often stayed hidden simply because nobody found them. And this in turn was because finding them took rare skill, patience, and time. If models like Mythos can scan the hidden plumbing of the internet – operating systems, browsers, routers, and shared open-source code – at an unprecedented scale, then what is now specialised hacking could become a routine and automated process.For organisations and software development firms, Mythos is a double-edged sword. It could rapidly uncover hidden flaws in their own code, but it also raises the fear attackers could find the vulnerabilities first. The implications reach well beyond tech companies. Much of that underlying, invisible software supports many of the services people rely on every day, from electricity and water to airlines, banking, retail and hospitals.What now?So far, cybersecurity and software companies have been remarkably quiet in public about Anthropic’s Mythos. Many firms appear to be waiting and watching, unwilling to signal their stance in case the model exposes weaknesses in their own systems. But developments like Mythos are a reason to stop treating cybersecurity as somebody else’s problem. For now, for individuals, the response is simple: basic cyber hygiene matters more than ever. Update phones, laptops, browsers and routers. Replace unsupported devices. Use a password manager. Turn on multi-factor authentication. Do not ignore patch notices. Those are the immediate steps. Beyond them lies a harder set of questions about AI and cyber security – about who gets access to powerful AI models, who oversees their use, and who decides what counts as the “right hands”.Stan is an member of the Association for Information Systems.Saeed Akhlaghpour does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.