Weapons of Math Destruction: What do Olympiad gold medals mean for the AI race?

Wait 5 sec.

Large language models (LLMs) are widely known for their ability to churn out essays and other forms of text in seconds. But for years, researchers have been using math problems that involve advanced reasoning as a test of what artificial intelligence (AI) systems are truly capable of.The race to build AI systems that rival human intelligence has led to several claims of mathematical breakthroughs, along with questions about the validity of certain benchmark tests.But this week provided a clearer sign of progress, as two AI models – developed by OpenAI and Google DeepMind – achieved scores high enough to win gold medals at the International Math Olympiad (IOM) 2025, a prestigious math competition for high school students.This is the first time any AI model has achieved such a high level of success on these kinds of problems.Every year since 1959, countries from around the world send their brightest ‘mathletes’ to compete at the IMO. The Olympiad takes place in two sessions, and participants are expected to solve three challenging math problems in each session. The duration of each session is 4.5 hours.The AI models solved five out of six math problems under the same conditions as human participants. Each problem carries seven points. They cover topics such as algebra, combinatorics, geometry, and number theory.The two AI models scored 35 out of 42 points, which was the cut-off this year for winning a gold medal. Both OpenAI and Google DeepMind used experimental AI reasoning models. Reasoning models are different from LLMs because they are said to work through a problem step-by-step before finally arriving at an answer.Story continues below this adIndia also bagged three gold medals, two silver, and one bronze at IOM 2025. Among the three gold-medal winners were Kanav Talwar and Aarav Gupta from Delhi Public School (DPS) Faridabad, while their schoolmate Archit Manas took home the bronze.Also Read | What is Comet, Perplexity’s agentic AI-powered browser?“It’s both surprising and impressive that AI systems can now solve IMO-level problems. However, the exact methods these AI systems use to arrive at their solutions remain somewhat unclear,” Kanav Talwar told The Indian Express. Talwar is the only member of the Indian contingent who outperformed both AI models, scoring two points higher than them.“This is not unexpected. AI systems work on large amounts of training data, so if they are fed enough Olympiad-level problems and their solutions, the AI system can memorise patterns (e.g., spotting cyclic quadrilaterals in geometry) for solving certain scenarios,” Aarav Gupta, another gold medal-winner, said.What’s behind the rivalry between OpenAI and Google DeepMind?Story continues below this adThis year was the first time IOM organisers officially worked with tech companies to allow their AI models to take part in the competition. While Google was part of this inaugural cohort, OpenAI wasn’t. The final scores achieved by the AI models were certified by IMO judges, and companies were reportedly asked to wait a few months before publishing the results in order not to steal the spotlight from the human medal-winners.However, OpenAI was the first to go public with the results. The Microsoft-backed AI startup on Saturday, July 19, announced that its unreleased AI model had achieved a gold medal-worthy score, but the results have not been certified by IMO judges. Instead, OpenAI relied on third-party, former IMO medallists to verify and grade the AI-generated solutions.Google, on the other hand, used a general-purpose model called Gemini Deep Think at the competition. The experimental model has “an enhanced reasoning mode for complex problems that incorporates some of our latest research techniques, including parallel thinking. This setup enables the model to simultaneously explore and combine multiple possible solutions before giving a final answer, rather than pursuing a single, linear chain of thought,” Google said.Gemini Deep Think’s results have been officially certified by the IMO. “We can confirm that Google DeepMind has reached the much-desired milestone […] Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow,” said Dr Gregor Dolinar, the president of IMO.Story continues below this adWhile Google has said this version of its Deep Think model will be made available to AI Ultra subscribers after testing, OpenAI has said it does not plan to release an AI model with this level of math capability for several months.Why are IMO gold medals a big deal for AI companies?Despite the rivalry between OpenAI and Google DeepMind, both their AI models essentially tied with the same final score. But their performances also underscore how rapidly AI models are evolving.Last year, Google DeepMind announced that its AI tools, AlphaProof and AlphaGeometry, had achieved an IMO score equivalent to a silver medal. But these AI tools were specially fine-tuned for solving math problems. They also relied on human experts to first translate the problems from natural language into formal programming languages such as Lean and vice versa. The computation for the proofs also took significantly more time.Story continues below this ad“This year, our advanced Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions – all within the 4.5-hour competition time limit,” Google said.Researchers behind OpenAI and Google’s IMO efforts this year also claimed that the gold-medal results showed how far AI reasoning models have come in solving problems that cannot be easily checked or verified.While AI models may be approaching elite human mathematical reasoning, India’s math talents believe that they still cannot match the emotion and creativity involved.“IMO participants not only solve problems but also experience the unique emotions, excitement, and mental challenge that come with the exam environment which is what makes the IMO truly special,” Talwar said. Archit Manas also agreed that AI models could probably find it hard to solve mathematical problems that require truly new ideas. “For example, an AI model trained on pre-IMO 2007 ideas would find it hard to solve IMO 2007/6,” he told The Indian Express.Where could mathematical AI models be used?Story continues below this adThe achievements of AI models at this year’s IMO suggest that it could be used to crack unsolved research problems in fields like cryptography and space exploration. But LLMs are also prone to stumbling on simple questions like whether 9.11 is bigger than 9.9. Hence, they are said to possess ‘jagged intelligence’ which is a term coined by Andrej Karpathy, a founding member of OpenAI.“AI can help mathematicians solve problems and take care of the more mundane computational elements in their work,” Talwar said. “Maybe AI can be used for checking proofs and even for brainstorming, but in my opinion, AI being able to replace mathematicians is a long way off,” Gupta opined.When asked if they would recommend using AI tools to train for future Olympiads, Talwar said, “Maybe in the future, AI can help in math Olympiad preparation in a way that is analogous to chess, where the AI could suggest better ideas on specific problems.”