For generations, mathematicians have tried and failed to solve an 80-year-old problem. In May, an artificial intelligence (AI) model successfully tackled it.OpenAI, the maker of ChatGPT, said one of its internal models had made a breakthrough with the challenge first posed by Hungarian mathematician Paul Erdős in 1946 — the planar unit distance problem. The mathematician was known for creating such challenges, which are collectively known as “Erdős problems”.Claims of AI models solving maths problems are not new, attracting criticism as well as scepticism. Experts, however, say this particular result stands out. Had a human carried it out, it would merit publication in a top math journal, they say. “No previous AI-generated proof has come close” to meeting those high standards, wrote Tim Gowers, a mathematician at the University of Cambridge, in commentary for OpenAI.Thomas Bloom, a researcher at the University of Manchester who maintains the website erdosproblems.com, told The Indian Express that he had ranked the “unit distance problem” among his top 10 Erdős problems and expected that a solution was still a long way off. These breakthroughs, according to him, matter because they show that AI is capable of doing real research.The problem is deceptively simple. Take a piece of paper and place some dots on it. As you keep adding dots — even up to the millions and trillions — how do you ensure that you get the maximum pairs of dots that are at the same distance from each other?Erdős posited that the best strategy to get the maximum number of equal-distance pairs is to arrange them in a shape roughly resembling a square grid.Story continues below this adFor decades, mathematicians thought that this was indeed the best way but could not find a proof. “Like most people, I was expecting a proof,” said Bloom, who is among nine mathematicians who have verified this result.Also in Explained | Is everything on the internet now written by AI? The science of AI detection tools, how efficient they areBut the OpenAI model, instead of solving the problem, disproved the square-grid assumption, “discovering an entirely new family of constructions that performs better”, according to OpenAI.This means that it essentially found a new pattern — one that is difficult to represent visually — drawing from different fields of mathematics to show that it was possible to get even more equal-distance pairs.Maths and AI modelsAI models that once struggled with basic arithmetic are now solving SAT problems and tackling Olympiad-level benchmarks. Some of these claims, however, have attracted scrutiny.Story continues below this adIn October 2025, OpenAI’s former Chief Product Officer Kevin Weil said on X that GPT-5 had found solutions to 10 Erdős problems. He later deleted this post. At that time, Bloom had described the claim as a “dramatic misrepresentation”, saying it only found existing literature.This is exactly why the new result is so significant. Rather than simply drawing on existing data, it shows that the AI model was able to “read academic papers and understand them well enough to apply them in new ways”, said Bloom.Also read | Why hundreds of mathematicians have backed a declaration against unchecked AI useDays after the Open AI announcement, Google DeepMind also said its AI system AlphaProof Nexus had solved nine Erdős problems.Story continues below this adSo why is solving maths problems so important in the field of AI? According to Bloom, with creative writing, people can argue about whether something is well-written or not. “With a mathematical proof, it’s either right or wrong, and everyone who’s read and understood it can agree on which.”Disproving the conjecture“Modern mathematics is very specialised, and people tend to have depth rather than breadth… a powerful technique from another field might have applications in a completely different field that all humans have missed,” said Bloom.But unlike humans, AI systems are less constrained by assumptions about which fields connect naturally.What makes the Erdős result specifically striking is how the AI model drew from multiple domains, Sayan Ranu, a professor at the Indian Institute of Technology Delhi, told The Indian Express. “The model solved a problem in discrete geometry using tools from algebraic number theory, branches of mathematics that experts had not previously connected to this question.”Story continues below this adBy gathering literature and synthesising information across vast domains, AI models can appear more efficient in certain fields than an expert working in a single field. It can dig for solutions tirelessly.Some caveatsDespite advancements in automation, experts said verification still requires human intervention. “The model did not invent something fundamentally new that nobody saw coming,” Sébastien Bubeck, who is leading OpenAI’s mathematical explorations, told Scientific American. “It just executed like an amazing mathematician.”Ranu said high-profile successes can be misleading. “LLMs (large language models) continue to hallucinate. They have improved, but they still make trivial errors on elementary math today, errors that simply don’t make the headlines.”Also read | How hackers used Meta’s own AI to break into Instagram accountsHighlighting the element of luck, he said a model that cracks a deep open problem in one run may fail at a much simpler calculation in another. So, while these landmark results are real, they do not mean the systems are reliably correct.Story continues below this adWhile OpenAI relied on human mathematicians to verify and simplify the output, Google DeepMind had linked their AI to coding verifiers like Lean.“With formal proof systems like Lean, the AI can produce code that demonstrates the proof is correct without the need for a human to check each line carefully — all they need to do is check the headline is accurate,” said Bloom.So why do these results matter?Bloom said, “These results in pure mathematics may have little effect on most people’s lives, but once AI can do that with maths papers, it could start to do the same with biology, physics, medicine, engineering.”Ranu said that the ultimate setup is a hybrid, where automation acts as a filter, with humans as the final arbiter for the hard cases.Story continues below this adNewsletterFollow our daily newsletter so you never miss anything important. On Wednesday, we answer readers' questions.SubscribeAccording to him, the models are already arriving at original scientific reasoning in a meaningful way, though a gap remains between these systems and the physical world. This, he said, is the next frontier where real progress will be seen.The result, he believes, settles a common misconception that LLMs are essentially a more powerful Google search, retrieving answers from a giant database.But here’s the catch. These models are far from perfect, as reflected in Ranu’s cautionary note that LLMs make mistakes while sounding confident, so what they say must be verified. He pointed out that they are not yet better than the very best specialists in a given field, such as a leading mathematician on a problem they have spent decades on.Erdős’ favourite problem, thus, became a test of whether AI has moved beyond simple retrieval of information, like search engines do. “This problem had no answer to retrieve. The model produced a new construction that humans had not found in eighty years of trying. That is reasoning and creation, not search,” Ranu said.