Chinese AI startup Moonshot outperforms GPT-5 and Claude Sonnet 4.5: What you need to know

Wait 5 sec.

A Chinese AI startup, Moonshot, has disrupted expectations in artificial intelligence development after its Kimi K2 Thinking model surpassed OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5 across multiple performance benchmarks, sparking renewed debate about whether America’s AI dominance is being challenged by cost-efficient Chinese innovation.Beijing-based Moonshot AI, valued at US$3.3 billion and backed by tech giants Alibaba Group Holding and Tencent Holdings, released the open-source Kimi K2 Thinking model on November 6, achieving what industry observers are calling another “DeepSeek moment” – a reference to the Hangzhou-based startup’s earlier disruption of AI cost assumptions. Hello, Kimi K2 Thinking!The Open-Source Thinking Agent Model is here. SOTA on HLE (44.9%) and BrowseComp (60.2%) Executes up to 200 – 300 sequential tool calls without human interference Excels in reasoning, agentic search, and coding 256K context windowBuilt… pic.twitter.com/lZCNBIgbV2— Kimi.ai (@Kimi_Moonshot) November 6, 2025Performance metrics challenge US modelsAccording to the company’s GitHub blog post, Kimi K2 Thinking scored 44.9% on Humanity’s Last Exam, a large language model benchmark consisting of 2,500 questions across a broad range of subjects, exceeding GPT-5’s 41.7%.The model also achieved 60.2% on the BrowseComp benchmark, which evaluates web browsing proficiency and information-seeking persistence of large language model agents, and scored 56.3% to lead in the Seal-0 benchmark designed to challenge search-augmented models on real-world research queries.VentureBeat reported that the fully open-weight release meeting or exceeding GPT-5’s scores marks a turning point where the gap between closed frontier systems and publicly available models has effectively collapsed for high-end reasoning and coding.Kimi K2 Thinking is the new leading open weights model: it demonstrates particular strength in agentic contexts but is very verbose, generating the most tokens of any model in completing our Intelligence Index evals@Kimi_Moonshot's Kimi K2 Thinking achieves a 67 in the… pic.twitter.com/m6SvpW7iif— Artificial Analysis (@ArtificialAnlys) November 7, 2025Cost efficiency raises questionsThe popularity of the model grew after CNBC reported its training cost was merely US$4.6 million, though Moonshot AI did not comment on the cost. According to calculations by the South China Morning Post, the cost of Kimi K2 Thinking’s application programming interface was six to 10 times cheaper than that of OpenAI and Anthropic’s models.The model uses a Mixture-of-Experts architecture with one trillion total parameters, of which 32 billion are activated per inference, and was trained using INT4 quantisation to achieve roughly two times generation speed improvement while maintaining state-of-the-art performance.Thomas Wolf, co-founder of Hugging Face, commented on X that Kimi K2 Thinking was another case of an open-source model passing a closed-source model, asking, “Is this another DeepSeek moment? Should we expect [one] every couple of months now?”Technical capabilities and limitationsMoonshot AI researchers said Kimi K2 Thinking set “new records across benchmarks that assess reasoning, coding and agent capabilities”. The model can execute up to 200-300 sequential tool calls without human interference, reasoning coherently across hundreds of steps to solve complex problems.Independent testing by consultancy Artificial Analysis placed Kimi K2 on top of its Tau-2 Bench Telecom agentic benchmark with 93% accuracy, which was described as the highest score it has independently measured.However, Nathan Lambert, a researcher at the Allen Institute for AI, suggested there’s still a time lag of approximately four to six months in raw performance between the best closed and open models, though he acknowledged that Chinese labs are closing in and performing very strongly on key benchmarks.Market implications and competitive pressureZhang Ruiwang, a Beijing-based information technology system architect, said the trend was for Chinese companies to keep costs down, explaining, “The overall performance of Chinese models still lags behind top US models, so they have to compete in the realms of cost-effectiveness to have a way out”.Zhang Yi, chief analyst at consultancy iiMedia, said the training costs of Chinese AI models were seeing a “cliff-like drop” driven by innovation in model architecture and training technique, and input of quality training data, marking a shift away from the heaping of computing resources in the early days.The model was released under a Modified MIT License that grants full commercial and derivative rights, with one restriction: deployers serving over 100 million monthly active users or generating over US$20 million per month in revenue must prominently display “Kimi K2” on the product’s user interface.Industry response and future outlookDeedy Das, a partner at early-stage venture capital firm Menlo Ventures, wrote in a post on X that “Today is a turning point in AI. A Chinese open-source model is #1. Seminal moment in AI”. Today is a turning point in AI. A Chinese open source model is #1.Kimi K2 Thinking scored 51% in Humanity's Last Exam, higher than GPT-5 and every other model. $0.6/M in, $2.5/M output.The best at writing, and does 15tps on two Mac M3 Ultras!Seminal moment in AI.Try it… pic.twitter.com/fmxlxpCGbE— Deedy (@deedydas) November 7, 2025Nathan Lambert wrote in a Substack article that the success of Chinese open-source AI developers, including Moonshot AI and DeepSeek, showed how they “made the closed labs sweat,” adding “There’s serious pricing pressure and expectations that [the US developers] need to manage”.The release positions Moonshot AI alongside other Chinese AI companies like DeepSeek, Qwen, and Baichuan that are increasingly challenging the narrative of American AI supremacy through cost-efficient innovation and open-source development strategies. Whether this represents a sustainable competitive advantage or a temporary convergence in capabilities remains to be seen as both US and Chinese companies continue advancing their models.the public nature of the statements, and the market’s reaction, suggest substantive discussions may soon be underway.The AI chip landscape is entering a period of flux. Organisations should maintain flexibility in their infrastructure strategy and monitor how partnerships like Tesla-Intel might reshape the competitive dynamics of AI hardware manufacturing.The decisions made today about chip manufacturing partnerships could determine which organisations have access to cost-effective, high-performance AI infrastructure in the coming years.Photo by Moonshot AI)See also: DeepSeek disruption: Chinese AI innovation narrows global technology divideWant to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.The post Chinese AI startup Moonshot outperforms GPT-5 and Claude Sonnet 4.5: What you need to know appeared first on AI News.