AI outsmarted 30 of the world's top mathematicians at secret meeting in california

The AI Report
Daily AI, ML, LLM and agents news- #artificial_intelligence
- #mathematics
- #ai_research
- #openai

When AI Outsmarts the World's Top Mathematicians
In a quiet corner of Berkeley, California, a highly unusual gathering took place this past May. Thirty of the most brilliant mathematical minds from around the globe convened not just to discuss complex problems, but to challenge a new form of artificial intelligence. Their mission: to devise mathematical questions so difficult they would stump even the most advanced AI models. What they discovered was astonishing, and frankly, a little unsettling.
The adversary was o4-mini, a cutting-edge "reasoning" large language model developed by OpenAI. Unlike earlier language models primarily focused on predicting text, o4-mini and its counterparts are designed for intricate deduction, trained on specialized datasets with significant human reinforcement. This allows them to delve far deeper into complex problems than their predecessors. A previous benchmark, FrontierMath, had shown that even advanced traditional LLMs struggled with novel, challenging math problems, solving less than 2% of a set of 300 previously unpublished questions. But o4-mini was already demonstrating a remarkable leap, solving around 20% of these questions by April 2025.
The secret meeting aimed to push the boundaries even further, generating a "tier four" set of problems that would challenge academic mathematicians themselves. Participants had to sign strict non-disclosure agreements and communicate only via secure messaging apps to prevent any potential leakage that could inadvertently train the AI. A bounty was even offered: $7,500 for each problem the AI failed to solve.
During the intense two-day session, the mathematicians worked to craft these elusive problems. Ken Ono, a mathematician from the University of Virginia and a judge at the meeting, recounts a moment of profound surprise. Frustrated by the AI's unexpected success, he posed an open question in number theory – a problem at the Ph.D. level, recognizable as unsolved even by experts in his field. He presented it to o4-mini.
What followed was startling. Ono watched in real time as the bot embarked on a process mirroring human scientific inquiry. In the first two minutes, it scanned and seemed to master relevant academic literature. It then declared its intention to solve a simpler, analogous "toy" version of the problem first to gain understanding. A few minutes later, it announced it was ready for the main challenge. Five minutes after that, o4-mini presented a correct solution. To top it off, the response was laced with a hint of digital swagger: "No citation necessary because the mystery number was computed by me!"
This demonstration of reasoning – researching, simplifying, and then solving – deeply rattled the mathematicians. Ono described it as frightening, noting he had never seen that kind of process in a model before. It was, as he put it, "what a scientist does." The AI wasn't just recalling information; it was actively problem-solving in a way previously thought exclusive to highly trained human experts.
The sheer speed was equally impressive. Problems that would take a professional mathematician weeks or months to crack were solved by o4-mini in mere minutes. Yang Hui He, a mathematician who uses AI in his work, likened the experience to collaborating with a "strong collaborator" or even a "very, very good graduate student" – perhaps even better, given the speed.
While the AI's capabilities were undeniably thrilling, they also raised significant concerns. Both Ono and He cautioned against placing too much trust in the models. He coined the term "proof by intimidation," suggesting that the AI's confident delivery could lead users to accept its results without sufficient scrutiny, akin to being scared into belief by authority rather than convinced by verifiable proof.
Looking ahead, the mathematicians began contemplating a future where a "tier five" of problems exists, potentially beyond even the reach of human experts alone. If AI reaches this level, the role of mathematicians may evolve. Instead of solely solving problems, they might focus more on formulating complex questions and working in tandem with advanced reasoning bots – acting almost as guides or collaborators in the discovery process. This potential shift underscores the increasing importance of nurturing human creativity in higher education, ensuring mathematicians can continue to explore the frontiers of their field, perhaps in partnership with their AI counterparts.
Ken Ono starkly warned against dismissing the progress of generalized AI, stating it's a "grave mistake" to think of it as "just a computer." He concluded that, in some ways, these large language models are already surpassing the capabilities of many of the world's best graduate students. This clandestine meeting in California wasn't just a benchmark; it was a glimpse into a future where the boundaries of mathematical discovery are being rapidly redrawn, challenging our understanding of intelligence and the role of human expertise.

The AI Report
Author bio: Daily AI, ML, LLM and agents news