Former Google engineer and influential AI researcher Francois Cholet Co-founder of a non-profit organization to help develop benchmarks that will probe AI for “human-level” intelligence.
The non-profit, ARC Prize Foundation will be led by Greg Kamrad, former Salesforce director of engineering and founder of AI product studio Leverage. Kamradt will serve as Chairman and Member of the Board.
“[W]I am growing into a proper non-profit foundation to act as a useful north star towards artificial general intelligence,” Cholett. wrote In a post on a nonprofit website. (Artificial general intelligence is a nebulous term, but it generally refers to AI that can perform most tasks that humans can.)[W]e is trying to inspire progress through propaganda [the gap] in basic human capacities.”
The ARC Prize Foundation will expand ARC-AGIAn experiment developed by Chollet to assess whether an AI system can efficiently acquire new skills that it was trained on. It consists of puzzle-like problems where an AI has to create the correct “answer” grid from a collection of different colored squares. The problems were designed to force an AI to adapt to new problems it had not seen before.
Chollet introduced ARC-AGI, short for “Abstract and Reasoning Corpus for Artificial General Intelligence” in 2019. Many AI systems can pass Math Olympiad exams and come up with possible solutions to PhD-level problems. But as of this year, the best-performing AI could solve only a third of ARC-AGI's tasks.
“Unlike most Frontier AI benchmarks, we're not trying to measure AI risk with superhuman test questions,” Cholett wrote in the post. “Future versions of the ARC-AGI benchmark will focus on shrinking [the human capability] gap toward zero.”
Last June, Chollet and Zapier co-founder Mike Knoop at competition Creating an AI capable of besting ARC-AGI. OpenAI's unpublished o3 The model is the first to score a qualification – but only with an extraordinary amount of computing power.
Chollet made it clear that ARC-AGI has flaws — many models were able to force their way to high scores — and that he doesn't believe o3 possesses human-level intelligence.
“[E]Arly data points that offer upcoming [successor to the ARC-AGI] The benchmark will still pose a significant challenge to the o3, potentially dropping its score below 30% even on high counts (although a smart person can score 95% without training). will be able to score more),” Chollet said in a statement last December. “You'll know artificial general intelligence is here when you practice creating tasks that are easy for regular humans but difficult for AI.”
button said The plan is to launch a new competition this year as well as a second-generation ARC-AGI benchmark. The non-profit organization will also begin designing the third version of ARC-AGI
It remains to be seen how the ARC Prize Foundation addresses this Criticism Cholet has been confronted with overselling ARC-AGI as a criterion for arriving at AGI. The very definition of AGI is hotly debated; Recently an OpenAI staff member claimed That AGI is “already” achieved if one defines AGI as AI “better than most humans at most tasks.”
Interestingly, OpenAI CEO Sam Altman said in December that the company intends to partner with the ARC-AGI team to develop future standards. Chollet did not provide any updates on the potential partnership in today's announcement.