PewDiePie has revealed that he trained his own AI model and claims it outperformed ChatGPT on a coding benchmark.
OpenAI wants to retire the leading AI coding benchmark—and the reasons reveal a deeper problem with how the whole industry measures itself.
Google just released its most capable Gemini 3.1 Pro AI model that beats all frontier models on Humanity's Last Exam and ...
Google has unveiled Gemini 3.1 Pro, an upgraded AI model that outperforms its predecessor and competitors on major logic and ...
Google says that its most advanced thinking model yet outperforms Claude and ChatGPT on Humanity's Last Exam and other key ...
MLCommons today released AILuminate, a new benchmark test for evaluating the safety of large language models. Launched in 2020, MLCommons is an industry consortium backed by several dozen tech firms.
Gemini 3.1 Pro promises a Google LLM capable of handling more complex forms of work.
OpenAI today detailed o3, its new flagship large language model for reasoning tasks. The model’s introduction caps off a 12-day product announcement series that started with the launch of a new ...
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft has unveiled a groundbreaking artificial intelligence model, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results