User Benchmarks - Search News

Scale AI launches Voice Showdown, the first real-world benchmark for voice AI — and the results are humbling for some top models

The results, drawn from thousands of spontaneous voice conversations across more than 60 languages, reveal capability gaps ...

TechCrunch

A new AI benchmark tests whether chatbots protect human well-being

AI chatbots have been linked to serious mental health harms in heavy users, but there have been few standards for measuring whether they safeguard human well-being or just maximize for engagement. A ...

Semiconductor Engineering

The Problem With Benchmarks

What makes a good benchmark and who should create it? This is an issue the industry has been slow to address, but progress is being made. Benchmarks long have been used to compare products, but what ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Scale AI launches Voice Showdown, the first real-world benchmark for voice AI — and the results are humbling for some top models

A new AI benchmark tests whether chatbots protect human well-being

The Problem With Benchmarks

Trending now