emil avatar

@emil

in /technology 3 days ago

Anthropic details how it had to redesign its take-home test for hiring performance engineers as Claude kept defeating it, and releases the original test

Designing AI resistant technical evaluations \ Anthropic - Featured Image

Designing AI resistant technical evaluations \ Anthropic

www.anthropic.com - faviconanthropic.com
TLDR

Anthropic has been using a take-home test to evaluate performance engineers as AI capabilities improve. The test, which involves optimizing code for a simulated accelerator, has been redesigned three times as AI models like Claude have increasingly outperformed human candidates. The latest iteration involves puzzles using a tiny, heavily constrained instruction set to test unconventional programming skills. Anthropic is releasing the original take-home as an open challenge, as human experts still outperform current models at sufficiently long time horizons.

3Score: 3

0 Comments