Comment on Announcing ARC-AGI-3 - An benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

<- View Parent
FrankFrankson@lemmy.world ⁨3⁩ ⁨weeks⁩ ago

Thatvis how much individual testing humans cost when you buy them in bulk.

source
Sort:hotnewtop