Comment on Announcing ARC-AGI-3 - An benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

<- View Parent
floquant@lemmy.dbzer0.com ⁨9⁩ ⁨hours⁩ ago

Is this a gotcha? Not sure where you got the “visual” from, but yes it is best human performance vs best LLM performance

source
Sort:hotnewtop