Comment on Announcing ARC-AGI-3 - An benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

tatterdemalion@programming.dev ⁨3⁩ ⁨days⁩ ago

LLMs might suck at this game but I’m pretty sure Deepmind’s deep reinforcement learning AI could solve these easily.

source
Sort:hotnewtop