The ARC Prize organization designs benchmarks which are specifically crafted to demonstrate tasks that humans complete easily, but are difficult for AIs like LLMs, “Reasoning” models, and Agentic frameworks.
ARC-AGI-3 is the first fully interactive benchmark in the ARC-AGI series. ARC-AGI-3 represents hundreds of original turn-based environments, each handcrafted by a team of human game designers. There are no instructions, no rules, and no stated goals. To succeed, an AI agent must explore each environment on its own, figure out how it works, discover what winning looks like, and carry what it learns forward across increasingly difficult levels.
Previous ARC-AGI benchmarks predicted and tracked major AI breakthroughs, from reasoning models to coding agents. ARC-AGI-3 points to what’s next: the gap between AI that can follow instructions and AI that can genuinely explore, learn, and adapt in unfamiliar situations.
You can try the tasks yourself here: arcprize.org/arc-agi/3
Here is the current leaderboard for ARC-AGI 3, using state of the art models
- OpenAI GPT-5.4 High - 0.3% success rate at $5.2K
- Google Gemini 3.1 Pro - 0.2% success rate at $2.2K
- Anthropic Opus 4.6 Max - 0.2% success rate at $8.9K
- xAI Grok 4.20 Reasoning - 0.0% success rate $3.8K.
ARC-AGI 3 Leaderboard
(Logarithmic cost on the horizontal axis)
RustyShackleford@piefed.social 3 weeks ago
As a psychiatrist, I have a theory about what’s missing in AI. First, it lacks childhood dependency and attachments. Second, it struggles to overcome repeated pain and suffering. Third, it lacks regular eating and restroom breaks. Fourth, it struggles to accept loss in everyday situations. Finally, it lacks the concept of our inevitable death. Without these nagging memories and concepts, machines will simply revert to the simpler concepts we use them for in our recent times, such as stealing cryptocurrency. After all, we live in a world run by capitalism, so it’s only logical. ¯\_(ツ)_/¯
CosmicTurtle0@lemmy.dbzer0.com 3 weeks ago
As a technologist, I have to remind everyone that AI is not intelligence. It’s a word prediction/statistical machine. It’s guessing at a surprisingly good rate what words follow the words before it.
It’s math. All the way down.
We as humans have simply taken these words and have said that it is “intelligence”.
unpossum@sh.itjust.works 3 weeks ago
As another technologist, I have to remind everyone that unless you subscribe to some rather fringe theories, humans are also based on standard physics.
Which is math. All the way down.
silverneedle@lemmy.ca 3 weeks ago
As someone who knows a thing or two about biology I think LLMs strip away >90% of what makes animals think.
msage@programming.dev 3 weeks ago
Are you anthromorphizing word suggester into a being experiencing things?
MagicShel@lemmy.zip 3 weeks ago
The major thing AI lacks is continuous parallel “prompting” through a variety of channels including sensory, biofeedback, and introspection / meta-thought about internal state and thinking.
AI currently transforms a given input into an output. However it cannot accept new input in the middle of an output. It can’t evaluate the quality of its own reasoning except though trial and error.
If you had 1000 AIs operating in tandem and fed a continuous stream of prompts in the form of pictures, text, meta-inspection, and perhaps a simulation of biomechanical feedback with the right configuration, I think it might be possible to create a system that is a hell of an approximation of sentience. But it would be slow and I’m not sure the result would be any better than a human — you’d introduce a lot of friction to the “thought” process. And I have to assume the energy cost would be pretty enormous.
In the end it would be a cool experiment to be part of, but I doubt that version would be worth the investment.
ExFed@programming.dev 3 weeks ago
It could also be that it lacks the machinery to feel any emotions at all. You don’t (normally) have to train people to be afraid of bears or heights or loneliness or boredom. You also don’t (normally) have to train people to have empathy or compassion.
I argue that our obsession with AI is, itself, a misalignment with our environment; it disproportionately tickles psychological reward centers which evolved under unrecognizably different circumstances.
Havoc8154@mander.xyz 2 weeks ago
I guess you don’t have children.
You absolutely do have to train them to be afraid of bears, heights, and every fucking thing you can imagine. You absolutely do have to teach them empathy and compassion. There may be some nugget of instinct, but without reinforcement it might as well not exist.
2xsaiko@discuss.tchncs.de 3 weeks ago
So what are you implying about people who don’t experience these?