ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Submitted ⁨⁨5⁩ ⁨months⁩ ago⁩ by ⁨homesweethomeMrL@lemmy.world⁩ to ⁨retrogaming@lemmy.world⁩

https://www.tomshardware.com/tech-industry/artificial-intelligence/chatgpt-got-absolutely-wrecked-by-atari-2600-in-beginners-chess-match-openais-newest-model-bamboozled-by-1970s-logic

source

Comments

Sort:hotnew top

MadMadBunny@lemmy.ca ⁨5⁩ ⁨months⁩ ago
Attempting to badly quote someone on another post: « How can people honestly think a glorified word autocomplete function could be able to understand what is a logarithm? »

source
- Ephera@lemmy.ml ⁨5⁩ ⁨months⁩ ago
  You can make external tools available to the LLM and then provide it with instructions for when/how to use them.
  So, for example, you’d describe to it that if someone asks it about math or chess, then it should generate JSON text according to a given schema and generate the command text to parametrize a script with it. The script can then e.g. make an API call to Wolfram Alpha or call into Stockfish or whatever.
  
  This isn’t going to be 100% reliable. For example, there’s a decent chance of the LLM fucking up when generating the relatively big JSON you need for describing the entire state of the chessboard, especially with general-purpose LLMs which are configured to introduce some amount of randomness in their output.
  
  But well, in particular, ChatGPT just won’t have the instructions built-in for calling a chess API/program, so for this particular case, it is likely as dumb as auto-complete. It will likely have a math API hooked up, though, so it should be able to calculate a logarithm through such an external tool. Of course, it might still not understand when to use a logarithm, for example.
  
  source
Electricblush@lemmy.world ⁨5⁩ ⁨months⁩ ago
This is so stupid and pointless…

“Thing not made to solve spesific task fails against thing made for it…”

This is like saying that a really old hand pushed lawn mower is better then a SUV at cutting grass…

source
- SpaceNoodle@lemmy.world ⁨5⁩ ⁨months⁩ ago
  SUVs aren’t marketed as grass mowers. LLMs are marketed as AI with all the answers.
  
  source
  - otp@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
    I’d be interested in seeing marketing of ChatGPT as a competitive boardgame player. Is there any?
    
    source
    -> View More Comments
  - SchmidtGenetics@lemmy.world ⁨5⁩ ⁨months⁩ ago
    Source?
    
    source
  - arararagi@ani.social ⁨5⁩ ⁨months⁩ ago
    Hear hear.
    
    source
daniskarma@lemmy.dbzer0.com ⁨5⁩ ⁨months⁩ ago
My 2€ calculator obliterates a 200.000€ ferrari doing multiplications.

source
Xanthobilly@lemmy.world ⁨5⁩ ⁨months⁩ ago
Image

source
Venus_Ziegenfalle@feddit.org ⁨5⁩ ⁨months⁩ ago
In other news: My toaster makes better toast than my vacuum.

source
- chonglibloodsport@lemmy.world ⁨5⁩ ⁨months⁩ ago
  If ChatGPT were marketed as a toaster nobody would bat an eye. The reason so many are laughing is because ChatGPT is marketed as a general intelligence tool.
  
  source
  - Railcar8095@lemm.ee ⁨5⁩ ⁨months⁩ ago
    Do you have any OpenAI stuff (ad, interview, presentation…) That claims it’s AGI? Because I’ve never seen such thing, only people hyping it for clicks and ad revenue
    
    source
    -> View More Comments
- homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
  Your vacuum uses more power than a 150,000-person city just to clean an 8’ square rug?
  
  That does suck.
  
  Heh.
  
  source
bridgeenjoyer@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
Is this just because gibbity couldn’t recognize the chess pieces? I’d love to believe this is true otherwise, love my 2600 haha.

source
- Stillwater@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
  At first it blamed it’s poor performance on the icons used, but then they switched to chess notation and it still failed hard
  
  source
  - bridgeenjoyer@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
    That is baffling
    
    source
redsunrise@programming.dev ⁨5⁩ ⁨months⁩ ago
in other words, a hammer “got absolutely wrecked” by a handsaw in a board-halving competition

source
- Redkey@programming.dev ⁨5⁩ ⁨months⁩ ago
  When all you have (or you try to convince others that all they need) is a hammer, everything looks like a nail. I guess this shows that it isn’t.
  
  source
- JeeBaiChow@lemmy.world ⁨5⁩ ⁨months⁩ ago
  Clearly you didn’t swing the hammer hard enough
  
  source
- homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
  One of those Fisher-Price plastic hammers with the hole in the handle?
  
  source
pinball_wizard@lemmy.zip ⁨5⁩ ⁨months⁩ ago
That’s on them for taking on the Atari 2600, where “the games don’t get older, they get better!”

source
arararagi@ani.social ⁨5⁩ ⁨months⁩ ago
Man all these people coping, I thought chatgpt was supposed to be a generic one able to do anything?

source
- homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
  It depends. Have you used one? If not - Yes! It does do . . . all the things.
  
  If you have used on, I’m sorry that was incorrect. You simply need to pay for the upgraded subscription. Oh, and as a trusted insider now we can let you in on a secret - the next version of this thing is gonna be, like, wow! Boom shanka! Everyone else will be so far behind!
  
  source
  - stormeuh@lemmy.world ⁨5⁩ ⁨months⁩ ago
    You know, when you put it like that, it sounds kind of like Scientology…
    
    source
JeeBaiChow@lemmy.world ⁨5⁩ ⁨months⁩ ago
If llms are statistics based, wouldn’t there be many many more losing games than perfectly winning ones? It’s like Dr strange saying ‘this is the only way’.

source
- Railcar8095@lemm.ee ⁨5⁩ ⁨months⁩ ago
  It’s not even that. It’s not a chess AI or a AGI (which doesn’t exist). It will speak and pretend to play, but has no memory of the exact position of the pieces nor the capability to plan several steps ahead. For ask intended and porpoises, it’s like asking my toddler what’s the time (she always says something that sounds like a time, but doesn’t understand the concept of hours or what the time is)
  
  The fact that somebody posted this on LinkedIn and not only wasn’t shamed out of his job but there are several articles about it is truly infuriating.
  
  source
QueenHawlSera@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
True AI does not and will not exist

source
OsrsNeedsF2P@lemmy.ml ⁨5⁩ ⁨months⁩ ago
What happens if you ask ChatGPT to code you a chess AI though?

source
- 4am@lemm.ee ⁨5⁩ ⁨months⁩ ago
  It doesn’t work without 200 hours of un-fucking
  
  source
- pedz@lemmy.ca ⁨5⁩ ⁨months⁩ ago
  It probably consumes as much energy as a family house for a day just to come up with that program. That’s what happens.
  
  In fact, I did a Google search and didn’t have any choice but to have an “AI” answer, even if I don’t want it. Here’s what it says:
  
  Each ChatGPT query is estimated to use around 10 times more electricity than a traditional Google search, with a single query consuming approximately 3 watt-hours, compared to 0.3 watt-hours for a Google search. This translates to a daily energy consumption of over half a million kilowatts, equivalent to the power used by 180,000 US households.
  
  source
  - daniskarma@lemmy.dbzer0.com ⁨5⁩ ⁨months⁩ ago
    Average daily energy consumption for a family in the US is said to be around 30.000 wh per day.
    
    That would be about 10.000 chatgpt queries per day to equal that.
    
    source
    -> View More Comments
homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
clop - clop - clop - clop - clop - clop

. . .

*bloop*

. . .

[screen goes black for 20 minutes]

. . .

Hmmmmm.

clop - clop - clop - clop - clop - clop - clop - clop - clop - clop

*bloop*

source
- OsrsNeedsF2P@lemmy.ml ⁨5⁩ ⁨months⁩ ago
  Hey I don’t mean to ruin your day, but maybe you should Google what you just commented…
  
  source
  - homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
    There is 100% no chance google knows what that is
    
    source
- homesweethomeMrL@lemmy.world ⁨5⁩ ⁨months⁩ ago
  Little disappointed more people didn’t get this.
  
  source