Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

⁨182⁩ ⁨likes⁩

Submitted ⁨⁨4⁩ ⁨days⁩ ago⁩ by ⁨homesweethomeMrL@lemmy.world⁩ to ⁨retrogaming@lemmy.world⁩

https://www.tomshardware.com/tech-industry/artificial-intelligence/chatgpt-got-absolutely-wrecked-by-atari-2600-in-beginners-chess-match-openais-newest-model-bamboozled-by-1970s-logic

source

Comments

Sort:hotnewtop
  • MadMadBunny@lemmy.ca ⁨4⁩ ⁨days⁩ ago

    Attempting to badly quote someone on another post: « How can people honestly think a glorified word autocomplete function could be able to understand what is a logarithm? »

    source
    • Ephera@lemmy.ml ⁨4⁩ ⁨days⁩ ago

      You can make external tools available to the LLM and then provide it with instructions for when/how to use them.
      So, for example, you’d describe to it that if someone asks it about math or chess, then it should generate JSON text according to a given schema and generate the command text to parametrize a script with it. The script can then e.g. make an API call to Wolfram Alpha or call into Stockfish or whatever.

      This isn’t going to be 100% reliable. For example, there’s a decent chance of the LLM fucking up when generating the relatively big JSON you need for describing the entire state of the chessboard, especially with general-purpose LLMs which are configured to introduce some amount of randomness in their output.

      But well, in particular, ChatGPT just won’t have the instructions built-in for calling a chess API/program, so for this particular case, it is likely as dumb as auto-complete. It will likely have a math API hooked up, though, so it should be able to calculate a logarithm through such an external tool. Of course, it might still not understand when to use a logarithm, for example.

      source
  • Electricblush@lemmy.world ⁨4⁩ ⁨days⁩ ago

    This is so stupid and pointless…

    “Thing not made to solve spesific task fails against thing made for it…”

    This is like saying that a really old hand pushed lawn mower is better then a SUV at cutting grass…

    source
    • SpaceNoodle@lemmy.world ⁨4⁩ ⁨days⁩ ago

      SUVs aren’t marketed as grass mowers. LLMs are marketed as AI with all the answers.

      source
      • otp@sh.itjust.works ⁨4⁩ ⁨days⁩ ago

        I’d be interested in seeing marketing of ChatGPT as a competitive boardgame player. Is there any?

        source
        • -> View More Comments
      • arararagi@ani.social ⁨3⁩ ⁨days⁩ ago

        Hear hear.

        source
      • SchmidtGenetics@lemmy.world ⁨4⁩ ⁨days⁩ ago

        Source?

        source
  • daniskarma@lemmy.dbzer0.com ⁨3⁩ ⁨days⁩ ago

    My 2€ calculator obliterates a 200.000€ ferrari doing multiplications.

    source
  • Venus_Ziegenfalle@feddit.org ⁨4⁩ ⁨days⁩ ago

    In other news: My toaster makes better toast than my vacuum.

    source
    • chonglibloodsport@lemmy.world ⁨3⁩ ⁨days⁩ ago

      If ChatGPT were marketed as a toaster nobody would bat an eye. The reason so many are laughing is because ChatGPT is marketed as a general intelligence tool.

      source
      • Railcar8095@lemm.ee ⁨3⁩ ⁨days⁩ ago

        Do you have any OpenAI stuff (ad, interview, presentation…) That claims it’s AGI? Because I’ve never seen such thing, only people hyping it for clicks and ad revenue

        source
        • -> View More Comments
    • homesweethomeMrL@lemmy.world ⁨3⁩ ⁨days⁩ ago

      Your vacuum uses more power than a 150,000-person city just to clean an 8’ square rug?

      That does suck.

      Heh.

      source
  • Xanthobilly@lemmy.world ⁨4⁩ ⁨days⁩ ago

    Image

    source
  • pinball_wizard@lemmy.zip ⁨3⁩ ⁨days⁩ ago

    That’s on them for taking on the Atari 2600, where “the games don’t get older, they get better!”

    source
  • arararagi@ani.social ⁨3⁩ ⁨days⁩ ago

    Man all these people coping, I thought chatgpt was supposed to be a generic one able to do anything?

    source
    • homesweethomeMrL@lemmy.world ⁨3⁩ ⁨days⁩ ago

      It depends. Have you used one? If not - Yes! It does do . . . all the things.

      If you have used on, I’m sorry that was incorrect. You simply need to pay for the upgraded subscription. Oh, and as a trusted insider now we can let you in on a secret - the next version of this thing is gonna be, like, wow! Boom shanka! Everyone else will be so far behind!

      source
      • stormeuh@lemmy.world ⁨3⁩ ⁨days⁩ ago

        You know, when you put it like that, it sounds kind of like Scientology…

        source
  • JeeBaiChow@lemmy.world ⁨3⁩ ⁨days⁩ ago

    If llms are statistics based, wouldn’t there be many many more losing games than perfectly winning ones? It’s like Dr strange saying ‘this is the only way’.

    source
    • Railcar8095@lemm.ee ⁨3⁩ ⁨days⁩ ago

      It’s not even that. It’s not a chess AI or a AGI (which doesn’t exist). It will speak and pretend to play, but has no memory of the exact position of the pieces nor the capability to plan several steps ahead. For ask intended and porpoises, it’s like asking my toddler what’s the time (she always says something that sounds like a time, but doesn’t understand the concept of hours or what the time is)

      The fact that somebody posted this on LinkedIn and not only wasn’t shamed out of his job but there are several articles about it is truly infuriating.

      source
  • bridgeenjoyer@sh.itjust.works ⁨4⁩ ⁨days⁩ ago

    Is this just because gibbity couldn’t recognize the chess pieces? I’d love to believe this is true otherwise, love my 2600 haha.

    source
    • Stillwater@sh.itjust.works ⁨4⁩ ⁨days⁩ ago

      At first it blamed it’s poor performance on the icons used, but then they switched to chess notation and it still failed hard

      source
      • bridgeenjoyer@sh.itjust.works ⁨4⁩ ⁨days⁩ ago

        That is baffling

        source
  • redsunrise@programming.dev ⁨4⁩ ⁨days⁩ ago

    in other words, a hammer “got absolutely wrecked” by a handsaw in a board-halving competition

    source
    • Redkey@programming.dev ⁨4⁩ ⁨days⁩ ago

      When all you have (or you try to convince others that all they need) is a hammer, everything looks like a nail. I guess this shows that it isn’t.

      source
    • JeeBaiChow@lemmy.world ⁨3⁩ ⁨days⁩ ago

      Clearly you didn’t swing the hammer hard enough

      source
    • homesweethomeMrL@lemmy.world ⁨4⁩ ⁨days⁩ ago

      One of those Fisher-Price plastic hammers with the hole in the handle?

      source
  • OsrsNeedsF2P@lemmy.ml ⁨4⁩ ⁨days⁩ ago

    What happens if you ask ChatGPT to code you a chess AI though?

    source
    • 4am@lemm.ee ⁨4⁩ ⁨days⁩ ago

      It doesn’t work without 200 hours of un-fucking

      source
    • pedz@lemmy.ca ⁨4⁩ ⁨days⁩ ago

      It probably consumes as much energy as a family house for a day just to come up with that program. That’s what happens.

      In fact, I did a Google search and didn’t have any choice but to have an “AI” answer, even if I don’t want it. Here’s what it says:

      Each ChatGPT query is estimated to use around 10 times more electricity than a traditional Google search, with a single query consuming approximately 3 watt-hours, compared to 0.3 watt-hours for a Google search. This translates to a daily energy consumption of over half a million kilowatts, equivalent to the power used by 180,000 US households. 

      source
      • daniskarma@lemmy.dbzer0.com ⁨3⁩ ⁨days⁩ ago

        Average daily energy consumption for a family in the US is said to be around 30.000 wh per day.

        That would be about 10.000 chatgpt queries per day to equal that.

        source
        • -> View More Comments
  • homesweethomeMrL@lemmy.world ⁨4⁩ ⁨days⁩ ago

    clop - clop - clop - clop - clop - clop

    . . .

    *bloop*

    . . .

    [screen goes black for 20 minutes]

    . . .

    Hmmmmm.

    clop - clop - clop - clop - clop - clop - clop - clop - clop - clop

    *bloop*

    source
    • homesweethomeMrL@lemmy.world ⁨3⁩ ⁨days⁩ ago

      Little disappointed more people didn’t get this.

      source
    • OsrsNeedsF2P@lemmy.ml ⁨4⁩ ⁨days⁩ ago

      Hey I don’t mean to ruin your day, but maybe you should Google what you just commented…

      source
      • homesweethomeMrL@lemmy.world ⁨4⁩ ⁨days⁩ ago

        There is 100% no chance google knows what that is

        source