Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

⁨334⁩ ⁨likes⁩

Submitted ⁨⁨5⁩ ⁨hours⁩ ago⁩ by ⁨Lifecoach5000@lemmy.world⁩ to ⁨technology@lemmy.world⁩

https://www.tomshardware.com/tech-industry/artificial-intelligence/chatgpt-got-absolutely-wrecked-by-atari-2600-in-beginners-chess-match-openais-newest-model-bamboozled-by-1970s-logic

source

Comments

Sort:hotnewtop
  • Furbag@lemmy.world ⁨2⁩ ⁨minutes⁩ ago

    Can ChatGPT actually play chess now? Last I checked, it couldn’t remember more than 5 moves of history so it wouldn’t be able to see the true board state and would make illegal moves, take it’s own pieces, materialize pieces out of thin air, etc.

    source
  • Objection@lemmy.ml ⁨1⁩ ⁨hour⁩ ago

    Tbf, the article should probably mention the fact that machine learning programs designed to play chess blow everything else out of the water.

    source
  • FMT99@lemmy.world ⁨5⁩ ⁨hours⁩ ago

    Did the author thinks ChatGPT is in fact an AGI? It’s a chatbot. Why would it be good at chess? It’s like saying an Atari 2600 running a dedicated chess program can beat Google Maps at chess.

    source
    • spankmonkey@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      AI including ChatGPT is being marketed as super awesome at everything, which is why that and similar AI is being forced into absolutely everything and being sold as a replacement for people.

      Something marketed as AGI should be treated as AGI when proving it isn’t AGI.

      source
      • pelespirit@sh.itjust.works ⁨5⁩ ⁨hours⁩ ago

        Not to help the AI companies, but why don’t they program them to look up math programs and outsource chess to other programs when they’re asked for that stuff? It’s obvious they’re shit at it, why do they answer anyway? It’s because they’re programmed by know-it-all programmers, isn’t it.

        source
        • -> View More Comments
      • PixelatedSaturn@lemmy.world ⁨4⁩ ⁨hours⁩ ago

        I don’t think ai is being marketed as awesome at everything. It’s got obvious flaws. Right now its not good for stuff like chess, probably not even tic tac toe. It’s a language model, its hard for it to calculate the playing field. But ai is in development, it might not need much to start playing chess.

        source
        • -> View More Comments
    • Broken@lemmy.ml ⁨1⁩ ⁨hour⁩ ago

      I agree with your general statement, but in theory since all ChatGPT does is regurgitate information back and a lot of chess is memorization of historical games and types, it might actually perform well. No, it can’t think, but it can remember everything so at some point that might tip the results in it’s favor.

      source
      • Eagle0110@lemmy.world ⁨6⁩ ⁨minutes⁩ ago

        Regurgitating am impression of, not regurgitating verbatim, that’s the problem here.

        Chess is 100% deterministic, so it falls flat.

        source
    • suburban_hillbilly@lemmy.ml ⁨5⁩ ⁨hours⁩ ago

      Most people do. It’s just called AI in the media everywhere and marketing works. I think online folks forget that something as simple as getting a Lemmy account by yourself puts you into the top quintile of tech literacy.

      source
    • adhdplantdev@lemm.ee ⁨2⁩ ⁨hours⁩ ago

      Articles like this are good because it exposes the flaws with the ai and that it can’t be trusted with complex multi step tasks.

      Helps people see that think AI is close to a human that its not and its missing critical functionality

      source
    • saltesc@lemmy.world ⁨4⁩ ⁨hours⁩ ago

      I like referring to LLMs as VI (Virtual Intelligence from Mass Effect) since they merely give the impression of intelligence but are little more than search engines. In the end all one is doing is displaying expected results based on a popularity algorithm. However they do this inconsistently due to bad data in and limited caching.

      source
    • TowardsTheFuture@lemmy.zip ⁨5⁩ ⁨hours⁩ ago

      I think that’s generally the point is most people thing chat GPT is this sentient thing that knows everything and… no.

      source
    • FartMaster69@lemmy.dbzer0.com ⁨3⁩ ⁨hours⁩ ago

      I mean, open AI seem to forget it isn’t.

      source
    • x00z@lemmy.world ⁨3⁩ ⁨hours⁩ ago

      In all fairness. Machine learning in chess engines is actually pretty strong.

      AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind).

      www.chess.com/terms/alphazero-chess-engine

      source
  • vane@lemmy.world ⁨1⁩ ⁨hour⁩ ago

    It’s not that hard to beat dumb 6 year old who’s only purpose is mine your privacy to sell you ads or product place some shit for you in future.

    source
  • anubis119@lemmy.world ⁨5⁩ ⁨hours⁩ ago

    A strange game. How about a nice game of Global Thermonuclear War?

    source
    • ada@piefed.blahaj.zone ⁨5⁩ ⁨hours⁩ ago

      No thank you. The only winning move is not to play

      source
    • Lifecoach5000@lemmy.world ⁨4⁩ ⁨hours⁩ ago

      Lmao! 🤣 that made me spit!!

      source
    • Xanthobilly@lemmy.world ⁨4⁩ ⁨hours⁩ ago

      Image

      source
    • MadMadBunny@lemmy.ca ⁨4⁩ ⁨hours⁩ ago

      Frak off, toaster

      source
  • floofloof@lemmy.ca ⁨5⁩ ⁨hours⁩ ago

    ChatGPT the word prediction machine? Why would anyone expect it to be good at chess?

    source
    • otp@sh.itjust.works ⁨5⁩ ⁨hours⁩ ago

      Because people want to feel superior because they don’t know how to use a ChatBot can count the number of "r"s in the word “strawberry”, lol

      source
      • electricyarn@lemmy.world ⁨5⁩ ⁨hours⁩ ago

        Yeah, just because I can’t count the number of r’s in the word strawberry doesn’t mean I shouldn’t be put in charge of the US nuclear arsenal!

        source
        • -> View More Comments
  • Lembot_0003@lemmy.zip ⁨5⁩ ⁨hours⁩ ago

    The Atari chess program can play chess better than the Boeing 747 too. And better than the North Pole. Amazing!

    source
    • CarbonatedPastaSauce@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      Neither of those things are marketed as being artificially intelligent.

      source
      • Lembot_0003@lemmy.zip ⁨4⁩ ⁨hours⁩ ago

        Marketers aren’t intelligent either, so I see no reason to listen to them.

        source
        • -> View More Comments
    • SpaceNoodle@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      Are either of those marketed as powerful AI?

      source
  • capuccino@lemmy.world ⁨5⁩ ⁨hours⁩ ago

    This made my day

    source
    • hogmomma@lemmy.world ⁨2⁩ ⁨hours⁩ ago

      Get your booty on the floor tonight.

      source
  • NotMyOldRedditName@lemmy.world ⁨2⁩ ⁨hours⁩ ago

    Okay, but could ChatGPT be used to vibe code a chess program that beats the Atari 2600?

    source
    • GreenKnight23@lemmy.world ⁨3⁩ ⁨minutes⁩ ago

      no.

      the answer is always, no.

      source
  • Nurse_Robot@lemmy.world ⁨5⁩ ⁨hours⁩ ago

    I’m often impressed at how good chatGPT is at generating text, but I’ll admit it’s hilariously terrible at chess. It loves to manifest pieces out of thin air, or make absurd illegal moves, like jumping its king halfway across the board and claiming checkmate

    source
    • Blaster_M@lemmy.world ⁨3⁩ ⁨hours⁩ ago

      ChatGPT is playing Anarchy Chess

      source
    • Lifecoach5000@lemmy.world ⁨4⁩ ⁨hours⁩ ago

      Yeah! I’ve loved watching Gothem Chess’ videos on these. Always have been good for a laugh.

      source
  • Kolanaki@pawb.social ⁨5⁩ ⁨hours⁩ ago

    There was a chess game for the Atari 2600? :O

    I wanna see them W I D E pieces.

    source
    • FMT99@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      Prepare to be delighted. Full disclosure, my Atari isn’t hooked up and also I don’t have the Video Chess cart even if it was, so this was fetched from Google Images.

      Image

      source
      • NotMyOldRedditName@lemmy.world ⁨2⁩ ⁨hours⁩ ago

        I’m annoyed the pieces are bottom adjusted…

        source
      • Kolanaki@pawb.social ⁨5⁩ ⁨hours⁩ ago

        Those are some funky looking knights lol

        source
        • -> View More Comments
      • homesweethomeMrL@lemmy.world ⁨4⁩ ⁨hours⁩ ago

        Can confirm.

        And if you play it on expert mode, you can leave for college and get your degree before it’s your turn again.

        source
    • youngalfred@lemm.ee ⁨5⁩ ⁨hours⁩ ago

      Here you go (online emulator): www.retrogames.cz/play_716-Atari2600.php

      source
      • over_clox@lemmy.world ⁨4⁩ ⁨hours⁩ ago

        WTF? I just played that just long enough for my queen to take over their queen, and it turned my queen into a rook?

        Is that even a legit rule in any variation of chess rules?

        source
    • over_clox@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      I wasn’t aware of that either, now I’m kinda curious to try to find it in my 512 Atari 2600 ROMs archive…

      source
  • muntedcrocodile@lemm.ee ⁨5⁩ ⁨hours⁩ ago

    This isn’t the strength of gpt-o4 the model has been optimised for tool use as an agent. That’s why its so good at image gen relative to their models it uses tools to construct an image piece by piece similar to a human. Also probably poor system prompting. A LLM is not a universal thinking machine its a a universal process machine. An LLM understands the process and uses tools to accomplish the process hence its strengths in writing code (especially as an agent).

    Its similar to how a monkey is infinitely better at remembering a sequence of numbers than a human ever could but is totally incapable of even comprehending writing down numbers.

    source
    • cheese_greater@lemmy.world ⁨4⁩ ⁨hours⁩ ago

      Do you have a source for that re:monkeys memorizing numerical sequences? What do you mean by that?

      source
      • RememberTheEnding@lemmy.world ⁨4⁩ ⁨hours⁩ ago

        www.youtube.com/watch?v=MKvX9PPmI-Q

        source
      • shalafi@lemmy.world ⁨4⁩ ⁨hours⁩ ago

        That threw me as well.

        source
  • krigo666@lemmy.world ⁨5⁩ ⁨hours⁩ ago

    Next, pit ChatGPT against 1K Chess in a ZX81.

    source
  • IsaamoonKHGDT_6143@lemmy.zip ⁨5⁩ ⁨hours⁩ ago

    They used ChatGPT 4o, instead of using o1 or o3.

    Obviously it was going to fail.

    source
  • Asswardbackaddict@lemmy.world ⁨4⁩ ⁨hours⁩ ago

    While you guys suck at using tools, I’m making up for my lack of coding experience with ai, and successfully simulating the behavior of my aether (fuck you guys. Your search for a static ether is irrelevant to how mine behaves, and you shouldn’t have dismissed everybody from Diogynes to Einstein), showing soliton-like structure emergence and particle-like interactions (with 1D relativistic restraints [I’m gonna need a fucking super computer to scale to 3D). Anyways, whether you’re wrong about your latest fun fact, cutting your thumb off trying to split a 2X4, or believing any idiot you talk to, this is user error, bro. Creating functional code for my simulations has saved me months, if not years of my life. Just setting up a gui was ridiculous for a novice like me, let alone translating walls of relativistic equation results (mainly stress-energy tensor) into code a computer can use.

    source
  • seven_phone@lemmy.world ⁨2⁩ ⁨hours⁩ ago

    You say you produce good oranges but my machine for testing apples gave your oranges a very low score.

    source