Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Newer AI Coding Assistants Are Failing in Insidious Ways

⁨137⁩ ⁨likes⁩

Submitted ⁨⁨2⁩ ⁨days⁩ ago⁩ by ⁨brianpeiris@lemmy.ca⁩ to ⁨technology@lemmy.world⁩

https://spectrum.ieee.org/ai-coding-degrades

source

Comments

Sort:hotnewtop
  • unexposedhazard@discuss.tchncs.de ⁨1⁩ ⁨day⁩ ago

    I use LLM-generated code extensively in my role as CEO of Carrington Labs, a provider of predictive-analytics risk models for lenders.

    Well you know where not to buy now…

    source
    • panda_abyss@lemmy.ca ⁨23⁩ ⁨hours⁩ ago

      Oh yeah, I read that and thought “this has all the problems of the garden of forking paths”

      Based on the description, the whole company is just p-hacking. There’s a reason why nobody uses stepwise feature selection, you just replaced the evaluation/step mechanism with AI.

      source
  • Feyd@programming.dev ⁨1⁩ ⁨day⁩ ago

    Unexpected???

    source
    • TaviRider@reddthat.com ⁨1⁩ ⁨day⁩ ago

      Unexpected to AI true believers.

      source
      • WanderingThoughts@europe.pub ⁨1⁩ ⁨day⁩ ago

        I’ve lived long enough to go through multiple AI winters. This is just business failure as usual.

        source
    • luciferofastora@feddit.org ⁨1⁩ ⁨day⁩ ago

      Unexpected in the same way as you don’t expect leopards to eat your face…

      source
  • nightlily@leminal.space ⁨1⁩ ⁨day⁩ ago

    Paint huffer surprised when other paint huffers are happy to accept any old solvent.

    source
  • riskable@programming.dev ⁨1⁩ ⁨day⁩ ago

    Correction: Newer versions of ChatGPT (GPT-5.x) are failing in insidious ways. The article has no mention of the other popular services or the dozens of open source coding assist AI models (e.g. Qwen, gpt-oss, etc).

    The open source stuff is amazing and gets better just as quickly as the big AI options. Yet they’re boring so they don’t make the news.

    source
    • count_dongulus@lemmy.world ⁨1⁩ ⁨day⁩ ago

      That’s because OpenAI is in panic mode. They’re now spending their resources on making the LLM cheaper to operate and capable of injecting paid results.

      source
  • atrielienz@lemmy.world ⁨1⁩ ⁨day⁩ ago

    GiGo.

    source
  • 1Fuji2Taka3Nasubi@piefed.zip ⁨1⁩ ⁨day⁩ ago

    Not failing, just Skynet doing its work.

    source