Newer AI Coding Assistants Are Failing in Insidious Ways

⁨140⁩ ⁨likes⁩

Submitted ⁨⁨4⁩ ⁨months⁩ ago⁩ by ⁨brianpeiris@lemmy.ca⁩ to ⁨technology@lemmy.world⁩

https://spectrum.ieee.org/ai-coding-degrades

source

Comments

Sort:hotnew top

unexposedhazard@discuss.tchncs.de ⁨4⁩ ⁨months⁩ ago

I use LLM-generated code extensively in my role as CEO of Carrington Labs, a provider of predictive-analytics risk models for lenders.

Well you know where not to buy now…

source
- panda_abyss@lemmy.ca ⁨4⁩ ⁨months⁩ ago
  Oh yeah, I read that and thought “this has all the problems of the garden of forking paths”
  
  Based on the description, the whole company is just p-hacking. There’s a reason why nobody uses stepwise feature selection, you just replaced the evaluation/step mechanism with AI.
  
  source
Feyd@programming.dev ⁨4⁩ ⁨months⁩ ago
Unexpected???

source
- TaviRider@reddthat.com ⁨4⁩ ⁨months⁩ ago
  Unexpected to AI true believers.
  
  source
  - WanderingThoughts@europe.pub ⁨4⁩ ⁨months⁩ ago
    I’ve lived long enough to go through multiple AI winters. This is just business failure as usual.
    
    source
- luciferofastora@feddit.org ⁨4⁩ ⁨months⁩ ago
  Unexpected in the same way as you don’t expect leopards to eat your face…
  
  source
nightlily@leminal.space ⁨4⁩ ⁨months⁩ ago
Paint huffer surprised when other paint huffers are happy to accept any old solvent.

source
riskable@programming.dev ⁨4⁩ ⁨months⁩ ago
Correction: Newer versions of ChatGPT (GPT-5.x) are failing in insidious ways. The article has no mention of the other popular services or the dozens of open source coding assist AI models (e.g. Qwen, gpt-oss, etc).

The open source stuff is amazing and gets better just as quickly as the big AI options. Yet they’re boring so they don’t make the news.

source
- count_dongulus@lemmy.world ⁨4⁩ ⁨months⁩ ago
  That’s because OpenAI is in panic mode. They’re now spending their resources on making the LLM cheaper to operate and capable of injecting paid results.
  
  source
atrielienz@lemmy.world ⁨4⁩ ⁨months⁩ ago
GiGo.

source
1Fuji2Taka3Nasubi@piefed.zip ⁨4⁩ ⁨months⁩ ago
Not failing, just Skynet doing its work.

source