Comment on AI agents wrong ~70% of time: Carnegie Mellon study
jsomae@lemmy.ml 4 days agoThe problem is they are not i.i.d., so this doesn’t really work. It works a bit, which is in my opinion why chain-of-thought is effective (it gives the LLM a chance to posit a couple answers first). However, we’re already looking at “agents,” so they’re probably already doing chain-of-thought.
Knock_Knock_Lemmy_In@lemmy.world 4 days ago
Very fair comment. In my experience even increasing the temperature you get stuck in local minimums
I was just trying to illustrate how 70% failure rates can still be useful.