Comment on Large Language Model Performance Doubles Every 7 Months
Eranziel@lemmy.world 1 week agoThat graph is hilarious. Enormous error bars, totally arbitrary quantization of complexity, and it’s title? “Task time for a human that an AI model completes with a 50 percent success rate”. 50 percent success is useless, lmao.
On a more sober note, I’m very disappointed that IEEE is publishing this kind of trash.
vrighter@discuss.tchncs.de 1 week ago
in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time