and assuming that improvement doesn’t plateau, ever,
Comment on Large Language Model Performance Doubles Every 7 Months
technocrit@lemmy.dbzer0.com 1 week ago
Classic pseudo-science. Vague definitions, sloppy measurements, extremely biased, etc.
catty@lemmy.world 1 week ago
Eranziel@lemmy.world 1 week ago
That graph is hilarious. Enormous error bars, totally arbitrary quantization of complexity, and it’s title? “Task time for a human that an AI model completes with a 50 percent success rate”. 50 percent success is useless, lmao.
On a more sober note, I’m very disappointed that IEEE is publishing this kind of trash.
vrighter@discuss.tchncs.de 1 week ago
in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time