Comment

Comment on Large Language Model Performance Doubles Every 7 Months

That graph is hilarious. Enormous error bars, totally arbitrary quantization of complexity, and it’s title? “Task time for a human that an AI model completes with a 50 percent success rate”. 50 percent success is useless, lmao.

On a more sober note, I’m very disappointed that IEEE is publishing this kind of trash.

source

Sort:hotnew top

vrighter@discuss.tchncs.de ⁨10⁩ ⁨months⁩ ago
in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time

source