blue_zephyr@lemmy.world 1 year ago
This paper is pretty unbelievable to me in the literal sense. First of all they couldn’t even bother to check for simple spelling mistakes. Second, all they’re doing is asking whether a number is prime or not and then extrapolating the results to be representative of solving math problems.
But most importantly I don’t believe for a second that the same model with a few adjustments over a 3 month period would completely flip performance on any representative task. I suspect there’s something seriously wrong with how they probe the models.