Comment

This particular coding leaderboard matches my own personal experience. Llama4 is hitting ~15% ; Claude Opus4 ~70% (I haven’t used others personally)

Sort:hotnew top