Comment on [deleted]

brucethemoose@lemmy.world ⁨6⁩ ⁨days⁩ ago

One thing about Anthropic/OpenAI models is they go off the rails with lots of conversation turns or long contexts. Like when they need to remember a lot of vending machine conversation I guess.

A more objective look: arxiv.org/abs/2505.06120v1

Gemini is much better. TBH the only models I’ve seen that are half decent at this are:

But most models are overtuned for oneshots like fix this table or write me a function, and don’t invest much in long context performance because it’s not very flashy.

source
Sort:hotnewtop