Comment

Comment on DeepSeek might not be such good news for energy after all

misk@sopuli.xyz ⁨10⁩ ⁨months⁩ ago

That’s kind of a weird benchmark. Wouldn’t you want a more detailed reply? How is quality measured? I thought the biggest technical feats here were ability to run reasonably well in a constrained memory settings and lower cost to train (and less energy used there).

source

Sort:hotnew top

jacksilver@lemmy.world ⁨10⁩ ⁨months⁩ ago
Longer!=Detailed

Generally what they’re calling out is that DeepSeek currently rambles more. With LLMs the challenge is how to get the right answer most sussinctly because each extra word is a lot of time/money.

That being said, I suspect that really it’s all roughly the same. We’ve been seeing this back and forth with LLMs for a while and DeepSeek, while using a different approach, doesn’t really break the mold.

source
wewbull@feddit.uk ⁨10⁩ ⁨months⁩ ago
This is more about the “reasoning” aspect of the model where it outputs a bunch of “thinking” before the actual result. In a lot of cases it easily adds 2-3x onto the number of tokens needed to be generated. This isn’t really useful output. It the model getting into a state where it can better respond.

source