DeepSeek might not be such good news for energy after all

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨Aatube@kbin.melroy.org⁩ to ⁨technology@lemmy.world⁩

http://archive.today/2025.02.03-000924/https://www.technologyreview.com/2025/01/31/1110776/deepseek-might-not-be-such-good-news-for-energy-after-all/

Overall, when tested on 40 prompts, DeepSeek was found to have a similar energy efficiency to the Meta model, but DeepSeek tended to generate much longer responses and therefore was found to use 87% more energy.

source

Comments

Sort:hotnew top

misk@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
That’s kind of a weird benchmark. Wouldn’t you want a more detailed reply? How is quality measured? I thought the biggest technical feats here were ability to run reasonably well in a constrained memory settings and lower cost to train (and less energy used there).

source
- jacksilver@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Longer!=Detailed
  
  Generally what they’re calling out is that DeepSeek currently rambles more. With LLMs the challenge is how to get the right answer most sussinctly because each extra word is a lot of time/money.
  
  That being said, I suspect that really it’s all roughly the same. We’ve been seeing this back and forth with LLMs for a while and DeepSeek, while using a different approach, doesn’t really break the mold.
  
  source
- wewbull@feddit.uk ⁨1⁩ ⁨year⁩ ago
  This is more about the “reasoning” aspect of the model where it outputs a bunch of “thinking” before the actual result. In a lot of cases it easily adds 2-3x onto the number of tokens needed to be generated. This isn’t really useful output. It the model getting into a state where it can better respond.
  
  source
Viri4thus@feddit.org ⁨1⁩ ⁨year⁩ ago
The FUD is hilarious. Even an llm would tell you the article compares apples and oranges… FFS.

source
vk6flab@lemmy.radio ⁨1⁩ ⁨year⁩ ago
And here I thought that the energy consumption was in the training.

source
JohnDClay@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
The original claims of energy efficiency came from mixing up the energy usage of their much smaller model with their big model I think.

source
- peanuts4life@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
  This article is comparing apples to oranges here. The deepseek R1 model is a mixture of experts, reasoning model with 600 billion parameters, and the meta model is a dense 70 billion parameter model without reasoning which preforms much worse.
  
  They should be comparing deepseek to reasoning models such as openai’s O1. They are comparable with results, but O1 cost significantly more to run. It’s impossible to know how much energy it uses because it’s a closed source model and openai doesn’t publish that information, but they charge a lot for it on their API.
  
  Tldr: It’s a bad faith comparison. Like comparing a train to a car and complaining about how much more diesel the train used on a 3 mile trip between stations.
  
  source
thechaoticchicken@lemmy.world ⁨1⁩ ⁨year⁩ ago
So the answer, as always, is ban useless, power-sucking, unreliable, copyright-infringing AI.

Why do I feel that this will never happen? That we will continue to use this horrible technology to the end of our days with people constantly making excuses for its existence? Ugh.

source
- simple@lemm.ee ⁨1⁩ ⁨year⁩ ago
  
  So the answer, as always, is ban useless, power-sucking, unreliable, copyright-infringing AI.
  
  That’s just naive. It’s way too late for any of that. If some country decided to ban AI, all the engineers will just move somewhere else.
  
  source
- ShepherdPie@midwest.social ⁨1⁩ ⁨year⁩ ago
  Everyone is making way too much money off of this for a blanket ban to ever happen.
  
  source
Taalnazi@lemmy.world ⁨11⁩ ⁨months⁩ ago
A bit flawed. What if the same prompts are used but both models are required to keep their responses equally brief?

source