Comment

Comment on Nvidia Sales Jump 56%, a Sign the A.I. Boom Isn’t Slowing Down

brucethemoose@lemmy.world ⁨2⁩ ⁨months⁩ ago

The power usage is massively overstated, and a meme perpetuated by Altman so he’ll get more more money for ‘scaling’

GPT-5 is already proof scaling with no innovation doesn’t work. And tech in the pipe like bitnet is coming to disrupt that even more; the future is small, specialized, augmented models, mostly running locally on your phone/PC because it’s so cheap and low power.

source

Sort:hotnew top

frezik@lemmy.blahaj.zone ⁨2⁩ ⁨months⁩ ago
Except none of these companies are making money. Like almost literally none. We’re about three years into the LLM craze, and nobody has figured out how to turn a profit. Hell, forget profit, not bleeding through prodigious piles of cash would be a big deal.

source
- brucethemoose@lemmy.world ⁨2⁩ ⁨months⁩ ago
  Nods vigorously.
  
  The future of LLMs basically unprofitable for the actual AI companies. We are in a hell of a bubble, which I can’t wait to pop so I can pick up a liquidation GPU (or at least rent one for cheap).
  
  That doesn’t mean power usage is an issue. In fact, it seems like the sheer inefficiency of OpenAI/Grok and such are nails in their coffins.
  
  source
  - frezik@lemmy.blahaj.zone ⁨2⁩ ⁨months⁩ ago
    Power usage is what’s sucking the cash. What else could it be? Not all of these companies are building out lots of datacenters the way OpenAI is. They built what they have, and are now trying to make money on it.
    
    The companies that are charging for AI are charging about as much as buyers are willing to pay, but it’s orders of magnitude too small to cover their costs. The big cost is power usage.
    
    source
    brucethemoose@lemmy.world ⁨2⁩ ⁨months⁩ ago
    On the training side, it’s mostly:
    
    Paying devs to prepare the training runs with data, software architecture, frameworks, things like that.
    
    Paying other devs to get the training to scale across 800+ nodes.
    
    Building the data centers, where the construction and GPU hardware costs kind of dwarf power usage in the short term.
    
    On the inference side:
    
    Sometimes optimized deployment frameworks like Deepseek uses, though many seem to use something off the shelf like sglang
    
    Renting or deploying GPU servers individually. They don’t need to be networked at scale like for training, with the highest end I’ve heard (Deepseek’s optimized framework) being like 18 servers or so. And again, the sticker price of the GPUs is the big cost here.
    
    Developing tool use frameworks.
    
    On both sides, the big players burn tons of money on Tech Bro “superstar” developers that, frankly, seem to Tweet more than developing interesting things.
    
    source