Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Nvidia delivers first Vera Rubin AI GPU samples to customers — 88-core Vera CPU paired with Rubin GPUs with 288 GB of HBM4 memory apiece

⁨135⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨day⁩ ago⁩ by ⁨RegularJoe@lemmy.world⁩ to ⁨technology@lemmy.world⁩

https://www.tomshardware.com/tech-industry/artificial-intelligence/nvidia-delivers-first-vera-rubin-ai-gpu-samples-to-customers-88-core-vera-cpu-paired-with-rubin-gpus-with-288-gb-of-hbm4-memory-apiece

source

Comments

Sort:hotnewtop
  • LoremIpsumGenerator@lemmy.world ⁨4⁩ ⁨hours⁩ ago

    So this is where our future ram buy went into? Fuck this planet then 🤣

    source
    • eleitl@lemmy.zip ⁨49⁩ ⁨minutes⁩ ago

      HBMx is a different product than DDRx/GDDRx, though parts of the fabbing are probably shared.

      source
  • zebidiah@lemmy.ca ⁨7⁩ ⁨hours⁩ ago

    THIS is why we can’t have nice things…

    source
  • Mynameisallen@lemmy.zip ⁨1⁩ ⁨day⁩ ago

    This is what all the parts we wanted went to

    source
    • Earthman_Jim@lemmy.zip ⁨1⁩ ⁨day⁩ ago

      Yeah, I wonder how long it will take them to clue in that no one wants to trade gaming for an AI fucking girlfriend ffs…

      source
      • Mynameisallen@lemmy.zip ⁨1⁩ ⁨day⁩ ago

        Until the money stops pouring in I suppose

        source
      • setsubyou@lemmy.world ⁨1⁩ ⁨day⁩ ago

        I mean if they came with a cool android body we could talk about it. It should at least be able to do cleaning and cooking. Otherwise my wife won’t like it.

        source
        • -> View More Comments
    • roofuskit@lemmy.world ⁨1⁩ ⁨day⁩ ago

      Don’t worry, you can rent them for $30 a month and stream all your video games.

      source
      • Mynameisallen@lemmy.zip ⁨20⁩ ⁨hours⁩ ago

        Not even, just the ones they deign to allow

        source
  • gnawmon@ttrpg.network ⁨1⁩ ⁨day⁩ ago

    so that’s why my 5070 laptop has 8 GBs of VRAM…

    my old 1080 also had 8 GBs of VRAM

    source
    • kittenzrulz123@lemmy.dbzer0.com ⁨5⁩ ⁨hours⁩ ago

      Your 5070 laptop has 8gb of vram? My desktop 3060 has 12gb of vram and its not even the TI version.

      source
      • gnawmon@ttrpg.network ⁨4⁩ ⁨hours⁩ ago

        yuup

        source
  • Cocodapuf@lemmy.world ⁨20⁩ ⁨hours⁩ ago

    Jesus fucking Christ, 288GB. And this is why I can’t have 16?

    source
    • Corkyskog@sh.itjust.works ⁨6⁩ ⁨hours⁩ ago

      And you have to buy a rack of them with 72 of them.

      source
  • xxce2AAb@feddit.dk ⁨1⁩ ⁨day⁩ ago

    Goodbye, sweet hardware. You deserved better and so did we.

    source
  • phoenixz@lemmy.ca ⁨17⁩ ⁨hours⁩ ago

    And none of us will be allowed to have them

    Only datacenters and only fortune 500 companies will be able to use anything Nvidia

    source
    • eleitl@lemmy.zip ⁨46⁩ ⁨minutes⁩ ago

      You can’t do much with them, unless you’re into deep leaning. And the power bill would bankrupt you. I wish I had a Cerebras box, but even the smallest one is 20 kW, liquid cooled.

      source
    • Corkyskog@sh.itjust.works ⁨6⁩ ⁨hours⁩ ago

      I mean if you have the 3 million to spend on a rack of them, I am sure they would allow you to have them.

      I do wonder what happens a few years down the road when everyone are replacing their gpus with latest and greatest variants what happens to the old racks? Do they get sold for pennies on the dollar because everyone else doing AI wants cutting edge?

      source
      • eleitl@lemmy.zip ⁨45⁩ ⁨minutes⁩ ago

        The failure rate is high for ML GPUs. The hardware is effectively consumables.

        source
  • RegularJoe@lemmy.world ⁨1⁩ ⁨day⁩ ago

    Nvidia’s Vera Rubin platform is the company’s next-generation architecture for AI data centers that includes an 88-core Vera CPU, Rubin GPU with 288 GB HBM4 memory, Rubin CPX GPU with 128 GB of GDDR7, NVLink 6.0 switch ASIC for scale-up rack-scale connectivity, BlueField-4 DPU with integrated SSD to store key-value cache, Spectrum-6 Photonics Ethernet, and Quantum-CX9 1.6 Tb/s Photonics InfiniBand NICs, as well as Spectrum-X Photonics Ethernet and Quantum-CX9 Photonics InfiniBand switching silicon for scale-out connectivity.

    source
    • TropicalDingdong@lemmy.world ⁨1⁩ ⁨day⁩ ago

      288 GB HBM4 memory

      jfc…

      Looking at the spec’s… fucking hell these things probably cost over 100k.

      I wonder if we’ll see a generational performance leap with LLM’s scaling to this much memory.

      source
      • AliasAKA@lemmy.world ⁨1⁩ ⁨day⁩ ago

        Current models are speculated at 700 billion parameters plus. At 32 bit precision (half float), that’s 2.8TB of RAM per model, or about 10 of these units. There are ways to lower it, but if you’re trying to run full precision (say for training) you’d use over 2x this, something like maybe 4x depending on how you store gradients and updates, and then running full precision I’d reckon at 32bit probably. Possible I suppose they train at 32bit but I’d be kind of surprised.

        source
        • -> View More Comments
      • in_my_honest_opinion@piefed.social ⁨1⁩ ⁨day⁩ ago

        Fundamentally no, linear progress requires exponential resources. The below article is about AGI but transformer based models will not benefit from just more grunt. We’re at the software stage of the problem now. But that doesn’t sign fat checks, so the big companies are incentivized to print money by developing more hardware.

        https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

        Also the industry is running out of training data

        https://arxiv.org/html/2602.21462v1

        What we need are more efficient models, and better harnessing. Or a different approach, reinforced learning applied to RNNs that use transformers has been showing promise.

        source
        • -> View More Comments
      • boonhet@sopuli.xyz ⁨1⁩ ⁨day⁩ ago

        LLMs can already use way more I believe, they don’t really run them a single one of these things.

        The HBM4 would likely be great for speed though.

        source
      • Cocodapuf@lemmy.world ⁨20⁩ ⁨hours⁩ ago

        Lol, this was literally my exact response

        lemmy.world/comment/22356808

        I feel you man.

        source
      • panda_abyss@lemmy.ca ⁨1⁩ ⁨day⁩ ago

        Yeah they’re going to cost as much as a house.

        I think we’ll see much larger active portions of larger MOEs, and larger context windows, which would be useful.

        The non LLM models I run would benefit a lot from this, but I don’t know of I’ll ever be able to justify the cost of how much they’ll be.

        source
    • yogurtwrong@lemmy.world ⁨20⁩ ⁨hours⁩ ago

      The buzzwords make my head hurt. Sounds like a copypasta

      source
      • in_my_honest_opinion@piefed.social ⁨14⁩ ⁨hours⁩ ago

        Almost like an LLM wrote it…

        source
  • redsand@infosec.pub ⁨17⁩ ⁨hours⁩ ago

    Brick them all 🧱

    source
  • RizzRustbolt@lemmy.world ⁨18⁩ ⁨hours⁩ ago

    But can it run Crysis?

    source
    • elucubra@sopuli.xyz ⁨16⁩ ⁨hours⁩ ago

      Can it run Doom?

      source
  • Earthman_Jim@lemmy.zip ⁨1⁩ ⁨day⁩ ago

    who. fucking. cares.

    source
  • Hadriscus@jlai.lu ⁨1⁩ ⁨day⁩ ago

    Can’t wait for it to hit secondhand market in november

    source
    • Cocodapuf@lemmy.world ⁨20⁩ ⁨hours⁩ ago

      So we can do what? De solder the individual ram chips and populate them on custom dimms?

      Pass.

      source
      • in_my_honest_opinion@piefed.social ⁨14⁩ ⁨hours⁩ ago

        You scoff but this is already being done in China. They desolder good chips from bad cards and add them to a mule card.

        https://overclock3d.net/news/gpu-displays/chinese-developers-create-modified-48gb-nvidia-rtx-4090d-and-32gb-rtx-4080-super-gpus-for-the-ai-cloud/

        source
      • vaultdweller013@sh.itjust.works ⁨18⁩ ⁨hours⁩ ago

        Bringus is gonna make a weird gaming computer by shoving one into a movie rental kiosk.

        source
  • fubarx@lemmy.world ⁨1⁩ ⁨day⁩ ago

    Question is, how long before it makes it to the next DGX Spark? Some people don’t have $10B to burn.

    source