Comment on Nvidia delivers first Vera Rubin AI GPU samples to customers — 88-core Vera CPU paired with Rubin GPUs with 288 GB of HBM4 memory apiece

<- View Parent
TropicalDingdong@lemmy.world ⁨4⁩ ⁨days⁩ ago

Yeah I’ve read that before. I don’t necessarily agree with their framework. And even working within their framework, this article is about a challenge to their third bullet.

I’m just not quite ready to rule out the idea that if you can scale single models above a certain boundary, you’ll get a fundamentally different/ novel behavior. This is consistent with other networked systems, and somewhat consistent with the original performance leaps we saw (the ones I think really matter are ones from 2019-2023, its really plateaued since and is mostly engineering tittering at the edges). It genuinely could be that 8 in a MoE configuration with single models maxing out each one could actually show a very different level of performance. We just don’t know because we just can’t test that with the current generation of hardware.

Its possible there really is something “just around the corner”; possible and unlikely.

source
Sort:hotnewtop