This type of thing is mostly used for inference with extremely large models, where a single GPU will have far too little VRAM to even load a model into memory. I doubt people are expecting this to perform particularly fast, they just want to get a model to run at all.
Comment on Apple reveals M3 Ultra, taking Apple silicon to a new extreme
REDACTED@infosec.pub 4 weeks ago
Isn’t unified memory terrible for AI tho? I kind of doubt it even has bandwidth of a 5 years old vram.
KingRandomGuy@lemmy.world 4 weeks ago
KoalaUnknown@lemmy.world 4 weeks ago
While DDR7 DRAM is obviously better, the massive amount of memory can be a massive advantage for some models.