Comment

Comment on Those who are hosting on bare metal: What is stopping you from using Containers or VM's? What are you self hosting?

brucethemoose@lemmy.world ⁨5⁩ ⁨months⁩ ago

In my case it’s performance and sheer RAM usage.

GLM 4.5 needs like 112GB RAM and absolutely every megabyte of VRAM from the GPU. It simply cannot afford the overhead. I think containers may slow down CPU<->GPU transfers slightly, but don’t quote me on that.

source

Sort:hotnew top

kiol@lemmy.world ⁨5⁩ ⁨months⁩ ago
Can anyone confirm if containers would actually impact CPU to GPU transfers

source
- brucethemoose@lemmy.world ⁨5⁩ ⁨months⁩ ago
  To be clear, VMs absolutely have overhead but Docker/Podman is the question. It might be negligible.
  
  And this is a particularly weird scenario (since prompt processing literally has to shuffle 112GB over the PCIe bus for each batch). Most GPGPU apps aren’t so sensitive to that.
  
  source