To be clear, VMs absolutely have overhead but Docker/Podman is the question. It might be negligible.
And this is a particularly weird scenario (since prompt processing literally has to shuffle 112GB over the PCIe bus for each batch). Most GPGPU apps aren’t so sensitive to that.
brucethemoose@lemmy.world 1 month ago
To be clear, VMs absolutely have overhead but Docker/Podman is the question. It might be negligible.
And this is a particularly weird scenario (since prompt processing literally has to shuffle 112GB over the PCIe bus for each batch). Most GPGPU apps aren’t so sensitive to that.