Comment on Can I run ollama on RTX 3060 and Inter iGPU to increase speed?
theunknownmuncher@lemmy.world 2 months ago
Models are computed sequentially (the output of each layer is the input into the next layer in the sequence) so more GPUs do not offer any kind of performance benefit
jeena@piefed.jeena.net 2 months ago
I see, that's a shame, thanks for explaining it.
Blue_Morpho@lemmy.world 2 months ago
You can. But I don’t think it will help.
medium.com/…/llm-multi-gpu-batch-inference-with-a…