Comment on Can I run ollama on RTX 3060 and Inter iGPU to increase speed?
theunknownmuncher@lemmy.world 3 weeks ago
Models are computed sequentially (the output of each layer is the input into the next layer in the sequence) so more GPUs do not offer any kind of performance benefit
jeena@piefed.jeena.net 3 weeks ago
I see, that's a shame, thanks for explaining it.
Blue_Morpho@lemmy.world 3 weeks ago
You can. But I don’t think it will help.
medium.com/…/llm-multi-gpu-batch-inference-with-a…