Comment - FBXL Lotide

How are you running a 34B model without a GPU? You must be getting one token an hour! How much RAM do you have in the LLM box?

Sort:hotnew top

variety4me@lemmy.zip ⁨3⁩ ⁨weeks⁩ ago
Its an MoE model (en.wikipedia.org/wiki/Mixture_of_experts), only 3B parameters are actually active

I have 32GB RAM

original
cecilkorik@lemmy.ca ⁨3⁩ ⁨weeks⁩ ago
Not what OP is using obviously, but AMD X3D CPUs and Mac systems can be quite competitive for AI if you’re lacking VRAM. Not all CPUs struggle with inference, and some GPUs aren’t so hot at it either. GPUs are generally better, especially the really high-end ones, but throwing in low- and mid-range cards and high-end CPUs stuff starts to look somewhat muddier.

original