Comment on Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

<- View Parent
Fisch@lemmy.ml ⁨7⁩ ⁨months⁩ ago

Unfortunately LLMs need a lot of VRAM. You could try using koboldcpp, it runs on the CPU but let’s you offload layers onto the GPU. That way you might be able to stay withing those 4gb even with larger models.

source
Sort:hotnewtop