Comment on Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts

<- View Parent
brucethemoose@lemmy.world ⁨2⁩ ⁨days⁩ ago

4GB VRAM

Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: huggingface.co/…/Kimi-Linear-48B-A3B-Instruct

Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI: huggingface.co/…/Qwen3-VL-30B-A3B-Instruct-UD-Q4_…

If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.

source
Sort:hotnewtop