Comment

Comment on Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts

<- View Parent

brucethemoose@lemmy.world ⁨2⁩ ⁨months⁩ ago

4GB VRAM

Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: huggingface.co/…/Kimi-Linear-48B-A3B-Instruct

Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI: huggingface.co/…/Qwen3-VL-30B-A3B-Instruct-UD-Q4_…

If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.

source

Sort:hotnew top

Passerby6497@lemmy.world ⁨1⁩ ⁨month⁩ ago
Thanks for the info. I would like to run locally if possible, but I’m not opposed to using API and just limiting what I surface.

source