Comment on Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts
Looks like I’m running an AMD Ryzen 5 2600 CPU, AMD Radeon RX 570 GPU, and 32GB RAM
4GB VRAM
Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: huggingface.co/…/Kimi-Linear-48B-A3B-Instruct
Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI: huggingface.co/…/Qwen3-VL-30B-A3B-Instruct-UD-Q4_…
If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.
Thanks for the info. I would like to run locally if possible, but I’m not opposed to using API and just limiting what I surface.
brucethemoose@lemmy.world 2 days ago
Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: huggingface.co/…/Kimi-Linear-48B-A3B-Instruct
Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI: huggingface.co/…/Qwen3-VL-30B-A3B-Instruct-UD-Q4_…
If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.
Passerby6497@lemmy.world 1 day ago
Thanks for the info. I would like to run locally if possible, but I’m not opposed to using API and just limiting what I surface.