Comment

Comment on Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts

<- View Parent

Passerby6497@lemmy.world ⁨3⁩ ⁨months⁩ ago

I for one would enjoy triggering your unskippable cutscenes in setting up local CPU based AI if it can work on Linux with an older amd card.

Don’t have funds for anything fancy, but would be interesting in playing around with it. Been wanting to get something like that setup for home assistant.

source

Sort:hotnew top

brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
Plenty of folks do AMD. A popular ‘homelab’ setup is 32GB AMD MI50s. Even Intel is fine these days!

But what’s your setup, precisely? CPU, RAM, and GPU.

source
- afk_strats@lemmy.world ⁨3⁩ ⁨months⁩ ago
  I have a MI50/7900xtx gaming/ai setup at homr which in i use for learning and to test out different models. Happy to answer questions
  
  source
- Passerby6497@lemmy.world ⁨3⁩ ⁨months⁩ ago
  Looks like I’m running an AMD Ryzen 5 2600 CPU, AMD Radeon RX 570 GPU, and 32GB RAM
  
  source
  - brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
    
    4GB VRAM
    
    Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: huggingface.co/…/Kimi-Linear-48B-A3B-Instruct
    
    Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI: huggingface.co/…/Qwen3-VL-30B-A3B-Instruct-UD-Q4_…
    
    If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.
    
    source
    Passerby6497@lemmy.world ⁨3⁩ ⁨months⁩ ago
    Thanks for the info. I would like to run locally if possible, but I’m not opposed to using API and just limiting what I surface.
    
    source
ag10n@lemmy.world ⁨3⁩ ⁨months⁩ ago
You can use Vulkan fairly easily as long as you have 8G vram

blog.linux-ng.de/…/running-llms-with-llama-cpp-us…

source
- brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
  The key is which one, and how though.
  
  For the really sparse models, you might be better off trying ik_llama.cpp, especially if you are targeting a ‘small’ quant.
  
  source
- Passerby6497@lemmy.world ⁨3⁩ ⁨months⁩ ago
  Only got 4G vram, unfortunately
  
  source
SabinStargem@lemmy.today ⁨3⁩ ⁨months⁩ ago
If you just want an easy way to setup AI on Windows or Linux, KoboldCPP is my recommendation for your backend. It supports the GGUF format, which allows you to use both RAM and VRAM simultaneously. It won’t be the fastest thing, but it is easy enough to setup, with a bundled GUI for prep and actual usage. Through the IP address it gives, you can hook the backend into a frontend of choice.

KoboldCPP

Image

source