Comment on Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts
You can use Vulkan fairly easily as long as you have 8G vram
blog.linux-ng.de/…/running-llms-with-llama-cpp-us…
The key is which one, and how though.
For the really sparse models, you might be better off trying ik_llama.cpp, especially if you are targeting a ‘small’ quant.
Only got 4G vram, unfortunately
brucethemoose@lemmy.world 1 month ago
The key is which one, and how though.
For the really sparse models, you might be better off trying ik_llama.cpp, especially if you are targeting a ‘small’ quant.