Indeed, Ollama is going a shady route. github.com/ggml-org/llama.cpp/pull/11016#issuecom…
I started playing with Ramalama (the name is a mouthful) and it works great. There is one or two more steps in the setup but I’ve achieved great performance and the project is making good use of standards (OCI, jinja, unmodified llama.cpp, from what I understand).
Go and check it out, they are compatible with models from HF and Ollama too.
Sims@lemmy.ml 9 months ago
Thanks for Lemonade hint. For Ryzen AI: github.com/lemonade-sdk/lemonade (linux=cpu for now)
brucethemoose@lemmy.world 9 months ago
You can still use the IGP, which might be faster in some cases.