Comment on Frustratingly bad at self hosting. Can someone help me access LLMs on my rig from my phone

brucethemoose@lemmy.world ⁨20⁩ ⁨hours⁩ ago

At risk of getting more technical, ik_llama.cpp has a fantastic built in webui:

github.com/ikawrakow/ik_llama.cpp/

img

Getting more technical, its also way better than ollama. You can run models way smarter than ollama can on the same hardware.

For reference, I’m running GLM-4 (667 GB of raw weights) on a single RTX 3090/Ryzen gaming rig, at reading speed, with pretty low quantization distortion.

And if you want a ‘look this up on the internet for me’ assistant (which you need for them to be truly useful), you need another docker project as well.

…That’s just how LLM self hosting is now. It’s simply too hardware intense to be easy. You can indeed host a small LLM without much understanding, but its going to be pretty dumb.

source
Sort:hotnewtop