You can probably just use ollama and import the novel.
Comment on The AI bubble is so big it's propping up the US economy (for now)
dubyakay@lemmy.ca 2 days agoI did not understand half of what you’ve written. But what do I need to get this running on my home PC?
tomkatt@lemmy.world 1 day ago
brucethemoose@lemmy.world 1 day ago
It’s going to be slow as molasses on ollama. It needs a better runtime, and GLM 4.5 probably isn’t supported at this moment anyway.
WorldsDumbestMan@lemmy.today 1 day ago
I’m running Qwen 3B and it is seldom useful
brucethemoose@lemmy.world 1 day ago
It’s too small.
IDK what your platform is, but have you tried Qwen3 A3B? Or smallthinker 21B?
huggingface.co/…/SmallThinker-21BA3B-Instruct
The speed should be somewhat similar.
WorldsDumbestMan@lemmy.today 1 day ago
Qwen3 8B sorry, Idiot spelling. I use it to talk about problems when I have no internet or maxed out on Claude. I can rarely trust it with anything reasoning related, it’s faster and easier to do most things myself.
brucethemoose@lemmy.world 1 day ago
I am referencing this: z.ai/blog/glm-4.5
The full GLM? Basically a 3090 or 4090 and a budget EPYC CPU. Or maybe 2 GPUs on a threadripper system.
GLM Air? Now this would work on a 16GB+ VRAM desktop, just slap in 96GB+ (maybe 64GB?) of fast RAM. Or the recent Framework desktop, or any mini PC/laptop with the 128GB Ryzen 395 config.
You’d download the weights, quantize yourself if needed, and run them in ik_llama.cpp (which should get support imminently).
github.com/ikawrakow/ik_llama.cpp/