Comment on The AI bubble is so big it's propping up the US economy (for now)
brucethemoose@lemmy.world 3 days ago
Open models are going to kick the stool out. Hopefully.
GLM 4.5 is already #2 on lm arena, above Grok and ChatGPT, and runnable on homelab rigs, yet just 32B active which is mad. Extrapolate that a literally bit, and it’s just a race to the bottom.
dubyakay@lemmy.ca 3 days ago
I did not understand half of what you’ve written. But what do I need to get this running on my home PC?
brucethemoose@lemmy.world 2 days ago
I am referencing this: z.ai/blog/glm-4.5
The full GLM? Basically a 3090 or 4090 and a budget EPYC CPU. Or maybe 2 GPUs on a threadripper system.
GLM Air? Now this would work on a 16GB+ VRAM desktop, just slap in 96GB+ (maybe 64GB?) of fast RAM. Or the recent Framework desktop, or any mini PC/laptop with the 128GB Ryzen 395 config.
You’d download the weights, quantize yourself if needed, and run them in ik_llama.cpp (which should get support imminently).
github.com/ikawrakow/ik_llama.cpp/
tomkatt@lemmy.world 3 days ago
You can probably just use ollama and import the novel.
brucethemoose@lemmy.world 2 days ago
It’s going to be slow as molasses on ollama. It needs a better runtime, and GLM 4.5 probably isn’t supported at this moment anyway.
WorldsDumbestMan@lemmy.today 3 days ago
I’m running Qwen 3B and it is seldom useful
brucethemoose@lemmy.world 2 days ago
It’s too small.
IDK what your platform is, but have you tried Qwen3 A3B? Or smallthinker 21B?
huggingface.co/…/SmallThinker-21BA3B-Instruct
The speed should be somewhat similar.