Comment

Comment on The AI bubble is so big it's propping up the US economy (for now)

<- View Parent

dubyakay@lemmy.ca ⁨4⁩ ⁨months⁩ ago

I did not understand half of what you’ve written. But what do I need to get this running on my home PC?

source

Sort:hotnew top

brucethemoose@lemmy.world ⁨4⁩ ⁨months⁩ ago
I am referencing this: z.ai/blog/glm-4.5

The full GLM? Basically a 3090 or 4090 and a budget EPYC CPU. Or maybe 2 GPUs on a threadripper system.

GLM Air? Now this would work on a 16GB+ VRAM desktop, just slap in 96GB+ (maybe 64GB?) of fast RAM. Or the recent Framework desktop, or any mini PC/laptop with the 128GB Ryzen 395 config.

You’d download the weights, quantize yourself if needed, and run them in ik_llama.cpp (which should get support imminently).

github.com/ikawrakow/ik_llama.cpp/

source
tomkatt@lemmy.world ⁨4⁩ ⁨months⁩ ago
You can probably just use ollama and import the novel.

source
- brucethemoose@lemmy.world ⁨4⁩ ⁨months⁩ ago
  It’s going to be slow as molasses on ollama. It needs a better runtime, and GLM 4.5 probably isn’t supported at this moment anyway.
  
  source
- WorldsDumbestMan@lemmy.today ⁨4⁩ ⁨months⁩ ago
  I’m running Qwen 3B and it is seldom useful
  
  source
  - brucethemoose@lemmy.world ⁨4⁩ ⁨months⁩ ago
    It’s too small.
    
    IDK what your platform is, but have you tried Qwen3 A3B? Or smallthinker 21B?
    
    huggingface.co/…/SmallThinker-21BA3B-Instruct
    
    The speed should be somewhat similar.
    
    source
    WorldsDumbestMan@lemmy.today ⁨4⁩ ⁨months⁩ ago
    Qwen3 8B sorry, Idiot spelling. I use it to talk about problems when I have no internet or maxed out on Claude. I can rarely trust it with anything reasoning related, it’s faster and easier to do most things myself.
    
    source
    -> View More Comments