Comment

Comment on The AI bubble is so big it's propping up the US economy (for now)

brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago

Open models are going to kick the stool out. Hopefully.

GLM 4.5 is already #2 on lm arena, above Grok and ChatGPT, and runnable on homelab rigs, yet just 32B active which is mad. Extrapolate that a literally bit, and it’s just a race to the bottom.

source

Sort:hotnew top

dubyakay@lemmy.ca ⁨3⁩ ⁨months⁩ ago
I did not understand half of what you’ve written. But what do I need to get this running on my home PC?

source
- brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
  I am referencing this: z.ai/blog/glm-4.5
  
  The full GLM? Basically a 3090 or 4090 and a budget EPYC CPU. Or maybe 2 GPUs on a threadripper system.
  
  GLM Air? Now this would work on a 16GB+ VRAM desktop, just slap in 96GB+ (maybe 64GB?) of fast RAM. Or the recent Framework desktop, or any mini PC/laptop with the 128GB Ryzen 395 config.
  
  You’d download the weights, quantize yourself if needed, and run them in ik_llama.cpp (which should get support imminently).
  
  github.com/ikawrakow/ik_llama.cpp/
  
  source
- tomkatt@lemmy.world ⁨3⁩ ⁨months⁩ ago
  You can probably just use ollama and import the novel.
  
  source
  - brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
    It’s going to be slow as molasses on ollama. It needs a better runtime, and GLM 4.5 probably isn’t supported at this moment anyway.
    
    source
  - WorldsDumbestMan@lemmy.today ⁨3⁩ ⁨months⁩ ago
    I’m running Qwen 3B and it is seldom useful
    
    source
    brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
    It’s too small.
    
    IDK what your platform is, but have you tried Qwen3 A3B? Or smallthinker 21B?
    
    huggingface.co/…/SmallThinker-21BA3B-Instruct
    
    The speed should be somewhat similar.
    
    source
    -> View More Comments