Comment

Comment on Elon Musk’s Grok Goes Haywire, Boasts About Billionaire’s Pee-Drinking Skills and ‘Blowjob Prowess’

<- View Parent

DandomRude@lemmy.world ⁨1⁩ ⁨month⁩ ago

You mean Deepseek on a local device?

source

Sort:hotnew top

brucethemoose@lemmy.world ⁨1⁩ ⁨month⁩ ago
No one is really running Deepseek locally. What ollama advertises (and basically lies about) is the now-obselete Qwen 2.5 distillations.

…I mean, some are, but it’s exclusively lunatics with EPYC homelab servers, heh.

source
- DandomRude@lemmy.world ⁨1⁩ ⁨month⁩ ago
  Thx for clarifying.
  
  I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven’t had much time to look into this stuff lately, but I wanted to check that again at some point.
  
  source
  - brucethemoose@lemmy.world ⁨1⁩ ⁨month⁩ ago
    Also, I’m a quant cooker myself. Say the word, and I can upload an IK quant more tailored for whatever your hardware/aim is.
    
    source
    DandomRude@lemmy.world ⁨1⁩ ⁨month⁩ ago
    Thank you! I might get back to you on that sometime.
    
    source
    -> View More Comments
  - brucethemoose@lemmy.world ⁨1⁩ ⁨month⁩ ago
    You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm’s ik_llama.cpp quants on Huggingface; that’s state of the art right now.
    
    source
khepri@lemmy.world ⁨1⁩ ⁨month⁩ ago
naw, I mean more that the kind of people who uncritically would take everything a chatbot says a face value are probably better off being in chatGPTs little curated garden anyway. Cause people like that are going to immediately get grifted into whatever comes along first no matter what, and a lot of those are a lot more dangerous to the rest of us that a bot that won’t talk great replacement with you.

source
- DandomRude@lemmy.world ⁨1⁩ ⁨month⁩ ago
  Ahh, thank you—I had misunderstood that, since Deepseek is (more or less) an open-source LLM from China that can also be used and fine-tuned on your own device using your own hardware.
  
  source
  - ranzispa@mander.xyz ⁨1⁩ ⁨month⁩ ago
    Do you have a cluster with 10 A100 lying around? Because that’s what it gets to run deepseek. It is open source, but it is far from accessible to run on your own hardware.
    
    source
    khepri@lemmy.world ⁨1⁩ ⁨month⁩ ago
    I run quantized versions on deepseek that are usable enough for chat, and it’s on a home set that is so old and slow by today’s standards I won’t even the specs lol. Let’s just say the rig is from 2018 and it wasn’t near the best even back then.
    
    source
    brucethemoose@lemmy.world ⁨1⁩ ⁨month⁩ ago
    That’s not strictly true.
    
    I have a Ryzen destkop, 7800, 3090, and 128GB DDR5. And I can run the full GLM 4.6 with quite acceptable token divergence compared to the unquantized model, see: huggingface.co/…/GLM-4.6-128GB-RAM-IK-GGUF
    
    If I had a EPYC/Threadripper homelab, I could run Deepseek the same way.
    
    source
    DandomRude@lemmy.world ⁨1⁩ ⁨month⁩ ago
    Yes, that’s true. It is resource-intensive, but unlike other capable LLMs, it is somewhat possible—not for most private individuals due to the requirements, but for companies with the necessary budget.
    
    source
    -> View More Comments