naw, I mean more that the kind of people who uncritically would take everything a chatbot says a face value are probably better off being in chatGPTs little curated garden anyway. Cause people like that are going to immediately get grifted into whatever comes along first no matter what, and a lot of those are a lot more dangerous to the rest of us that a bot that won’t talk great replacement with you.
Comment on Elon Musk’s Grok Goes Haywire, Boasts About Billionaire’s Pee-Drinking Skills and ‘Blowjob Prowess’
DandomRude@lemmy.world 3 weeks agoYou mean Deepseek on a local device?
khepri@lemmy.world 3 weeks ago
DandomRude@lemmy.world 3 weeks ago
Ahh, thank you—I had misunderstood that, since Deepseek is (more or less) an open-source LLM from China that can also be used and fine-tuned on your own device using your own hardware.
ranzispa@mander.xyz 3 weeks ago
Do you have a cluster with 10 A100 lying around? Because that’s what it gets to run deepseek. It is open source, but it is far from accessible to run on your own hardware.
khepri@lemmy.world 3 weeks ago
I run quantized versions on deepseek that are usable enough for chat, and it’s on a home set that is so old and slow by today’s standards I won’t even the specs lol. Let’s just say the rig is from 2018 and it wasn’t near the best even back then.
brucethemoose@lemmy.world 3 weeks ago
That’s not strictly true.
I have a Ryzen destkop, 7800, 3090, and 128GB DDR5. And I can run the full GLM 4.6 with quite acceptable token divergence compared to the unquantized model, see: huggingface.co/…/GLM-4.6-128GB-RAM-IK-GGUF
If I had a EPYC/Threadripper homelab, I could run Deepseek the same way.
DandomRude@lemmy.world 3 weeks ago
Yes, that’s true. It is resource-intensive, but unlike other capable LLMs, it is somewhat possible—not for most private individuals due to the requirements, but for companies with the necessary budget.
brucethemoose@lemmy.world 3 weeks ago
No one is really running Deepseek locally. What ollama advertises (and basically lies about) is the now-obselete Qwen 2.5 distillations.
…I mean, some are, but it’s exclusively lunatics with EPYC homelab servers, heh.
DandomRude@lemmy.world 3 weeks ago
Thx for clarifying.
I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven’t had much time to look into this stuff lately, but I wanted to check that again at some point.
brucethemoose@lemmy.world 3 weeks ago
Also, I’m a quant cooker myself. Say the word, and I can upload an IK quant more tailored for whatever your hardware/aim is.
DandomRude@lemmy.world 3 weeks ago
Thank you! I might get back to you on that sometime.
brucethemoose@lemmy.world 3 weeks ago
You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm’s ik_llama.cpp quants on Huggingface; that’s state of the art right now.