That’s great if they actually work. But my experience with the big, corporate-funded models has been pretty freaking abysmal after more than a year of trying to adopt them into my daily workflow. I can’t imagine the performance of local models is much better when they’re running on much, much smaller datasets and with much, much less computing power.
I’m happy to be proven wrong, of course, but I just don’t see how it’s possible for local models to compete with the Big Boys in terms of quality… and the quality of the largest models is only middling at best.
eronth@lemmy.world 6 months ago
Which ones are you running?
FaceDeer@fedia.io 6 months ago
I've found Qwen3-30B-A3B-Thinking-2507 to be the best all-around "do stuff for me" model that fits on my hardware. I've mostly been using it for analyzing and summarizing documents I've got on my local hard drive; meeting transcripts, books, and so forth. It's done surprisingly well on those transcripts, I daresay its summaries are able to tease out patterns that a human wouldn't have had an easy time spotting.
When it comes to creative writing I mix it up with Llama-3.3-70B-Instruct to enrich the text, using multiple models helps keep it from becoming repetitive and too recognizable in style.
I've got Qwen3-Coder-30B-A3B-Instruct kicking around as a programming assistant, but while it's competent at its job I've been finding that the big online models do better (unsurprisingly) so I use those more. Perhaps if I was focusing on code analysis and cleanup I'd be using that one instead but when it comes to writing big new classes or applications in one swoop it pays to go with the best right off the bat. Maybe once the IDEs get a little better at integrating LLMs it might catch up.
I've been using Ollama as the framework for running them, it's got a nice simple API and it runs in the background so it'll claim and release memory whenever demand for it comes. I used to use KoboldCPP but I had to manually start and stop it a lot and that got tedious.