Comment on I've just created c/Ollama!
brucethemoose@lemmy.world 1 day agoHonestly perplexity, the online service, is pretty good.
But first question is: how much RAM does your Mac have? This is basically the factor for what model you can and should run.
WhirlpoolBrewer@lemmings.world 1 day ago
8GB
brucethemoose@lemmy.world 1 day ago
8GB?
You might be able to run Qwen3 4B: huggingface.co/mlx-community/…/main
But honestly you don’t have enough RAM to spare, and even a small model might bog things down. I’d run Open Web UI or LM Studio with a free LLM API, like Gemini Flash, or pay a few bucks for something off openrouter. Or maybe Cerebras API.
WhirlpoolBrewer@lemmings.world 1 day ago
Good to know. I’d hate to buy a new machine strictly for running an LLM. Could be an excuse to pickup something like a Framework 16, but realistically, I don’t see myself doing that. I think you might be right about using something like Open Web UI or LM Studio.
brucethemoose@lemmy.world 1 day ago
Yeah, just paying for LLM APIs is dirt cheap, and they (supposedly) don’t scrape data. Again I’d recommend Openrouter and Cerebras! And you get your pick of models to try from them.
Even a framework 16 is not great for LLMs TBH. The Framework desktop is (as it uses a special AMD chip), but it’s very expensive. Honestly the whole hardware market is so screwed up, hence most ‘local LLM enthusiasts’ buy a used RTX 3090 and stick them in desktops or servers, heh.