Comment

Comment on I've just created c/Ollama!

brucethemoose@lemmy.world ⁨8⁩ ⁨months⁩ ago

Honestly perplexity, the online service, is pretty good.

But first question is: how much RAM does your Mac have? This is basically the factor for what model you can and should run.

Sort:hotnew top

WhirlpoolBrewer@lemmings.world ⁨8⁩ ⁨months⁩ ago
8GB

source
- brucethemoose@lemmy.world ⁨8⁩ ⁨months⁩ ago
  8GB?
  
  You might be able to run Qwen3 4B: huggingface.co/mlx-community/…/main
  
  But honestly you don’t have enough RAM to spare, and even a small model might bog things down. I’d run Open Web UI or LM Studio with a free LLM API, like Gemini Flash, or pay a few bucks for something off openrouter. Or maybe Cerebras API.
  
  source
  - WhirlpoolBrewer@lemmings.world ⁨8⁩ ⁨months⁩ ago
    Good to know. I’d hate to buy a new machine strictly for running an LLM. Could be an excuse to pickup something like a Framework 16, but realistically, I don’t see myself doing that. I think you might be right about using something like Open Web UI or LM Studio.
    
    source
    brucethemoose@lemmy.world ⁨8⁩ ⁨months⁩ ago
    Yeah, just paying for LLM APIs is dirt cheap, and they (supposedly) don’t scrape data. Again I’d recommend Openrouter and Cerebras! And you get your pick of models to try from them.
    
    Even a framework 16 is not great for LLMs TBH. The Framework desktop is (as it uses a special AMD chip), but it’s very expensive. Honestly the whole hardware market is so screwed up, hence most ‘local LLM enthusiasts’ buy a used RTX 3090 and stick them in desktops or servers, heh.
    
    source
    -> View More Comments