Comment

Comment on OpenAI to remove non-profit control and give Sam Altman equity

PumpkinEscobar@lemmy.world ⁨5⁩ ⁨months⁩ ago

It’s a good thing that real open source models are getting good enough to compete with or exceed OpenAI.

source

Sort:hotnew top

Vorticity@lemmy.world ⁨5⁩ ⁨months⁩ ago
Can you recommend some models to try?

source
- PumpkinEscobar@lemmy.world ⁨5⁩ ⁨months⁩ ago
  First a caveat/warning - you’ll need a beefy GPU to run larger models, there are some smaller models that perform pretty well.
  
  Adding a medium amount of extra information for you or anyone else that might want to get into running models locally
  
  Tools
  
  Ollama - great app for downloading/managing/running models locally
  
  OpenWebUI - A web app that provides a UI like the ChatGPT web app, but can use local models
  
  continue.dev - A VS Code extension that can use ollama to give a github copilot-like AI assistant running against a local model (can also connect to Anthropic Claude, etc…)
  
  Models
  
  If you look at ollama.com/library?sort=featured you can see models
  
  Model size is measured by parameter count. Generally higher parameter models are better (more “smart”, more accurate) but it’s very challenging/slow to run anything over 25b parameters on consumer GPUs. I tend to find 8-13b parameter models are a sort of sweet spot, the 1-4b parameter models are meant more for really low power devices, they’ll give you OK results for simple requests and summarizing, but they’re not going to wow you.
  
  If you look at the ‘tags’ for the models listed below, you’ll see things like 8b-instruct-q8_0 or 8b-instruct-q4_0. The q part refers to quantization, or shrinking/compressing a model and the number after that is roughly how aggressively it was compressed. Note the size of each tag and how the size reduces as the quantization gets more aggressive (smaller numbers). You can roughly think of this size number as “how much video ram do I need to run this model”. Models can run partially or even fully on a CPU but that’s much slower. Ollama doesn’t yet support these new NPUs found in new laptops/processors, but work is happening there.
  
  Llama 3.1 - The 8b instruct model is pretty good, decent speed and good quality. This is a good “default” model to use
  
  Llama 3.2 - This model was just released yesterday. I’m only seeing the 1b and 3b models right now. They’ve changed the 8b model to 11b, I’m assuming the 11b model is going to be my new goto when it’s available.
  
  Deepseek Coder v2 - A great coding assistant model
  
  Command-r - This is a more niche model, mainly useful for RAG. It’s only available in a 35b parameter model, so not all that feasible to run locally
  
  Mistral small - A really good model, in the ballpark of Llama. I haven’t had quite as much luck with this as with Llama but it is good and I just saw that a new version was released 8 days ago, will need to check it out again
  
  source
  - 8orange8@lemm.ee ⁨5⁩ ⁨months⁩ ago
    A really nice summary. Very useful. Thanks!
    
    source

Tools

Models