Comment on OpenAI to remove non-profit control and give Sam Altman equity
Vorticity@lemmy.world 1 month agoCan you recommend some models to try?
Comment on OpenAI to remove non-profit control and give Sam Altman equity
Vorticity@lemmy.world 1 month agoCan you recommend some models to try?
PumpkinEscobar@lemmy.world 1 month ago
First a caveat/warning - you’ll need a beefy GPU to run larger models, there are some smaller models that perform pretty well.
Adding a medium amount of extra information for you or anyone else that might want to get into running models locally
Tools
Models
If you look at ollama.com/library?sort=featured you can see models
Model size is measured by parameter count. Generally higher parameter models are better (more “smart”, more accurate) but it’s very challenging/slow to run anything over 25b parameters on consumer GPUs. I tend to find 8-13b parameter models are a sort of sweet spot, the 1-4b parameter models are meant more for really low power devices, they’ll give you OK results for simple requests and summarizing, but they’re not going to wow you.
If you look at the ‘tags’ for the models listed below, you’ll see things like
8b-instruct-q8_0
or8b-instruct-q4_0
. The q part refers to quantization, or shrinking/compressing a model and the number after that is roughly how aggressively it was compressed. Note the size of each tag and how the size reduces as the quantization gets more aggressive (smaller numbers). You can roughly think of this size number as “how much video ram do I need to run this model”. Models can run partially or even fully on a CPU but that’s much slower. Ollama doesn’t yet support these new NPUs found in new laptops/processors, but work is happening there.8orange8@lemm.ee 1 month ago
A really nice summary. Very useful. Thanks!