Comment on I've just created c/Ollama!

<- View Parent
brucethemoose@lemmy.world ⁨1⁩ ⁨week⁩ ago

Oh actually that’s a good card for LLM serving!

Use the llama.cpp server from source, it has better support for Pascal cards than anything else:

github.com/ggml-org/llama.cpp/…/multimodal.md


Gemma 3 is a hair too big (like 17-18GB), so I’d start with InternVL 14B Q5K XL: huggingface.co/…/InternVL3-14B-Instruct-GGUF

Or Mixtral 24B IQ4_XS for more ‘text’ intelligence than vision: huggingface.co/…/Mistral-Small-3.2-24B-Instruct-2…

source
Sort:hotnewtop