Comment on I've just created c/Ollama!

<- View Parent
brucethemoose@lemmy.world ⁨1⁩ ⁨day⁩ ago

Actually, to go ahead and answer, the “easiest” path would be LM Studio (which supports MLX quants natively and is not time intensive to install), and a DWQ quantization (which is a newer, higher quality variant of MLX quants).

Probably one of these models, depending on how much RAM you have:

huggingface.co/…/Magistral-Small-2506-4bit-DWQ

huggingface.co/…/Qwen3-30B-A3B-4bit-DWQ-0508

huggingface.co/…/GLM-4-32B-0414-4bit-DWQ

With a bit more time invested, you could try to set up Open Web UI as an alterantive interface (which has its own built in web search like Gemini): openwebui.com

And then use LM Studio (or some other MLX backend, or even free online API models) as the ‘engine’

source
Sort:hotnewtop