Comment

iii@mander.xyz ⁨4⁩ ⁨months⁩ ago

One of these projects might be of interest to you:

github.com/Mintplex-Labs/anything-llm
github.com/mudler/LocalAI

Do note that CPU inference is quite a lot slower than CPU. I currently like the quantized deepseek models as currently the best balanced between quality of replies and inference time when not using GPU.

source

Sort:hotnew top

ProperlyProperTea@lemmy.ml ⁨4⁩ ⁨months⁩ ago
Indeed, other than being able to get the model running, having decent hardware is the next most important part.

3060 12gb is probably cheapest card to get, 3090 or other 24gb card if you can get it

source