Comment on Self Hosted Parrot.AI or Otter.AI Alternatives

<- View Parent
chmclhpby@lemmy.world ⁨10⁩ ⁨months⁩ ago

LLaMa-2 was just released and the fine-tunings people have made of it are topping the leaderboards right now in terms of performance for an open source language model. As for inference, don’t forget to look into quantization so you can run larger models on limited vram. I’ve heard about vLLM and llama.cpp and its derivatives.

If you’re looking for a GPU ~$300, I heard a used 3060 is better value than a 4060 right now on performance and memory throughout but not power efficiency (if you want an easy time with ML unfortunately the only option is nvidia).

Good luck! Would be nice to get an update if you find a good solution, it seems could share your use case

source
Sort:hotnewtop