cross-posted from: programming.dev/post/51407459
Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!
Submitted 1 week ago by anzo@programming.dev to selfhosting@slrpnk.net
cross-posted from: programming.dev/post/51407459
Check what can you use and at what rate of token per seconds would it be… It has examples of many models and quantization levels. Huge resource!
The best model is no model.
Doctorbllk@slrpnk.net 1 week ago
I know I will invite ire with this, but I think a self hosted model is relatively acceptable. Get rid of the generative art and stick to things like code and evaluation via a model not being sourced by a massive data center (plus the capability to train a model in a way you may find even more acceptable than a default) and most if not all of the questionable aspects of LLMs fade away.