Comment on Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows
korewa@reddthat.com 1 year ago
Dang I need to try these for now only the stable diffusion extension for automatic 1111 is available.
I wonder if it will accelerate 30b models that doesn’t fit all in the gpu vram.
If it only accelerates 13b then it was already fast enough