Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows

⁨60⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨ijeff@lemdro.id⁩ to ⁨technology@lemmy.world⁩

https://blogs.nvidia.com/blog/2023/10/17/tensorrt-llm-windows-stable-diffusion-rtx/

cross-posted from: lemdro.id/post/2377716 (!aistuff@lemdro.id)

Comments

Sort:hotnew top

korewa@reddthat.com ⁨1⁩ ⁨year⁩ ago
Dang I need to try these for now only the stable diffusion extension for automatic 1111 is available.

I wonder if it will accelerate 30b models that doesn’t fit all in the gpu vram.

If it only accelerates 13b then it was already fast enough

source