Comment on What's your self-hosting success of the week?

<- View Parent
Shimitar@downonthestreet.eu ⁨2⁩ ⁨weeks⁩ ago

NVIDIA Corporation GA104GL [RTX A4000] (rev a1)

From lspci

It has 16gb of VRAM, not too much but enough to run gpt:OSS 20b and a few other models pretty nice.

I noticed that it’s better to stick to a single model, I imagine that unload and reload the model in VRAM takes time.

source
Sort:hotnewtop