Comment on [deleted]

<- View Parent
tal@lemmy.today ⁨5⁩ ⁨days⁩ ago

Will more VRAM solve the problem of not retaining context?

IIRC — I ran KoboldAI with 24GB of VRAM, so wasn’t super-constrained – there are some limits on the number of tokens that can be sent as a prompt imposed by VRAM, which I did not hit. However, there are also some imposed by the software; you can only increase the number of tokens that get fed in so far, regardless of VRAM. More VRAM does let you use larger, more “knowledgeable” models.

I’m not sure whether those are purely-arbitrary, to try to keep performance running, or if there are other technical issues with very large prompts.

It definitely isn’t capable of keeping the entire previous conversation (once you get one of any length) as an input to generating a new response, though.

source
Sort:hotnewtop