I had never heard of Kobold AI. I was going to self-host Ollama and try with it but I’ll take a look at Kobold. I had never heard about controls on world-building and dialogue triggers either; there’s a lot to learn.
Will more VRAM solve the problem of not retaining context? Can I throw 48GB of VRAM towards an 8B model to help it remember stuff?
Yes, I’m looking at image generation (stable diffusion) too. Thanks
fishynoob@infosec.pub 1 week ago
Thanks for the edit. You have a very intriguing idea; a second LLM in the background with a summary of the conversation + static context might make performance a lot better. I don’t know if anyone has implemented it/knows how one can DIY it with Kobold/Ollama