Comment

Comment on Does anyone else have experience with koboldcpp? How do I make it give me longer outputs?

llama.cpp uses the gpu if you compile it with gpu support and you tell it to use the gpu…

Never used koboldcpp, so I don’t know why it would it would give you shorter responses if both the model and the prompt are the same (also assuming you’ve generated multiple times, and it’s always the same). If you don’t want to use discord to visit the official koboldcpp server, you might get more answers from a more llm-focused community such as !localllama@sh.itjust.works

source

Sort:hotnew top

PenisWenisGenius@lemmynsfw.com ⁨7⁩ ⁨months⁩ ago
Cool I didn’t know llamacpp could do gpu acceleration at all. I’m going to look into that.

source