I don’t think giving the temperature knob to end users is the answer.
Turning it to max for max correctness and low creativity won’t work in an intuitive way.
Sure, turning it down from the balanced middle value will make it more “creative” and unexpected, and this is useful for idea generation, etc. But a knob that goes from “good” to “sort of off the rails, but in a good way” isn’t a great user experience for most people.
Most people understand this stuff as intended to be intelligent. Correct. Etc. Or they At least understand that’s the goal. Once you give them a knob to adjust the “intelligence level,” you’ll have more pushback on these things not meeting their goals. “I clearly had it in factual/correct/intelligent mode. Not creativity mode. I don’t understand why it left our these facts and invented a back story to this small thing mentioned…”
Not everyone is an engineer. Temp is an obtuse thing.
1rre@discuss.tchncs.de 1 week ago
I’ve found Gemini overwhelmingly terrible at pretty much everything, it responds more like a 7b model running on a home pc or a model from two years ago than a medium commercial model in how it completely ignores what you ask it and just latches on to keywords… It’s almost like they’ve played with their tokenisation or trained it exclusively for providing tech support where it links you to an irrelevant article or something
brucethemoose@lemmy.world 1 week ago
Gemini Flash Thinking from earlier this year was good, but it regressed a ton.
Gemini 1.5 is literally better than the new 2.0 in some of my tests, especially long-context ones.
Imgonnatrythis@sh.itjust.works 1 week ago
Bing/chatgpt is just as bad. It loves to tell you it’s doing something and then just ignores you completely.