Comment on ChatGPT spends 'tens of millions of dollars' on people saying 'please' and 'thank you', but Sam Altman says it's worth it

<- View Parent
Dran_Arcana@lemmy.world ⁨1⁩ ⁨day⁩ ago

Anecdotally, I use it a lot and I feel like my responses are better when I’m polite. I have a couple of theories as to why.

  1. More tokens in the context window of your question, and a clear separator between ideas in a conversation make it easier for the inference tokenizer to recognize disparate ideas.

  2. Higher quality datasets contain american boomer/millennial notions of “politeness” and when responses are structured in kind, they’re more likely to contain tokens from those higher quality datasets.

I haven’t mathematically proven any of this within the llama.cpp tokenizer, but I strongly suspect that I could at least prove a correlation between polite token input and dataset representation output tokens

source
Sort:hotnewtop