Comment

Comment on Considering that LLMs are trained on the whole of the internet, it's kind of amazing that they don't talk back to you like a condescending, smug asshole

TheLeadenSea@sh.itjust.works ⁨1⁩ ⁨day⁩ ago

They have RLHF (reinforcement learning from human feedback) so any negative, biased, or rude responses would have been filtered out in training. That’s the idea anyway, obviously no system is perfect.

source

Sort:hotnew top

SpaceNoodle@lemmy.world ⁨23⁩ ⁨hours⁩ ago
Then why are they all still smarmy assholes?

source
- SkyNTP@lemmy.ml ⁨23⁩ ⁨hours⁩ ago
  That’s what was said. LLMs have been reinforced to respond exactly how they do. In other words, that “smarmy asshole” attitude, you describe was a deliberate choice.
  
  source