Comment on ChatGPT Mostly Source Wikipedia; Google AI Overviews Mostly Source Reddit
AbouBenAdhem@lemmy.world 1 week ago
There was a recent paper claiming that LLMs were better at avoiding toxic speech if it was actually included in their training data, since models that hadn’t been trained on it had no way of recognizing it. With that in mind, maybe using reddit for training isn’t as bad an idea as it seems.