Comment on Father sues Google, claiming Gemini chatbot drove son into fatal delusion
MoffKalast@lemmy.world 5 hours agoThat would be my bet, LLMs really gravitate towards playing along and continuing whatever’s already written. And Gemini especially has a 1M long context so it could be going back for a book’s worth of text and reinforcing it up the wazoo.
That said, there is something really unhinged about Google’s Gemma series even in short conversations and I see the big version is no better. Something’s not quite right with their RLHF dataset.
calamitycastle@lemmy.world 4 hours ago
What is an rlhf data set?
wonderingwanderer@sopuli.xyz 3 hours ago
Reinforcement Learning from Human Feedback
It’s a method of fine-tuning and aligning LLMs which requires active human input