Comment on GitHub hits CTRL-Z, decides it will train its AI with user data after all
S4m_S3p1l@infosec.pub 1 week ago
I’m not surprised, companies are starting to realise that AI is only as useful as the data it’s trained on. If you blast it with all the internet slop we have completely unfiltered, it’s going to start fucking up all it’s responses. It’s not just about the volume of data, it’s about the quality of that data. Sites like Github, and academic journals, contain the exact data that companies need to create well rounded LLMs, that don’t go off on racist rants and declare themselves as “MechaHitler”. That makes data like Github’s pure gold.
MDCCCLV@lemmy.ca 1 week ago
Counterpoint, I’ve poisoned it with absolute dumb shit and the worst code you’ve ever seen
luftruessel@feddit.org 1 week ago
Intentionally, right? Right?