Comment on Reddit stock falls for second day as references to its content in ChatGPT responses plummet
plyth@feddit.org 5 days agoThey could sell the cleaned votes to AI companies and keep the dirty data public for the scrapers.
Comment on Reddit stock falls for second day as references to its content in ChatGPT responses plummet
plyth@feddit.org 5 days agoThey could sell the cleaned votes to AI companies and keep the dirty data public for the scrapers.
M1ch431@slrpnk.net 5 days ago
Meta/OpenAI openly pirating everything they can to train their LLMs is a good example of how data hungry these AI/etc. companies are.
Is it plausible for companies to request Reddit narrow down data e.g. by demographic or geographic location and request that data for purchase? Sure, but the LLMs seemingly require all data that exists that these companies can get their hands on - I highly doubt with the scale of data theft being committed do they care about Reddit data being tainted. If anything, it might even be desirable to them.