Okay, but it is those niche subs that are the most valuable.
Comment on Reddit stock falls for second day as references to its content in ChatGPT responses plummet
M1ch431@slrpnk.net 2 weeks agoCitation needed.
FlexibleToast@lemmy.world 2 weeks ago
M1ch431@slrpnk.net 2 weeks ago
Are you somebody invested in Reddit? Genuine question.
FlexibleToast@lemmy.world 2 weeks ago
No, I’m not. I don’t care at all if they’re successful or go under.
Sure, but again it’s not likely to be most. You don’t seem to realize how hard it is to get data that is already classified. That stuff is gold to people developing AI. Most of the work in data science is cleaning data and getting it into a usable form.
M1ch431@slrpnk.net 2 weeks ago
It’s noise, a very large part of it. Reddit is financially motivated to make the data appear as if is signal. It isn’t - they have taken extremely minimal steps to ensure actual human participation.
This doesn’t matter to AI companies, but it only warps that technology more and more. AI is a sinking ship with current methodologies. Reddit will die when the AI bubble bursts and those involved with Reddit already cashed out enough to be filthy rich (e.g. Steve Huffman sold 500,000 of his shares in the IPO, indicating he will make $17mn).
plyth@feddit.org 2 weeks ago
They could sell the cleaned votes to AI companies and keep the dirty data public for the scrapers.
M1ch431@slrpnk.net 2 weeks ago
Meta/OpenAI openly pirating everything they can to train their LLMs is a good example of how data hungry these AI/etc. companies are.
Is it plausible for companies to request Reddit narrow down data e.g. by demographic or geographic location and request that data for purchase? Sure, but the LLMs seemingly require all data that exists that these companies can get their hands on - I highly doubt with the scale of data theft being committed do they care about Reddit data being tainted. If anything, it might even be desirable to them.