Comment on Reddit Will License Its Data to Train LLMs, So We Made a Firefox Extension That Lets You Replace Your Comments With Any (Non-Copyrighted) Text

<- View Parent
lvxferre@mander.xyz ⁨8⁩ ⁨months⁩ ago

They almost certainly do, if only because of the practicalities of adding a new comment

If this is true, it shifts the problem from “not having it” to “not knowing which version should be used” (to train the LLM).

They could feed it the unedited versions and call it a day, but a lot of times people edit their content to correct it or add further info, specially for “meatier” content (like tutorials). So there’s still some value on the edits, and I believe that Google will be at least tempted to use them.

If that’s correct, editing it with nonsense will lower the value of edited comments for the sake of LLM training. It should have an impact, just not as big as if they kept no version system.

It would also help with any administration/moderation tasks if they could see whether people posted rule-breaking content and then tried to hide it behind edits.

I know from experience (I’m a former Reddit janny) that moderators can’t see earlier versions of the content, only the last one. The admins might though.

That said, one of the many Spez controversies did show that they are capable of making actual edits on the back end if they wished.

The one from TD, right?

source
Sort:hotnewtop