Everyone say it with me now…
FUCK SPEZ!
Submitted 8 hours ago by misk@piefed.social to technology@lemmy.zip
https://www.theverge.com/news/757538/reddit-internet-archive-wayback-machine-block-limit
Everyone say it with me now…
FUCK SPEZ!
You rang?
And they will victimblame themselves later!
It’s another move to protect against AI scraping.
Ha! That just means all the content deleted by users who left Reddit is actually inaccessible.
No, they reverted a lot of that. Bulk restoring even “overwritten” post data several weeks and months after the fact, after most people stopped checking.
I don’t think they did. Unless you have evidence otherwise, I think this is a rumour which comes from a misunderstanding of how deletion tools worked. Until recently, the api only provided access to 1000 posts per feed, i.e. 1000 most recent comments in your /comments/new feed. So if you try to mass delete “everything”, you really only delete the 1k posts from each feed, which can leave a lot of your posts/comments unfindable. This has been a commonly-complainaed about limitation of the API for 15 years. But people would run a “mass deletion” tool, think they’d deleted everything, then later find a comment that wasn’t deleted and they’d get all conspiratorial. I seriously doubt reddit cares about your comment that much, much as we love to hate them.
I personally wrote a script to scrape search engine results for a “myusername site:reddit.com” search, looking for my comments to delete. After running various mass deletion tools which claimed to have deleted everything (and made my profile look empty, since all feeds had been exhausted), I was able to delete tens of thousands more comments which weren’t findable via the API. I… used reddit a lot.
The API has recently been updated to allow more (?) posts to be visible. Since that change I was able to view a couple thousand more comments via my profile, so deleted those too. If you have that volume of posts it’s really just a case of trying to find them all however you can. The api has always been limited.
My deleted posts have stayed deleted so far.
Pfffft who cares let the bots fight the bots. Tis all that is left anyhoo.
Who?
Problem is scraper bots are way more aggressive and harder to block. If they were ignoring Reddit because they were taking content from IA but IA is willing to obey robots.txt whereas scraper bots are not, they just shifted the load of serving the bots or playing whack-a-mole with their block evading mechanisms. They aren’t going to stop the bots. It may result in being able to negotiate a license with the bigger guys, but that’s likely not going to make up for the money they spend on dealing with the bots in the long run. Of course companies like this don’t really think long term, it just looks good to investors this quarter.
FreedomAdvocate@lemmy.net.au 2 hours ago
“It’s another move to protect against AI scraping.”
Not because they’re against AI getting their data, oh no - because they SELL their data to google to use for their AI.