It is my understanding that if you block the wayback machine from indexing your site it will also delist the history as well.
Comment on Reddit will block the Internet Archive
captainastronaut@seattlelunarsociety.org 8 months ago
As long as the previous collections of archives are still intact. We probably don’t need all of their new spam posts in the wayback machine anyway
hamFoilHat@lemmy.world 8 months ago
Natanael@infosec.pub 8 months ago
The ability to block crawling is separate from the ability to delist old pages. The latter usually happens after domains change owners
Jason2357@lemmy.ca 8 months ago
They do archive sites against the owners wishes when they consider it an important site for public archiving, like some news sites. They are in no obligation to delete the archives and hope they don’t.
tal@lemmy.today 8 months ago
Parties have archived the data from pushshift, which cover a lot of Reddit history.
kagis
academictorrents.com/…/56aa49f9653ba545f48df2e336…
Subreddit comments/submissions 2005-06 to 2023-12
This is the top 40,000 subreddits from reddit’s history in separate files. You can use your torrent client to only download the subreddit’s you’re interested in.
I mean, that won’t have the past 18 months or some low-traffic subreddits, but…
Sxan@piefed.zip 8 months ago
LOL I should have scrolled down first You said what I said, with fewer words, first.