Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Reddit will block the Internet Archive

⁨0⁩ ⁨likes⁩

Submitted ⁨⁨8⁩ ⁨months⁩ ago⁩ by ⁨General_Effort@lemmy.world⁩ to ⁨technology@lemmy.world⁩

https://www.theverge.com/news/757538/reddit-internet-archive-wayback-machine-block-limit

source

Comments

Sort:hotnewtop
  • Suffa@lemmy.wtf ⁨7⁩ ⁨months⁩ ago

    I stopped using reddit long ago.

    source
  • buddascrayon@lemmy.world ⁨7⁩ ⁨months⁩ ago

    Not that reddit isn’t hot garbage right now, and has been for a while actually, but there’s a lot of people here who have glazed over the reason why reddit instituted this policy.

    AI companies are scraping the way back machine. This is something that should concern all of us.

    source
    • General_Effort@lemmy.world ⁨7⁩ ⁨months⁩ ago

      Why?

      source
      • Midnight1938@reddthat.com ⁨7⁩ ⁨months⁩ ago

        Circumventing sites with ‘no ai scraping’ rules

        source
        • -> View More Comments
  • Eh_I@lemmy.world ⁨7⁩ ⁨months⁩ ago

    Fuck Spez

    source
  • Evono@lemmy.dbzer0.com ⁨7⁩ ⁨months⁩ ago

    Reddit warned my account ( first warn in 10 years ) and deleted the comment when I told a American he can strike peacefully to show the government they are against it.

    I got a warn for recommending violence by an ai , the human that checked it agreed and didn’t remove the warn haha.

    Reddit is just feared that their censorship goes public.

    source
    • Eh_I@lemmy.world ⁨7⁩ ⁨months⁩ ago

      I was on Reddit for like 15 years, then got all my warnings and a ban in like a month or two earlier this year. Oh well, lol.

      source
      • ArmchairAce1944@discuss.online ⁨7⁩ ⁨months⁩ ago

        I was on reddit for 11 years before getting banned due to zionists. I have a throwaway reddit account now for porn and other shit, but I dont post.

        source
      • lukaro@lemmy.zip ⁨7⁩ ⁨months⁩ ago

        I just replied “Liar, or fucking liar.” To every republican lie I saw. Only took 2 days for a permaban. I feel if they can lie we should be able to call them out on it at least.

        source
  • MangioneDontMiss@lemmy.ca ⁨7⁩ ⁨months⁩ ago

    reddit can go fuck itself.

    source
    • Eh_I@lemmy.world ⁨7⁩ ⁨months⁩ ago

      That’s the kind of talk that can get you banned from Reddit. 😜

      source
      • MangioneDontMiss@lemmy.ca ⁨7⁩ ⁨months⁩ ago

        I imagine almost my entire Post history can get me banned on Reddit.

        source
  • Njos2SQEZtPVRhH@piefed.social ⁨8⁩ ⁨months⁩ ago

    People who posted on Reddit ( speaking in the past tense, because who would continue to do so now that we have better things? ) never intended for it to be of limited access. Reddit was a publicly accessible place, and people shared their thoughts and comments on it because it was the frontpage on the internet, so the place of choice to share things with the world. That being scraped should not be a problem. But clearly Reddit didn't want to give you a platform to share your thoughts with the world, they wanted you to donate your thoughts and take it as their property so that they can capitalize on it.

    source
    • General_Effort@lemmy.world ⁨7⁩ ⁨months⁩ ago

      I don’t know… I mean, I agree. But I’m seeing a lot of demands that instances should prevent scraping. Ok, it could be astroturf; a campaign by Reddit/data brokers to neutralize the free competition. But you have seen all those deleted posts on Reddit. Those are some special little minds.

      source
      • Njos2SQEZtPVRhH@piefed.social ⁨7⁩ ⁨months⁩ ago

        you're right, there's probably some anti-ai/anti-scraping folks on there aswell as here. Personally I most definitely hate intellectual property more than I do generative AI. But you're right, different people on there will feel differently. But the point still stands that for those who thought they shared their thoughts with the world, their ideas that they donated were taken from them.

        source
  • Peculiaris@lemmy.zip ⁨8⁩ ⁨months⁩ ago

    In the lieu of an IPO u/spez has actively destroyed everything that made Reddit good! Gate keeping the API thinking it’ll help with making some bigshot LLM some day lol

    source
    • DrSoap@lemmy.world ⁨7⁩ ⁨months⁩ ago

      Lol every platform seems to live long enough to shoot themselves in the foot.

      source
      • ILikeBoobies@lemmy.ca ⁨7⁩ ⁨months⁩ ago

        Phpbb/mybb/smf haven’t seemed to do that.

        source
      • Peculiaris@lemmy.zip ⁨7⁩ ⁨months⁩ ago

        Enshittification

        source
  • sturmblast@lemmy.world ⁨8⁩ ⁨months⁩ ago

    Fuck Reddit

    source
  • PattyMcB@lemmy.world ⁨8⁩ ⁨months⁩ ago

    Another nail in the coffin.

    source
  • User79185@discuss.tchncs.de ⁨8⁩ ⁨months⁩ ago

    This is huge blow to archivism, thanks to corporate greed and enshittification of reddit. Worst MBA filled POS.

    source
  • bigbabybilly@lemmy.world ⁨8⁩ ⁨months⁩ ago

    That place is becoming more and more of a shithole. Bots, Ads, trolls, garbage mods… deleted the app last month.

    source
    • espentan@lemmy.world ⁨7⁩ ⁨months⁩ ago

      I quit reddit, cold turkey, the day they shut off free API access for 3rd parties. Except for a couple of fairly niche subs I haven’t missed it at all.

      source
      • AstralPath@lemmy.ca ⁨7⁩ ⁨months⁩ ago

        Same here. I’ve been better off ever since.

        source
  • CaptPretentious@lemmy.world ⁨8⁩ ⁨months⁩ ago

    If you can’t archive something, did it ever really exist?

    source
    • Jax@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

      In a causal sense, yes. In a ‘the average person is fucking stupid’ sense, no.

      source
  • SocialMediaRefugee@lemmy.world ⁨8⁩ ⁨months⁩ ago

    So reddit will become even less valuable

    source
  • kokesh@lemmy.world ⁨8⁩ ⁨months⁩ ago

    They can keep their shit for themselves, stopped caring a long time ago.

    source
  • MehBlah@lemmy.world ⁨8⁩ ⁨months⁩ ago

    When reddit has mutated a few more times. They start erasing stuff themselves. It will be lost to time and that fills me with hope.

    source
  • phantomwise@lemmy.ml ⁨8⁩ ⁨months⁩ ago

    Nice of them to protect their (users’) content from AI scrapping. So that they can charge AI companies for it instead.

    source
    • muusemuuse@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

      They aren’t doing that. They are protecting content from being scraped for free. Reddit is perfectly happy to charge for AI access to user-generated content.

      source
      • ebolapie@lemmy.world ⁨8⁩ ⁨months⁩ ago

        No, that’s not what’s happening. They’re preventing scrapers from accessing the content at no charge. They’re totally willing to make deals for access to their content in exchange for money.

        source
        • -> View More Comments
  • Jhex@lemmy.world ⁨8⁩ ⁨months⁩ ago

    what’s a reddit?

    source
    • Bloomcole@lemmy.world ⁨8⁩ ⁨months⁩ ago

      You use it too scratch your butt I think.

      source
  • NigelFrobisher@aussie.zone ⁨8⁩ ⁨months⁩ ago

    Is that even possible?

    source
    • General_Effort@lemmy.world ⁨8⁩ ⁨months⁩ ago

      Technologically no. Reddit sends out the data to 10s of millions of users as part of their normal operations. They need to try to block those who collect that data for the IA. Reddit has the very short end of the stick.

      The problem is that evading such counter-measures may be criminal in the US. Obviously, EU laws are much harsher.

      source
      • SocialMediaRefugee@lemmy.world ⁨8⁩ ⁨months⁩ ago

        Not to mention all of Asia, South America, Africa…

        source
      • Bloomcole@lemmy.world ⁨8⁩ ⁨months⁩ ago

        Slightly related, can you explain how (a few times for me) an archived page I tried to revisit got erased?

        source
        • -> View More Comments
  • forkDestroyer@infosec.pub ⁨8⁩ ⁨months⁩ ago

    AI can scrape books and journals for info, but can’t scrape Reddit?

    source
    • General_Effort@lemmy.world ⁨8⁩ ⁨months⁩ ago

      Reddit can be scraped just as much as online books and journals.

      source
      • zarkanian@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

        So what’s the point of this?

        source
        • -> View More Comments
    • hunnybubny@discuss.tchncs.de ⁨8⁩ ⁨months⁩ ago

      Yes. Rules for thee.

      source
  • BD89@lemmy.sdf.org ⁨8⁩ ⁨months⁩ ago

    And I will block reddit.

    source
  • MedicPigBabySaver@lemmy.world ⁨8⁩ ⁨months⁩ ago

    Fuck Reddit and Fuck Spez.

    source
  • ozoned@piefed.social ⁨8⁩ ⁨months⁩ ago

    Good plan. Keep locking down your big tech platforms, and we'll all be over here letting folks know where they can find freedom.

    source
    • Bloomcole@lemmy.world ⁨8⁩ ⁨months⁩ ago

      ‘freedom’ as long as the mod agrees with you.

      source
    • aquovie@lemmy.cafe ⁨8⁩ ⁨months⁩ ago

      Careful. Lemmy is too small to draw the attention of sophisticated, persistent abuse. As a company, Reddit has struggled with revenue and we’ve all seen those struggles quite publicly. Lemmy instances with those same challenges would probably just fold and close up.

      Federated networks give you freedom but the potential for abuse is proportional to that freedom while at the same time, federation is far more expensive taken as a whole.

      source
      • bytesonbike@discuss.online ⁨8⁩ ⁨months⁩ ago

        Lemmy instances with those same challenges would probably just fold and close up.

        Can confirm. I set up a pixelfed instance for my city with the goal of moving people from Insta to this version. After about three months, user accounts went from 1-10 signups a week to a hundred a week.

        No way did that many business owners sign up. And yep, all spam.

        After a while, my random weekend project in Spring became a full time job. I closed it last month.

        source
        • -> View More Comments
      • girsaysdoom@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

        I’m sure it would persist even after an event of malicious activity. It may just turn out like email with servers needing to be added to an allowlist at worst and more moderation. I think scalability might be the limiting factor at some point though and as a result we could end up with several disconnected islands of server clusters instead of globally meshed servers.

        source
    • yarr@feddit.nl ⁨8⁩ ⁨months⁩ ago

      Or… let them stay on Reddit. I like lemmy much better, and it’s possibly due to the people that are not present and the lack of commercial interest.

      source
      • Capybara_mdp@reddthat.com ⁨7⁩ ⁨months⁩ ago

        Does anyone have any good tech- related forums on Lemmy? I’m still digging around as i find a lot of interesting but “Quiet” ones.

        source
      • Jason2357@lemmy.ca ⁨8⁩ ⁨months⁩ ago

        I think if the fediverse was ever to become more mainstream, it would naturally splinter. For example, the corporate stuff would be big, and those people who value the small-instance experience we have now would probably de-federate from it. There would always be small fediverses, even if the big fediverses got REALLY big.

        source
      • bytesonbike@discuss.online ⁨8⁩ ⁨months⁩ ago

        In the tech world, we call that a honeytrap.

        source
      • ZombieMantis@lemmy.world ⁨8⁩ ⁨months⁩ ago

        Just make your own invite-only server if you’re so worried about it. Digital freedom should be for everyone, not just a few antisocial nerds.

        source
        • -> View More Comments
      • ozoned@piefed.social ⁨8⁩ ⁨months⁩ ago

        No harm in that. To each their own. :-) Everyone gets to decide at least.

        source
  • bathing_in_bismuth@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

    That means big news is coming, and the media doesn’t want to fuck up the reporting that is comming. Reddit preparing for mass submission of articles

    source
  • MonkderVierte@lemmy.zip ⁨8⁩ ⁨months⁩ ago

    The company limited search crawlers to google, why are you surprised?

    source
  • Blackmist@feddit.uk ⁨8⁩ ⁨months⁩ ago

    It’s another move to protect against AI scraping that isn’t paying them for access.

    source
    • sqgl@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

      Weren’t Reddit comparing a couple of years ago that too many AI bots crawls were stressing their servers.

      Doesn’t the internet archive relieve that stress?

      source
      • supersquirrel@sopuli.xyz ⁨7⁩ ⁨months⁩ ago

        Doesn’t the internet archive relieve that stress?

        I think that was probably the real reason for the block, the Internet Archive is too functional, scalable and accessible of a service for reddit’s lame excuses about needing to gatekeep access to the community created content on their website to not make reddit look totally stupid unless they came up with an excuse to block the Internet Archive.

        source
  • conorab@lemmy.conorab.com ⁨8⁩ ⁨months⁩ ago

    As somebody who often ends up using Reddit like Stackoverflow and in some cases needing the Internet Archive (IA) to find the original post after it’s been deleted or garbled, I think this is a wakeup call for those go to Reddit both to get technical help and to post it. More than ever, Reddit is becoming an unreliable place to find answers for old obscure issues and if they are going to lockout places like the IA then I think it’s time people stopped contributing their solutions to Reddit.

    source
    • Sxan@piefed.zip ⁨8⁩ ⁨months⁩ ago

      Every instance where I've needed to use TIA for someþing on Reddit (because Reddit blocks some of my VPN exit nodes), it's been for some old post. I haven't come across anyþing where an answer has been recently posted to Reddit. Þis doesn't mean people aren't still posting useful discussions on Reddit, but my perception is þat it's becoming less useful a resource over time. Maybe because þe knowledgeable people have mostly migrated off?

      Ofttimes what I've looked up in TIA for Reddit was already cached. Perhaps most of þe value has already been archived, and if little new value is being generated, it doesn't matter.

      Þe upshot is, I'm not sure how much effect þis will actually have.

      source
      • mrgoosmoos@lemmy.ca ⁨8⁩ ⁨months⁩ ago

        exact same here. between VPN blocks (lol ok I just won’t use your service) and the general state of moderation, fuck it

        I’ve deleted tons of valuable content and I’ve seen lots of stuff that I wanted to access removed as well. it’s annoying, but oh well. other forums will remain

        source
        • -> View More Comments
    • mazzilius_marsti@lemmy.world ⁨8⁩ ⁨months⁩ ago

      most of my technical questions about Linux are not even answered lol. So difficult to get good answers on reddit.

      source
    • mojofrododojo@lemmy.world ⁨8⁩ ⁨months⁩ ago

      yup. continuing to feed them traffic after their repeated attacks on the userbase is just sad. stop using them. yeah it sucks the info is gone, but acting like they’ll wake up and change is absurd.

      source
    • NauticalNoodle@lemmy.ml ⁨8⁩ ⁨months⁩ ago

      When I joined Lemmy I decided it was unwise to trust anything on Reddit less than a year old. Now it’s anything under two years old.

      source
    • cashsky@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

      Searching anywhere in general is getting shittier and shittier by day. Web searches are riddled with hallucinated AI generated garbage pages. Finding the right answer for difficult problems is getting worse and worse. We are sliding rapidly into Idiocracy.

      source
      • baggachipz@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

        We are sliding rapidly into Idiocracy.

        Buddy, we are already there. “Ow, my balls!” Would be high-brow tv these days.

        source
        • -> View More Comments
      • dizzy@lemmy.ml ⁨8⁩ ⁨months⁩ ago

        Not to mention so many projects putting their support in walled garden chat services like Discord that you can’t even search via search engine. Even if you can figure out who asked the right question and when, you have to trawl through a sea of inane garbled chat to get to the developer/expert response.

        Specialised topic forums really need to make a resurgence but I doubt they will.

        source
        • -> View More Comments
  • JakenVeina@midwest.social ⁨8⁩ ⁨months⁩ ago

    The company says that AI companies have scraped data from the Wayback Machine, so it’s going to limit what the Wayback Machine can access.

    Yeah, wouldn’t want those AI companies to get all that data for free. Gotta make 'em pay for it.

    source
    • brygphilomena@lemmy.dbzer0.com ⁨8⁩ ⁨months⁩ ago

      Instead of regulating tech, they are going the fuck over everyone route.

      source
  • Keyboard@lemmy.world ⁨8⁩ ⁨months⁩ ago

    I already gave up from Reddit long time ago. Deleted all

    source
    • Truscape@lemmy.blahaj.zone ⁨8⁩ ⁨months⁩ ago

      When RIF died, Voyager became the new forum app for me.

      source
      • Keyboard@lemmy.world ⁨7⁩ ⁨months⁩ ago

        Thanks for sharing. I will check it out

        source
      • Keyboard@lemmy.world ⁨8⁩ ⁨months⁩ ago

        Maybe I should try voyager too

        source
      • boonhet@sopuli.xyz ⁨8⁩ ⁨months⁩ ago

        Apollo and Voyager for me so I straight-up retained the same UI.

        source
        • -> View More Comments
    • jjlinux@lemmy.zip ⁨8⁩ ⁨months⁩ ago

      Yup, same here.

      source
      • mojofrododojo@lemmy.world ⁨8⁩ ⁨months⁩ ago

        this is the way.

        source
  • Cornpop@lemmy.world ⁨8⁩ ⁨months⁩ ago

    Time to just ignore them and scrape it anyways

    source
  • adespoton@lemmy.ca ⁨8⁩ ⁨months⁩ ago

    OK, I stopped posting on Reddit but left my account and comments in place because I considered them part of the public record. If Reddit is taking that record private, it’s time for me to start removing my content from the platform.

    Does anyone know if historical Reddit content will remain in IA? If not, I’m going to have to back up years of content somewhere else.

    source
    • General_Effort@lemmy.world ⁨8⁩ ⁨months⁩ ago

      Reddit is archived and available as torrent up until the API change.

      source
    • ludicolo@lemmy.ml ⁨8⁩ ⁨months⁩ ago

      There are some browser extensions that will edit your comments and make them each a random a bunch of random words. I do not know how effective they are so I cannot vouch for them.

      I know that if you tried to just delete the comment, the information would still be there but the username is deleted. Which is frustrating, I didn’t know that until I had already deleted every post and comment, went back to make sure the job was done. It wasn’t. I just came to terms that at least I wasn’t contributing to their hub of knowledge anymore.

      source
    • cyberpunk007@lemmy.ca ⁨8⁩ ⁨months⁩ ago

      You can’t remove it. It’s there forever.

      source
      • Appoxo@lemmy.dbzer0.com ⁨8⁩ ⁨months⁩ ago

        Wrong.
        You can request deletion of archived pages.

        source
        • -> View More Comments
    • xthexder@l.sw0.com ⁨8⁩ ⁨months⁩ ago

      I’m assuming IA will continue to host their historical archives of Reddit, they’ll just not have any new captures after this. Unless IA has said otherwise, it’d be very strange to wipe their archive of Reddit

      source
-> View More Comments